2017-04-06 14:56:29

by Akinobu Mita

[permalink] [raw]
Subject: [PATCH -mm 0/5] fault-inject: improve fail-nth interface

This series tries to improve fail-nth interface which was added to -mm
tree recently.

Akinobu Mita (5):
fault-inject: automatically detect the number base for fail-nth write
interface
fault-inject: parse as natural 1-based value for fail-nth write
interface
fault-inject: make fail-nth read/write interface symmetric
fault-inject: simplify access check for fail-nth
fault-inject: add /proc/<pid>/fail-nth

Documentation/fault-injection/fault-injection.txt | 21 ++++++------
fs/proc/base.c | 40 +++++++++--------------
include/linux/sched.h | 2 +-
3 files changed, 28 insertions(+), 35 deletions(-)

Cc: Dmitry Vyukov <[email protected]>
--
2.7.4


2017-04-06 14:56:40

by Akinobu Mita

[permalink] [raw]
Subject: [PATCH -mm 2/5] fault-inject: parse as natural 1-based value for fail-nth write interface

The value written to fail-nth file is parsed as 0-based. Parsing as
one-based is more natural to understand and it enables to cancel the
previous setup by simply writing '0'.

This change also convert task->fail_nth from signed to unsigned int.

Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Akinobu Mita <[email protected]>
---
Documentation/fault-injection/fault-injection.txt | 7 +++----
fs/proc/base.c | 9 ++++-----
include/linux/sched.h | 2 +-
3 files changed, 8 insertions(+), 10 deletions(-)

diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
index 192d8cb..a321905 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.txt
@@ -138,8 +138,8 @@ o proc entries

- /proc/self/task/<current-tid>/fail-nth:

- Write to this file of integer N makes N-th call in the current task fail
- (N is 0-based). Read from this file returns a single char 'Y' or 'N'
+ Write to this file of integer N makes N-th call in the task fail.
+ Read from this file returns a single char 'Y' or 'N'
that says if the fault setup with a previous write to this file was
injected or not, and disables the fault if it wasn't yet injected.
Note that this file enables all types of faults (slab, futex, etc).
@@ -320,7 +320,7 @@ int main()
system("echo N > /sys/kernel/debug/failslab/ignore-gfp-wait");
sprintf(buf, "/proc/self/task/%ld/fail-nth", syscall(SYS_gettid));
fail_nth = open(buf, O_RDWR);
- for (i = 0;; i++) {
+ for (i = 1;; i++) {
sprintf(buf, "%d", i);
write(fail_nth, buf, strlen(buf));
res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
@@ -339,7 +339,6 @@ int main()

An example output:

-0-th fault Y: res=-1/23
1-th fault Y: res=-1/23
2-th fault Y: res=-1/23
3-th fault Y: res=-1/12
diff --git a/fs/proc/base.c b/fs/proc/base.c
index c85376b..42c52e2 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1363,7 +1363,8 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
size_t count, loff_t *ppos)
{
struct task_struct *task;
- int err, n;
+ int err;
+ unsigned int n;

task = get_proc_task(file_inode(file));
if (!task)
@@ -1371,12 +1372,10 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
put_task_struct(task);
if (task != current)
return -EPERM;
- err = kstrtoint_from_user(buf, count, 0, &n);
+ err = kstrtouint_from_user(buf, count, 0, &n);
if (err)
return err;
- if (n < 0 || n == INT_MAX)
- return -EINVAL;
- current->fail_nth = n + 1;
+ current->fail_nth = n;
return count;
}

diff --git a/include/linux/sched.h b/include/linux/sched.h
index f0f0854..bec5d97 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -945,7 +945,7 @@ struct task_struct {

#ifdef CONFIG_FAULT_INJECTION
int make_it_fail;
- int fail_nth;
+ unsigned int fail_nth;
#endif
/*
* When (nr_dirtied >= nr_dirtied_pause), it's time to call
--
2.7.4

2017-04-06 14:56:51

by Akinobu Mita

[permalink] [raw]
Subject: [PATCH -mm 3/5] fault-inject: make fail-nth read/write interface symmetric

The read interface for fail-nth looks a bit odd. Read from this file
returns "NYYYY..." or "YYYYY..." (this makes me surprise when cat this
file). Because there is no EOF condition. The first character indicates
current->fail_nth is zero or not, and then current->fail_nth is reset
to zero.

Just returning task->fail_nth value is more natural to understand.

Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Akinobu Mita <[email protected]>
---
Documentation/fault-injection/fault-injection.txt | 13 +++++++------
fs/proc/base.c | 14 ++++++--------
2 files changed, 13 insertions(+), 14 deletions(-)

diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
index a321905..370ddcb 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.txt
@@ -139,9 +139,9 @@ o proc entries
- /proc/self/task/<current-tid>/fail-nth:

Write to this file of integer N makes N-th call in the task fail.
- Read from this file returns a single char 'Y' or 'N'
- that says if the fault setup with a previous write to this file was
- injected or not, and disables the fault if it wasn't yet injected.
+ Read from this file returns a integer value. A value of '0' indicates
+ that the fault setup with a previous write to this file was injected.
+ A positive integer N indicates that the fault wasn't yet injected.
Note that this file enables all types of faults (slab, futex, etc).
This setting takes precedence over all other generic debugfs settings
like probability, interval, times, etc. But per-capability settings
@@ -325,13 +325,14 @@ int main()
write(fail_nth, buf, strlen(buf));
res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
err = errno;
- read(fail_nth, buf, 1);
+ pread(fail_nth, buf, sizeof(buf), 0);
if (res == 0) {
close(fds[0]);
close(fds[1]);
}
- printf("%d-th fault %c: res=%d/%d\n", i, buf[0], res, err);
- if (buf[0] != 'Y')
+ printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
+ res, err);
+ if (atoi(buf))
break;
}
return 0;
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 42c52e2..9d14215 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1383,7 +1383,8 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
size_t count, loff_t *ppos)
{
struct task_struct *task;
- int err;
+ char numbuf[PROC_NUMBUF];
+ ssize_t len;

task = get_proc_task(file_inode(file));
if (!task)
@@ -1391,13 +1392,10 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
put_task_struct(task);
if (task != current)
return -EPERM;
- if (count < 1)
- return -EINVAL;
- err = put_user((char)(current->fail_nth ? 'N' : 'Y'), buf);
- if (err)
- return err;
- current->fail_nth = 0;
- return 1;
+ len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
+ len = simple_read_from_buffer(buf, count, ppos, numbuf, len);
+
+ return len;
}

static const struct file_operations proc_fail_nth_operations = {
--
2.7.4

2017-04-06 14:57:07

by Akinobu Mita

[permalink] [raw]
Subject: [PATCH -mm 1/5] fault-inject: automatically detect the number base for fail-nth write interface

Automatically detect the number base to use when writing to fail-nth
file instead of always parsing as a decimal number.

Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Akinobu Mita <[email protected]>
---
fs/proc/base.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index f3fc74c..c85376b 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1371,7 +1371,7 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
put_task_struct(task);
if (task != current)
return -EPERM;
- err = kstrtoint_from_user(buf, count, 10, &n);
+ err = kstrtoint_from_user(buf, count, 0, &n);
if (err)
return err;
if (n < 0 || n == INT_MAX)
--
2.7.4

2017-04-06 14:57:05

by Akinobu Mita

[permalink] [raw]
Subject: [PATCH -mm 4/5] fault-inject: simplify access check for fail-nth

The fail-nth file is created with 0666 and the access is permitted if
and only if the task is current.

This file is owned by the currnet user. So we can create it with 0644
and allow the owner to write it. This enables to watch the status of
task->fail_nth from another processes.

Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Akinobu Mita <[email protected]>
---
fs/proc/base.c | 22 ++++++++--------------
1 file changed, 8 insertions(+), 14 deletions(-)

diff --git a/fs/proc/base.c b/fs/proc/base.c
index 9d14215..14e7b73 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -1366,16 +1366,16 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
int err;
unsigned int n;

+ err = kstrtoint_from_user(buf, count, 0, &n);
+ if (err)
+ return err;
+
task = get_proc_task(file_inode(file));
if (!task)
return -ESRCH;
+ task->fail_nth = n;
put_task_struct(task);
- if (task != current)
- return -EPERM;
- err = kstrtouint_from_user(buf, count, 0, &n);
- if (err)
- return err;
- current->fail_nth = n;
+
return count;
}

@@ -1389,11 +1389,9 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
task = get_proc_task(file_inode(file));
if (!task)
return -ESRCH;
- put_task_struct(task);
- if (task != current)
- return -EPERM;
len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
len = simple_read_from_buffer(buf, count, ppos, numbuf, len);
+ put_task_struct(task);

return len;
}
@@ -3358,11 +3356,7 @@ static const struct pid_entry tid_base_stuff[] = {
#endif
#ifdef CONFIG_FAULT_INJECTION
REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
- /*
- * Operations on the file check that the task is current,
- * so we create it with 0666 to support testing under unprivileged user.
- */
- REG("fail-nth", 0666, proc_fail_nth_operations),
+ REG("fail-nth", 0644, proc_fail_nth_operations),
#endif
#ifdef CONFIG_TASK_IO_ACCOUNTING
ONE("io", S_IRUSR, proc_tid_io_accounting),
--
2.7.4

2017-04-06 14:57:32

by Akinobu Mita

[permalink] [raw]
Subject: [PATCH -mm 5/5] fault-inject: add /proc/<pid>/fail-nth

fail-nth interface is only created in /proc/self/task/<current-tid>/.
This change also adds it in /proc/<pid>/.

This makes shell based tool a bit simpler.

$ bash -c "builtin echo 100 > /proc/self/fail-nth && exec ls /"

Cc: Dmitry Vyukov <[email protected]>
Signed-off-by: Akinobu Mita <[email protected]>
---
Documentation/fault-injection/fault-injection.txt | 3 ++-
fs/proc/base.c | 1 +
2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
index 370ddcb..918972b 100644
--- a/Documentation/fault-injection/fault-injection.txt
+++ b/Documentation/fault-injection/fault-injection.txt
@@ -136,7 +136,8 @@ use the boot option:

o proc entries

-- /proc/self/task/<current-tid>/fail-nth:
+- /proc/<pid>/fail-nth:
+- /proc/self/task/<tid>/fail-nth:

Write to this file of integer N makes N-th call in the task fail.
Read from this file returns a integer value. A value of '0' indicates
diff --git a/fs/proc/base.c b/fs/proc/base.c
index 14e7b73..ea1039d 100644
--- a/fs/proc/base.c
+++ b/fs/proc/base.c
@@ -2964,6 +2964,7 @@ static const struct pid_entry tgid_base_stuff[] = {
#endif
#ifdef CONFIG_FAULT_INJECTION
REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
+ REG("fail-nth", 0644, proc_fail_nth_operations),
#endif
#ifdef CONFIG_ELF_CORE
REG("coredump_filter", S_IRUGO|S_IWUSR, proc_coredump_filter_operations),
--
2.7.4

2017-04-07 20:24:03

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -mm 4/5] fault-inject: simplify access check for fail-nth

On Thu, Apr 6, 2017 at 4:56 PM, Akinobu Mita <[email protected]> wrote:
> The fail-nth file is created with 0666 and the access is permitted if
> and only if the task is current.
>
> This file is owned by the currnet user. So we can create it with 0644
> and allow the owner to write it. This enables to watch the status of
> task->fail_nth from another processes.
>
> Cc: Dmitry Vyukov <[email protected]>
> Signed-off-by: Akinobu Mita <[email protected]>
> ---
> fs/proc/base.c | 22 ++++++++--------------
> 1 file changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 9d14215..14e7b73 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1366,16 +1366,16 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
> int err;
> unsigned int n;
>
> + err = kstrtoint_from_user(buf, count, 0, &n);

/\/\/\/\/\

I think this should be kstrtouint_from_user. Otherwise it effective
reverts one of previous patches.

> + if (err)
> + return err;
> +
> task = get_proc_task(file_inode(file));
> if (!task)
> return -ESRCH;
> + task->fail_nth = n;
> put_task_struct(task);
> - if (task != current)
> - return -EPERM;
> - err = kstrtouint_from_user(buf, count, 0, &n);
> - if (err)
> - return err;
> - current->fail_nth = n;
> +
> return count;
> }
>
> @@ -1389,11 +1389,9 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
> task = get_proc_task(file_inode(file));
> if (!task)
> return -ESRCH;
> - put_task_struct(task);
> - if (task != current)
> - return -EPERM;
> len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
> len = simple_read_from_buffer(buf, count, ppos, numbuf, len);
> + put_task_struct(task);
>
> return len;
> }
> @@ -3358,11 +3356,7 @@ static const struct pid_entry tid_base_stuff[] = {
> #endif
> #ifdef CONFIG_FAULT_INJECTION
> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> - /*
> - * Operations on the file check that the task is current,
> - * so we create it with 0666 to support testing under unprivileged user.
> - */
> - REG("fail-nth", 0666, proc_fail_nth_operations),
> + REG("fail-nth", 0644, proc_fail_nth_operations),
> #endif
> #ifdef CONFIG_TASK_IO_ACCOUNTING
> ONE("io", S_IRUSR, proc_tid_io_accounting),
> --
> 2.7.4
>

2017-04-07 20:37:32

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -mm 3/5] fault-inject: make fail-nth read/write interface symmetric

On Thu, Apr 6, 2017 at 4:55 PM, Akinobu Mita <[email protected]> wrote:
> The read interface for fail-nth looks a bit odd. Read from this file
> returns "NYYYY..." or "YYYYY..." (this makes me surprise when cat this
> file). Because there is no EOF condition. The first character indicates
> current->fail_nth is zero or not, and then current->fail_nth is reset
> to zero.
>
> Just returning task->fail_nth value is more natural to understand.
>
> Cc: Dmitry Vyukov <[email protected]>
> Signed-off-by: Akinobu Mita <[email protected]>
> ---
> Documentation/fault-injection/fault-injection.txt | 13 +++++++------
> fs/proc/base.c | 14 ++++++--------
> 2 files changed, 13 insertions(+), 14 deletions(-)
>
> diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
> index a321905..370ddcb 100644
> --- a/Documentation/fault-injection/fault-injection.txt
> +++ b/Documentation/fault-injection/fault-injection.txt
> @@ -139,9 +139,9 @@ o proc entries
> - /proc/self/task/<current-tid>/fail-nth:
>
> Write to this file of integer N makes N-th call in the task fail.
> - Read from this file returns a single char 'Y' or 'N'
> - that says if the fault setup with a previous write to this file was
> - injected or not, and disables the fault if it wasn't yet injected.
> + Read from this file returns a integer value. A value of '0' indicates
> + that the fault setup with a previous write to this file was injected.
> + A positive integer N indicates that the fault wasn't yet injected.
> Note that this file enables all types of faults (slab, futex, etc).
> This setting takes precedence over all other generic debugfs settings
> like probability, interval, times, etc. But per-capability settings
> @@ -325,13 +325,14 @@ int main()
> write(fail_nth, buf, strlen(buf));
> res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
> err = errno;
> - read(fail_nth, buf, 1);
> + pread(fail_nth, buf, sizeof(buf), 0);
> if (res == 0) {
> close(fds[0]);
> close(fds[1]);
> }
> - printf("%d-th fault %c: res=%d/%d\n", i, buf[0], res, err);
> - if (buf[0] != 'Y')
> + printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
> + res, err);
> + if (atoi(buf))
> break;
> }
> return 0;
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 42c52e2..9d14215 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1383,7 +1383,8 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
> size_t count, loff_t *ppos)
> {
> struct task_struct *task;
> - int err;
> + char numbuf[PROC_NUMBUF];
> + ssize_t len;
>
> task = get_proc_task(file_inode(file));
> if (!task)
> @@ -1391,13 +1392,10 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
> put_task_struct(task);
> if (task != current)
> return -EPERM;
> - if (count < 1)
> - return -EINVAL;
> - err = put_user((char)(current->fail_nth ? 'N' : 'Y'), buf);
> - if (err)
> - return err;
> - current->fail_nth = 0;
> - return 1;
> + len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);

If we allow setting this for non current task, then we need to prevent
data races as the task uses task->fail_nth concurrently. Reads then
should use READ_ONCE and writes in fault-inject.c should use
WRITE_ONCE.

> + len = simple_read_from_buffer(buf, count, ppos, numbuf, len);
> +
> + return len;
> }
>
> static const struct file_operations proc_fail_nth_operations = {
> --
> 2.7.4
>

2017-04-07 20:45:45

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -mm 4/5] fault-inject: simplify access check for fail-nth

On Thu, Apr 6, 2017 at 4:56 PM, Akinobu Mita <[email protected]> wrote:
> The fail-nth file is created with 0666 and the access is permitted if
> and only if the task is current.
>
> This file is owned by the currnet user. So we can create it with 0644
> and allow the owner to write it. This enables to watch the status of
> task->fail_nth from another processes.
>
> Cc: Dmitry Vyukov <[email protected]>
> Signed-off-by: Akinobu Mita <[email protected]>
> ---
> fs/proc/base.c | 22 ++++++++--------------
> 1 file changed, 8 insertions(+), 14 deletions(-)
>
> diff --git a/fs/proc/base.c b/fs/proc/base.c
> index 9d14215..14e7b73 100644
> --- a/fs/proc/base.c
> +++ b/fs/proc/base.c
> @@ -1366,16 +1366,16 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
> int err;
> unsigned int n;
>
> + err = kstrtoint_from_user(buf, count, 0, &n);
> + if (err)
> + return err;
> +
> task = get_proc_task(file_inode(file));
> if (!task)
> return -ESRCH;
> + task->fail_nth = n;
> put_task_struct(task);
> - if (task != current)
> - return -EPERM;
> - err = kstrtouint_from_user(buf, count, 0, &n);
> - if (err)
> - return err;
> - current->fail_nth = n;
> +
> return count;
> }
>
> @@ -1389,11 +1389,9 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
> task = get_proc_task(file_inode(file));
> if (!task)
> return -ESRCH;
> - put_task_struct(task);
> - if (task != current)
> - return -EPERM;
> len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
> len = simple_read_from_buffer(buf, count, ppos, numbuf, len);
> + put_task_struct(task);
>
> return len;
> }
> @@ -3358,11 +3356,7 @@ static const struct pid_entry tid_base_stuff[] = {
> #endif
> #ifdef CONFIG_FAULT_INJECTION
> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
> - /*
> - * Operations on the file check that the task is current,
> - * so we create it with 0666 to support testing under unprivileged user.
> - */
> - REG("fail-nth", 0666, proc_fail_nth_operations),
> + REG("fail-nth", 0644, proc_fail_nth_operations),

/\/\/\/\/\/\

This breaks us.
Under setuid sandbox test threads can't open the file anymore. And we
can't pre-open the files before dropping privs as new threads can be
created afterwards.

I think the root cause of all these problems (permissions, parsing,
serialization, broken cat, symmetry) is that we are trying to fit a
programmatic API into reads and writes on textual files. We don't need
symmetry, we don't need read+write to reset injection, we don't need
parsing and serialization, it does not make sense to do this of
non-current task, it definitely does not make sense to cat this, etc.

What do you think of 2 ioctls on /sys/kernel/debug/fail_nth?

> #endif
> #ifdef CONFIG_TASK_IO_ACCOUNTING
> ONE("io", S_IRUSR, proc_tid_io_accounting),
> --
> 2.7.4
>

2017-04-08 08:26:06

by Akinobu Mita

[permalink] [raw]
Subject: Re: [PATCH -mm 4/5] fault-inject: simplify access check for fail-nth

2017-04-08 5:45 GMT+09:00 Dmitry Vyukov <[email protected]>:
> On Thu, Apr 6, 2017 at 4:56 PM, Akinobu Mita <[email protected]> wrote:
>> The fail-nth file is created with 0666 and the access is permitted if
>> and only if the task is current.
>>
>> This file is owned by the currnet user. So we can create it with 0644
>> and allow the owner to write it. This enables to watch the status of
>> task->fail_nth from another processes.
>>
>> Cc: Dmitry Vyukov <[email protected]>
>> Signed-off-by: Akinobu Mita <[email protected]>
>> ---
>> fs/proc/base.c | 22 ++++++++--------------
>> 1 file changed, 8 insertions(+), 14 deletions(-)
>>
>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>> index 9d14215..14e7b73 100644
>> --- a/fs/proc/base.c
>> +++ b/fs/proc/base.c
>> @@ -1366,16 +1366,16 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
>> int err;
>> unsigned int n;
>>
>> + err = kstrtoint_from_user(buf, count, 0, &n);
>> + if (err)
>> + return err;
>> +
>> task = get_proc_task(file_inode(file));
>> if (!task)
>> return -ESRCH;
>> + task->fail_nth = n;
>> put_task_struct(task);
>> - if (task != current)
>> - return -EPERM;
>> - err = kstrtouint_from_user(buf, count, 0, &n);
>> - if (err)
>> - return err;
>> - current->fail_nth = n;
>> +
>> return count;
>> }
>>
>> @@ -1389,11 +1389,9 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
>> task = get_proc_task(file_inode(file));
>> if (!task)
>> return -ESRCH;
>> - put_task_struct(task);
>> - if (task != current)
>> - return -EPERM;
>> len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
>> len = simple_read_from_buffer(buf, count, ppos, numbuf, len);
>> + put_task_struct(task);
>>
>> return len;
>> }
>> @@ -3358,11 +3356,7 @@ static const struct pid_entry tid_base_stuff[] = {
>> #endif
>> #ifdef CONFIG_FAULT_INJECTION
>> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
>> - /*
>> - * Operations on the file check that the task is current,
>> - * so we create it with 0666 to support testing under unprivileged user.
>> - */
>> - REG("fail-nth", 0666, proc_fail_nth_operations),
>> + REG("fail-nth", 0644, proc_fail_nth_operations),
>
> /\/\/\/\/\/\
>
> This breaks us.
> Under setuid sandbox test threads can't open the file anymore. And we
> can't pre-open the files before dropping privs as new threads can be
> created afterwards.

Could you provide a working example for this? Because I'm not sure
I understand the problem you described here.

If we omit resetting tsk->fail_nth in dup_task_struct(), tsk->fail_nth
is inherited from parent to child process. So the parent process can
pre-open and set fail-nth file and reset parent's own ->fail_nth after
fork by writing 0 to fail-nth file. Does that fix your problem?

> I think the root cause of all these problems (permissions, parsing,
> serialization, broken cat, symmetry) is that we are trying to fit a
> programmatic API into reads and writes on textual files. We don't need
> symmetry, we don't need read+write to reset injection, we don't need
> parsing and serialization, it does not make sense to do this of
> non-current task, it definitely does not make sense to cat this, etc.
>
> What do you think of 2 ioctls on /sys/kernel/debug/fail_nth?

I think the misc device is suitable than debugfs file for ioctl only
knob. But I prefer read/write interface than ioctl if possible.

>> #endif
>> #ifdef CONFIG_TASK_IO_ACCOUNTING
>> ONE("io", S_IRUSR, proc_tid_io_accounting),
>> --
>> 2.7.4
>>

2017-04-08 17:37:01

by Dmitry Vyukov

[permalink] [raw]
Subject: Re: [PATCH -mm 4/5] fault-inject: simplify access check for fail-nth

On Sat, Apr 8, 2017 at 10:25 AM, Akinobu Mita <[email protected]> wrote:
> 2017-04-08 5:45 GMT+09:00 Dmitry Vyukov <[email protected]>:
>> On Thu, Apr 6, 2017 at 4:56 PM, Akinobu Mita <[email protected]> wrote:
>>> The fail-nth file is created with 0666 and the access is permitted if
>>> and only if the task is current.
>>>
>>> This file is owned by the currnet user. So we can create it with 0644
>>> and allow the owner to write it. This enables to watch the status of
>>> task->fail_nth from another processes.
>>>
>>> Cc: Dmitry Vyukov <[email protected]>
>>> Signed-off-by: Akinobu Mita <[email protected]>
>>> ---
>>> fs/proc/base.c | 22 ++++++++--------------
>>> 1 file changed, 8 insertions(+), 14 deletions(-)
>>>
>>> diff --git a/fs/proc/base.c b/fs/proc/base.c
>>> index 9d14215..14e7b73 100644
>>> --- a/fs/proc/base.c
>>> +++ b/fs/proc/base.c
>>> @@ -1366,16 +1366,16 @@ static ssize_t proc_fail_nth_write(struct file *file, const char __user *buf,
>>> int err;
>>> unsigned int n;
>>>
>>> + err = kstrtoint_from_user(buf, count, 0, &n);
>>> + if (err)
>>> + return err;
>>> +
>>> task = get_proc_task(file_inode(file));
>>> if (!task)
>>> return -ESRCH;
>>> + task->fail_nth = n;
>>> put_task_struct(task);
>>> - if (task != current)
>>> - return -EPERM;
>>> - err = kstrtouint_from_user(buf, count, 0, &n);
>>> - if (err)
>>> - return err;
>>> - current->fail_nth = n;
>>> +
>>> return count;
>>> }
>>>
>>> @@ -1389,11 +1389,9 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
>>> task = get_proc_task(file_inode(file));
>>> if (!task)
>>> return -ESRCH;
>>> - put_task_struct(task);
>>> - if (task != current)
>>> - return -EPERM;
>>> len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
>>> len = simple_read_from_buffer(buf, count, ppos, numbuf, len);
>>> + put_task_struct(task);
>>>
>>> return len;
>>> }
>>> @@ -3358,11 +3356,7 @@ static const struct pid_entry tid_base_stuff[] = {
>>> #endif
>>> #ifdef CONFIG_FAULT_INJECTION
>>> REG("make-it-fail", S_IRUGO|S_IWUSR, proc_fault_inject_operations),
>>> - /*
>>> - * Operations on the file check that the task is current,
>>> - * so we create it with 0666 to support testing under unprivileged user.
>>> - */
>>> - REG("fail-nth", 0666, proc_fail_nth_operations),
>>> + REG("fail-nth", 0644, proc_fail_nth_operations),
>>
>> /\/\/\/\/\/\
>>
>> This breaks us.
>> Under setuid sandbox test threads can't open the file anymore. And we
>> can't pre-open the files before dropping privs as new threads can be
>> created afterwards.
>
> Could you provide a working example for this? Because I'm not sure
> I understand the problem you described here.

The actual use case is syzkaller executor:
https://github.com/google/syzkaller/blob/fault_inject/executor/executor.cc
(search for "fail-nth"). When sandbox_setuid is enabled, open of the
file fails with EPERM.

> If we omit resetting tsk->fail_nth in dup_task_struct(), tsk->fail_nth
> is inherited from parent to child process. So the parent process can
> pre-open and set fail-nth file and reset parent's own ->fail_nth after
> fork by writing 0 to fail-nth file. Does that fix your problem?

I don't think so.
First clone does unknown number of allocations. Second I don't
need/want the thread to start failing start away. At some point in
future it _may_ need to inject failures into a particular system call.
There are multiple threads, and we don't know apriori which one of
them will need to inject failures, since thread assignment is dynamic
and depends on which system calls will block and which will not block.

>> I think the root cause of all these problems (permissions, parsing,
>> serialization, broken cat, symmetry) is that we are trying to fit a
>> programmatic API into reads and writes on textual files. We don't need
>> symmetry, we don't need read+write to reset injection, we don't need
>> parsing and serialization, it does not make sense to do this of
>> non-current task, it definitely does not make sense to cat this, etc.
>>
>> What do you think of 2 ioctls on /sys/kernel/debug/fail_nth?
>
> I think the misc device is suitable than debugfs file for ioctl only
> knob. But I prefer read/write interface than ioctl if possible.

Why do you prefer read/write?
This interface is not meant to be used from command line, like lots of
other system calls.
We did /sys/kernel/debug/kcov with ioctls and it works perfect.


>>> #endif
>>> #ifdef CONFIG_TASK_IO_ACCOUNTING
>>> ONE("io", S_IRUSR, proc_tid_io_accounting),
>>> --
>>> 2.7.4
>>>

2017-07-12 20:49:41

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH -mm 3/5] fault-inject: make fail-nth read/write interface symmetric

On Fri, 7 Apr 2017 22:37:01 +0200 Dmitry Vyukov <[email protected]> wrote:

> On Thu, Apr 6, 2017 at 4:55 PM, Akinobu Mita <[email protected]> wrote:
> > The read interface for fail-nth looks a bit odd. Read from this file
> > returns "NYYYY..." or "YYYYY..." (this makes me surprise when cat this
> > file). Because there is no EOF condition. The first character indicates
> > current->fail_nth is zero or not, and then current->fail_nth is reset
> > to zero.
> >
> > Just returning task->fail_nth value is more natural to understand.
> >
> > Cc: Dmitry Vyukov <[email protected]>
> > Signed-off-by: Akinobu Mita <[email protected]>
> > ---
> > Documentation/fault-injection/fault-injection.txt | 13 +++++++------
> > fs/proc/base.c | 14 ++++++--------
> > 2 files changed, 13 insertions(+), 14 deletions(-)
> >
> > diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
> > index a321905..370ddcb 100644
> > --- a/Documentation/fault-injection/fault-injection.txt
> > +++ b/Documentation/fault-injection/fault-injection.txt
> > @@ -139,9 +139,9 @@ o proc entries
> > - /proc/self/task/<current-tid>/fail-nth:
> >
> > Write to this file of integer N makes N-th call in the task fail.
> > - Read from this file returns a single char 'Y' or 'N'
> > - that says if the fault setup with a previous write to this file was
> > - injected or not, and disables the fault if it wasn't yet injected.
> > + Read from this file returns a integer value. A value of '0' indicates
> > + that the fault setup with a previous write to this file was injected.
> > + A positive integer N indicates that the fault wasn't yet injected.
> > Note that this file enables all types of faults (slab, futex, etc).
> > This setting takes precedence over all other generic debugfs settings
> > like probability, interval, times, etc. But per-capability settings
> > @@ -325,13 +325,14 @@ int main()
> > write(fail_nth, buf, strlen(buf));
> > res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
> > err = errno;
> > - read(fail_nth, buf, 1);
> > + pread(fail_nth, buf, sizeof(buf), 0);
> > if (res == 0) {
> > close(fds[0]);
> > close(fds[1]);
> > }
> > - printf("%d-th fault %c: res=%d/%d\n", i, buf[0], res, err);
> > - if (buf[0] != 'Y')
> > + printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
> > + res, err);
> > + if (atoi(buf))
> > break;
> > }
> > return 0;
> > diff --git a/fs/proc/base.c b/fs/proc/base.c
> > index 42c52e2..9d14215 100644
> > --- a/fs/proc/base.c
> > +++ b/fs/proc/base.c
> > @@ -1383,7 +1383,8 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
> > size_t count, loff_t *ppos)
> > {
> > struct task_struct *task;
> > - int err;
> > + char numbuf[PROC_NUMBUF];
> > + ssize_t len;
> >
> > task = get_proc_task(file_inode(file));
> > if (!task)
> > @@ -1391,13 +1392,10 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
> > put_task_struct(task);
> > if (task != current)
> > return -EPERM;
> > - if (count < 1)
> > - return -EINVAL;
> > - err = put_user((char)(current->fail_nth ? 'N' : 'Y'), buf);
> > - if (err)
> > - return err;
> > - current->fail_nth = 0;
> > - return 1;
> > + len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
>
> If we allow setting this for non current task, then we need to prevent
> data races as the task uses task->fail_nth concurrently. Reads then
> should use READ_ONCE and writes in fault-inject.c should use
> WRITE_ONCE.

This remains unresolved?

2017-07-13 16:18:09

by Akinobu Mita

[permalink] [raw]
Subject: Re: [PATCH -mm 3/5] fault-inject: make fail-nth read/write interface symmetric

2017-07-13 5:49 GMT+09:00 Andrew Morton <[email protected]>:
> On Fri, 7 Apr 2017 22:37:01 +0200 Dmitry Vyukov <[email protected]> wrote:
>
>> On Thu, Apr 6, 2017 at 4:55 PM, Akinobu Mita <[email protected]> wrote:
>> > The read interface for fail-nth looks a bit odd. Read from this file
>> > returns "NYYYY..." or "YYYYY..." (this makes me surprise when cat this
>> > file). Because there is no EOF condition. The first character indicates
>> > current->fail_nth is zero or not, and then current->fail_nth is reset
>> > to zero.
>> >
>> > Just returning task->fail_nth value is more natural to understand.
>> >
>> > Cc: Dmitry Vyukov <[email protected]>
>> > Signed-off-by: Akinobu Mita <[email protected]>
>> > ---
>> > Documentation/fault-injection/fault-injection.txt | 13 +++++++------
>> > fs/proc/base.c | 14 ++++++--------
>> > 2 files changed, 13 insertions(+), 14 deletions(-)
>> >
>> > diff --git a/Documentation/fault-injection/fault-injection.txt b/Documentation/fault-injection/fault-injection.txt
>> > index a321905..370ddcb 100644
>> > --- a/Documentation/fault-injection/fault-injection.txt
>> > +++ b/Documentation/fault-injection/fault-injection.txt
>> > @@ -139,9 +139,9 @@ o proc entries
>> > - /proc/self/task/<current-tid>/fail-nth:
>> >
>> > Write to this file of integer N makes N-th call in the task fail.
>> > - Read from this file returns a single char 'Y' or 'N'
>> > - that says if the fault setup with a previous write to this file was
>> > - injected or not, and disables the fault if it wasn't yet injected.
>> > + Read from this file returns a integer value. A value of '0' indicates
>> > + that the fault setup with a previous write to this file was injected.
>> > + A positive integer N indicates that the fault wasn't yet injected.
>> > Note that this file enables all types of faults (slab, futex, etc).
>> > This setting takes precedence over all other generic debugfs settings
>> > like probability, interval, times, etc. But per-capability settings
>> > @@ -325,13 +325,14 @@ int main()
>> > write(fail_nth, buf, strlen(buf));
>> > res = socketpair(AF_LOCAL, SOCK_STREAM, 0, fds);
>> > err = errno;
>> > - read(fail_nth, buf, 1);
>> > + pread(fail_nth, buf, sizeof(buf), 0);
>> > if (res == 0) {
>> > close(fds[0]);
>> > close(fds[1]);
>> > }
>> > - printf("%d-th fault %c: res=%d/%d\n", i, buf[0], res, err);
>> > - if (buf[0] != 'Y')
>> > + printf("%d-th fault %c: res=%d/%d\n", i, atoi(buf) ? 'N' : 'Y',
>> > + res, err);
>> > + if (atoi(buf))
>> > break;
>> > }
>> > return 0;
>> > diff --git a/fs/proc/base.c b/fs/proc/base.c
>> > index 42c52e2..9d14215 100644
>> > --- a/fs/proc/base.c
>> > +++ b/fs/proc/base.c
>> > @@ -1383,7 +1383,8 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
>> > size_t count, loff_t *ppos)
>> > {
>> > struct task_struct *task;
>> > - int err;
>> > + char numbuf[PROC_NUMBUF];
>> > + ssize_t len;
>> >
>> > task = get_proc_task(file_inode(file));
>> > if (!task)
>> > @@ -1391,13 +1392,10 @@ static ssize_t proc_fail_nth_read(struct file *file, char __user *buf,
>> > put_task_struct(task);
>> > if (task != current)
>> > return -EPERM;
>> > - if (count < 1)
>> > - return -EINVAL;
>> > - err = put_user((char)(current->fail_nth ? 'N' : 'Y'), buf);
>> > - if (err)
>> > - return err;
>> > - current->fail_nth = 0;
>> > - return 1;
>> > + len = snprintf(numbuf, sizeof(numbuf), "%u\n", task->fail_nth);
>>
>> If we allow setting this for non current task, then we need to prevent
>> data races as the task uses task->fail_nth concurrently. Reads then
>> should use READ_ONCE and writes in fault-inject.c should use
>> WRITE_ONCE.
>
> This remains unresolved?

I have just send a proposed fix. (Subject: [PATCH -mm] fault-inject: avoid
unwanted data race to task->fail_nth)