2020-09-03 16:00:38

by Christoph Hellwig

[permalink] [raw]
Subject: [PATCH] /dev/zero: also implement ->read

Christophe reported a major speedup due to avoiding the iov_iter
overhead, so just add this trivial function. Note that /dev/zero
already implements both an iter and non-iter writes so this just
makes it more symmetric.

Christophe Leroy <[email protected]>
Signed-off-by: Christoph Hellwig <[email protected]>
---
drivers/char/mem.c | 22 ++++++++++++++++++++++
1 file changed, 22 insertions(+)

diff --git a/drivers/char/mem.c b/drivers/char/mem.c
index abd4ffdc8cdebc..1dc99ab158457a 100644
--- a/drivers/char/mem.c
+++ b/drivers/char/mem.c
@@ -726,6 +726,27 @@ static ssize_t read_iter_zero(struct kiocb *iocb, struct iov_iter *iter)
return written;
}

+static ssize_t read_zero(struct file *file, char __user *buf,
+ size_t count, loff_t *ppos)
+{
+ size_t cleared = 0;
+
+ while (count) {
+ size_t chunk = min_t(size_t, count, PAGE_SIZE);
+
+ if (clear_user(buf + cleared, chunk))
+ return cleared ? cleared : -EFAULT;
+ cleared += chunk;
+ count -= chunk;
+
+ if (signal_pending(current))
+ return cleared ? cleared : -ERESTARTSYS;
+ cond_resched();
+ }
+
+ return cleared;
+}
+
static int mmap_zero(struct file *file, struct vm_area_struct *vma)
{
#ifndef CONFIG_MMU
@@ -921,6 +942,7 @@ static const struct file_operations zero_fops = {
.llseek = zero_lseek,
.write = write_zero,
.read_iter = read_iter_zero,
+ .read = read_zero,
.write_iter = write_iter_zero,
.mmap = mmap_zero,
.get_unmapped_area = get_unmapped_area_zero,
--
2.28.0


2020-09-03 16:12:46

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read



Le 03/09/2020 à 17:59, Christoph Hellwig a écrit :
> Christophe reported a major speedup due to avoiding the iov_iter
> overhead, so just add this trivial function. Note that /dev/zero
> already implements both an iter and non-iter writes so this just
> makes it more symmetric.
>
> Christophe Leroy <[email protected]>
> Signed-off-by: Christoph Hellwig <[email protected]>

Tested-by: Christophe Leroy <[email protected]>

> ---
> drivers/char/mem.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/char/mem.c b/drivers/char/mem.c
> index abd4ffdc8cdebc..1dc99ab158457a 100644
> --- a/drivers/char/mem.c
> +++ b/drivers/char/mem.c
> @@ -726,6 +726,27 @@ static ssize_t read_iter_zero(struct kiocb *iocb, struct iov_iter *iter)
> return written;
> }
>
> +static ssize_t read_zero(struct file *file, char __user *buf,
> + size_t count, loff_t *ppos)
> +{
> + size_t cleared = 0;
> +
> + while (count) {
> + size_t chunk = min_t(size_t, count, PAGE_SIZE);
> +
> + if (clear_user(buf + cleared, chunk))
> + return cleared ? cleared : -EFAULT;
> + cleared += chunk;
> + count -= chunk;
> +
> + if (signal_pending(current))
> + return cleared ? cleared : -ERESTARTSYS;
> + cond_resched();
> + }
> +
> + return cleared;
> +}
> +
> static int mmap_zero(struct file *file, struct vm_area_struct *vma)
> {
> #ifndef CONFIG_MMU
> @@ -921,6 +942,7 @@ static const struct file_operations zero_fops = {
> .llseek = zero_lseek,
> .write = write_zero,
> .read_iter = read_iter_zero,
> + .read = read_zero,
> .write_iter = write_iter_zero,
> .mmap = mmap_zero,
> .get_unmapped_area = get_unmapped_area_zero,
>

2020-09-03 17:59:09

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read



Le 03/09/2020 à 17:59, Christoph Hellwig a écrit :
> Christophe reported a major speedup due to avoiding the iov_iter
> overhead, so just add this trivial function. Note that /dev/zero
> already implements both an iter and non-iter writes so this just
> makes it more symmetric.
>
> Christophe Leroy <[email protected]>
> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> drivers/char/mem.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/char/mem.c b/drivers/char/mem.c
> index abd4ffdc8cdebc..1dc99ab158457a 100644
> --- a/drivers/char/mem.c
> +++ b/drivers/char/mem.c
> @@ -726,6 +726,27 @@ static ssize_t read_iter_zero(struct kiocb *iocb, struct iov_iter *iter)
> return written;
> }
>
> +static ssize_t read_zero(struct file *file, char __user *buf,
> + size_t count, loff_t *ppos)
> +{
> + size_t cleared = 0;
> +
> + while (count) {
> + size_t chunk = min_t(size_t, count, PAGE_SIZE);
> +
> + if (clear_user(buf + cleared, chunk))
> + return cleared ? cleared : -EFAULT;
> + cleared += chunk;
> + count -= chunk;
> +
> + if (signal_pending(current))
> + return cleared ? cleared : -ERESTARTSYS;
> + cond_resched();
> + }
> +
> + return cleared;
> +}
> +
> static int mmap_zero(struct file *file, struct vm_area_struct *vma)
> {
> #ifndef CONFIG_MMU
> @@ -921,6 +942,7 @@ static const struct file_operations zero_fops = {
> .llseek = zero_lseek,
> .write = write_zero,
> .read_iter = read_iter_zero,
> + .read = read_zero,

Wondering if .read should be before .write, so that we get in order
read, write, read_iter, write_iter.

> .write_iter = write_iter_zero,
> .mmap = mmap_zero,
> .get_unmapped_area = get_unmapped_area_zero,
>

Christophe

2020-09-03 18:05:30

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

On Thu, Sep 03, 2020 at 07:51:07PM +0200, Christophe Leroy wrote:
>
>
> Le 03/09/2020 ? 17:59, Christoph Hellwig a ?crit?:
> > Christophe reported a major speedup due to avoiding the iov_iter
> > overhead, so just add this trivial function. Note that /dev/zero
> > already implements both an iter and non-iter writes so this just
> > makes it more symmetric.
> >
> > Christophe Leroy <[email protected]>
> > Signed-off-by: Christoph Hellwig <[email protected]>
> > ---
> > drivers/char/mem.c | 22 ++++++++++++++++++++++
> > 1 file changed, 22 insertions(+)
> >
> > diff --git a/drivers/char/mem.c b/drivers/char/mem.c
> > index abd4ffdc8cdebc..1dc99ab158457a 100644
> > --- a/drivers/char/mem.c
> > +++ b/drivers/char/mem.c
> > @@ -726,6 +726,27 @@ static ssize_t read_iter_zero(struct kiocb *iocb, struct iov_iter *iter)
> > return written;
> > }
> > +static ssize_t read_zero(struct file *file, char __user *buf,
> > + size_t count, loff_t *ppos)
> > +{
> > + size_t cleared = 0;
> > +
> > + while (count) {
> > + size_t chunk = min_t(size_t, count, PAGE_SIZE);
> > +
> > + if (clear_user(buf + cleared, chunk))
> > + return cleared ? cleared : -EFAULT;
> > + cleared += chunk;
> > + count -= chunk;
> > +
> > + if (signal_pending(current))
> > + return cleared ? cleared : -ERESTARTSYS;
> > + cond_resched();
> > + }
> > +
> > + return cleared;
> > +}
> > +
> > static int mmap_zero(struct file *file, struct vm_area_struct *vma)
> > {
> > #ifndef CONFIG_MMU
> > @@ -921,6 +942,7 @@ static const struct file_operations zero_fops = {
> > .llseek = zero_lseek,
> > .write = write_zero,
> > .read_iter = read_iter_zero,
> > + .read = read_zero,
>
> Wondering if .read should be before .write, so that we get in order read,
> write, read_iter, write_iter.

It really does not matter :)

thanks,

greg k-h

2020-09-03 21:36:20

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] /dev/zero: also implement ->read

From: Christophe Leroy
>
>
> Le 03/09/2020 à 17:59, Christoph Hellwig a écrit :
> > Christophe reported a major speedup due to avoiding the iov_iter
> > overhead, so just add this trivial function. Note that /dev/zero
> > already implements both an iter and non-iter writes so this just
> > makes it more symmetric.
> >
> > Christophe Leroy <[email protected]>
> > Signed-off-by: Christoph Hellwig <[email protected]>
>
> Tested-by: Christophe Leroy <[email protected]>

Any idea what has happened to make the 'iter' version so bad?

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-09-06 18:22:43

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

Hi!

> > > Christophe reported a major speedup due to avoiding the iov_iter
> > > overhead, so just add this trivial function. Note that /dev/zero
> > > already implements both an iter and non-iter writes so this just
> > > makes it more symmetric.
> > >
> > > Christophe Leroy <[email protected]>
> > > Signed-off-by: Christoph Hellwig <[email protected]>
> >
> > Tested-by: Christophe Leroy <[email protected]>
>
> Any idea what has happened to make the 'iter' version so bad?

Exactly. Also it would be nice to note how the speedup was measured
and what the speedup is.

Best regads,
Pavel
--
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html


Attachments:
(No filename) (775.00 B)
signature.asc (188.00 B)
Digital signature
Download all attachments

2020-09-06 18:37:54

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

Hi,

Le 06/09/2020 ? 20:21, Pavel Machek a ?crit?:
> Hi!
>
>>>> Christophe reported a major speedup due to avoiding the iov_iter
>>>> overhead, so just add this trivial function. Note that /dev/zero
>>>> already implements both an iter and non-iter writes so this just
>>>> makes it more symmetric.
>>>>
>>>> Christophe Leroy <[email protected]>
>>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>>
>>> Tested-by: Christophe Leroy <[email protected]>
>>
>> Any idea what has happened to make the 'iter' version so bad?
>
> Exactly. Also it would be nice to note how the speedup was measured
> and what the speedup is.
>

Was measured on an 8xx powerpc running at 132MHz with:

dd if=/dev/zero of=/dev/null count=1M

With the patch, dd displays a throughput of 113.5MB/s
Without the patch it is 99.9MB/s

Christophe

2020-09-06 18:40:13

by Pavel Machek

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

On Sun 2020-09-06 20:35:38, Christophe Leroy wrote:
> Hi,
>
> Le 06/09/2020 ? 20:21, Pavel Machek a ?crit?:
> >Hi!
> >
> >>>>Christophe reported a major speedup due to avoiding the iov_iter
> >>>>overhead, so just add this trivial function. Note that /dev/zero
> >>>>already implements both an iter and non-iter writes so this just
> >>>>makes it more symmetric.
> >>>>
> >>>>Christophe Leroy <[email protected]>
> >>>>Signed-off-by: Christoph Hellwig <[email protected]>
> >>>
> >>>Tested-by: Christophe Leroy <[email protected]>
> >>
> >>Any idea what has happened to make the 'iter' version so bad?
> >
> >Exactly. Also it would be nice to note how the speedup was measured
> >and what the speedup is.
> >
>
> Was measured on an 8xx powerpc running at 132MHz with:
>
> dd if=/dev/zero of=/dev/null count=1M
>
> With the patch, dd displays a throughput of 113.5MB/s
> Without the patch it is 99.9MB/s

Actually... that does not seem like a huge deal. read(/dev/zero) is
not that common operation.

Are you getting similar speedups on normal hardware?

Pavel
--
DENX Software Engineering GmbH, Managing Director: Wolfgang Denk
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany


Attachments:
(No filename) (1.24 kB)
signature.asc (188.00 B)
Digital signature
Download all attachments

2020-09-06 18:48:13

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read



Le 06/09/2020 ? 20:38, Pavel Machek a ?crit?:
> On Sun 2020-09-06 20:35:38, Christophe Leroy wrote:
>> Hi,
>>
>> Le 06/09/2020 ? 20:21, Pavel Machek a ?crit?:
>>> Hi!
>>>
>>>>>> Christophe reported a major speedup due to avoiding the iov_iter
>>>>>> overhead, so just add this trivial function. Note that /dev/zero
>>>>>> already implements both an iter and non-iter writes so this just
>>>>>> makes it more symmetric.
>>>>>>
>>>>>> Christophe Leroy <[email protected]>
>>>>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>>>>
>>>>> Tested-by: Christophe Leroy <[email protected]>
>>>>
>>>> Any idea what has happened to make the 'iter' version so bad?
>>>
>>> Exactly. Also it would be nice to note how the speedup was measured
>>> and what the speedup is.
>>>
>>
>> Was measured on an 8xx powerpc running at 132MHz with:

Oops. That was not on an 8xx but on an 8321 running at 333MHz, sorry.

>>
>> dd if=/dev/zero of=/dev/null count=1M
>>
>> With the patch, dd displays a throughput of 113.5MB/s
>> Without the patch it is 99.9MB/s
>
> Actually... that does not seem like a huge deal. read(/dev/zero) is
> not that common operation.

That's 14% more. It is not negligeable.

I think I need to measure the /dev/zero read standalone. I guess the
write to /dev/null flatters the result.

>
> Are you getting similar speedups on normal hardware?
>

Do you regard powerpc embedded devices as abnormal ?
AFAIK the 832x is embedded in millions of ADSL boxes.
What processor do you have in mind as normal hardware ?

Christophe

2020-09-06 20:13:55

by Greg Kroah-Hartman

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

On Sun, Sep 06, 2020 at 08:38:20PM +0200, Pavel Machek wrote:
> On Sun 2020-09-06 20:35:38, Christophe Leroy wrote:
> > Hi,
> >
> > Le 06/09/2020 ? 20:21, Pavel Machek a ?crit?:
> > >Hi!
> > >
> > >>>>Christophe reported a major speedup due to avoiding the iov_iter
> > >>>>overhead, so just add this trivial function. Note that /dev/zero
> > >>>>already implements both an iter and non-iter writes so this just
> > >>>>makes it more symmetric.
> > >>>>
> > >>>>Christophe Leroy <[email protected]>
> > >>>>Signed-off-by: Christoph Hellwig <[email protected]>
> > >>>
> > >>>Tested-by: Christophe Leroy <[email protected]>
> > >>
> > >>Any idea what has happened to make the 'iter' version so bad?
> > >
> > >Exactly. Also it would be nice to note how the speedup was measured
> > >and what the speedup is.
> > >
> >
> > Was measured on an 8xx powerpc running at 132MHz with:
> >
> > dd if=/dev/zero of=/dev/null count=1M
> >
> > With the patch, dd displays a throughput of 113.5MB/s
> > Without the patch it is 99.9MB/s
>
> Actually... that does not seem like a huge deal. read(/dev/zero) is
> not that common operation.

There is nothing wrong with this patch (aside from the sparse warning),
and it's in my tree now, so I don't understand complaining about it...

greg k-h

2020-09-06 20:56:14

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] /dev/zero: also implement ->read

From: Christophe Leroy
> Sent: 06 September 2020 19:36
> Hi,
>
> Le 06/09/2020 à 20:21, Pavel Machek a écrit :
> > Hi!
> >
> >>>> Christophe reported a major speedup due to avoiding the iov_iter
> >>>> overhead, so just add this trivial function. Note that /dev/zero
> >>>> already implements both an iter and non-iter writes so this just
> >>>> makes it more symmetric.
> >>>>
> >>>> Christophe Leroy <[email protected]>
> >>>> Signed-off-by: Christoph Hellwig <[email protected]>
> >>>
> >>> Tested-by: Christophe Leroy <[email protected]>
> >>
> >> Any idea what has happened to make the 'iter' version so bad?
> >
> > Exactly. Also it would be nice to note how the speedup was measured
> > and what the speedup is.
> >
>
> Was measured on an 8xx powerpc running at 132MHz with:
>
> dd if=/dev/zero of=/dev/null count=1M
>
> With the patch, dd displays a throughput of 113.5MB/s
> Without the patch it is 99.9MB/s

That in itself isn't a problem.
What was the throughput before any of these patches?

I just remember another thread about the same test running
a lot slower after one of the related changes.
While this speeds up read /dev/zero (which is uncommon)
if this is needed to get near the old performance then
the changes to the 'iter' code will affect real workloads.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

2020-09-06 22:36:33

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

On 03/09/2020 17.59, Christoph Hellwig wrote:
> Christophe reported a major speedup due to avoiding the iov_iter
> overhead, so just add this trivial function. Note that /dev/zero
> already implements both an iter and non-iter writes so this just
> makes it more symmetric.
>
> Christophe Leroy <[email protected]>

?-by ?

> Signed-off-by: Christoph Hellwig <[email protected]>
> ---
> drivers/char/mem.c | 22 ++++++++++++++++++++++
> 1 file changed, 22 insertions(+)
>
> diff --git a/drivers/char/mem.c b/drivers/char/mem.c
> index abd4ffdc8cdebc..1dc99ab158457a 100644
> --- a/drivers/char/mem.c
> +++ b/drivers/char/mem.c
> @@ -726,6 +726,27 @@ static ssize_t read_iter_zero(struct kiocb *iocb, struct iov_iter *iter)
> return written;
> }
>
> +static ssize_t read_zero(struct file *file, char __user *buf,
> + size_t count, loff_t *ppos)
> +{
> + size_t cleared = 0;
> +
> + while (count) {
> + size_t chunk = min_t(size_t, count, PAGE_SIZE);
> +
> + if (clear_user(buf + cleared, chunk))
> + return cleared ? cleared : -EFAULT;

Probably nobody really cares, but currently doing

read(fd, &unmapped_page - 5, 123);

returns 5, and those five bytes do get cleared; if I'm reading the above
right you'd return -EFAULT for that case.


> + cleared += chunk;
> + count -= chunk;
> +
> + if (signal_pending(current))
> + return cleared ? cleared : -ERESTARTSYS;

I can't see how we can get here without 'cleared' being positive, so
this can just be 'return cleared' (and if you fix the above EFAULT case
to more accurately track how much got cleared, there's probably no
longer any code to be symmetric with anyway).

Rasmus

2020-09-07 04:46:33

by Christophe Leroy

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read



Le 06/09/2020 à 22:52, David Laight a écrit :
> From: Christophe Leroy
>> Sent: 06 September 2020 19:36
>> Hi,
>>
>> Le 06/09/2020 à 20:21, Pavel Machek a écrit :
>>> Hi!
>>>
>>>>>> Christophe reported a major speedup due to avoiding the iov_iter
>>>>>> overhead, so just add this trivial function. Note that /dev/zero
>>>>>> already implements both an iter and non-iter writes so this just
>>>>>> makes it more symmetric.
>>>>>>
>>>>>> Christophe Leroy <[email protected]>
>>>>>> Signed-off-by: Christoph Hellwig <[email protected]>
>>>>>
>>>>> Tested-by: Christophe Leroy <[email protected]>
>>>>
>>>> Any idea what has happened to make the 'iter' version so bad?
>>>
>>> Exactly. Also it would be nice to note how the speedup was measured
>>> and what the speedup is.
>>>
>>
>> Was measured on an 8xx powerpc running at 132MHz with:
>>
>> dd if=/dev/zero of=/dev/null count=1M
>>
>> With the patch, dd displays a throughput of 113.5MB/s
>> Without the patch it is 99.9MB/s
>
> That in itself isn't a problem.
> What was the throughput before any of these patches?
>
> I just remember another thread about the same test running
> a lot slower after one of the related changes.

> While this speeds up read /dev/zero (which is uncommon)
> if this is needed to get near the old performance then
> the changes to the 'iter' code will affect real workloads.

If you are talking about the tests around the set_fs series from
Christoph, I identified that the degradation was linked to
CONFIG_STACKPROTECTOR_STRONG being selected by default, which provides
unreliable results from one patch to another, GCC doing some strange
things linked to unrelated changes.

With CONFIG_STACKPROTECTOR set to N, I get stable performance and no
degradation with any of the patches of the set_fs series.

Christophe

2020-09-07 06:23:53

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

On Mon, Sep 07, 2020 at 12:34:37AM +0200, Rasmus Villemoes wrote:
> On 03/09/2020 17.59, Christoph Hellwig wrote:
> > Christophe reported a major speedup due to avoiding the iov_iter
> > overhead, so just add this trivial function. Note that /dev/zero
> > already implements both an iter and non-iter writes so this just
> > makes it more symmetric.
> >
> > Christophe Leroy <[email protected]>
>
> ?-by ?

Suggested-by,

> > +static ssize_t read_zero(struct file *file, char __user *buf,
> > + size_t count, loff_t *ppos)
> > +{
> > + size_t cleared = 0;
> > +
> > + while (count) {
> > + size_t chunk = min_t(size_t, count, PAGE_SIZE);
> > +
> > + if (clear_user(buf + cleared, chunk))
> > + return cleared ? cleared : -EFAULT;
>
> Probably nobody really cares, but currently doing
>
> read(fd, &unmapped_page - 5, 123);
>
> returns 5, and those five bytes do get cleared; if I'm reading the above
> right you'd return -EFAULT for that case.
>
>
> > + cleared += chunk;
> > + count -= chunk;
> > +
> > + if (signal_pending(current))
> > + return cleared ? cleared : -ERESTARTSYS;
>
> I can't see how we can get here without 'cleared' being positive, so
> this can just be 'return cleared' (and if you fix the above EFAULT case
> to more accurately track how much got cleared, there's probably no
> longer any code to be symmetric with anyway).

Yeah, I'll fix these up and resend.

2020-09-07 06:54:07

by Rasmus Villemoes

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

On 07/09/2020 08.20, Christoph Hellwig wrote:
> On Mon, Sep 07, 2020 at 12:34:37AM +0200, Rasmus Villemoes wrote:
>> On 03/09/2020 17.59, Christoph Hellwig wrote:
>
>>> +static ssize_t read_zero(struct file *file, char __user *buf,
>>> + size_t count, loff_t *ppos)
>>> +{
>>> + size_t cleared = 0;
>>> +
>>> + while (count) {
>>> + size_t chunk = min_t(size_t, count, PAGE_SIZE);
>>> +
>>> + if (clear_user(buf + cleared, chunk))
>>> + return cleared ? cleared : -EFAULT;
>>
>> Probably nobody really cares, but currently doing
>>
>> read(fd, &unmapped_page - 5, 123);
>>
>> returns 5, and those five bytes do get cleared; if I'm reading the above
>> right you'd return -EFAULT for that case.
>>
>>
>>> + cleared += chunk;
>>> + count -= chunk;
>>> +
>>> + if (signal_pending(current))
>>> + return cleared ? cleared : -ERESTARTSYS;
>>
>> I can't see how we can get here without 'cleared' being positive, so
>> this can just be 'return cleared' (and if you fix the above EFAULT case
>> to more accurately track how much got cleared, there's probably no
>> longer any code to be symmetric with anyway).
>
> Yeah, I'll fix these up and resend.
>

Actually, while you're micro-optimizing it, AFAIK VFS already handles
count==0, so you can avoid the initial branch and the last
cond_resched() by writing it something like

while (1) {
size_t chunk = min_t(size_t, count, PAGE_SIZE), c;
c = chunk - clear_user(buf + cleared, chunk);
if (unlikely(!c))
return cleared ?: -EFAULT;
cleared += c;
count -= c;
if (!count || signal_pending())
return cleared;
cond_resched();
}

For the dd test case with the default bs=512 that avoids cond_resched()
and signal_pending() altogether.

Rasmus

2020-09-07 07:34:20

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH] /dev/zero: also implement ->read

On Mon, Sep 07, 2020 at 08:50:53AM +0200, Rasmus Villemoes wrote:
>
> Actually, while you're micro-optimizing it, AFAIK VFS already handles
> count==0,

It doesn't.

2020-09-07 08:19:44

by David Laight

[permalink] [raw]
Subject: RE: [PATCH] /dev/zero: also implement ->read

From: Christophe Leroy
> Sent: 07 September 2020 05:44
...
> If you are talking about the tests around the set_fs series from
> Christoph, I identified that the degradation was linked to
> CONFIG_STACKPROTECTOR_STRONG being selected by default, which provides
> unreliable results from one patch to another, GCC doing some strange
> things linked to unrelated changes.
>
> With CONFIG_STACKPROTECTOR set to N, I get stable performance and no
> degradation with any of the patches of the set_fs series.

Ah I didn't spot that response going through.

Sounds like that option should be disabled if it causes that
much of a performance drop.

David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)