2010-06-01 07:45:37

by Jens Axboe

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

On Thu, May 27 2010, Michael Kerrisk wrote:
> Jens,
>
> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
> > On Mon, May 24 2010, Michael Kerrisk wrote:
> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
> >> >> > Right, that looks like a thinko.
> >> >> >
> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
> >> >> > -Eerror. Thanks for your comments and suggestions!
> >> >>
> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
> >> >> this patche (and all patches that change the API/ABI.)
> >> >
> >> > The first change is this:
> >> >
> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
> >> >
> >> > and the one dealing with the pages vs bytes API is this:
> >> >
> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
> >> >
> >> > Not tested yet, will do so before sending in of course.
> >>
> >> Eyeballing it quickly, these changes look right.
> >
> > Good, thanks.
> >
> >> Do you have some test programs you can make available?
> >
> > Actually I don't, I test it by modifying fio's splice engine to set/get
> > the pipe size and test the resulting transfers.
>
> An afterthought. Do there not also need to be fixes to the /proc
> interfaces. I don't think they were included in your revised patches.

I think the proc part can be sanely left in pages, since it's just a
memory limiter.

--
Jens Axboe


2010-06-02 19:26:16

by Michael Kerrisk

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
> On Thu, May 27 2010, Michael Kerrisk wrote:
>> Jens,
>>
>> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
>> > On Mon, May 24 2010, Michael Kerrisk wrote:
>> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
>> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>> >> >> > Right, that looks like a thinko.
>> >> >> >
>> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
>> >> >> > -Eerror. Thanks for your comments and suggestions!
>> >> >>
>> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
>> >> >> this patche (and all patches that change the API/ABI.)
>> >> >
>> >> > The first change is this:
>> >> >
>> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
>> >> >
>> >> > and the one dealing with the pages vs bytes API is this:
>> >> >
>> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
>> >> >
>> >> > Not tested yet, will do so before sending in of course.
>> >>
>> >> Eyeballing it quickly, these changes look right.
>> >
>> > Good, thanks.
>> >
>> >> Do you have some test programs you can make available?
>> >
>> > Actually I don't, I test it by modifying fio's splice engine to set/get
>> > the pipe size and test the resulting transfers.
>>
>> An afterthought. Do there not also need to be fixes to the /proc
>> interfaces. I don't think they were included in your revised patches.
>
> I think the proc part can be sanely left in pages, since it's just a
> memory limiter.

I can't see any advantage to using two different units for these
closely related APIs, and it does seem like it could be a source of
confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
shmget() SHMMAX that impose per-process memory-related limits use
bytes. Best to be consistent, don't you think?

Cheers,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/

2010-06-03 06:10:44

by Jens Axboe

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

On Wed, Jun 02 2010, Michael Kerrisk wrote:
> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
> > On Thu, May 27 2010, Michael Kerrisk wrote:
> >> Jens,
> >>
> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
> >> >> >> > Right, that looks like a thinko.
> >> >> >> >
> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
> >> >> >> > -Eerror. Thanks for your comments and suggestions!
> >> >> >>
> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
> >> >> >> this patche (and all patches that change the API/ABI.)
> >> >> >
> >> >> > The first change is this:
> >> >> >
> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
> >> >> >
> >> >> > and the one dealing with the pages vs bytes API is this:
> >> >> >
> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
> >> >> >
> >> >> > Not tested yet, will do so before sending in of course.
> >> >>
> >> >> Eyeballing it quickly, these changes look right.
> >> >
> >> > Good, thanks.
> >> >
> >> >> Do you have some test programs you can make available?
> >> >
> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
> >> > the pipe size and test the resulting transfers.
> >>
> >> An afterthought. Do there not also need to be fixes to the /proc
> >> interfaces. I don't think they were included in your revised patches.
> >
> > I think the proc part can be sanely left in pages, since it's just a
> > memory limiter.
>
> I can't see any advantage to using two different units for these
> closely related APIs, and it does seem like it could be a source of
> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
> shmget() SHMMAX that impose per-process memory-related limits use
> bytes. Best to be consistent, don't you think?

But they are different interfaces. I think the 'pass in required size,
return actual size' where actual size is >= required size makes sense
for the syscall part, but for an "admin" interface it is more logical to
deal in pages. Perhaps that's just me and the average admin does not
agree. So while it's just detail, it's also an interface so has some
importance. And if there's consensus that bytes is a cleaner interface
on the proc side as well, then lets change it.

--
Jens Axboe

2010-06-03 06:46:40

by Michael Kerrisk

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

Hi Jens,

On Thu, Jun 3, 2010 at 8:10 AM, Jens Axboe <[email protected]> wrote:
> On Wed, Jun 02 2010, Michael Kerrisk wrote:
>> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
>> > On Thu, May 27 2010, Michael Kerrisk wrote:
>> >> Jens,
>> >>
>> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
>> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
>> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>> >> >> >> > Right, that looks like a thinko.
>> >> >> >> >
>> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
>> >> >> >> > -Eerror. Thanks for your comments and suggestions!
>> >> >> >>
>> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
>> >> >> >> this patche (and all patches that change the API/ABI.)
>> >> >> >
>> >> >> > The first change is this:
>> >> >> >
>> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
>> >> >> >
>> >> >> > and the one dealing with the pages vs bytes API is this:
>> >> >> >
>> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
>> >> >> >
>> >> >> > Not tested yet, will do so before sending in of course.
>> >> >>
>> >> >> Eyeballing it quickly, these changes look right.
>> >> >
>> >> > Good, thanks.
>> >> >
>> >> >> Do you have some test programs you can make available?
>> >> >
>> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
>> >> > the pipe size and test the resulting transfers.
>> >>
>> >> An afterthought. Do there not also need to be fixes to the /proc
>> >> interfaces. I don't think they were included in your revised patches.
>> >
>> > I think the proc part can be sanely left in pages, since it's just a
>> > memory limiter.
>>
>> I can't see any advantage to using two different units for these
>> closely related APIs, and it does seem like it could be a source of
>> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
>> shmget() SHMMAX that impose per-process memory-related limits use
>> bytes. Best to be consistent, don't you think?
>
> But they are different interfaces. ?I think the 'pass in required size,
> return actual size' where actual size is >= required size makes sense
> for the syscall part, but for an "admin" interface it is more logical to
> deal in pages. Perhaps that's just me and the average admin does not
> agree. So while it's just detail, it's also an interface so has some
> importance. And if there's consensus that bytes is a cleaner interface
> on the proc side as well, then lets change it.

I'll add one more datapoint to those that I already mentioned.
RLIMIT_STACK and RLIMIT_DATA (getrlimit()) is also expressed in bytes.

There was only one vaguely related limit that I could find that
measured things in pages. Consider these two System V shared memory
limits:

SHMMAX
This is the maximum size (in bytes) of a shared memory segment.

SHMALL
This is a system-wide limit on the total number of pages of shared memory.

But in a way this almost confirms my point. SHMMAX is a limit the
governs the behavior of individual processes (like your /proc file),
while SHMALL is a limit that governs the behavior of the system as a
whole. There is a (sort of) logic to using bytes for one and pages for
the other.

I think that I've said all I need to say on the topic. I'm inclined to
think yours /proc file should use bytes, since it seems consistent
with other simialr APIs. Others may confirm, or someone else mught
have a different insight.

Cheers,

Michael

PS I hope you are going to set the lower limit for the /proc file to
4096B (a page) (?).

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/

2010-06-03 07:01:30

by Jens Axboe

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

On Thu, Jun 03 2010, Michael Kerrisk wrote:
> Hi Jens,
>
> On Thu, Jun 3, 2010 at 8:10 AM, Jens Axboe <[email protected]> wrote:
> > On Wed, Jun 02 2010, Michael Kerrisk wrote:
> >> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
> >> > On Thu, May 27 2010, Michael Kerrisk wrote:
> >> >> Jens,
> >> >>
> >> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
> >> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
> >> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
> >> >> >> >> > Right, that looks like a thinko.
> >> >> >> >> >
> >> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
> >> >> >> >> > -Eerror. Thanks for your comments and suggestions!
> >> >> >> >>
> >> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
> >> >> >> >> this patche (and all patches that change the API/ABI.)
> >> >> >> >
> >> >> >> > The first change is this:
> >> >> >> >
> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
> >> >> >> >
> >> >> >> > and the one dealing with the pages vs bytes API is this:
> >> >> >> >
> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
> >> >> >> >
> >> >> >> > Not tested yet, will do so before sending in of course.
> >> >> >>
> >> >> >> Eyeballing it quickly, these changes look right.
> >> >> >
> >> >> > Good, thanks.
> >> >> >
> >> >> >> Do you have some test programs you can make available?
> >> >> >
> >> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
> >> >> > the pipe size and test the resulting transfers.
> >> >>
> >> >> An afterthought. Do there not also need to be fixes to the /proc
> >> >> interfaces. I don't think they were included in your revised patches.
> >> >
> >> > I think the proc part can be sanely left in pages, since it's just a
> >> > memory limiter.
> >>
> >> I can't see any advantage to using two different units for these
> >> closely related APIs, and it does seem like it could be a source of
> >> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
> >> shmget() SHMMAX that impose per-process memory-related limits use
> >> bytes. Best to be consistent, don't you think?
> >
> > But they are different interfaces. ?I think the 'pass in required size,
> > return actual size' where actual size is >= required size makes sense
> > for the syscall part, but for an "admin" interface it is more logical to
> > deal in pages. Perhaps that's just me and the average admin does not
> > agree. So while it's just detail, it's also an interface so has some
> > importance. And if there's consensus that bytes is a cleaner interface
> > on the proc side as well, then lets change it.
>
> I'll add one more datapoint to those that I already mentioned.
> RLIMIT_STACK and RLIMIT_DATA (getrlimit()) is also expressed in bytes.
>
> There was only one vaguely related limit that I could find that
> measured things in pages. Consider these two System V shared memory
> limits:
>
> SHMMAX
> This is the maximum size (in bytes) of a shared memory segment.
>
> SHMALL
> This is a system-wide limit on the total number of pages of shared memory.
>
> But in a way this almost confirms my point. SHMMAX is a limit the
> governs the behavior of individual processes (like your /proc file),
> while SHMALL is a limit that governs the behavior of the system as a
> whole. There is a (sort of) logic to using bytes for one and pages for
> the other.
>
> I think that I've said all I need to say on the topic. I'm inclined to
> think yours /proc file should use bytes, since it seems consistent
> with other simialr APIs. Others may confirm, or someone else mught
> have a different insight.

I'll commit a patch to change it to bytes.

> PS I hope you are going to set the lower limit for the /proc file to
> 4096B (a page) (?).

Yes, I think I'll do that as a separate patch up front.

--
Jens Axboe

2010-06-03 07:05:55

by Michael Kerrisk

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

On Thu, Jun 3, 2010 at 9:01 AM, Jens Axboe <[email protected]> wrote:
> On Thu, Jun 03 2010, Michael Kerrisk wrote:
>> Hi Jens,
>>
>> On Thu, Jun 3, 2010 at 8:10 AM, Jens Axboe <[email protected]> wrote:
>> > On Wed, Jun 02 2010, Michael Kerrisk wrote:
>> >> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
>> >> > On Thu, May 27 2010, Michael Kerrisk wrote:
>> >> >> Jens,
>> >> >>
>> >> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
>> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>> >> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
>> >> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>> >> >> >> >> > Right, that looks like a thinko.
>> >> >> >> >> >
>> >> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
>> >> >> >> >> > -Eerror. Thanks for your comments and suggestions!
>> >> >> >> >>
>> >> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
>> >> >> >> >> this patche (and all patches that change the API/ABI.)
>> >> >> >> >
>> >> >> >> > The first change is this:
>> >> >> >> >
>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
>> >> >> >> >
>> >> >> >> > and the one dealing with the pages vs bytes API is this:
>> >> >> >> >
>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
>> >> >> >> >
>> >> >> >> > Not tested yet, will do so before sending in of course.
>> >> >> >>
>> >> >> >> Eyeballing it quickly, these changes look right.
>> >> >> >
>> >> >> > Good, thanks.
>> >> >> >
>> >> >> >> Do you have some test programs you can make available?
>> >> >> >
>> >> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
>> >> >> > the pipe size and test the resulting transfers.
>> >> >>
>> >> >> An afterthought. Do there not also need to be fixes to the /proc
>> >> >> interfaces. I don't think they were included in your revised patches.
>> >> >
>> >> > I think the proc part can be sanely left in pages, since it's just a
>> >> > memory limiter.
>> >>
>> >> I can't see any advantage to using two different units for these
>> >> closely related APIs, and it does seem like it could be a source of
>> >> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
>> >> shmget() SHMMAX that impose per-process memory-related limits use
>> >> bytes. Best to be consistent, don't you think?
>> >
>> > But they are different interfaces. ?I think the 'pass in required size,
>> > return actual size' where actual size is >= required size makes sense
>> > for the syscall part, but for an "admin" interface it is more logical to
>> > deal in pages. Perhaps that's just me and the average admin does not
>> > agree. So while it's just detail, it's also an interface so has some
>> > importance. And if there's consensus that bytes is a cleaner interface
>> > on the proc side as well, then lets change it.
>>
>> I'll add one more datapoint to those that I already mentioned.
>> RLIMIT_STACK and RLIMIT_DATA (getrlimit()) is also expressed in bytes.
>>
>> There was only one vaguely related limit that I could find that
>> measured things in pages. Consider these two System V shared memory
>> limits:
>>
>> SHMMAX
>> This is the maximum size (in bytes) of a shared memory segment.
>>
>> SHMALL
>> This is a system-wide limit on the total number of pages of shared memory.
>>
>> But in a way this almost confirms my point. SHMMAX is a limit the
>> governs the behavior of individual processes (like your /proc file),
>> while SHMALL is a limit that governs the behavior of the system as a
>> whole. There is a (sort of) logic to using bytes for one and pages for
>> the other.
>>
>> I think that I've said all I need to say on the topic. I'm inclined to
>> think yours /proc file should use bytes, since it seems consistent
>> with other simialr APIs. Others may confirm, or someone else mught
>> have a different insight.
>
> I'll commit a patch to change it to bytes.

Thanks Jens.

>> PS I hope you are going to set the lower limit for the /proc file to
>> 4096B (a page) (?).
>
> Yes, I think I'll do that as a separate patch up front.

Okay.

Cheers,

Michael


--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/

2010-06-03 07:48:36

by Michael Kerrisk

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

On Thu, Jun 3, 2010 at 9:05 AM, Michael Kerrisk
<[email protected]> wrote:
> On Thu, Jun 3, 2010 at 9:01 AM, Jens Axboe <[email protected]> wrote:
>> On Thu, Jun 03 2010, Michael Kerrisk wrote:
>>> Hi Jens,
>>>
>>> On Thu, Jun 3, 2010 at 8:10 AM, Jens Axboe <[email protected]> wrote:
>>> > On Wed, Jun 02 2010, Michael Kerrisk wrote:
>>> >> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
>>> >> > On Thu, May 27 2010, Michael Kerrisk wrote:
>>> >> >> Jens,
>>> >> >>
>>> >> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
>>> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>> >> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
>>> >> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>> >> >> >> >> > Right, that looks like a thinko.
>>> >> >> >> >> >
>>> >> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
>>> >> >> >> >> > -Eerror. Thanks for your comments and suggestions!
>>> >> >> >> >>
>>> >> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
>>> >> >> >> >> this patche (and all patches that change the API/ABI.)
>>> >> >> >> >
>>> >> >> >> > The first change is this:
>>> >> >> >> >
>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
>>> >> >> >> >
>>> >> >> >> > and the one dealing with the pages vs bytes API is this:
>>> >> >> >> >
>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
>>> >> >> >> >
>>> >> >> >> > Not tested yet, will do so before sending in of course.
>>> >> >> >>
>>> >> >> >> Eyeballing it quickly, these changes look right.
>>> >> >> >
>>> >> >> > Good, thanks.
>>> >> >> >
>>> >> >> >> Do you have some test programs you can make available?
>>> >> >> >
>>> >> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
>>> >> >> > the pipe size and test the resulting transfers.
>>> >> >>
>>> >> >> An afterthought. Do there not also need to be fixes to the /proc
>>> >> >> interfaces. I don't think they were included in your revised patches.
>>> >> >
>>> >> > I think the proc part can be sanely left in pages, since it's just a
>>> >> > memory limiter.
>>> >>
>>> >> I can't see any advantage to using two different units for these
>>> >> closely related APIs, and it does seem like it could be a source of
>>> >> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
>>> >> shmget() SHMMAX that impose per-process memory-related limits use
>>> >> bytes. Best to be consistent, don't you think?
>>> >
>>> > But they are different interfaces. ?I think the 'pass in required size,
>>> > return actual size' where actual size is >= required size makes sense
>>> > for the syscall part, but for an "admin" interface it is more logical to
>>> > deal in pages. Perhaps that's just me and the average admin does not
>>> > agree. So while it's just detail, it's also an interface so has some
>>> > importance. And if there's consensus that bytes is a cleaner interface
>>> > on the proc side as well, then lets change it.
>>>
>>> I'll add one more datapoint to those that I already mentioned.
>>> RLIMIT_STACK and RLIMIT_DATA (getrlimit()) is also expressed in bytes.
>>>
>>> There was only one vaguely related limit that I could find that
>>> measured things in pages. Consider these two System V shared memory
>>> limits:
>>>
>>> SHMMAX
>>> This is the maximum size (in bytes) of a shared memory segment.
>>>
>>> SHMALL
>>> This is a system-wide limit on the total number of pages of shared memory.
>>>
>>> But in a way this almost confirms my point. SHMMAX is a limit the
>>> governs the behavior of individual processes (like your /proc file),
>>> while SHMALL is a limit that governs the behavior of the system as a
>>> whole. There is a (sort of) logic to using bytes for one and pages for
>>> the other.
>>>
>>> I think that I've said all I need to say on the topic. I'm inclined to
>>> think yours /proc file should use bytes, since it seems consistent
>>> with other simialr APIs. Others may confirm, or someone else mught
>>> have a different insight.
>>
>> I'll commit a patch to change it to bytes.
>
> Thanks Jens.

Since I'm going to document the /proc file, it occurred to me... What
are you going to call this file now? "pipe_max_pages" no longer makes
sense. "pipe_size_ceiling" may be more expressive than simply
"pipe_max".

Cheers,

Michael

2010-06-03 07:59:11

by Michael Kerrisk

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

Hi Jens,

On Thu, Jun 3, 2010 at 9:48 AM, Michael Kerrisk
<[email protected]> wrote:
> On Thu, Jun 3, 2010 at 9:05 AM, Michael Kerrisk
> <[email protected]> wrote:
>> On Thu, Jun 3, 2010 at 9:01 AM, Jens Axboe <[email protected]> wrote:
>>> On Thu, Jun 03 2010, Michael Kerrisk wrote:
>>>> Hi Jens,
>>>>
>>>> On Thu, Jun 3, 2010 at 8:10 AM, Jens Axboe <[email protected]> wrote:
>>>> > On Wed, Jun 02 2010, Michael Kerrisk wrote:
>>>> >> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
>>>> >> > On Thu, May 27 2010, Michael Kerrisk wrote:
>>>> >> >> Jens,
>>>> >> >>
>>>> >> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
>>>> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>>> >> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
>>>> >> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>>> >> >> >> >> > Right, that looks like a thinko.
>>>> >> >> >> >> >
>>>> >> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
>>>> >> >> >> >> > -Eerror. Thanks for your comments and suggestions!
>>>> >> >> >> >>
>>>> >> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
>>>> >> >> >> >> this patche (and all patches that change the API/ABI.)
>>>> >> >> >> >
>>>> >> >> >> > The first change is this:
>>>> >> >> >> >
>>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
>>>> >> >> >> >
>>>> >> >> >> > and the one dealing with the pages vs bytes API is this:
>>>> >> >> >> >
>>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
>>>> >> >> >> >
>>>> >> >> >> > Not tested yet, will do so before sending in of course.
>>>> >> >> >>
>>>> >> >> >> Eyeballing it quickly, these changes look right.
>>>> >> >> >
>>>> >> >> > Good, thanks.
>>>> >> >> >
>>>> >> >> >> Do you have some test programs you can make available?
>>>> >> >> >
>>>> >> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
>>>> >> >> > the pipe size and test the resulting transfers.
>>>> >> >>
>>>> >> >> An afterthought. Do there not also need to be fixes to the /proc
>>>> >> >> interfaces. I don't think they were included in your revised patches.
>>>> >> >
>>>> >> > I think the proc part can be sanely left in pages, since it's just a
>>>> >> > memory limiter.
>>>> >>
>>>> >> I can't see any advantage to using two different units for these
>>>> >> closely related APIs, and it does seem like it could be a source of
>>>> >> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
>>>> >> shmget() SHMMAX that impose per-process memory-related limits use
>>>> >> bytes. Best to be consistent, don't you think?
>>>> >
>>>> > But they are different interfaces. ?I think the 'pass in required size,
>>>> > return actual size' where actual size is >= required size makes sense
>>>> > for the syscall part, but for an "admin" interface it is more logical to
>>>> > deal in pages. Perhaps that's just me and the average admin does not
>>>> > agree. So while it's just detail, it's also an interface so has some
>>>> > importance. And if there's consensus that bytes is a cleaner interface
>>>> > on the proc side as well, then lets change it.
>>>>
>>>> I'll add one more datapoint to those that I already mentioned.
>>>> RLIMIT_STACK and RLIMIT_DATA (getrlimit()) is also expressed in bytes.
>>>>
>>>> There was only one vaguely related limit that I could find that
>>>> measured things in pages. Consider these two System V shared memory
>>>> limits:
>>>>
>>>> SHMMAX
>>>> This is the maximum size (in bytes) of a shared memory segment.
>>>>
>>>> SHMALL
>>>> This is a system-wide limit on the total number of pages of shared memory.
>>>>
>>>> But in a way this almost confirms my point. SHMMAX is a limit the
>>>> governs the behavior of individual processes (like your /proc file),
>>>> while SHMALL is a limit that governs the behavior of the system as a
>>>> whole. There is a (sort of) logic to using bytes for one and pages for
>>>> the other.
>>>>
>>>> I think that I've said all I need to say on the topic. I'm inclined to
>>>> think yours /proc file should use bytes, since it seems consistent
>>>> with other simialr APIs. Others may confirm, or someone else mught
>>>> have a different insight.
>>>
>>> I'll commit a patch to change it to bytes.
>>
>> Thanks Jens.
>
> Since I'm going to document the /proc file, it occurred to me... What
> are you going to call this file now? "pipe_max_pages" ?no longer makes
> sense. "pipe_size_ceiling" may be more expressive than simply
> "pipe_max".

So, I'm looking at this interface still more closely now. How about
using CAP_SYS_RESOURCE, rather than the hugely overloaded
CAP_SYS_ADMIN as the governor for the capability check? Again, it's
about consistency. Here's what CAP_SYS_RESOURCE currently governs:

CAP_SYS_RESOURCE
* Use reserved space on ext2 file systems;
* make ioctl(2) calls controlling ext3 journaling;
* override disk quota limits;
* increase resource limits (see setrlimit(2));
* override RLIMIT_NPROC resource limit;
* raise msg_qbytes limit for a System V message queue
above the limit
in /proc/sys/kernel/msgmnb (see msgop(2) and msgctl(2)).

Including the pipe size limit in this list makes sense.

Cheers,

Michael

--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Author of "The Linux Programming Interface" http://blog.man7.org/

2010-06-03 08:29:30

by Michael Kerrisk

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

Hi Jens,

On Thu, Jun 3, 2010 at 9:58 AM, Michael Kerrisk
<[email protected]> wrote:
> Hi Jens,
>
> On Thu, Jun 3, 2010 at 9:48 AM, Michael Kerrisk
> <[email protected]> wrote:
>> On Thu, Jun 3, 2010 at 9:05 AM, Michael Kerrisk
>> <[email protected]> wrote:
>>> On Thu, Jun 3, 2010 at 9:01 AM, Jens Axboe <[email protected]> wrote:
>>>> On Thu, Jun 03 2010, Michael Kerrisk wrote:
>>>>> Hi Jens,
>>>>>
>>>>> On Thu, Jun 3, 2010 at 8:10 AM, Jens Axboe <[email protected]> wrote:
>>>>> > On Wed, Jun 02 2010, Michael Kerrisk wrote:
>>>>> >> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
>>>>> >> > On Thu, May 27 2010, Michael Kerrisk wrote:
>>>>> >> >> Jens,
>>>>> >> >>
>>>>> >> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
>>>>> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>>>> >> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
>>>>> >> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>>>> >> >> >> >> > Right, that looks like a thinko.
>>>>> >> >> >> >> >
>>>>> >> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
>>>>> >> >> >> >> > -Eerror. Thanks for your comments and suggestions!
>>>>> >> >> >> >>
>>>>> >> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
>>>>> >> >> >> >> this patche (and all patches that change the API/ABI.)
>>>>> >> >> >> >
>>>>> >> >> >> > The first change is this:
>>>>> >> >> >> >
>>>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
>>>>> >> >> >> >
>>>>> >> >> >> > and the one dealing with the pages vs bytes API is this:
>>>>> >> >> >> >
>>>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
>>>>> >> >> >> >
>>>>> >> >> >> > Not tested yet, will do so before sending in of course.
>>>>> >> >> >>
>>>>> >> >> >> Eyeballing it quickly, these changes look right.
>>>>> >> >> >
>>>>> >> >> > Good, thanks.
>>>>> >> >> >
>>>>> >> >> >> Do you have some test programs you can make available?
>>>>> >> >> >
>>>>> >> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
>>>>> >> >> > the pipe size and test the resulting transfers.
>>>>> >> >>
>>>>> >> >> An afterthought. Do there not also need to be fixes to the /proc
>>>>> >> >> interfaces. I don't think they were included in your revised patches.
>>>>> >> >
>>>>> >> > I think the proc part can be sanely left in pages, since it's just a
>>>>> >> > memory limiter.
>>>>> >>
>>>>> >> I can't see any advantage to using two different units for these
>>>>> >> closely related APIs, and it does seem like it could be a source of
>>>>> >> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
>>>>> >> shmget() SHMMAX that impose per-process memory-related limits use
>>>>> >> bytes. Best to be consistent, don't you think?
>>>>> >
>>>>> > But they are different interfaces. ?I think the 'pass in required size,
>>>>> > return actual size' where actual size is >= required size makes sense
>>>>> > for the syscall part, but for an "admin" interface it is more logical to
>>>>> > deal in pages. Perhaps that's just me and the average admin does not
>>>>> > agree. So while it's just detail, it's also an interface so has some
>>>>> > importance. And if there's consensus that bytes is a cleaner interface
>>>>> > on the proc side as well, then lets change it.
>>>>>
>>>>> I'll add one more datapoint to those that I already mentioned.
>>>>> RLIMIT_STACK and RLIMIT_DATA (getrlimit()) is also expressed in bytes.
>>>>>
>>>>> There was only one vaguely related limit that I could find that
>>>>> measured things in pages. Consider these two System V shared memory
>>>>> limits:
>>>>>
>>>>> SHMMAX
>>>>> This is the maximum size (in bytes) of a shared memory segment.
>>>>>
>>>>> SHMALL
>>>>> This is a system-wide limit on the total number of pages of shared memory.
>>>>>
>>>>> But in a way this almost confirms my point. SHMMAX is a limit the
>>>>> governs the behavior of individual processes (like your /proc file),
>>>>> while SHMALL is a limit that governs the behavior of the system as a
>>>>> whole. There is a (sort of) logic to using bytes for one and pages for
>>>>> the other.
>>>>>
>>>>> I think that I've said all I need to say on the topic. I'm inclined to
>>>>> think yours /proc file should use bytes, since it seems consistent
>>>>> with other simialr APIs. Others may confirm, or someone else mught
>>>>> have a different insight.
>>>>
>>>> I'll commit a patch to change it to bytes.
>>>
>>> Thanks Jens.
>>
>> Since I'm going to document the /proc file, it occurred to me... What
>> are you going to call this file now? "pipe_max_pages" ?no longer makes
>> sense. "pipe_size_ceiling" may be more expressive than simply
>> "pipe_max".
>
> So, I'm looking at this interface still more closely now. How about
> using CAP_SYS_RESOURCE, rather than the hugely overloaded
> CAP_SYS_ADMIN as the governor for the capability check? Again, it's
> about consistency. Here's what CAP_SYS_RESOURCE currently governs:
>
> ? ? ? CAP_SYS_RESOURCE
> ? ? ? ? ? ? ?* Use reserved space on ext2 file systems;
> ? ? ? ? ? ? ?* make ioctl(2) calls controlling ext3 journaling;
> ? ? ? ? ? ? ?* override disk quota limits;
> ? ? ? ? ? ? ?* increase resource limits (see setrlimit(2));
> ? ? ? ? ? ? ?* override RLIMIT_NPROC resource limit;
> ? ? ? ? ? ? ?* raise msg_qbytes limit for a System V message queue
> above the limit
> ? ? ? ? ? ? ? ?in /proc/sys/kernel/msgmnb (see msgop(2) and msgctl(2)).
>
> Including the pipe size limit in this list makes sense.

Another question: What happens if we adjust the capacity of a pipe to
a value that is smaller than the number of bytes currently in the
pipe?

Cheers,

Michael

2010-06-03 08:53:59

by Michael Kerrisk

[permalink] [raw]
Subject: Re: [patch] pipe: add support for shrinking and growing pipes

On Thu, Jun 3, 2010 at 10:29 AM, Michael Kerrisk
<[email protected]> wrote:
> Hi Jens,
>
> On Thu, Jun 3, 2010 at 9:58 AM, Michael Kerrisk
> <[email protected]> wrote:
>> Hi Jens,
>>
>> On Thu, Jun 3, 2010 at 9:48 AM, Michael Kerrisk
>> <[email protected]> wrote:
>>> On Thu, Jun 3, 2010 at 9:05 AM, Michael Kerrisk
>>> <[email protected]> wrote:
>>>> On Thu, Jun 3, 2010 at 9:01 AM, Jens Axboe <[email protected]> wrote:
>>>>> On Thu, Jun 03 2010, Michael Kerrisk wrote:
>>>>>> Hi Jens,
>>>>>>
>>>>>> On Thu, Jun 3, 2010 at 8:10 AM, Jens Axboe <[email protected]> wrote:
>>>>>> > On Wed, Jun 02 2010, Michael Kerrisk wrote:
>>>>>> >> On Tue, Jun 1, 2010 at 9:45 AM, Jens Axboe <[email protected]> wrote:
>>>>>> >> > On Thu, May 27 2010, Michael Kerrisk wrote:
>>>>>> >> >> Jens,
>>>>>> >> >>
>>>>>> >> >> On Mon, May 24, 2010 at 7:56 PM, Jens Axboe <[email protected]> wrote:
>>>>>> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>>>>> >> >> >> On Mon, May 24, 2010 at 7:35 PM, Jens Axboe <[email protected]> wrote:
>>>>>> >> >> >> > On Mon, May 24 2010, Michael Kerrisk wrote:
>>>>>> >> >> >> >> > Right, that looks like a thinko.
>>>>>> >> >> >> >> >
>>>>>> >> >> >> >> > I'll submit a patch changing it to bytes and the agreed API and fix this
>>>>>> >> >> >> >> > -Eerror. Thanks for your comments and suggestions!
>>>>>> >> >> >> >>
>>>>>> >> >> >> >> Thanks. And of course you are welcome. (Please CC linux-api@vger on
>>>>>> >> >> >> >> this patche (and all patches that change the API/ABI.)
>>>>>> >> >> >> >
>>>>>> >> >> >> > The first change is this:
>>>>>> >> >> >> >
>>>>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=0191f8697bbdfefcd36e7b8dc3eeddfe82893e4b
>>>>>> >> >> >> >
>>>>>> >> >> >> > and the one dealing with the pages vs bytes API is this:
>>>>>> >> >> >> >
>>>>>> >> >> >> > http://git.kernel.dk/?p=linux-2.6-block.git;a=commit;h=b9598db3401282bb27b4aef77e3eee12015f7f29
>>>>>> >> >> >> >
>>>>>> >> >> >> > Not tested yet, will do so before sending in of course.
>>>>>> >> >> >>
>>>>>> >> >> >> Eyeballing it quickly, these changes look right.
>>>>>> >> >> >
>>>>>> >> >> > Good, thanks.
>>>>>> >> >> >
>>>>>> >> >> >> Do you have some test programs you can make available?
>>>>>> >> >> >
>>>>>> >> >> > Actually I don't, I test it by modifying fio's splice engine to set/get
>>>>>> >> >> > the pipe size and test the resulting transfers.
>>>>>> >> >>
>>>>>> >> >> An afterthought. Do there not also need to be fixes to the /proc
>>>>>> >> >> interfaces. I don't think they were included in your revised patches.
>>>>>> >> >
>>>>>> >> > I think the proc part can be sanely left in pages, since it's just a
>>>>>> >> > memory limiter.
>>>>>> >>
>>>>>> >> I can't see any advantage to using two different units for these
>>>>>> >> closely related APIs, and it does seem like it could be a source of
>>>>>> >> confusion. Similar APIs that I can think of like RLIMIT_MEMLOCK and
>>>>>> >> shmget() SHMMAX that impose per-process memory-related limits use
>>>>>> >> bytes. Best to be consistent, don't you think?
>>>>>> >
>>>>>> > But they are different interfaces. ?I think the 'pass in required size,
>>>>>> > return actual size' where actual size is >= required size makes sense
>>>>>> > for the syscall part, but for an "admin" interface it is more logical to
>>>>>> > deal in pages. Perhaps that's just me and the average admin does not
>>>>>> > agree. So while it's just detail, it's also an interface so has some
>>>>>> > importance. And if there's consensus that bytes is a cleaner interface
>>>>>> > on the proc side as well, then lets change it.
>>>>>>
>>>>>> I'll add one more datapoint to those that I already mentioned.
>>>>>> RLIMIT_STACK and RLIMIT_DATA (getrlimit()) is also expressed in bytes.
>>>>>>
>>>>>> There was only one vaguely related limit that I could find that
>>>>>> measured things in pages. Consider these two System V shared memory
>>>>>> limits:
>>>>>>
>>>>>> SHMMAX
>>>>>> This is the maximum size (in bytes) of a shared memory segment.
>>>>>>
>>>>>> SHMALL
>>>>>> This is a system-wide limit on the total number of pages of shared memory.
>>>>>>
>>>>>> But in a way this almost confirms my point. SHMMAX is a limit the
>>>>>> governs the behavior of individual processes (like your /proc file),
>>>>>> while SHMALL is a limit that governs the behavior of the system as a
>>>>>> whole. There is a (sort of) logic to using bytes for one and pages for
>>>>>> the other.
>>>>>>
>>>>>> I think that I've said all I need to say on the topic. I'm inclined to
>>>>>> think yours /proc file should use bytes, since it seems consistent
>>>>>> with other simialr APIs. Others may confirm, or someone else mught
>>>>>> have a different insight.
>>>>>
>>>>> I'll commit a patch to change it to bytes.
>>>>
>>>> Thanks Jens.
>>>
>>> Since I'm going to document the /proc file, it occurred to me... What
>>> are you going to call this file now? "pipe_max_pages" ?no longer makes
>>> sense. "pipe_size_ceiling" may be more expressive than simply
>>> "pipe_max".
>>
>> So, I'm looking at this interface still more closely now. How about
>> using CAP_SYS_RESOURCE, rather than the hugely overloaded
>> CAP_SYS_ADMIN as the governor for the capability check? Again, it's
>> about consistency. Here's what CAP_SYS_RESOURCE currently governs:
>>
>> ? ? ? CAP_SYS_RESOURCE
>> ? ? ? ? ? ? ?* Use reserved space on ext2 file systems;
>> ? ? ? ? ? ? ?* make ioctl(2) calls controlling ext3 journaling;
>> ? ? ? ? ? ? ?* override disk quota limits;
>> ? ? ? ? ? ? ?* increase resource limits (see setrlimit(2));
>> ? ? ? ? ? ? ?* override RLIMIT_NPROC resource limit;
>> ? ? ? ? ? ? ?* raise msg_qbytes limit for a System V message queue
>> above the limit
>> ? ? ? ? ? ? ? ?in /proc/sys/kernel/msgmnb (see msgop(2) and msgctl(2)).
>>
>> Including the pipe size limit in this list makes sense.
>
> Another question: What happens if we adjust the capacity of a pipe to
> a value that is smaller than the number of bytes currently in the
> pipe?

Hi Jens,

This is my first cut at trying to describe the new interface:

[[
Starting with kernel 2.6.35, Linux allows the capacity of a pipe to be
modified. The Linux-specific fcntl(fd, F_SETPIPE_SZ, size) call
changes the capacity of the pipe referred to by fd to be at least size
bytes. An unprivileged process can change the pipe capacity to any
value in the range from the system page size up to the value in
/proc/sys/fs/pipe-max-pages???. The default value for
pipe-max-pages???? is ???? bytes. A privileged (CAP_SYS_RESOURCE???)
process can override this limit. When allocating space for the pipe,
the kernel may round size up to some value convenient for the
implementation. (In the initial implementation, size is rounded up to
the next power-of-two multiple of the system page size.) The fcntl(fd,
F_GETPIPE_SZ) call returns the actual size allocated for the pipe
referred to by fd.
]]

Obviously there's a few things that are still not final, such as
* the resource name,
* the name of the /proc file,
and the behavior if 'size' is less than the number of bytes currently
in the pipe.

But other than that, how does this look?

Cheers,

Michael