2014-02-06 10:30:31

by Madars Vitolins

[permalink] [raw]
Subject: Max number of posix queues in vanilla kernel (/proc/sys/fs/mqueue/queues_max)

Hi Folks,

I have recently ported my multi-process application (like a classical open
system) which uses POSIX Queues as IPC to one of the latest Linux kernels,
and I have faced issue that number of maximum queues are dramatically
limited down to 1024 (see include/linux/ipc_namespace.h, #define
HARD_QUEUESMAX 1024).

Previously the max number of queues was INT_MAX (on 64bit system was:
2147483647).

This update imposes bad limits on our multi-process application. As our
app uses approaches that each process opens its own set of queues (usually
something about 3-5 queues per process). In some scenarios we might run up
to 3000 processes or more (which of-course for linux is not a problem).
Thus we might need up to 9000 queues or more. All processes run under one
user.

But now we have this limit, which limits our software down and we are
getting in trouble. We could patch the kernel manually, but not all
customers are capable of this and willing to do the patching.

Thus I *kindly* ask you guys to increase this limit to something like 1M
queues or more (or to technical limit i.e. leave the same INT_MAX). If
user can screw up the system by setting or using maximums, let it leave to
the user. As it is doing system tuning and he is responsible for kernel
parameters.

The kernel limit was introduced by:
-
http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=93e6f119c0ce8a1bba6e81dc8dd97d67be360844

Also I see other people are claiming issues with this, see:
- https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695 - for
them some database software is not working after the kernel upgrade...

Also I think that when people will upgrade from RHEL 5 or RHEL 6 to next
versions where this hard limit will be defined, I suspect that many will
claim problem about it...

Thanks a lot in advance,
Madars Vitolins



2014-02-07 16:27:55

by Madars Vitolins

[permalink] [raw]
Subject: Re: Max number of posix queues in vanilla kernel (/proc/sys/fs/mqueue/queues_max)

Hello!

Any comments or concerns about this?

many thanks,
Madars

> Hi Folks,
>
> I have recently ported my multi-process application (like a classical open
> system) which uses POSIX Queues as IPC to one of the latest Linux kernels,
> and I have faced issue that number of maximum queues are dramatically
> limited down to 1024 (see include/linux/ipc_namespace.h, #define
> HARD_QUEUESMAX 1024).
>
> Previously the max number of queues was INT_MAX (on 64bit system was:
> 2147483647).
>
> This update imposes bad limits on our multi-process application. As our
> app uses approaches that each process opens its own set of queues (usually
> something about 3-5 queues per process). In some scenarios we might run up
> to 3000 processes or more (which of-course for linux is not a problem).
> Thus we might need up to 9000 queues or more. All processes run under one
> user.
>
> But now we have this limit, which limits our software down and we are
> getting in trouble. We could patch the kernel manually, but not all
> customers are capable of this and willing to do the patching.
>
> Thus I *kindly* ask you guys to increase this limit to something like 1M
> queues or more (or to technical limit i.e. leave the same INT_MAX). If
> user can screw up the system by setting or using maximums, let it leave to
> the user. As it is doing system tuning and he is responsible for kernel
> parameters.
>
> The kernel limit was introduced by:
> -
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=93e6f119c0ce8a1bba6e81dc8dd97d67be360844
>
> Also I see other people are claiming issues with this, see:
> - https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695 - for
> them some database software is not working after the kernel upgrade...
>
> Also I think that when people will upgrade from RHEL 5 or RHEL 6 to next
> versions where this hard limit will be defined, I suspect that many will
> claim problem about it...
>
> Thanks a lot in advance,
> Madars Vitolins
>
>
>

2014-02-07 20:11:13

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: Max number of posix queues in vanilla kernel (/proc/sys/fs/mqueue/queues_max)

On Thu, 2014-02-06 at 12:21 +0200, [email protected] wrote:
> Hi Folks,
>
> I have recently ported my multi-process application (like a classical open
> system) which uses POSIX Queues as IPC to one of the latest Linux kernels,
> and I have faced issue that number of maximum queues are dramatically
> limited down to 1024 (see include/linux/ipc_namespace.h, #define
> HARD_QUEUESMAX 1024).
>
> Previously the max number of queues was INT_MAX (on 64bit system was:
> 2147483647).

Hmm yes, 1024 is quite unrealistic for some workloads and breaks
userspace - I don't see any reasons for _this_ specific value in the
changelog or related changes in the patchset that introduced commits
93e6f119 and 02967ea0. And the fact that this limit is per namespace
makes no difference really. Hell, if nothing else, the mq_overview(7)
manpage description is evidence enough. For privileged users:

The default value for queues_max is 256; it can be changed to any value in the range 0 to INT_MAX.

>
> This update imposes bad limits on our multi-process application. As our
> app uses approaches that each process opens its own set of queues (usually
> something about 3-5 queues per process). In some scenarios we might run up
> to 3000 processes or more (which of-course for linux is not a problem).
> Thus we might need up to 9000 queues or more. All processes run under one
> user.
>
> But now we have this limit, which limits our software down and we are
> getting in trouble. We could patch the kernel manually, but not all
> customers are capable of this and willing to do the patching.
>
> Thus I *kindly* ask you guys to increase this limit to something like 1M
> queues or more (or to technical limit i.e. leave the same INT_MAX). If
> user can screw up the system by setting or using maximums, let it leave to
> the user. As it is doing system tuning and he is responsible for kernel
> parameters.
>
> The kernel limit was introduced by:
> -
> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=93e6f119c0ce8a1bba6e81dc8dd97d67be360844
>
> Also I see other people are claiming issues with this, see:
> - https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695 - for
> them some database software is not working after the kernel upgrade...

Surprised we didn't hear about this earlier by Michael Kerrisk. At least
the upstream manpages haven't been updated to reflect this new behavior,
it would have been the wrong way to go.

>
> Also I think that when people will upgrade from RHEL 5 or RHEL 6 to next
> versions where this hard limit will be defined, I suspect that many will
> claim problem about it...

Agreed, RHEL 7 will ship with some baseline version of the 3.10 kernel
and users will be exposed to this. Of course, the same goes for just
about any distro, and Ubuntu users are already complaining about it.

I believe that instead of bumping up this HARD limit of 1024, we should
go back to the original behavior. If we just increase it, instead, then
how high is high enough?

Thanks,
Davidlohr

2014-02-07 21:24:32

by Doug Ledford

[permalink] [raw]
Subject: Re: Max number of posix queues in vanilla kernel (/proc/sys/fs/mqueue/queues_max)

On 2/7/2014 3:11 PM, Davidlohr Bueso wrote:
> On Thu, 2014-02-06 at 12:21 +0200, [email protected] wrote:
>> Hi Folks,
>>
>> I have recently ported my multi-process application (like a classical open
>> system) which uses POSIX Queues as IPC to one of the latest Linux kernels,
>> and I have faced issue that number of maximum queues are dramatically
>> limited down to 1024 (see include/linux/ipc_namespace.h, #define
>> HARD_QUEUESMAX 1024).
>>
>> Previously the max number of queues was INT_MAX (on 64bit system was:
>> 2147483647).
>
> Hmm yes, 1024 is quite unrealistic for some workloads and breaks
> userspace - I don't see any reasons for _this_ specific value in the
> changelog or related changes in the patchset that introduced commits
> 93e6f119 and 02967ea0.

There wasn't a specific selection of that number other than a general
attempt to make the max more reasonable (INT_MAX isn't really reasonable
given the overhead of each individual queue, even if the queue number
and max msg size are small).

> And the fact that this limit is per namespace
> makes no difference really. Hell, if nothing else, the mq_overview(7)
> manpage description is evidence enough. For privileged users:
>
> The default value for queues_max is 256; it can be changed to any value in the range 0 to INT_MAX.

That was obviously never updated to match the change.

In hindsight, I'm not sure we really even care though. Since the limit
on queues is per namespace, and we can make as many namespaces as we
want, the limit is more or less meaningless and only serves as a
nuisance to people. Since we have accounting on a per user basis that
spans across namespaces and across queues, maybe that should be
sufficient and the limit on queues should simply be removed and we
should instead just rely on memory limits. When the user has exhausted
their allowed memory usage, whether by large queue sizes, large message
sizes, or large queue counts, then they are done. When they haven't,
they can keep allocating. Would make things considerably easier and
would avoid the breakage we are talking about here.

>>
>> This update imposes bad limits on our multi-process application. As our
>> app uses approaches that each process opens its own set of queues (usually
>> something about 3-5 queues per process). In some scenarios we might run up
>> to 3000 processes or more (which of-course for linux is not a problem).
>> Thus we might need up to 9000 queues or more. All processes run under one
>> user.
>>
>> But now we have this limit, which limits our software down and we are
>> getting in trouble. We could patch the kernel manually, but not all
>> customers are capable of this and willing to do the patching.
>>
>> Thus I *kindly* ask you guys to increase this limit to something like 1M
>> queues or more (or to technical limit i.e. leave the same INT_MAX).

Technically, INT_MAX isn't (and never was) a valid limit. Because the
queue overhead memory size is accounted against the user when creating a
queue, they can never effectively get to INT_MAX whether it's allowed or
not.

>> If
>> user can screw up the system by setting or using maximums, let it leave to
>> the user. As it is doing system tuning and he is responsible for kernel
>> parameters.
>>
>> The kernel limit was introduced by:
>> -
>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=93e6f119c0ce8a1bba6e81dc8dd97d67be360844
>>
>> Also I see other people are claiming issues with this, see:
>> - https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695 - for
>> them some database software is not working after the kernel upgrade...
>
> Surprised we didn't hear about this earlier by Michael Kerrisk. At least
> the upstream manpages haven't been updated to reflect this new behavior,
> it would have been the wrong way to go.
>
>>
>> Also I think that when people will upgrade from RHEL 5 or RHEL 6 to next
>> versions where this hard limit will be defined, I suspect that many will
>> claim problem about it...
>
> Agreed, RHEL 7 will ship with some baseline version of the 3.10 kernel
> and users will be exposed to this. Of course, the same goes for just
> about any distro, and Ubuntu users are already complaining about it.
>
> I believe that instead of bumping up this HARD limit of 1024, we should
> go back to the original behavior. If we just increase it, instead, then
> how high is high enough?

I think it can be removed entirely myself. The memory limit is really
all we need worry about unless Viro comes back and says 100s of
thousands of queues in a single namespace will kill queue lookup or
something like that.




Attachments:
signature.asc (899.00 B)
OpenPGP digital signature

2014-02-08 22:39:37

by Madars Vitolins

[permalink] [raw]
Subject: Re: Max number of posix queues in vanilla kernel (/proc/sys/fs/mqueue/queues_max)

Yes, limiting queue count to the memory sounds correct!

Thanks,
Madars

> On 2/7/2014 3:11 PM, Davidlohr Bueso wrote:
>> On Thu, 2014-02-06 at 12:21 +0200, [email protected] wrote:
>>> Hi Folks,
>>>
>>> I have recently ported my multi-process application (like a classical
>>> open
>>> system) which uses POSIX Queues as IPC to one of the latest Linux
>>> kernels,
>>> and I have faced issue that number of maximum queues are dramatically
>>> limited down to 1024 (see include/linux/ipc_namespace.h, #define
>>> HARD_QUEUESMAX 1024).
>>>
>>> Previously the max number of queues was INT_MAX (on 64bit system was:
>>> 2147483647).
>>
>> Hmm yes, 1024 is quite unrealistic for some workloads and breaks
>> userspace - I don't see any reasons for _this_ specific value in the
>> changelog or related changes in the patchset that introduced commits
>> 93e6f119 and 02967ea0.
>
> There wasn't a specific selection of that number other than a general
> attempt to make the max more reasonable (INT_MAX isn't really reasonable
> given the overhead of each individual queue, even if the queue number
> and max msg size are small).
>
>> And the fact that this limit is per namespace
>> makes no difference really. Hell, if nothing else, the mq_overview(7)
>> manpage description is evidence enough. For privileged users:
>>
>> The default value for queues_max is 256; it can be changed to any value
>> in the range 0 to INT_MAX.
>
> That was obviously never updated to match the change.
>
> In hindsight, I'm not sure we really even care though. Since the limit
> on queues is per namespace, and we can make as many namespaces as we
> want, the limit is more or less meaningless and only serves as a
> nuisance to people. Since we have accounting on a per user basis that
> spans across namespaces and across queues, maybe that should be
> sufficient and the limit on queues should simply be removed and we
> should instead just rely on memory limits. When the user has exhausted
> their allowed memory usage, whether by large queue sizes, large message
> sizes, or large queue counts, then they are done. When they haven't,
> they can keep allocating. Would make things considerably easier and
> would avoid the breakage we are talking about here.
>
>>>
>>> This update imposes bad limits on our multi-process application. As our
>>> app uses approaches that each process opens its own set of queues
>>> (usually
>>> something about 3-5 queues per process). In some scenarios we might run
>>> up
>>> to 3000 processes or more (which of-course for linux is not a problem).
>>> Thus we might need up to 9000 queues or more. All processes run under
>>> one
>>> user.
>>>
>>> But now we have this limit, which limits our software down and we are
>>> getting in trouble. We could patch the kernel manually, but not all
>>> customers are capable of this and willing to do the patching.
>>>
>>> Thus I *kindly* ask you guys to increase this limit to something like
>>> 1M
>>> queues or more (or to technical limit i.e. leave the same INT_MAX).
>
> Technically, INT_MAX isn't (and never was) a valid limit. Because the
> queue overhead memory size is accounted against the user when creating a
> queue, they can never effectively get to INT_MAX whether it's allowed or
> not.
>
>>> If
>>> user can screw up the system by setting or using maximums, let it leave
>>> to
>>> the user. As it is doing system tuning and he is responsible for kernel
>>> parameters.
>>>
>>> The kernel limit was introduced by:
>>> -
>>> http://git.kernel.org/cgit/linux/kernel/git/torvalds/linux.git/commit/?id=93e6f119c0ce8a1bba6e81dc8dd97d67be360844
>>>
>>> Also I see other people are claiming issues with this, see:
>>> - https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695 - for
>>> them some database software is not working after the kernel upgrade...
>>
>> Surprised we didn't hear about this earlier by Michael Kerrisk. At least
>> the upstream manpages haven't been updated to reflect this new behavior,
>> it would have been the wrong way to go.
>>
>>>
>>> Also I think that when people will upgrade from RHEL 5 or RHEL 6 to
>>> next
>>> versions where this hard limit will be defined, I suspect that many
>>> will
>>> claim problem about it...
>>
>> Agreed, RHEL 7 will ship with some baseline version of the 3.10 kernel
>> and users will be exposed to this. Of course, the same goes for just
>> about any distro, and Ubuntu users are already complaining about it.
>>
>> I believe that instead of bumping up this HARD limit of 1024, we should
>> go back to the original behavior. If we just increase it, instead, then
>> how high is high enough?
>
> I think it can be removed entirely myself. The memory limit is really
> all we need worry about unless Viro comes back and says 100s of
> thousands of queues in a single namespace will kill queue lookup or
> something like that.
>
>
>
>

2014-02-09 04:17:06

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: Max number of posix queues in vanilla kernel (/proc/sys/fs/mqueue/queues_max)

On Fri, 2014-02-07 at 16:24 -0500, Doug Ledford wrote:
> On 2/7/2014 3:11 PM, Davidlohr Bueso wrote:
> > On Thu, 2014-02-06 at 12:21 +0200, [email protected] wrote:
> >> Hi Folks,
> >>
> >> I have recently ported my multi-process application (like a classical open
> >> system) which uses POSIX Queues as IPC to one of the latest Linux kernels,
> >> and I have faced issue that number of maximum queues are dramatically
> >> limited down to 1024 (see include/linux/ipc_namespace.h, #define
> >> HARD_QUEUESMAX 1024).
> >>
> >> Previously the max number of queues was INT_MAX (on 64bit system was:
> >> 2147483647).
> >
> > Hmm yes, 1024 is quite unrealistic for some workloads and breaks
> > userspace - I don't see any reasons for _this_ specific value in the
> > changelog or related changes in the patchset that introduced commits
> > 93e6f119 and 02967ea0.
>
> There wasn't a specific selection of that number other than a general
> attempt to make the max more reasonable (INT_MAX isn't really reasonable
> given the overhead of each individual queue, even if the queue number
> and max msg size are small).
>
> > And the fact that this limit is per namespace
> > makes no difference really. Hell, if nothing else, the mq_overview(7)
> > manpage description is evidence enough. For privileged users:
> >
> > The default value for queues_max is 256; it can be changed to any value in the range 0 to INT_MAX.
>
> That was obviously never updated to match the change.
>
> In hindsight, I'm not sure we really even care though. Since the limit
> on queues is per namespace, and we can make as many namespaces as we
> want, the limit is more or less meaningless and only serves as a
> nuisance to people.

Yes, but namespaces aren't _that_ popular in reality, specially as you
describe the workaround.

> Since we have accounting on a per user basis that
> spans across namespaces and across queues, maybe that should be
> sufficient and the limit on queues should simply be removed and we
> should instead just rely on memory limits. When the user has exhausted
> their allowed memory usage, whether by large queue sizes, large message
> sizes, or large queue counts, then they are done. When they haven't,
> they can keep allocating. Would make things considerably easier and
> would avoid the breakage we are talking about here.
>

Right, and this is taken care of in mqueue_get_inode().

The (untested) code below simply removes this global limit, let me know
if you're okay with it and I'll send a formal/tested patch.

diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index e7831d2..d78a09f 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -120,7 +120,6 @@ extern int mq_init_ns(struct ipc_namespace *ns);
*/
#define MIN_QUEUESMAX 1
#define DFLT_QUEUESMAX 256
-#define HARD_QUEUESMAX 1024
#define MIN_MSGMAX 1
#define DFLT_MSG 10U
#define DFLT_MSGMAX 10
diff --git a/ipc/mq_sysctl.c b/ipc/mq_sysctl.c
index 383d638..5bb8bfe 100644
--- a/ipc/mq_sysctl.c
+++ b/ipc/mq_sysctl.c
@@ -22,6 +22,16 @@ static void *get_mq(ctl_table *table)
return which;
}

+static int proc_mq_dointvec(ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table mq_table;
+ memcpy(&mq_table, table, sizeof(mq_table));
+ mq_table.data = get_mq(table);
+
+ return proc_dointvec(&mq_table, write, buffer, lenp, ppos);
+}
+
static int proc_mq_dointvec_minmax(ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
@@ -33,12 +43,10 @@ static int proc_mq_dointvec_minmax(ctl_table *table, int write,
lenp, ppos);
}
#else
+#define proc_mq_dointvec NULL
#define proc_mq_dointvec_minmax NULL
#endif

-static int msg_queues_limit_min = MIN_QUEUESMAX;
-static int msg_queues_limit_max = HARD_QUEUESMAX;
-
static int msg_max_limit_min = MIN_MSGMAX;
static int msg_max_limit_max = HARD_MSGMAX;

@@ -51,9 +59,7 @@ static ctl_table mq_sysctls[] = {
.data = &init_ipc_ns.mq_queues_max,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_mq_dointvec_minmax,
- .extra1 = &msg_queues_limit_min,
- .extra2 = &msg_queues_limit_max,
+ .proc_handler = proc_mq_dointvec,
},
{
.procname = "msg_max",
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ccf1f9f..c3b3117 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -433,9 +433,9 @@ static int mqueue_create(struct inode *dir, struct dentry *dentry,
error = -EACCES;
goto out_unlock;
}
- if (ipc_ns->mq_queues_count >= HARD_QUEUESMAX ||
- (ipc_ns->mq_queues_count >= ipc_ns->mq_queues_max &&
- !capable(CAP_SYS_RESOURCE))) {
+
+ if (ipc_ns->mq_queues_count >= ipc_ns->mq_queues_max &&
+ !capable(CAP_SYS_RESOURCE)) {
error = -ENOSPC;
goto out_unlock;
}

2014-02-09 16:13:13

by Doug Ledford

[permalink] [raw]
Subject: Re: Max number of posix queues in vanilla kernel (/proc/sys/fs/mqueue/queues_max)

On 02/08/2014 11:17 PM, Davidlohr Bueso wrote:
> On Fri, 2014-02-07 at 16:24 -0500, Doug Ledford wrote:
>> On 2/7/2014 3:11 PM, Davidlohr Bueso wrote:
>>> On Thu, 2014-02-06 at 12:21 +0200, [email protected] wrote:
>>>> Hi Folks,
>>>>
>>>> I have recently ported my multi-process application (like a classical open
>>>> system) which uses POSIX Queues as IPC to one of the latest Linux kernels,
>>>> and I have faced issue that number of maximum queues are dramatically
>>>> limited down to 1024 (see include/linux/ipc_namespace.h, #define
>>>> HARD_QUEUESMAX 1024).
>>>>
>>>> Previously the max number of queues was INT_MAX (on 64bit system was:
>>>> 2147483647).
>>>
>>> Hmm yes, 1024 is quite unrealistic for some workloads and breaks
>>> userspace - I don't see any reasons for _this_ specific value in the
>>> changelog or related changes in the patchset that introduced commits
>>> 93e6f119 and 02967ea0.
>>
>> There wasn't a specific selection of that number other than a general
>> attempt to make the max more reasonable (INT_MAX isn't really reasonable
>> given the overhead of each individual queue, even if the queue number
>> and max msg size are small).
>>
>>> And the fact that this limit is per namespace
>>> makes no difference really. Hell, if nothing else, the mq_overview(7)
>>> manpage description is evidence enough. For privileged users:
>>>
>>> The default value for queues_max is 256; it can be changed to any value in the range 0 to INT_MAX.
>>
>> That was obviously never updated to match the change.
>>
>> In hindsight, I'm not sure we really even care though. Since the limit
>> on queues is per namespace, and we can make as many namespaces as we
>> want, the limit is more or less meaningless and only serves as a
>> nuisance to people.
>
> Yes, but namespaces aren't _that_ popular in reality, specially as you
> describe the workaround.
>
>> Since we have accounting on a per user basis that
>> spans across namespaces and across queues, maybe that should be
>> sufficient and the limit on queues should simply be removed and we
>> should instead just rely on memory limits. When the user has exhausted
>> their allowed memory usage, whether by large queue sizes, large message
>> sizes, or large queue counts, then they are done. When they haven't,
>> they can keep allocating. Would make things considerably easier and
>> would avoid the breakage we are talking about here.
>>
>
> Right, and this is taken care of in mqueue_get_inode().
>
> The (untested) code below simply removes this global limit, let me know
> if you're okay with it and I'll send a formal/tested patch.
>
> diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
> index e7831d2..d78a09f 100644
> --- a/include/linux/ipc_namespace.h
> +++ b/include/linux/ipc_namespace.h
> @@ -120,7 +120,6 @@ extern int mq_init_ns(struct ipc_namespace *ns);
> */
> #define MIN_QUEUESMAX 1
> #define DFLT_QUEUESMAX 256
> -#define HARD_QUEUESMAX 1024

Since you are passing the queue setting off to proc_dointvec, I don't
think the3 MIN_QUEUESMAX value is used any longer, so might as well kill
it too. Otherwise, it looks acceptable to me.

2014-02-09 21:06:10

by Davidlohr Bueso

[permalink] [raw]
Subject: [PATCH] ipc,mqueue: remove limits for the amount of system-wide queues

From: Davidlohr Bueso <[email protected]>

Commit 93e6f119 (ipc/mqueue: cleanup definition names and locations) added
global hardcoded limits to the amount of message queues that can be created.
While these limits are per-namespace, reality is that it ends up breaking
userspace applications. Historically users have, at least in theory, been able
to create up to INT_MAX queues, and limiting it to just 1024 is way too low
and dramatic for some workloads and use cases. For instance, Madars reports:

"This update imposes bad limits on our multi-process application. As our
app uses approaches that each process opens its own set of queues (usually
something about 3-5 queues per process). In some scenarios we might run up
to 3000 processes or more (which of-course for linux is not a problem).
Thus we might need up to 9000 queues or more. All processes run under one
user."

Other affected users can be found in launchpad bug #1155695:
https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695

Instead of increasing this limit, revert it entirely and fallback to the
original way of dealing queue limits -- where once a user's resource limit
is reached, and all memory is used, new queues cannot be created.

Reported-by: [email protected]
Cc: Doug Ledford <[email protected]>
Cc: Manfred Spraul <[email protected]>
Cc: [email protected] # v3.5+
Signed-off-by: Davidlohr Bueso <[email protected]>
---
include/linux/ipc_namespace.h | 2 --
ipc/mq_sysctl.c | 18 ++++++++++++------
ipc/mqueue.c | 6 +++---
3 files changed, 15 insertions(+), 11 deletions(-)

diff --git a/include/linux/ipc_namespace.h b/include/linux/ipc_namespace.h
index e7831d2..35e7eca 100644
--- a/include/linux/ipc_namespace.h
+++ b/include/linux/ipc_namespace.h
@@ -118,9 +118,7 @@ extern int mq_init_ns(struct ipc_namespace *ns);
* the new maximum will handle anyone else. I may have to revisit this
* in the future.
*/
-#define MIN_QUEUESMAX 1
#define DFLT_QUEUESMAX 256
-#define HARD_QUEUESMAX 1024
#define MIN_MSGMAX 1
#define DFLT_MSG 10U
#define DFLT_MSGMAX 10
diff --git a/ipc/mq_sysctl.c b/ipc/mq_sysctl.c
index 383d638..5bb8bfe 100644
--- a/ipc/mq_sysctl.c
+++ b/ipc/mq_sysctl.c
@@ -22,6 +22,16 @@ static void *get_mq(ctl_table *table)
return which;
}

+static int proc_mq_dointvec(ctl_table *table, int write,
+ void __user *buffer, size_t *lenp, loff_t *ppos)
+{
+ struct ctl_table mq_table;
+ memcpy(&mq_table, table, sizeof(mq_table));
+ mq_table.data = get_mq(table);
+
+ return proc_dointvec(&mq_table, write, buffer, lenp, ppos);
+}
+
static int proc_mq_dointvec_minmax(ctl_table *table, int write,
void __user *buffer, size_t *lenp, loff_t *ppos)
{
@@ -33,12 +43,10 @@ static int proc_mq_dointvec_minmax(ctl_table *table, int write,
lenp, ppos);
}
#else
+#define proc_mq_dointvec NULL
#define proc_mq_dointvec_minmax NULL
#endif

-static int msg_queues_limit_min = MIN_QUEUESMAX;
-static int msg_queues_limit_max = HARD_QUEUESMAX;
-
static int msg_max_limit_min = MIN_MSGMAX;
static int msg_max_limit_max = HARD_MSGMAX;

@@ -51,9 +59,7 @@ static ctl_table mq_sysctls[] = {
.data = &init_ipc_ns.mq_queues_max,
.maxlen = sizeof(int),
.mode = 0644,
- .proc_handler = proc_mq_dointvec_minmax,
- .extra1 = &msg_queues_limit_min,
- .extra2 = &msg_queues_limit_max,
+ .proc_handler = proc_mq_dointvec,
},
{
.procname = "msg_max",
diff --git a/ipc/mqueue.c b/ipc/mqueue.c
index ccf1f9f..c3b3117 100644
--- a/ipc/mqueue.c
+++ b/ipc/mqueue.c
@@ -433,9 +433,9 @@ static int mqueue_create(struct inode *dir, struct dentry *dentry,
error = -EACCES;
goto out_unlock;
}
- if (ipc_ns->mq_queues_count >= HARD_QUEUESMAX ||
- (ipc_ns->mq_queues_count >= ipc_ns->mq_queues_max &&
- !capable(CAP_SYS_RESOURCE))) {
+
+ if (ipc_ns->mq_queues_count >= ipc_ns->mq_queues_max &&
+ !capable(CAP_SYS_RESOURCE)) {
error = -ENOSPC;
goto out_unlock;
}
--
1.8.1.4


2014-02-11 18:14:18

by Doug Ledford

[permalink] [raw]
Subject: Re: [PATCH] ipc,mqueue: remove limits for the amount of system-wide queues

On 2/9/2014 4:06 PM, Davidlohr Bueso wrote:
> From: Davidlohr Bueso <[email protected]>
>
> Commit 93e6f119 (ipc/mqueue: cleanup definition names and locations) added
> global hardcoded limits to the amount of message queues that can be created.
> While these limits are per-namespace, reality is that it ends up breaking
> userspace applications. Historically users have, at least in theory, been able
> to create up to INT_MAX queues, and limiting it to just 1024 is way too low
> and dramatic for some workloads and use cases. For instance, Madars reports:
>
> "This update imposes bad limits on our multi-process application. As our
> app uses approaches that each process opens its own set of queues (usually
> something about 3-5 queues per process). In some scenarios we might run up
> to 3000 processes or more (which of-course for linux is not a problem).
> Thus we might need up to 9000 queues or more. All processes run under one
> user."
>
> Other affected users can be found in launchpad bug #1155695:
> https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695
>
> Instead of increasing this limit, revert it entirely and fallback to the
> original way of dealing queue limits -- where once a user's resource limit
> is reached, and all memory is used, new queues cannot be created.
>
> Reported-by: [email protected]
> Cc: Doug Ledford <[email protected]>

Acked-by: Doug Ledford <[email protected]>

> Cc: Manfred Spraul <[email protected]>
> Cc: [email protected] # v3.5+
> Signed-off-by: Davidlohr Bueso <[email protected]>
> ---
> include/linux/ipc_namespace.h | 2 --
> ipc/mq_sysctl.c | 18 ++++++++++++------
> ipc/mqueue.c | 6 +++---
> 3 files changed, 15 insertions(+), 11 deletions(-)



Attachments:
signature.asc (899.00 B)
OpenPGP digital signature

2014-02-11 22:16:27

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] ipc,mqueue: remove limits for the amount of system-wide queues

On Sun, 09 Feb 2014 13:06:03 -0800 Davidlohr Bueso <[email protected]> wrote:

> From: Davidlohr Bueso <[email protected]>
>
> Commit 93e6f119 (ipc/mqueue: cleanup definition names and locations) added
> global hardcoded limits to the amount of message queues that can be created.
> While these limits are per-namespace, reality is that it ends up breaking
> userspace applications. Historically users have, at least in theory, been able
> to create up to INT_MAX queues, and limiting it to just 1024 is way too low
> and dramatic for some workloads and use cases. For instance, Madars reports:
>
> "This update imposes bad limits on our multi-process application. As our
> app uses approaches that each process opens its own set of queues (usually
> something about 3-5 queues per process). In some scenarios we might run up
> to 3000 processes or more (which of-course for linux is not a problem).
> Thus we might need up to 9000 queues or more. All processes run under one
> user."
>
> Other affected users can be found in launchpad bug #1155695:
> https://bugs.launchpad.net/ubuntu/+source/manpages/+bug/1155695
>
> Instead of increasing this limit, revert it entirely and fallback to the
> original way of dealing queue limits -- where once a user's resource limit
> is reached, and all memory is used, new queues cannot be created.
>
> --- a/ipc/mq_sysctl.c
> +++ b/ipc/mq_sysctl.c
> @@ -22,6 +22,16 @@ static void *get_mq(ctl_table *table)
> return which;
> }
>
> +static int proc_mq_dointvec(ctl_table *table, int write,
> + void __user *buffer, size_t *lenp, loff_t *ppos)
> +{
> + struct ctl_table mq_table;
> + memcpy(&mq_table, table, sizeof(mq_table));
> + mq_table.data = get_mq(table);
> +
> + return proc_dointvec(&mq_table, write, buffer, lenp, ppos);
> +}
> +
> static int proc_mq_dointvec_minmax(ctl_table *table, int write,
> void __user *buffer, size_t *lenp, loff_t *ppos)
> {
>
> ...
>
> @@ -51,9 +59,7 @@ static ctl_table mq_sysctls[] = {
> .data = &init_ipc_ns.mq_queues_max,
> .maxlen = sizeof(int),
> .mode = 0644,
> - .proc_handler = proc_mq_dointvec_minmax,
> - .extra1 = &msg_queues_limit_min,
> - .extra2 = &msg_queues_limit_max,
> + .proc_handler = proc_mq_dointvec,
> },

hm, afaict proc_mq_dointvec() isn't needed - proc_dointvec_minmax()
will do the right thing if ->extra1 and/or ->extra2 are NULL, so we can
still use proc_mq_dointvec_minmax().

Which has absolutely nothing at all to do with your patch, but makes me
think we could take a sharp instrument to the sysctl code...