2007-11-14 14:42:22

by Ciju Rajan K

[permalink] [raw]
Subject: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

When a normal user is trying to allocate huge pages using shmget(), the
user is not able to get the memory even if the gid is present in
/proc/sys/vm/hugetlb_shm_group. The function user_shm_lock() is not
successful. The user does not have the capability to perform a
CAP_IPC_LOCK. A check is added here to see whether the gid is present in
hugetlb_shm_group. Please review the patch.

Signed-off-by: Ciju Rajan ([email protected])
---
--- linux-2.6.23/mm/mlock.c.orig 2007-11-14 17:35:02.000000000 +0530
+++ linux-2.6.23/mm/mlock.c 2007-11-14 17:50:31.000000000 +0530
@@ -8,6 +8,7 @@
#include <linux/capability.h>
#include <linux/mman.h>
#include <linux/mm.h>
+#include <linux/hugetlb.h>
#include <linux/mempolicy.h>
#include <linux/syscalls.h>
#include <linux/sched.h>
@@ -248,8 +249,14 @@ int user_shm_lock(size_t size, struct us
allowed = 1;
lock_limit >>= PAGE_SHIFT;
spin_lock(&shmlock_user_lock);
+#ifdef CONFIG_HUGETLB_PAGE
+ if (!allowed &&
+ locked + user->locked_shm > lock_limit &&
+ (!(capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group))))
+#else
if (!allowed &&
locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
+#endif
goto out;
get_uid(user);
user->locked_shm += locked;


2007-11-14 15:30:43

by Adam Litke

[permalink] [raw]
Subject: Re: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

Hi Ciju:

I am still not exactly sure why this patch is needed. As I read
user_shm_lock():

> lock_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
> if (lock_limit == RLIM_INFINITY)
> allowed = 1;
> lock_limit >>= PAGE_SHIFT;
> spin_lock(&shmlock_user_lock);
> if (!allowed &&
> locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
> goto out;

... if the user's locked limit (ulimit -l) is set to unlimited, allowed
(above) is set to 1. In that case, the second part of that if() is
bypassed, and the function grants permission. Therefore, the easy
solution is to make sure your user's lock_limit is RLIM_INFINITY.

On Wed, 2007-11-14 at 19:45 +0530, Ciju Rajan K wrote:
<snip>
> @@ -248,8 +249,14 @@ int user_shm_lock(size_t size, struct us
> allowed = 1;
> lock_limit >>= PAGE_SHIFT;
> spin_lock(&shmlock_user_lock);
> +#ifdef CONFIG_HUGETLB_PAGE
> + if (!allowed &&
> + locked + user->locked_shm > lock_limit &&
> + (!(capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group))))

This will allow any user in hugetlb_shm_group to make unlimited use of
huge page shm segments _and_ normal page shm segments. Definitely not
what you want.

> +#else
> if (!allowed &&
> locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
> +#endif
> goto out;
> get_uid(user);
> user->locked_shm += locked;
>

Please don't add new #ifdefs into .c files, headers only.

--
Adam Litke - (agl at us.ibm.com)
IBM Linux Technology Center

2007-11-14 22:37:09

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

On Wed, Nov 14, 2007 at 09:31:41AM -0600, aglitke wrote:
> ... if the user's locked limit (ulimit -l) is set to unlimited, allowed
> (above) is set to 1. In that case, the second part of that if() is
> bypassed, and the function grants permission. Therefore, the easy
> solution is to make sure your user's lock_limit is RLIM_INFINITY.

This function deserves a minor cleanup and a bit more commenting.

Reading user->locked_shm within shmlock_user_lock would be nice, too.

Maybe something like this (untested, uncompiled) would do.


-- wli


diff --git a/mm/mlock.c b/mm/mlock.c
index 7b26560..5f51792 100644
--- a/mm/mlock.c
+++ b/mm/mlock.c
@@ -234,6 +234,12 @@ asmlinkage long sys_munlockall(void)
/*
* Objects with different lifetime than processes (SHM_LOCK and SHM_HUGETLB
* shm segments) get accounted against the user_struct instead.
+ * First, user_shm_lock() checks that the user has permission to lock
+ * enough memory; then if so, the locked shm is accounted to the user's
+ * system-wide state. shmlock_user_lock protects the per-user field
+ * tracking how much locked_shm is in use within the struct user_struct.
+ * shmlock_user_lock is taken early to guard the read-only check that
+ * user->locked_shm is in-bounds against updates to user->locked_shm.
*/
static DEFINE_SPINLOCK(shmlock_user_lock);

@@ -242,19 +248,22 @@ int user_shm_lock(size_t size, struct user_struct *user)
unsigned long lock_limit, locked;
int allowed = 0;

+ spin_lock(&shmlock_user_lock);
locked = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
lock_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
if (lock_limit == RLIM_INFINITY)
allowed = 1;
- lock_limit >>= PAGE_SHIFT;
- spin_lock(&shmlock_user_lock);
- if (!allowed &&
- locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
- goto out;
- get_uid(user);
- user->locked_shm += locked;
- allowed = 1;
-out:
+ else {
+ lock_limit >>= PAGE_SHIFT;
+ if (locked + user->locked_shm <= lock_limit)
+ allowed = 1;
+ else if (capable(CAP_IPC_LOCK))
+ allowed = 1;
+ }
+ if (allowed) {
+ get_uid(user);
+ user->locked_shm += locked;
+ }
spin_unlock(&shmlock_user_lock);
return allowed;
}

2007-11-16 14:30:51

by Ciju Rajan K

[permalink] [raw]
Subject: Re: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

Hi Adam,
If this condition check is not included, the root user have to use the
function
setrlimit() to set the lock_limit of a normal user to RLIM_INFINITY. I
think
the /proc interface 'hugetlb_shm_group' is introduced to avoid these
difficulties.
Please correct me, if I am wrong.

Regarding the problem with the 'if' condition, I feel that even in the
case of
user's lock_limit is set to unlimited, he could use unlimited hugepages and
normal page shm segments. So what is the advantage in this scenario.

I tried to avoid the #ifdef statements. But the variable
sysctl_hugetlb_shm_group is defined
in fs/hugetlbfs/inode.c, this segment is enabled only when the config
parameter
CONFIG_HUGETLBFS is set to yes. If the hugetlbfs is not selected while
configuring,
there would be a compilation error.

Is there any better way so that the root user can configure the gid in
'hugetlb_shm_group'
and the user is able to access the huge pages using shmget().

Thanks
Ciju

aglitke wrote:
> Hi Ciju:
>
> I am still not exactly sure why this patch is needed. As I read
> user_shm_lock():
>
>
>> lock_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
>> if (lock_limit == RLIM_INFINITY)
>> allowed = 1;
>> lock_limit >>= PAGE_SHIFT;
>> spin_lock(&shmlock_user_lock);
>> if (!allowed &&
>> locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
>> goto out;
>>
>
> ... if the user's locked limit (ulimit -l) is set to unlimited, allowed
> (above) is set to 1. In that case, the second part of that if() is
> bypassed, and the function grants permission. Therefore, the easy
> solution is to make sure your user's lock_limit is RLIM_INFINITY.
>
> On Wed, 2007-11-14 at 19:45 +0530, Ciju Rajan K wrote:
> <snip>
>
>> @@ -248,8 +249,14 @@ int user_shm_lock(size_t size, struct us
>> allowed = 1;
>> lock_limit >>= PAGE_SHIFT;
>> spin_lock(&shmlock_user_lock);
>> +#ifdef CONFIG_HUGETLB_PAGE
>> + if (!allowed &&
>> + locked + user->locked_shm > lock_limit &&
>> + (!(capable(CAP_IPC_LOCK) || in_group_p(sysctl_hugetlb_shm_group))))
>>
>
> This will allow any user in hugetlb_shm_group to make unlimited use of
> huge page shm segments _and_ normal page shm segments. Definitely not
> what you want.
>
>
>> +#else
>> if (!allowed &&
>> locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
>> +#endif
>> goto out;
>> get_uid(user);
>> user->locked_shm += locked;
>>
>>
>
> Please don't add new #ifdefs into .c files, headers only.
>
>

2007-11-29 18:32:17

by Ciju Rajan K

[permalink] [raw]
Subject: Re: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

Hi Wli,
I tested your patch. But that is not solving the problem.
If the code change to user_shm_lock() is not a good solution, could
you please suggest a method so that the normal user is able to allocate
the huge pages, if his gid is added to /proc/sys/vm/hugetlb_shm_group

Thanks
Ciju

William Lee Irwin III wrote:
> On Wed, Nov 14, 2007 at 09:31:41AM -0600, aglitke wrote:
>> ... if the user's locked limit (ulimit -l) is set to unlimited, allowed
>> (above) is set to 1. In that case, the second part of that if() is
>> bypassed, and the function grants permission. Therefore, the easy
>> solution is to make sure your user's lock_limit is RLIM_INFINITY.
>
> This function deserves a minor cleanup and a bit more commenting.
>
> Reading user->locked_shm within shmlock_user_lock would be nice, too.
>
> Maybe something like this (untested, uncompiled) would do.
>
>
> -- wli
>
>
> diff --git a/mm/mlock.c b/mm/mlock.c
> index 7b26560..5f51792 100644
> --- a/mm/mlock.c
> +++ b/mm/mlock.c
> @@ -234,6 +234,12 @@ asmlinkage long sys_munlockall(void)
> /*
> * Objects with different lifetime than processes (SHM_LOCK and SHM_HUGETLB
> * shm segments) get accounted against the user_struct instead.
> + * First, user_shm_lock() checks that the user has permission to lock
> + * enough memory; then if so, the locked shm is accounted to the user's
> + * system-wide state. shmlock_user_lock protects the per-user field
> + * tracking how much locked_shm is in use within the struct user_struct.
> + * shmlock_user_lock is taken early to guard the read-only check that
> + * user->locked_shm is in-bounds against updates to user->locked_shm.
> */
> static DEFINE_SPINLOCK(shmlock_user_lock);
>
> @@ -242,19 +248,22 @@ int user_shm_lock(size_t size, struct user_struct *user)
> unsigned long lock_limit, locked;
> int allowed = 0;
>
> + spin_lock(&shmlock_user_lock);
> locked = (size + PAGE_SIZE - 1) >> PAGE_SHIFT;
> lock_limit = current->signal->rlim[RLIMIT_MEMLOCK].rlim_cur;
> if (lock_limit == RLIM_INFINITY)
> allowed = 1;
> - lock_limit >>= PAGE_SHIFT;
> - spin_lock(&shmlock_user_lock);
> - if (!allowed &&
> - locked + user->locked_shm > lock_limit && !capable(CAP_IPC_LOCK))
> - goto out;
> - get_uid(user);
> - user->locked_shm += locked;
> - allowed = 1;
> -out:
> + else {
> + lock_limit >>= PAGE_SHIFT;
> + if (locked + user->locked_shm <= lock_limit)
> + allowed = 1;
> + else if (capable(CAP_IPC_LOCK))
> + allowed = 1;
> + }
> + if (allowed) {
> + get_uid(user);
> + user->locked_shm += locked;
> + }
> spin_unlock(&shmlock_user_lock);
> return allowed;
> }

2007-11-29 23:12:59

by William Lee Irwin III

[permalink] [raw]
Subject: Re: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

On Fri, Nov 30, 2007 at 12:02:32AM +0530, Ciju Rajan K wrote:
> I tested your patch. But that is not solving the problem.
> If the code change to user_shm_lock() is not a good solution, could
> you please suggest a method so that the normal user is able to allocate
> the huge pages, if his gid is added to /proc/sys/vm/hugetlb_shm_group

The patch I posted resolves a race unrelated to your issue. Raising your
locked memory limits should not be difficult. /etc/limits.conf or similar
should set it up for you. You can also change the default rlimit in the
kernel and compile it with default limits elevated to what you want your
unprivileged process to have to start with if you're truly having lots
of trouble getting userspace to set the default limits properly. I'd
look in include/asm-generic/resource.h


-- wli

2008-01-29 15:01:51

by Ciju Rajan K

[permalink] [raw]
Subject: Re: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

William Lee Irwin III wrote:
> On Fri, Nov 30, 2007 at 12:02:32AM +0530, Ciju Rajan K wrote:
>> I tested your patch. But that is not solving the problem.
>> If the code change to user_shm_lock() is not a good solution, could
>> you please suggest a method so that the normal user is able to allocate
>> the huge pages, if his gid is added to /proc/sys/vm/hugetlb_shm_group
>
> The patch I posted resolves a race unrelated to your issue. Raising your
> locked memory limits should not be difficult. /etc/limits.conf or similar
> should set it up for you. You can also change the default rlimit in the
> kernel and compile it with default limits elevated to what you want your
> unprivileged process to have to start with if you're truly having lots
> of trouble getting userspace to set the default limits properly. I'd
> look in include/asm-generic/resource.h
>
>
> -- wli

Hi Wli,

The documentation available in the kernel for huge pages does not talk
about the issue associated with locked memory limit. I think it would be
helpful to the users if we mention about this in the documentation. I
am attaching a small documentation patch.

Thanks
Ciju

Signed-off-by: Ciju Rajan ([email protected])
---
--- Documentation/vm/hugetlbpage.txt.orig 2008-01-24 16:42:40.000000000 +0530
+++ Documentation/vm/hugetlbpage.txt 2008-01-29 19:36:45.000000000 +0530
@@ -108,6 +108,18 @@ a supplementary group and system admin n
applications to use any combination of mmaps and shm* calls, though the
mount of filesystem will be required for using mmap calls.

+Note: The default locked limit in the kernel is just 32KB. If the normal
+user whose gid is present in the file /proc/sys/vm/hugetlb_shm_group needs
+more memory than this, the default limit must be increased. If pam is installed
+on your system, resource limits can be configured by installing lines similar
+to the following in /etc/security/limits.conf:
+
+@hugegroup soft memlock 2097152
+@hugegroup hard memlock 2097152
+
+Otherwise, you may manipulate the locked limit command directly with 'ulimit'.
+See its man page for more information.
+
*******************************************************************

/*

2008-01-30 09:38:14

by Ciju Rajan K

[permalink] [raw]
Subject: Re: [RFC] [PATCH] hugetlbfs :shmget with SHM_HUGETLB only works as root

Ciju Rajan K wrote:
> William Lee Irwin III wrote:
>> On Fri, Nov 30, 2007 at 12:02:32AM +0530, Ciju Rajan K wrote:
>>> I tested your patch. But that is not solving the problem.
>>> If the code change to user_shm_lock() is not a good solution, could
>>> you please suggest a method so that the normal user is able to
>>> allocate the huge pages, if his gid is added to
>>> /proc/sys/vm/hugetlb_shm_group
>>
>> The patch I posted resolves a race unrelated to your issue. Raising your
>> locked memory limits should not be difficult. /etc/limits.conf or similar
>> should set it up for you. You can also change the default rlimit in the
>> kernel and compile it with default limits elevated to what you want your
>> unprivileged process to have to start with if you're truly having lots
>> of trouble getting userspace to set the default limits properly. I'd
>> look in include/asm-generic/resource.h
>>
>>
>> -- wli
>
> Hi Wli,
>
> The documentation available in the kernel for huge pages does not talk
> about the issue associated with locked memory limit. I think it would be
> helpful to the users if we mention about this in the documentation. I
> am attaching a small documentation patch.
>
> Thanks
> Ciju
>
> Signed-off-by: Ciju Rajan ([email protected])
> ---
> --- Documentation/vm/hugetlbpage.txt.orig 2008-01-24
> 16:42:40.000000000 +0530
> +++ Documentation/vm/hugetlbpage.txt 2008-01-29 19:36:45.000000000 +0530
> @@ -108,6 +108,18 @@ a supplementary group and system admin n
> applications to use any combination of mmaps and shm* calls, though the
> mount of filesystem will be required for using mmap calls.
>
> +Note: The default locked limit in the kernel is just 32KB. If the normal
> +user whose gid is present in the file /proc/sys/vm/hugetlb_shm_group needs
> +more memory than this, the default limit must be increased. If pam is
> installed
> +on your system, resource limits can be configured by installing lines
> similar
> +to the following in /etc/security/limits.conf:
> +
> +@hugegroup soft memlock 2097152
> +@hugegroup hard memlock 2097152
> +
> +Otherwise, you may manipulate the locked limit command directly with
> 'ulimit'.
> +See its man page for more information.
> +
> *******************************************************************
>
> /*
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Wli,
Can this patch be included in the hugepage documentation?

Thanks
Ciju