2013-04-10 18:26:49

by Waiman Long

[permalink] [raw]
Subject: [PATCH RFC 1/2] SELinux: reduce overhead of mls_level_isvalid() function call

While running the high_systime workload of the AIM7 benchmark on
a 2-socket 12-core Westmere x86-64 machine running 3.8.2 kernel,
it was found that a pretty sizable amount of time was spent in the
SELinux code. Below was the perf trace of the "perf record -a -s"
of a test run at 1500 users:

3.96% ls [kernel.kallsyms] [k] ebitmap_get_bit
1.44% ls [kernel.kallsyms] [k] mls_level_isvalid
1.33% ls [kernel.kallsyms] [k] find_next_bit

The ebitmap_get_bit() was the hottest function in the perf-report
output. Both the ebitmap_get_bit() and find_next_bit() functions
were, in fact, called by mls_level_isvalid(). As a result, the
mls_level_isvalid() call consumed 6.73% of the total CPU time of all
the 24 virtual CPUs which is quite a lot.

Looking at the mls_level_isvalid() function, it is checking to see
if all the bits set in one of the ebitmap structure are also set in
another one as well as the highest set bit is no bigger than the one
specified by the given policydb data structure. It is doing it in
a bit-by-bit manner. So if the ebitmap structure has many bits set,
the iteration loop will be done many times.

The current code can be rewritten to use a similar algorithm as the
ebitmap_contains() function with an additional check for the highest
set bit. With that change, the perf trace showed that the used CPU
time drop down to just 0.09% of the total which is about 100X less
than before.

0.04% ls [kernel.kallsyms] [k] ebitmap_get_bit
0.04% ls [kernel.kallsyms] [k] mls_level_isvalid
0.01% ls [kernel.kallsyms] [k] find_next_bit

Actually, the remaining ebitmap_get_bit() and find_next_bit() function
calls are made by other kernel routines as the new mls_level_isvalid()
function will not call them anymore.

This patch also improves the high_systime AIM7 benchmark result,
though the improvement is not as impressive as is suggested by the
reduction in CPU time. The table below shows the performance change
on the 2-socket x86-64 system mentioned above.

+--------------+---------------+----------------+-----------------+
| Workload | mean % change | mean % change | mean % change |
| | 10-100 users | 200-1000 users | 1100-2000 users |
+--------------+---------------+----------------+-----------------+
| high_systime | +0.2% | +1.1% | +2.4% |
+--------------+---------------+----------------+-----------------+

Signed-off-by: Waiman Long <[email protected]>
---
security/selinux/ss/mls.c | 38 +++++++++++++++++++++++++++-----------
1 files changed, 27 insertions(+), 11 deletions(-)

diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
index 40de8d3..ce02803 100644
--- a/security/selinux/ss/mls.c
+++ b/security/selinux/ss/mls.c
@@ -160,8 +160,7 @@ void mls_sid_to_context(struct context *context,
int mls_level_isvalid(struct policydb *p, struct mls_level *l)
{
struct level_datum *levdatum;
- struct ebitmap_node *node;
- int i;
+ struct ebitmap_node *nodel, *noded;

if (!l->sens || l->sens > p->p_levels.nprim)
return 0;
@@ -170,16 +169,33 @@ int mls_level_isvalid(struct policydb *p, struct mls_level *l)
if (!levdatum)
return 0;

- ebitmap_for_each_positive_bit(&l->cat, node, i) {
- if (i > p->p_cats.nprim)
- return 0;
- if (!ebitmap_get_bit(&levdatum->level->cat, i)) {
- /*
- * Category may not be associated with
- * sensitivity.
- */
- return 0;
+ /*
+ * Return 1 iff
+ * 1. l->cat.node is NULL, or
+ * 2. all the bits set in l->cat are also set in levdatum->level->cat,
+ * and
+ * 3. the last bit set in l->cat should not be larger than
+ * p->p_cats.nprim.
+ */
+ noded = levdatum->level->cat.node;
+ for (nodel = l->cat.node ; nodel ; nodel = nodel->next) {
+ int i, lastsetbit = -1;
+
+ for (i = EBITMAP_UNIT_NUMS - 1 ; i >= 0 ; i--) {
+ if (!nodel->maps[i])
+ continue;
+ if (!noded ||
+ ((nodel->maps[i]&noded->maps[i]) != nodel->maps[i]))
+ return 0;
+ if (lastsetbit < 0)
+ lastsetbit = nodel->startbit +
+ i * EBITMAP_UNIT_SIZE +
+ __fls(nodel->maps[i]);
}
+ if ((lastsetbit >= 0) && (lastsetbit > p->p_cats.nprim))
+ return 0;
+ if (noded)
+ noded = noded->next;
}

return 1;
--
1.7.1


2013-04-10 18:26:47

by Waiman Long

[permalink] [raw]
Subject: [PATCH RFC 2/2] SELinux: Increase ebitmap_node size for 64-bit configuration

Currently, the ebitmap_node structure has a fixed size of 32 bytes. On
a 32-bit system, the overhead is 8 bytes, leaving 24 bytes for being
used as bitmaps. The overhead ratio is 1/4.

On a 64-bit system, the overhead is 16 bytes. Therefore, only 16 bytes
are left for bitmap purpose and the overhead ratio is 1/2. With a
3.8.2 kernel, a boot-up operation will cause the ebitmap_get_bit()
function to be called about 9 million times. The average number of
ebitmap_node traversal is about 3.7.

This patch increases the size of the ebitmap_node structure to 64
bytes for 64-bit system to keep the overhead ratio at 1/4. This may
also improve performance a little bit by making node to node traversal
less frequent (< 2) as more bits are available in each node.

Signed-off-by: Waiman Long <[email protected]>
---
security/selinux/ss/ebitmap.h | 8 +++++++-
1 files changed, 7 insertions(+), 1 deletions(-)

diff --git a/security/selinux/ss/ebitmap.h b/security/selinux/ss/ebitmap.h
index 922f8af..653a5b9 100644
--- a/security/selinux/ss/ebitmap.h
+++ b/security/selinux/ss/ebitmap.h
@@ -16,7 +16,13 @@

#include <net/netlabel.h>

-#define EBITMAP_UNIT_NUMS ((32 - sizeof(void *) - sizeof(u32)) \
+#ifdef CONFIG_64BIT
+#define EBITMAP_NODE_SIZE 64
+#else
+#define EBITMAP_NODE_SIZE 32
+#endif
+
+#define EBITMAP_UNIT_NUMS ((EBITMAP_NODE_SIZE-sizeof(void *)-sizeof(u32))\
/ sizeof(unsigned long))
#define EBITMAP_UNIT_SIZE BITS_PER_LONG
#define EBITMAP_SIZE (EBITMAP_UNIT_NUMS * EBITMAP_UNIT_SIZE)
--
1.7.1

2013-05-03 14:08:05

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH RFC 1/2] SELinux: reduce overhead of mls_level_isvalid() function call

On 04/10/2013 02:26 PM, Waiman Long wrote:
> While running the high_systime workload of the AIM7 benchmark on
> a 2-socket 12-core Westmere x86-64 machine running 3.8.2 kernel,
> it was found that a pretty sizable amount of time was spent in the
> SELinux code. Below was the perf trace of the "perf record -a -s"
> of a test run at 1500 users:
>
> 3.96% ls [kernel.kallsyms] [k] ebitmap_get_bit
> 1.44% ls [kernel.kallsyms] [k] mls_level_isvalid
> 1.33% ls [kernel.kallsyms] [k] find_next_bit
>
> The ebitmap_get_bit() was the hottest function in the perf-report
> output. Both the ebitmap_get_bit() and find_next_bit() functions
> were, in fact, called by mls_level_isvalid(). As a result, the
> mls_level_isvalid() call consumed 6.73% of the total CPU time of all
> the 24 virtual CPUs which is quite a lot.
>
> Looking at the mls_level_isvalid() function, it is checking to see
> if all the bits set in one of the ebitmap structure are also set in
> another one as well as the highest set bit is no bigger than the one
> specified by the given policydb data structure. It is doing it in
> a bit-by-bit manner. So if the ebitmap structure has many bits set,
> the iteration loop will be done many times.
>
> The current code can be rewritten to use a similar algorithm as the
> ebitmap_contains() function with an additional check for the highest
> set bit. With that change, the perf trace showed that the used CPU
> time drop down to just 0.09% of the total which is about 100X less
> than before.
>
> 0.04% ls [kernel.kallsyms] [k] ebitmap_get_bit
> 0.04% ls [kernel.kallsyms] [k] mls_level_isvalid
> 0.01% ls [kernel.kallsyms] [k] find_next_bit
>
> Actually, the remaining ebitmap_get_bit() and find_next_bit() function
> calls are made by other kernel routines as the new mls_level_isvalid()
> function will not call them anymore.
>
> This patch also improves the high_systime AIM7 benchmark result,
> though the improvement is not as impressive as is suggested by the
> reduction in CPU time. The table below shows the performance change
> on the 2-socket x86-64 system mentioned above.
>
> +--------------+---------------+----------------+-----------------+
> | Workload | mean % change | mean % change | mean % change |
> | | 10-100 users | 200-1000 users | 1100-2000 users |
> +--------------+---------------+----------------+-----------------+
> | high_systime | +0.2% | +1.1% | +2.4% |
> +--------------+---------------+----------------+-----------------+
>
> Signed-off-by: Waiman Long<[email protected]>
> ---
> security/selinux/ss/mls.c | 38 +++++++++++++++++++++++++++-----------
> 1 files changed, 27 insertions(+), 11 deletions(-)
>
> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
> index 40de8d3..ce02803 100644
> --- a/security/selinux/ss/mls.c
> +++ b/security/selinux/ss/mls.c
> @@ -160,8 +160,7 @@ void mls_sid_to_context(struct context *context,
> int mls_level_isvalid(struct policydb *p, struct mls_level *l)
> {
> struct level_datum *levdatum;
> - struct ebitmap_node *node;
> - int i;
> + struct ebitmap_node *nodel, *noded;
>
> if (!l->sens || l->sens> p->p_levels.nprim)
> return 0;
> @@ -170,16 +169,33 @@ int mls_level_isvalid(struct policydb *p, struct mls_level *l)
> if (!levdatum)
> return 0;
>
> - ebitmap_for_each_positive_bit(&l->cat, node, i) {
> - if (i> p->p_cats.nprim)
> - return 0;
> - if (!ebitmap_get_bit(&levdatum->level->cat, i)) {
> - /*
> - * Category may not be associated with
> - * sensitivity.
> - */
> - return 0;
> + /*
> + * Return 1 iff
> + * 1. l->cat.node is NULL, or
> + * 2. all the bits set in l->cat are also set in levdatum->level->cat,
> + * and
> + * 3. the last bit set in l->cat should not be larger than
> + * p->p_cats.nprim.
> + */
> + noded = levdatum->level->cat.node;
> + for (nodel = l->cat.node ; nodel ; nodel = nodel->next) {
> + int i, lastsetbit = -1;
> +
> + for (i = EBITMAP_UNIT_NUMS - 1 ; i>= 0 ; i--) {
> + if (!nodel->maps[i])
> + continue;
> + if (!noded ||
> + ((nodel->maps[i]&noded->maps[i]) != nodel->maps[i]))
> + return 0;
> + if (lastsetbit< 0)
> + lastsetbit = nodel->startbit +
> + i * EBITMAP_UNIT_SIZE +
> + __fls(nodel->maps[i]);
> }
> + if ((lastsetbit>= 0)&& (lastsetbit> p->p_cats.nprim))
> + return 0;
> + if (noded)
> + noded = noded->next;
> }
>
> return 1;

Would you mind giving me some feedback on what you think about this patch?

Thank a lot!
Regards,
Longman

2013-06-05 14:53:28

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH RFC 1/2] SELinux: reduce overhead of mls_level_isvalid() function call

On 05/03/2013 10:07 AM, Waiman Long wrote:
> On 04/10/2013 02:26 PM, Waiman Long wrote:
>> While running the high_systime workload of the AIM7 benchmark on
>> a 2-socket 12-core Westmere x86-64 machine running 3.8.2 kernel,
>> it was found that a pretty sizable amount of time was spent in the
>> SELinux code. Below was the perf trace of the "perf record -a -s"
>> of a test run at 1500 users:
>>
>> 3.96% ls [kernel.kallsyms] [k] ebitmap_get_bit
>> 1.44% ls [kernel.kallsyms] [k] mls_level_isvalid
>> 1.33% ls [kernel.kallsyms] [k] find_next_bit
>>
>> The ebitmap_get_bit() was the hottest function in the perf-report
>> output. Both the ebitmap_get_bit() and find_next_bit() functions
>> were, in fact, called by mls_level_isvalid(). As a result, the
>> mls_level_isvalid() call consumed 6.73% of the total CPU time of all
>> the 24 virtual CPUs which is quite a lot.
>>
>> Looking at the mls_level_isvalid() function, it is checking to see
>> if all the bits set in one of the ebitmap structure are also set in
>> another one as well as the highest set bit is no bigger than the one
>> specified by the given policydb data structure. It is doing it in
>> a bit-by-bit manner. So if the ebitmap structure has many bits set,
>> the iteration loop will be done many times.
>>
>> The current code can be rewritten to use a similar algorithm as the
>> ebitmap_contains() function with an additional check for the highest
>> set bit. With that change, the perf trace showed that the used CPU
>> time drop down to just 0.09% of the total which is about 100X less
>> than before.
>>
>> 0.04% ls [kernel.kallsyms] [k] ebitmap_get_bit
>> 0.04% ls [kernel.kallsyms] [k] mls_level_isvalid
>> 0.01% ls [kernel.kallsyms] [k] find_next_bit
>>
>> Actually, the remaining ebitmap_get_bit() and find_next_bit() function
>> calls are made by other kernel routines as the new mls_level_isvalid()
>> function will not call them anymore.
>>
>> This patch also improves the high_systime AIM7 benchmark result,
>> though the improvement is not as impressive as is suggested by the
>> reduction in CPU time. The table below shows the performance change
>> on the 2-socket x86-64 system mentioned above.
>>
>> +--------------+---------------+----------------+-----------------+
>> | Workload | mean % change | mean % change | mean % change |
>> | | 10-100 users | 200-1000 users | 1100-2000 users |
>> +--------------+---------------+----------------+-----------------+
>> | high_systime | +0.2% | +1.1% | +2.4% |
>> +--------------+---------------+----------------+-----------------+
>>
>> Signed-off-by: Waiman Long<[email protected]>
>> ---
>> security/selinux/ss/mls.c | 38
>> +++++++++++++++++++++++++++-----------
>> 1 files changed, 27 insertions(+), 11 deletions(-)
>>
>> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
>> index 40de8d3..ce02803 100644
>> --- a/security/selinux/ss/mls.c
>> +++ b/security/selinux/ss/mls.c
>> @@ -160,8 +160,7 @@ void mls_sid_to_context(struct context *context,
>> int mls_level_isvalid(struct policydb *p, struct mls_level *l)
>> {
>> struct level_datum *levdatum;
>> - struct ebitmap_node *node;
>> - int i;
>> + struct ebitmap_node *nodel, *noded;
>>
>> if (!l->sens || l->sens> p->p_levels.nprim)
>> return 0;
>> @@ -170,16 +169,33 @@ int mls_level_isvalid(struct policydb *p,
>> struct mls_level *l)
>> if (!levdatum)
>> return 0;
>>
>> - ebitmap_for_each_positive_bit(&l->cat, node, i) {
>> - if (i> p->p_cats.nprim)
>> - return 0;
>> - if (!ebitmap_get_bit(&levdatum->level->cat, i)) {
>> - /*
>> - * Category may not be associated with
>> - * sensitivity.
>> - */
>> - return 0;
>> + /*
>> + * Return 1 iff
>> + * 1. l->cat.node is NULL, or
>> + * 2. all the bits set in l->cat are also set in
>> levdatum->level->cat,
>> + * and
>> + * 3. the last bit set in l->cat should not be larger than
>> + * p->p_cats.nprim.
>> + */
>> + noded = levdatum->level->cat.node;
>> + for (nodel = l->cat.node ; nodel ; nodel = nodel->next) {
>> + int i, lastsetbit = -1;
>> +
>> + for (i = EBITMAP_UNIT_NUMS - 1 ; i>= 0 ; i--) {
>> + if (!nodel->maps[i])
>> + continue;
>> + if (!noded ||
>> + ((nodel->maps[i]&noded->maps[i]) != nodel->maps[i]))
>> + return 0;
>> + if (lastsetbit< 0)
>> + lastsetbit = nodel->startbit +
>> + i * EBITMAP_UNIT_SIZE +
>> + __fls(nodel->maps[i]);
>> }
>> + if ((lastsetbit>= 0)&& (lastsetbit> p->p_cats.nprim))
>> + return 0;
>> + if (noded)
>> + noded = noded->next;
>> }
>>
>> return 1;
>
> Would you mind giving me some feedback on what you think about this
> patch?
>
> Thank a lot!
> Regards,
> Longman

I am sorry for the annoyance, but I really want to have some feedback
on whether this patch is viable or not.

Thanks,
Longman

2013-06-05 15:15:00

by Stephen Smalley

[permalink] [raw]
Subject: Re: [PATCH RFC 1/2] SELinux: reduce overhead of mls_level_isvalid() function call

On 05/03/2013 10:07 AM, Waiman Long wrote:
> On 04/10/2013 02:26 PM, Waiman Long wrote:
>> While running the high_systime workload of the AIM7 benchmark on
>> a 2-socket 12-core Westmere x86-64 machine running 3.8.2 kernel,
>> it was found that a pretty sizable amount of time was spent in the
>> SELinux code. Below was the perf trace of the "perf record -a -s"
>> of a test run at 1500 users:
>>
>> 3.96% ls [kernel.kallsyms] [k] ebitmap_get_bit
>> 1.44% ls [kernel.kallsyms] [k] mls_level_isvalid
>> 1.33% ls [kernel.kallsyms] [k] find_next_bit
>>
>> The ebitmap_get_bit() was the hottest function in the perf-report
>> output. Both the ebitmap_get_bit() and find_next_bit() functions
>> were, in fact, called by mls_level_isvalid(). As a result, the
>> mls_level_isvalid() call consumed 6.73% of the total CPU time of all
>> the 24 virtual CPUs which is quite a lot.
>>
>> Looking at the mls_level_isvalid() function, it is checking to see
>> if all the bits set in one of the ebitmap structure are also set in
>> another one as well as the highest set bit is no bigger than the one
>> specified by the given policydb data structure. It is doing it in
>> a bit-by-bit manner. So if the ebitmap structure has many bits set,
>> the iteration loop will be done many times.
>>
>> The current code can be rewritten to use a similar algorithm as the
>> ebitmap_contains() function with an additional check for the highest
>> set bit. With that change, the perf trace showed that the used CPU
>> time drop down to just 0.09% of the total which is about 100X less
>> than before.
>>
>> 0.04% ls [kernel.kallsyms] [k] ebitmap_get_bit
>> 0.04% ls [kernel.kallsyms] [k] mls_level_isvalid
>> 0.01% ls [kernel.kallsyms] [k] find_next_bit
>>
>> Actually, the remaining ebitmap_get_bit() and find_next_bit() function
>> calls are made by other kernel routines as the new mls_level_isvalid()
>> function will not call them anymore.
>>
>> This patch also improves the high_systime AIM7 benchmark result,
>> though the improvement is not as impressive as is suggested by the
>> reduction in CPU time. The table below shows the performance change
>> on the 2-socket x86-64 system mentioned above.
>>
>> +--------------+---------------+----------------+-----------------+
>> | Workload | mean % change | mean % change | mean % change |
>> | | 10-100 users | 200-1000 users | 1100-2000 users |
>> +--------------+---------------+----------------+-----------------+
>> | high_systime | +0.2% | +1.1% | +2.4% |
>> +--------------+---------------+----------------+-----------------+
>>
>> Signed-off-by: Waiman Long<[email protected]>
>> ---
>> security/selinux/ss/mls.c | 38 +++++++++++++++++++++++++++-----------
>> 1 files changed, 27 insertions(+), 11 deletions(-)
>>
>> diff --git a/security/selinux/ss/mls.c b/security/selinux/ss/mls.c
>> index 40de8d3..ce02803 100644
>> --- a/security/selinux/ss/mls.c
>> +++ b/security/selinux/ss/mls.c
>> @@ -160,8 +160,7 @@ void mls_sid_to_context(struct context *context,
>> int mls_level_isvalid(struct policydb *p, struct mls_level *l)
>> {
>> struct level_datum *levdatum;
>> - struct ebitmap_node *node;
>> - int i;
>> + struct ebitmap_node *nodel, *noded;
>>
>> if (!l->sens || l->sens> p->p_levels.nprim)
>> return 0;
>> @@ -170,16 +169,33 @@ int mls_level_isvalid(struct policydb *p, struct
>> mls_level *l)
>> if (!levdatum)
>> return 0;
>>
>> - ebitmap_for_each_positive_bit(&l->cat, node, i) {
>> - if (i> p->p_cats.nprim)
>> - return 0;
>> - if (!ebitmap_get_bit(&levdatum->level->cat, i)) {
>> - /*
>> - * Category may not be associated with
>> - * sensitivity.
>> - */
>> - return 0;
>> + /*
>> + * Return 1 iff
>> + * 1. l->cat.node is NULL, or
>> + * 2. all the bits set in l->cat are also set in
>> levdatum->level->cat,
>> + * and
>> + * 3. the last bit set in l->cat should not be larger than
>> + * p->p_cats.nprim.
>> + */
>> + noded = levdatum->level->cat.node;
>> + for (nodel = l->cat.node ; nodel ; nodel = nodel->next) {
>> + int i, lastsetbit = -1;
>> +
>> + for (i = EBITMAP_UNIT_NUMS - 1 ; i>= 0 ; i--) {
>> + if (!nodel->maps[i])
>> + continue;
>> + if (!noded ||
>> + ((nodel->maps[i]&noded->maps[i]) != nodel->maps[i]))
>> + return 0;
>> + if (lastsetbit< 0)
>> + lastsetbit = nodel->startbit +
>> + i * EBITMAP_UNIT_SIZE +
>> + __fls(nodel->maps[i]);
>> }
>> + if ((lastsetbit>= 0)&& (lastsetbit> p->p_cats.nprim))
>> + return 0;
>> + if (noded)
>> + noded = noded->next;
>> }
>>
>> return 1;
>
> Would you mind giving me some feedback on what you think about this patch?

Can you take the core logic into a helper function within ebitmap.c?
Otherwise you are directly exposing ebitmap internals to the mls code.

2013-06-05 15:18:29

by Waiman Long

[permalink] [raw]
Subject: Re: [PATCH RFC 1/2] SELinux: reduce overhead of mls_level_isvalid() function call

On 06/05/2013 10:59 AM, Stephen Smalley wrote:
>
> Can you take the core logic into a helper function within ebitmap.c?
> Otherwise you are directly exposing ebitmap internals to the mls code.

Sure. I will move the logic to ebitmap.c & send out an updated patch.

Thank for the quick response.

Regards,
Longman