2014-01-22 09:44:11

by Dongsheng Yang

[permalink] [raw]
Subject: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

There is already a function named task_nice in sched.h to get the nice value
of task_struct. We can use it in __update_max_tr() rather than calculate it
manually.

Signed-off-by: Dongsheng Yang <[email protected]>
---
kernel/trace/trace.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
index 9d20cd9..ec149b4 100644
--- a/kernel/trace/trace.c
+++ b/kernel/trace/trace.c
@@ -970,7 +970,7 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
else
max_data->uid = task_uid(tsk);

- max_data->nice = tsk->static_prio - 20 - MAX_RT_PRIO;
+ max_data->nice = task_nice(tsk);
max_data->policy = tsk->policy;
max_data->rt_priority = tsk->rt_priority;

--
1.8.2.1


2014-01-23 03:56:38

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

On Wed, 22 Jan 2014 17:41:45 -0500
Dongsheng Yang <[email protected]> wrote:

> There is already a function named task_nice in sched.h to get the nice value
> of task_struct. We can use it in __update_max_tr() rather than calculate it
> manually.
>
> Signed-off-by: Dongsheng Yang <[email protected]>
> ---
> kernel/trace/trace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 9d20cd9..ec149b4 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -970,7 +970,7 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
> else
> max_data->uid = task_uid(tsk);
>
> - max_data->nice = tsk->static_prio - 20 - MAX_RT_PRIO;
> + max_data->nice = task_nice(tsk);

Except that's a function call in a critical path. Switch it to
TASK_NICE(), and I'll take the patch.

Thanks,

-- Steve

> max_data->policy = tsk->policy;
> max_data->rt_priority = tsk->rt_priority;
>

2014-01-23 04:00:32

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

On Wed, 22 Jan 2014 22:56:32 -0500
Steven Rostedt <[email protected]> wrote:

> On Wed, 22 Jan 2014 17:41:45 -0500
> Dongsheng Yang <[email protected]> wrote:
>
> > There is already a function named task_nice in sched.h to get the nice value
> > of task_struct. We can use it in __update_max_tr() rather than calculate it
> > manually.
> >
> > Signed-off-by: Dongsheng Yang <[email protected]>
> > ---
> > kernel/trace/trace.c | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> > index 9d20cd9..ec149b4 100644
> > --- a/kernel/trace/trace.c
> > +++ b/kernel/trace/trace.c
> > @@ -970,7 +970,7 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
> > else
> > max_data->uid = task_uid(tsk);
> >
> > - max_data->nice = tsk->static_prio - 20 - MAX_RT_PRIO;
> > + max_data->nice = task_nice(tsk);
>
> Except that's a function call in a critical path. Switch it to
> TASK_NICE(), and I'll take the patch.

Bah, I just noticed that TASK_NICE is in kernel/sched/sched.h not
include/linux/sched.h

Peter, is there a reason that task_nice() is not a static inline in
sched.h and have these macros there too? They only reference fields in
task_struct that are already defined there. I don't see why they need
to be private to kernel/sched.

-- Steve

>
> Thanks,
>
> -- Steve
>
> > max_data->policy = tsk->policy;
> > max_data->rt_priority = tsk->rt_priority;
> >
>

2014-01-23 04:13:25

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

On 01/22/2014 11:00 PM, Steven Rostedt wrote:
>
> Bah, I just noticed that TASK_NICE is in kernel/sched/sched.h not
> include/linux/sched.h
>
> Peter, is there a reason that task_nice() is not a static inline in
> sched.h and have these macros there too? They only reference fields in
> task_struct that are already defined there. I don't see why they need
> to be private to kernel/sched.

Agree. These macros are useful to other modules out of kernel/sched.
But they are private to kernel/sched currently.

If we move them to include/linux/sched.h, I will use TASK_NICE in this
patch.
>
> -- Steve
>
>> Thanks,
>>
>> -- Steve
>>
>>> max_data->policy = tsk->policy;
>>> max_data->rt_priority = tsk->rt_priority;
>>>
>

2014-01-23 08:26:58

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

On Thu, Jan 23, 2014 at 12:11:04PM -0500, Dongsheng Yang wrote:
> On 01/22/2014 11:00 PM, Steven Rostedt wrote:
> >
> >Bah, I just noticed that TASK_NICE is in kernel/sched/sched.h not
> >include/linux/sched.h
> >
> >Peter, is there a reason that task_nice() is not a static inline in
> >sched.h and have these macros there too? They only reference fields in
> >task_struct that are already defined there. I don't see why they need
> >to be private to kernel/sched.
>
> Agree. These macros are useful to other modules out of kernel/sched.
> But they are private to kernel/sched currently.

And the floodgates open.. _why_ would a module care about nice values?
That's sounds just so full of wrong.

2014-01-23 08:45:42

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

On 01/23/2014 03:26 AM, Peter Zijlstra wrote:
> On Thu, Jan 23, 2014 at 12:11:04PM -0500, Dongsheng Yang wrote:
>> On 01/22/2014 11:00 PM, Steven Rostedt wrote:
>>> Bah, I just noticed that TASK_NICE is in kernel/sched/sched.h not
>>> include/linux/sched.h
>>>
>>> Peter, is there a reason that task_nice() is not a static inline in
>>> sched.h and have these macros there too? They only reference fields in
>>> task_struct that are already defined there. I don't see why they need
>>> to be private to kernel/sched.
>> Agree. These macros are useful to other modules out of kernel/sched.
>> But they are private to kernel/sched currently.
> And the floodgates open.. _why_ would a module care about nice values?
> That's sounds just so full of wrong.

Sorry for my misnomer with 'modules'. Actually I mean other subsystems in
kernel, such as trace, sched, rcu. :)

I did not mean the module for modprobe.

Sorry for my terrible expression :(.
>

2014-01-23 11:52:09

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

On Thu, 23 Jan 2014 09:26:30 +0100
Peter Zijlstra <[email protected]> wrote:

> On Thu, Jan 23, 2014 at 12:11:04PM -0500, Dongsheng Yang wrote:
> > On 01/22/2014 11:00 PM, Steven Rostedt wrote:
> > >
> > >Bah, I just noticed that TASK_NICE is in kernel/sched/sched.h not
> > >include/linux/sched.h
> > >
> > >Peter, is there a reason that task_nice() is not a static inline in
> > >sched.h and have these macros there too? They only reference fields in
> > >task_struct that are already defined there. I don't see why they need
> > >to be private to kernel/sched.
> >
> > Agree. These macros are useful to other modules out of kernel/sched.
> > But they are private to kernel/sched currently.
>
> And the floodgates open.. _why_ would a module care about nice values?
> That's sounds just so full of wrong.

As Dongsheng already said, it's not for modules, but for other parts of
the core kernel.

It's not like modules or other parts can't just reimplement those
macros. All the fields are already public in linux/sched.h. In fact, the
reason for this discussion is to get rid of an open coded implementation
in the tracing facility.

-- Steve

2014-01-27 09:18:06

by Dongsheng Yang

[permalink] [raw]
Subject: [PATCH 2/3] sched: Expose some macros related with priority.

Some macros in kernel/sched/sched.h about priority are
private to kernel/sched. But they are useful to other
parts of the core kernel.

This patch move these macros from kernel/sched/sched.h to
include/linux/sched/prio.h so that they are available to
other subsystems.

Signed-off-by: Dongsheng Yang <[email protected]>
---
include/linux/sched/prio.h | 18 ++++++++++++++++++
kernel/sched/sched.h | 18 ------------------
2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
index 9382ba8..13216f1 100644
--- a/include/linux/sched/prio.h
+++ b/include/linux/sched/prio.h
@@ -20,4 +20,22 @@
#define MAX_PRIO (MAX_RT_PRIO + 40)
#define DEFAULT_PRIO (MAX_RT_PRIO + 20)

+/*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+ * and back.
+ */
+#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
+#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
+#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)
+
+/*
+ * 'User priority' is the nice value converted to something we
+ * can work with better when scaling various scheduler parameters,
+ * it's a [ 0 ... 39 ] range.
+ */
+#define USER_PRIO(p) ((p)-MAX_RT_PRIO)
+#define TASK_USER_PRIO(p) USER_PRIO((p)->static_prio)
+#define MAX_USER_PRIO (USER_PRIO(MAX_PRIO))
+
#endif /* _SCHED_PRIO_H */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c2119fd..b44720d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -24,24 +24,6 @@ extern long calc_load_fold_active(struct rq *this_rq);
extern void update_cpu_load_active(struct rq *this_rq);

/*
- * Convert user-nice values [ -20 ... 0 ... 19 ]
- * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
- * and back.
- */
-#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
-#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
-#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)
-
-/*
- * 'User priority' is the nice value converted to something we
- * can work with better when scaling various scheduler parameters,
- * it's a [ 0 ... 39 ] range.
- */
-#define USER_PRIO(p) ((p)-MAX_RT_PRIO)
-#define TASK_USER_PRIO(p) USER_PRIO((p)->static_prio)
-#define MAX_USER_PRIO (USER_PRIO(MAX_PRIO))
-
-/*
* Helpers for converting nanosecond timing to jiffy resolution
*/
#define NS_TO_JIFFIES(TIME) ((unsigned long)(TIME) / (NSEC_PER_SEC / HZ))
--
1.8.2.1

2014-01-27 09:18:09

by Dongsheng Yang

[permalink] [raw]
Subject: [PATCH 3/3] sched: Implement task_nice and task_prio as static inline functions.

As commit 0e0c0797 expose the priority related macros in linux/sched/prio.h,
we don't have to implement task_nice and task_prio in kernel/sched/core.c any
more.

This patch implement them in linux/sched/sched.h as static inline functions,
saving the kernel stack and enhancing the performance.

Signed-off-by: Dongsheng Yang <[email protected]>
---
include/linux/sched.h | 26 ++++++++++++++++++++++++--
kernel/sched/core.c | 25 -------------------------
2 files changed, 24 insertions(+), 27 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ba1b732..125145b 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2082,8 +2082,30 @@ static inline void sched_autogroup_exit(struct signal_struct *sig) { }

extern bool yield_to(struct task_struct *p, bool preempt);
extern void set_user_nice(struct task_struct *p, long nice);
-extern int task_prio(const struct task_struct *p);
-extern int task_nice(const struct task_struct *p);
+/**
+ * task_prio - return the priority value of a given task.
+ * @p: the task in question.
+ *
+ * Return: The priority value as seen by users in /proc.
+ * RT tasks are offset by -200. Normal tasks are centered
+ * around 0, value goes from -16 to +15.
+ */
+static inline int task_prio(const struct task_struct *p)
+{
+ return p->prio - MAX_RT_PRIO;
+}
+
+/**
+ * task_nice - return the nice value of a given task.
+ * @p: the task in question.
+ *
+ * Return: The nice value [ -20 ... 0 ... 19 ].
+ */
+static inline int task_nice(const struct task_struct *p)
+{
+ return TASK_NICE(p);
+}
+
extern int can_nice(const struct task_struct *p, const int nice);
extern int task_curr(const struct task_struct *p);
extern int idle_cpu(int cpu);
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7fea865..a8d6150 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3096,31 +3096,6 @@ SYSCALL_DEFINE1(nice, int, increment)
#endif

/**
- * task_prio - return the priority value of a given task.
- * @p: the task in question.
- *
- * Return: The priority value as seen by users in /proc.
- * RT tasks are offset by -200. Normal tasks are centered
- * around 0, value goes from -16 to +15.
- */
-int task_prio(const struct task_struct *p)
-{
- return p->prio - MAX_RT_PRIO;
-}
-
-/**
- * task_nice - return the nice value of a given task.
- * @p: the task in question.
- *
- * Return: The nice value [ -20 ... 0 ... 19 ].
- */
-int task_nice(const struct task_struct *p)
-{
- return TASK_NICE(p);
-}
-EXPORT_SYMBOL(task_nice);
-
-/**
* idle_cpu - is a given cpu idle currently?
* @cpu: the processor in question.
*
--
1.8.2.1

2014-01-27 09:18:34

by Dongsheng Yang

[permalink] [raw]
Subject: [PATCH 1/3] sched: Move the priority specific bits into a new header file.

Some bits about priority are defined in linux/sched/rt.h, but
some of them are not only for rt scheduler, such as MAX_PRIO.

This patch move them all into a new header file, linux/sched/prio.h.

Signed-off-by: Dongsheng Yang <[email protected]>
---
include/linux/sched.h | 4 ++++
include/linux/sched/prio.h | 23 +++++++++++++++++++++++
include/linux/sched/rt.h | 21 +++------------------
3 files changed, 30 insertions(+), 18 deletions(-)
create mode 100644 include/linux/sched/prio.h

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 68a0e84..ba1b732 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3,6 +3,10 @@

#include <uapi/linux/sched.h>

+#ifndef _SCHED_PRIO_H
+#include <linux/sched/prio.h>
+#endif /* #ifndef _SCHED_PRIO_H */
+

struct sched_param {
int sched_priority;
diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
new file mode 100644
index 0000000..9382ba8
--- /dev/null
+++ b/include/linux/sched/prio.h
@@ -0,0 +1,23 @@
+#ifndef _SCHED_PRIO_H
+#define _SCHED_PRIO_H
+
+/*
+ * Priority of a process goes from 0..MAX_PRIO-1, valid RT
+ * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL/SCHED_BATCH
+ * tasks are in the range MAX_RT_PRIO..MAX_PRIO-1. Priority
+ * values are inverted: lower p->prio value means higher priority.
+ *
+ * The MAX_USER_RT_PRIO value allows the actual maximum
+ * RT priority to be separate from the value exported to
+ * user-space. This allows kernel threads to set their
+ * priority to a value higher than any user task. Note:
+ * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO.
+ */
+
+#define MAX_USER_RT_PRIO 100
+#define MAX_RT_PRIO MAX_USER_RT_PRIO
+
+#define MAX_PRIO (MAX_RT_PRIO + 40)
+#define DEFAULT_PRIO (MAX_RT_PRIO + 20)
+
+#endif /* _SCHED_PRIO_H */
diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
index 34e4ebe..50409a3 100644
--- a/include/linux/sched/rt.h
+++ b/include/linux/sched/rt.h
@@ -1,24 +1,9 @@
#ifndef _SCHED_RT_H
#define _SCHED_RT_H

-/*
- * Priority of a process goes from 0..MAX_PRIO-1, valid RT
- * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL/SCHED_BATCH
- * tasks are in the range MAX_RT_PRIO..MAX_PRIO-1. Priority
- * values are inverted: lower p->prio value means higher priority.
- *
- * The MAX_USER_RT_PRIO value allows the actual maximum
- * RT priority to be separate from the value exported to
- * user-space. This allows kernel threads to set their
- * priority to a value higher than any user task. Note:
- * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO.
- */
-
-#define MAX_USER_RT_PRIO 100
-#define MAX_RT_PRIO MAX_USER_RT_PRIO
-
-#define MAX_PRIO (MAX_RT_PRIO + 40)
-#define DEFAULT_PRIO (MAX_RT_PRIO + 20)
+#ifndef _SCHED_PRIO_H
+#include <linux/sched/prio.h>
+#endif /* ifndef _SCHED_PRIO_H */

static inline int rt_prio(int prio)
{
--
1.8.2.1

2014-01-27 09:18:49

by Dongsheng Yang

[permalink] [raw]
Subject: [PATCH 0/3] sched: Collect the bits about priority into a new header file, include/linux/sched/prio.h.

Hi Peter,
This patch set collect the bits about priority in linux/sched/prio.h and
expose some macros about priority here. So that the other parts of core kernel
can use the them without reimplementing it in an open way.

Dongsheng Yang (3):
sched: Move the priority specific bits into a new header file.
sched: Expose some macros related with priority.
sched: Implement task_nice and task_prio as static inline functions.

include/linux/sched.h | 30 ++++++++++++++++++++++++++++--
include/linux/sched/prio.h | 41 +++++++++++++++++++++++++++++++++++++++++
include/linux/sched/rt.h | 21 +++------------------
kernel/sched/core.c | 25 -------------------------
kernel/sched/sched.h | 18 ------------------
5 files changed, 72 insertions(+), 63 deletions(-)
create mode 100644 include/linux/sched/prio.h

--
1.8.2.1

2014-01-27 10:33:10

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/3] sched: Implement task_nice and task_prio as static inline functions.

On Mon, Jan 27, 2014 at 05:15:39PM -0500, Dongsheng Yang wrote:
> +/**
> + * task_prio - return the priority value of a given task.
> + * @p: the task in question.
> + *
> + * Return: The priority value as seen by users in /proc.
> + * RT tasks are offset by -200. Normal tasks are centered
> + * around 0, value goes from -16 to +15.
> + */
> +static inline int task_prio(const struct task_struct *p)
> +{
> + return p->prio - MAX_RT_PRIO;
> +}

Who would ever want to use/rely on this? It doesn't make any sense. And
therefore it shouldn't ever be considered time critical.

> +/**
> + * task_nice - return the nice value of a given task.
> + * @p: the task in question.
> + *
> + * Return: The nice value [ -20 ... 0 ... 19 ].
> + */
> +static inline int task_nice(const struct task_struct *p)
> +{
> + return TASK_NICE(p);
> +}

Urgh, no. Just remove the macro already. Although arguably we should
remove ->static_prio and clean up that entire mess.

2014-01-27 12:12:15

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH 3/3] sched: Implement task_nice and task_prio as static inline functions.

On 01/27/2014 05:32 AM, Peter Zijlstra wrote:
> On Mon, Jan 27, 2014 at 05:15:39PM -0500, Dongsheng Yang wrote:
>> +/**
>> + * task_prio - return the priority value of a given task.
>> + * @p: the task in question.
>> + *
>> + * Return: The priority value as seen by users in /proc.
>> + * RT tasks are offset by -200. Normal tasks are centered
>> + * around 0, value goes from -16 to +15.
>> + */
>> +static inline int task_prio(const struct task_struct *p)
>> +{
>> + return p->prio - MAX_RT_PRIO;
>> +}
> Who would ever want to use/rely on this? It doesn't make any sense. And
> therefore it shouldn't ever be considered time critical.

I just copy it from kernel/sched/core.c. Currently, it is used in
fs/proc/array.c.
>
>> +/**
>> + * task_nice - return the nice value of a given task.
>> + * @p: the task in question.
>> + *
>> + * Return: The nice value [ -20 ... 0 ... 19 ].
>> + */
>> +static inline int task_nice(const struct task_struct *p)
>> +{
>> + return TASK_NICE(p);
>> +}
> Urgh, no. Just remove the macro already. Although arguably we should
> remove ->static_prio and clean up that entire mess.
>

Oops, sorry for the noise. I am a newbie here, could you help to point
out that
which tree is the latest version for sched. Thanx :)

2014-01-27 12:16:13

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/3] sched: Implement task_nice and task_prio as static inline functions.

On Mon, Jan 27, 2014 at 08:09:54PM -0500, Dongsheng Yang wrote:
> On 01/27/2014 05:32 AM, Peter Zijlstra wrote:
> >On Mon, Jan 27, 2014 at 05:15:39PM -0500, Dongsheng Yang wrote:
> >>+/**
> >>+ * task_prio - return the priority value of a given task.
> >>+ * @p: the task in question.
> >>+ *
> >>+ * Return: The priority value as seen by users in /proc.
> >>+ * RT tasks are offset by -200. Normal tasks are centered
> >>+ * around 0, value goes from -16 to +15.
> >>+ */
> >>+static inline int task_prio(const struct task_struct *p)
> >>+{
> >>+ return p->prio - MAX_RT_PRIO;
> >>+}
> >Who would ever want to use/rely on this? It doesn't make any sense. And
> >therefore it shouldn't ever be considered time critical.
>
> I just copy it from kernel/sched/core.c. Currently, it is used in
> fs/proc/array.c.

Just leave it there. Nobody should use this value anyhow.

> >>+/**
> >>+ * task_nice - return the nice value of a given task.
> >>+ * @p: the task in question.
> >>+ *
> >>+ * Return: The nice value [ -20 ... 0 ... 19 ].
> >>+ */
> >>+static inline int task_nice(const struct task_struct *p)
> >>+{
> >>+ return TASK_NICE(p);
> >>+}
> >Urgh, no. Just remove the macro already. Although arguably we should
> >remove ->static_prio and clean up that entire mess.
> >
>
> Oops, sorry for the noise. I am a newbie here, could you help to point out
> that
> which tree is the latest version for sched. Thanx :)

tip/sched/core or tip/master, but that's not the issue. There's no point
in having an inline and a macro that do the exact same thing.

2014-01-27 13:02:03

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH 3/3] sched: Implement task_nice and task_prio as static inline functions.

On 01/27/2014 07:16 AM, Peter Zijlstra wrote:
> On Mon, Jan 27, 2014 at 08:09:54PM -0500, Dongsheng Yang wrote:
>> On 01/27/2014 05:32 AM, Peter Zijlstra wrote:
>>> On Mon, Jan 27, 2014 at 05:15:39PM -0500, Dongsheng Yang wrote:
>>>> +/**
>>>> + * task_prio - return the priority value of a given task.
>>>> + * @p: the task in question.
>>>> + *
>>>> + * Return: The priority value as seen by users in /proc.
>>>> + * RT tasks are offset by -200. Normal tasks are centered
>>>> + * around 0, value goes from -16 to +15.
>>>> + */
>>>> +static inline int task_prio(const struct task_struct *p)
>>>> +{
>>>> + return p->prio - MAX_RT_PRIO;
>>>> +}
>>> Who would ever want to use/rely on this? It doesn't make any sense. And
>>> therefore it shouldn't ever be considered time critical.
>> I just copy it from kernel/sched/core.c. Currently, it is used in
>> fs/proc/array.c.
> Just leave it there. Nobody should use this value anyhow.

Okey.
>
>>>> +/**
>>>> + * task_nice - return the nice value of a given task.
>>>> + * @p: the task in question.
>>>> + *
>>>> + * Return: The nice value [ -20 ... 0 ... 19 ].
>>>> + */
>>>> +static inline int task_nice(const struct task_struct *p)
>>>> +{
>>>> + return TASK_NICE(p);
>>>> +}
>>> Urgh, no. Just remove the macro already. Although arguably we should
>>> remove ->static_prio and clean up that entire mess.
>>>
>> Oops, sorry for the noise. I am a newbie here, could you help to point out
>> that
>> which tree is the latest version for sched. Thanx :)
> tip/sched/core or tip/master, but that's not the issue. There's no point
> in having an inline and a macro that do the exact same thing.

Okey, so, do you mean I should remove the macro of TASK_NICE and
implement it directly in inline task_nice? If so, I will send v2 soon.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-01-27 13:09:02

by Peter Zijlstra

[permalink] [raw]
Subject: Re: [PATCH 3/3] sched: Implement task_nice and task_prio as static inline functions.

On Mon, Jan 27, 2014 at 08:59:40PM -0500, Dongsheng Yang wrote:
> >>>>+static inline int task_nice(const struct task_struct *p)
> >>>>+{
> >>>>+ return TASK_NICE(p);
> >>>>+}
> >>>Urgh, no. Just remove the macro already. Although arguably we should
> >>>remove ->static_prio and clean up that entire mess.
> >>>
> >>Oops, sorry for the noise. I am a newbie here, could you help to point out
> >>that
> >> which tree is the latest version for sched. Thanx :)
> >tip/sched/core or tip/master, but that's not the issue. There's no point
> >in having an inline and a macro that do the exact same thing.
>
> Okey, so, do you mean I should remove the macro of TASK_NICE and
> implement it directly in inline task_nice? If so, I will send v2 soon.

Yah, although ideally someone would rethink the entire prio thing and
get rid of the back and forth.

2014-01-27 15:16:17

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH 3/3 V2] sched: Implement task_nice as static inline function.

Peter, what about this version?

On 01/27/2014 10:00 PM, Dongsheng Yang wrote:
> As commit 0e0c0797 expose the priority related macros in linux/sched/prio.h,
> we don't have to implement task_nice in kernel/sched/core.c any more.
>
> This patch implement it in linux/sched/sched.h as static inline function,
> saving the kernel stack and enhancing the performance.
>
> Signed-off-by: Dongsheng Yang <[email protected]>
> ---
> Changelog:
> - v1:
> * leave the task_prio() in kernel/sched/core.c
> * remove macro TASK_NICE and implement it as static inline
> function in include/linux/sched.h.
> include/linux/sched.h | 11 ++++++++++-
> include/linux/sched/prio.h | 1 -
> kernel/sched/core.c | 26 +++++++-------------------
> kernel/sched/cputime.c | 4 ++--
> 4 files changed, 19 insertions(+), 23 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index ba1b732..5b63361 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -2083,7 +2083,16 @@ static inline void sched_autogroup_exit(struct signal_struct *sig) { }
> extern bool yield_to(struct task_struct *p, bool preempt);
> extern void set_user_nice(struct task_struct *p, long nice);
> extern int task_prio(const struct task_struct *p);
> -extern int task_nice(const struct task_struct *p);
> +/**
> + * task_nice - return the nice value of a given task.
> + * @p: the task in question.
> + *
> + * Return: The nice value [ -20 ... 0 ... 19 ].
> + */
> +static inline int task_nice(const struct task_struct *p)
> +{
> + return PRIO_TO_NICE((p)->static_prio);
> +}
> extern int can_nice(const struct task_struct *p, const int nice);
> extern int task_curr(const struct task_struct *p);
> extern int idle_cpu(int cpu);
> diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
> index 13216f1..410ccb7 100644
> --- a/include/linux/sched/prio.h
> +++ b/include/linux/sched/prio.h
> @@ -27,7 +27,6 @@
> */
> #define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
> #define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
> -#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)
>
> /*
> * 'User priority' is the nice value converted to something we
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 7fea865..b2bc1db 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -2998,7 +2998,7 @@ void set_user_nice(struct task_struct *p, long nice)
> unsigned long flags;
> struct rq *rq;
>
> - if (TASK_NICE(p) == nice || nice < -20 || nice > 19)
> + if (task_nice(p) == nice || nice < -20 || nice > 19)
> return;
> /*
> * We have to be careful, if called from sys_setpriority(),
> @@ -3076,7 +3076,7 @@ SYSCALL_DEFINE1(nice, int, increment)
> if (increment > 40)
> increment = 40;
>
> - nice = TASK_NICE(current) + increment;
> + nice = task_nice(current) + increment;
> if (nice < -20)
> nice = -20;
> if (nice > 19)
> @@ -3109,18 +3109,6 @@ int task_prio(const struct task_struct *p)
> }
>
> /**
> - * task_nice - return the nice value of a given task.
> - * @p: the task in question.
> - *
> - * Return: The nice value [ -20 ... 0 ... 19 ].
> - */
> -int task_nice(const struct task_struct *p)
> -{
> - return TASK_NICE(p);
> -}
> -EXPORT_SYMBOL(task_nice);
> -
> -/**
> * idle_cpu - is a given cpu idle currently?
> * @cpu: the processor in question.
> *
> @@ -3319,7 +3307,7 @@ recheck:
> */
> if (user && !capable(CAP_SYS_NICE)) {
> if (fair_policy(policy)) {
> - if (attr->sched_nice < TASK_NICE(p) &&
> + if (attr->sched_nice < task_nice(p) &&
> !can_nice(p, attr->sched_nice))
> return -EPERM;
> }
> @@ -3343,7 +3331,7 @@ recheck:
> * SCHED_NORMAL if the RLIMIT_NICE would normally permit it.
> */
> if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) {
> - if (!can_nice(p, TASK_NICE(p)))
> + if (!can_nice(p, task_nice(p)))
> return -EPERM;
> }
>
> @@ -3383,7 +3371,7 @@ recheck:
> * If not changing anything there's no need to proceed further:
> */
> if (unlikely(policy == p->policy)) {
> - if (fair_policy(policy) && attr->sched_nice != TASK_NICE(p))
> + if (fair_policy(policy) && attr->sched_nice != task_nice(p))
> goto change;
> if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
> goto change;
> @@ -3835,7 +3823,7 @@ SYSCALL_DEFINE3(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
> else if (task_has_rt_policy(p))
> attr.sched_priority = p->rt_priority;
> else
> - attr.sched_nice = TASK_NICE(p);
> + attr.sched_nice = task_nice(p);
>
> rcu_read_unlock();
>
> @@ -7006,7 +6994,7 @@ void normalize_rt_tasks(void)
> * Renice negative nice level userspace
> * tasks back to 0:
> */
> - if (TASK_NICE(p) < 0 && p->mm)
> + if (task_nice(p) < 0 && p->mm)
> set_user_nice(p, 0);
> continue;
> }
> diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
> index 9994791..58624a6 100644
> --- a/kernel/sched/cputime.c
> +++ b/kernel/sched/cputime.c
> @@ -142,7 +142,7 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
> p->utimescaled += cputime_scaled;
> account_group_user_time(p, cputime);
>
> - index = (TASK_NICE(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
> + index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
>
> /* Add user time to cpustat. */
> task_group_account_field(p, index, (__force u64) cputime);
> @@ -169,7 +169,7 @@ static void account_guest_time(struct task_struct *p, cputime_t cputime,
> p->gtime += cputime;
>
> /* Add guest time to cpustat. */
> - if (TASK_NICE(p) > 0) {
> + if (task_nice(p) > 0) {
> cpustat[CPUTIME_NICE] += (__force u64) cputime;
> cpustat[CPUTIME_GUEST_NICE] += (__force u64) cputime;
> } else {

2014-01-27 15:16:19

by Dongsheng Yang

[permalink] [raw]
Subject: [PATCH 3/3 V2] sched: Implement task_nice as static inline function.

As commit 0e0c0797 expose the priority related macros in linux/sched/prio.h,
we don't have to implement task_nice in kernel/sched/core.c any more.

This patch implement it in linux/sched/sched.h as static inline function,
saving the kernel stack and enhancing the performance.

Signed-off-by: Dongsheng Yang <[email protected]>
---
Changelog:
- v1:
* leave the task_prio() in kernel/sched/core.c
* remove macro TASK_NICE and implement it as static inline
function in include/linux/sched.h.
include/linux/sched.h | 11 ++++++++++-
include/linux/sched/prio.h | 1 -
kernel/sched/core.c | 26 +++++++-------------------
kernel/sched/cputime.c | 4 ++--
4 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ba1b732..5b63361 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2083,7 +2083,16 @@ static inline void sched_autogroup_exit(struct signal_struct *sig) { }
extern bool yield_to(struct task_struct *p, bool preempt);
extern void set_user_nice(struct task_struct *p, long nice);
extern int task_prio(const struct task_struct *p);
-extern int task_nice(const struct task_struct *p);
+/**
+ * task_nice - return the nice value of a given task.
+ * @p: the task in question.
+ *
+ * Return: The nice value [ -20 ... 0 ... 19 ].
+ */
+static inline int task_nice(const struct task_struct *p)
+{
+ return PRIO_TO_NICE((p)->static_prio);
+}
extern int can_nice(const struct task_struct *p, const int nice);
extern int task_curr(const struct task_struct *p);
extern int idle_cpu(int cpu);
diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
index 13216f1..410ccb7 100644
--- a/include/linux/sched/prio.h
+++ b/include/linux/sched/prio.h
@@ -27,7 +27,6 @@
*/
#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
-#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)

/*
* 'User priority' is the nice value converted to something we
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 7fea865..b2bc1db 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -2998,7 +2998,7 @@ void set_user_nice(struct task_struct *p, long nice)
unsigned long flags;
struct rq *rq;

- if (TASK_NICE(p) == nice || nice < -20 || nice > 19)
+ if (task_nice(p) == nice || nice < -20 || nice > 19)
return;
/*
* We have to be careful, if called from sys_setpriority(),
@@ -3076,7 +3076,7 @@ SYSCALL_DEFINE1(nice, int, increment)
if (increment > 40)
increment = 40;

- nice = TASK_NICE(current) + increment;
+ nice = task_nice(current) + increment;
if (nice < -20)
nice = -20;
if (nice > 19)
@@ -3109,18 +3109,6 @@ int task_prio(const struct task_struct *p)
}

/**
- * task_nice - return the nice value of a given task.
- * @p: the task in question.
- *
- * Return: The nice value [ -20 ... 0 ... 19 ].
- */
-int task_nice(const struct task_struct *p)
-{
- return TASK_NICE(p);
-}
-EXPORT_SYMBOL(task_nice);
-
-/**
* idle_cpu - is a given cpu idle currently?
* @cpu: the processor in question.
*
@@ -3319,7 +3307,7 @@ recheck:
*/
if (user && !capable(CAP_SYS_NICE)) {
if (fair_policy(policy)) {
- if (attr->sched_nice < TASK_NICE(p) &&
+ if (attr->sched_nice < task_nice(p) &&
!can_nice(p, attr->sched_nice))
return -EPERM;
}
@@ -3343,7 +3331,7 @@ recheck:
* SCHED_NORMAL if the RLIMIT_NICE would normally permit it.
*/
if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) {
- if (!can_nice(p, TASK_NICE(p)))
+ if (!can_nice(p, task_nice(p)))
return -EPERM;
}

@@ -3383,7 +3371,7 @@ recheck:
* If not changing anything there's no need to proceed further:
*/
if (unlikely(policy == p->policy)) {
- if (fair_policy(policy) && attr->sched_nice != TASK_NICE(p))
+ if (fair_policy(policy) && attr->sched_nice != task_nice(p))
goto change;
if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
goto change;
@@ -3835,7 +3823,7 @@ SYSCALL_DEFINE3(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
else if (task_has_rt_policy(p))
attr.sched_priority = p->rt_priority;
else
- attr.sched_nice = TASK_NICE(p);
+ attr.sched_nice = task_nice(p);

rcu_read_unlock();

@@ -7006,7 +6994,7 @@ void normalize_rt_tasks(void)
* Renice negative nice level userspace
* tasks back to 0:
*/
- if (TASK_NICE(p) < 0 && p->mm)
+ if (task_nice(p) < 0 && p->mm)
set_user_nice(p, 0);
continue;
}
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 9994791..58624a6 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -142,7 +142,7 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
p->utimescaled += cputime_scaled;
account_group_user_time(p, cputime);

- index = (TASK_NICE(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
+ index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;

/* Add user time to cpustat. */
task_group_account_field(p, index, (__force u64) cputime);
@@ -169,7 +169,7 @@ static void account_guest_time(struct task_struct *p, cputime_t cputime,
p->gtime += cputime;

/* Add guest time to cpustat. */
- if (TASK_NICE(p) > 0) {
+ if (task_nice(p) > 0) {
cpustat[CPUTIME_NICE] += (__force u64) cputime;
cpustat[CPUTIME_GUEST_NICE] += (__force u64) cputime;
} else {
--
1.8.2.1

2014-01-27 15:45:35

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/3] sched: Collect the bits about priority into a new header file, include/linux/sched/prio.h.

On Mon, 27 Jan 2014 17:15:36 -0500
Dongsheng Yang <[email protected]> wrote:

> Date: Mon, 27 Jan 2014 17:15:36 -0500

You need to fix your timestamps. Unless you are writing to us from the
future.

-- Steve

2014-01-28 16:06:28

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 0/3] sched: Collect the bits about priority into a new header file, include/linux/sched/prio.h.

On Tue, 28 Jan 2014 23:59:06 +0800
Dongsheng Yang <[email protected]> wrote:

> Hi Steve,
>
> Sorry for the late reply, I have been vacation for Chinese Spring Festival.
>
> Aha, yes, it looks strange a mail from future.
>
> Could you help to figure out which timezone is the standard one for LKML?

It should just be your own timezone. I think you had the wrong timezone
before as it was "Date: Mon, 27 Jan 2014 17:15:36 -0500". You had the
same timezone as I have (-0500). But this email is correct:

Date: Tue, 28 Jan 2014 23:59:06 +0800

Basically, whatever you use to send email, make sure that it is set up
to send with your own timezone. There's no standard one for LKML. The
mail clients will do the conversion.

-- Steve

2014-01-29 05:28:19

by Namhyung Kim

[permalink] [raw]
Subject: Re: [PATCH 1/3] sched: Move the priority specific bits into a new header file.

Hi Dongsheng,

On Mon, 27 Jan 2014 17:15:37 -0500, Dongsheng Yang wrote:
> Some bits about priority are defined in linux/sched/rt.h, but
> some of them are not only for rt scheduler, such as MAX_PRIO.
>
> This patch move them all into a new header file, linux/sched/prio.h.
>
> Signed-off-by: Dongsheng Yang <[email protected]>
> ---
> include/linux/sched.h | 4 ++++
> include/linux/sched/prio.h | 23 +++++++++++++++++++++++
> include/linux/sched/rt.h | 21 +++------------------
> 3 files changed, 30 insertions(+), 18 deletions(-)
> create mode 100644 include/linux/sched/prio.h
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 68a0e84..ba1b732 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -3,6 +3,10 @@
>
> #include <uapi/linux/sched.h>
>
> +#ifndef _SCHED_PRIO_H
> +#include <linux/sched/prio.h>
> +#endif /* #ifndef _SCHED_PRIO_H */

It seems you don't need to use #ifndef-#endif pair to include a header
file?

Thanks,
Namhyung

2014-02-10 02:56:21

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH 1/3] sched: Move the priority specific bits into a new header file.


Hi Namhyung,

On 01/29/2014 01:28 PM, Namhyung Kim wrote:
> Hi Dongsheng,
>
> On Mon, 27 Jan 2014 17:15:37 -0500, Dongsheng Yang wrote:
>> Some bits about priority are defined in linux/sched/rt.h, but
>> some of them are not only for rt scheduler, such as MAX_PRIO.
>>
>> This patch move them all into a new header file, linux/sched/prio.h.
>>
>> Signed-off-by: Dongsheng Yang <[email protected]>
>> ---
>> include/linux/sched.h | 4 ++++
>> include/linux/sched/prio.h | 23 +++++++++++++++++++++++
>> include/linux/sched/rt.h | 21 +++------------------
>> 3 files changed, 30 insertions(+), 18 deletions(-)
>> create mode 100644 include/linux/sched/prio.h
>>
>> diff --git a/include/linux/sched.h b/include/linux/sched.h
>> index 68a0e84..ba1b732 100644
>> --- a/include/linux/sched.h
>> +++ b/include/linux/sched.h
>> @@ -3,6 +3,10 @@
>>
>> #include <uapi/linux/sched.h>
>>
>> +#ifndef _SCHED_PRIO_H
>> +#include <linux/sched/prio.h>
>> +#endif /* #ifndef _SCHED_PRIO_H */
> It seems you don't need to use #ifndef-#endif pair to include a header
> file?

Sorry for the late reply, coming back from vacation for Chinese Spring
Festival.

The reason I use #ifndef-#endif here is that there are lots of files,
such as kernel/sched/sched.h, are including <linux/sched.h> and
<linux/sched/rt.h>. And both of them are including prio.h.

I am not sure should we avoid reincluding a file and how.

Could you help to give me some suggestion of it. Thanx :)

> Thanks,
> Namhyung
>

Subject: [tip:sched/core] sched: Move the priority specific bits into a new header file

Commit-ID: 5c228079ce8a9bb043a423069a6674dfb9268037
Gitweb: http://git.kernel.org/tip/5c228079ce8a9bb043a423069a6674dfb9268037
Author: Dongsheng Yang <[email protected]>
AuthorDate: Mon, 27 Jan 2014 17:15:37 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Sun, 9 Feb 2014 13:31:49 +0100

sched: Move the priority specific bits into a new header file

Some bits about priority are defined in linux/sched/rt.h, but
some of them are not only for rt scheduler, such as MAX_PRIO.

This patch move them all into a new header file, linux/sched/prio.h.

Signed-off-by: Dongsheng Yang <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/f7549508a1588da2c613d601748ca9de30fa5dcf.1390859827.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/sched.h | 2 ++
include/linux/sched/prio.h | 23 +++++++++++++++++++++++
include/linux/sched/rt.h | 19 +------------------
3 files changed, 26 insertions(+), 18 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index ed86779..d97d0a8 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -3,6 +3,8 @@

#include <uapi/linux/sched.h>

+#include <linux/sched/prio.h>
+

struct sched_param {
int sched_priority;
diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
new file mode 100644
index 0000000..9382ba8
--- /dev/null
+++ b/include/linux/sched/prio.h
@@ -0,0 +1,23 @@
+#ifndef _SCHED_PRIO_H
+#define _SCHED_PRIO_H
+
+/*
+ * Priority of a process goes from 0..MAX_PRIO-1, valid RT
+ * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL/SCHED_BATCH
+ * tasks are in the range MAX_RT_PRIO..MAX_PRIO-1. Priority
+ * values are inverted: lower p->prio value means higher priority.
+ *
+ * The MAX_USER_RT_PRIO value allows the actual maximum
+ * RT priority to be separate from the value exported to
+ * user-space. This allows kernel threads to set their
+ * priority to a value higher than any user task. Note:
+ * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO.
+ */
+
+#define MAX_USER_RT_PRIO 100
+#define MAX_RT_PRIO MAX_USER_RT_PRIO
+
+#define MAX_PRIO (MAX_RT_PRIO + 40)
+#define DEFAULT_PRIO (MAX_RT_PRIO + 20)
+
+#endif /* _SCHED_PRIO_H */
diff --git a/include/linux/sched/rt.h b/include/linux/sched/rt.h
index 34e4ebe..f7453d4 100644
--- a/include/linux/sched/rt.h
+++ b/include/linux/sched/rt.h
@@ -1,24 +1,7 @@
#ifndef _SCHED_RT_H
#define _SCHED_RT_H

-/*
- * Priority of a process goes from 0..MAX_PRIO-1, valid RT
- * priority is 0..MAX_RT_PRIO-1, and SCHED_NORMAL/SCHED_BATCH
- * tasks are in the range MAX_RT_PRIO..MAX_PRIO-1. Priority
- * values are inverted: lower p->prio value means higher priority.
- *
- * The MAX_USER_RT_PRIO value allows the actual maximum
- * RT priority to be separate from the value exported to
- * user-space. This allows kernel threads to set their
- * priority to a value higher than any user task. Note:
- * MAX_RT_PRIO must not be smaller than MAX_USER_RT_PRIO.
- */
-
-#define MAX_USER_RT_PRIO 100
-#define MAX_RT_PRIO MAX_USER_RT_PRIO
-
-#define MAX_PRIO (MAX_RT_PRIO + 40)
-#define DEFAULT_PRIO (MAX_RT_PRIO + 20)
+#include <linux/sched/prio.h>

static inline int rt_prio(int prio)
{

Subject: [tip:sched/core] sched: Implement task_nice() as static inline function

Commit-ID: d0ea026808ad81de2af14938448419a95211b938
Gitweb: http://git.kernel.org/tip/d0ea026808ad81de2af14938448419a95211b938
Author: Dongsheng Yang <[email protected]>
AuthorDate: Mon, 27 Jan 2014 22:00:45 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Sun, 9 Feb 2014 15:28:23 +0100

sched: Implement task_nice() as static inline function

As patch "sched: Move the priority specific bits into a new header file" exposes
the priority related macros in linux/sched/prio.h, we don't have to implement
task_nice() in kernel/sched/core.c any more.

This patch implements it in linux/sched/sched.h as static inline function,
saving the kernel stack and enhancing performance a bit.

Signed-off-by: Dongsheng Yang <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/[email protected]
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/sched.h | 11 ++++++++++-
include/linux/sched/prio.h | 1 -
kernel/sched/core.c | 26 +++++++-------------------
kernel/sched/cputime.c | 4 ++--
4 files changed, 19 insertions(+), 23 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index d97d0a8..e3d5564 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -2094,7 +2094,16 @@ static inline void sched_autogroup_exit(struct signal_struct *sig) { }
extern bool yield_to(struct task_struct *p, bool preempt);
extern void set_user_nice(struct task_struct *p, long nice);
extern int task_prio(const struct task_struct *p);
-extern int task_nice(const struct task_struct *p);
+/**
+ * task_nice - return the nice value of a given task.
+ * @p: the task in question.
+ *
+ * Return: The nice value [ -20 ... 0 ... 19 ].
+ */
+static inline int task_nice(const struct task_struct *p)
+{
+ return PRIO_TO_NICE((p)->static_prio);
+}
extern int can_nice(const struct task_struct *p, const int nice);
extern int task_curr(const struct task_struct *p);
extern int idle_cpu(int cpu);
diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
index 13216f1..410ccb7 100644
--- a/include/linux/sched/prio.h
+++ b/include/linux/sched/prio.h
@@ -27,7 +27,6 @@
*/
#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
-#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)

/*
* 'User priority' is the nice value converted to something we
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 210a12a..104c816 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -3000,7 +3000,7 @@ void set_user_nice(struct task_struct *p, long nice)
unsigned long flags;
struct rq *rq;

- if (TASK_NICE(p) == nice || nice < -20 || nice > 19)
+ if (task_nice(p) == nice || nice < -20 || nice > 19)
return;
/*
* We have to be careful, if called from sys_setpriority(),
@@ -3078,7 +3078,7 @@ SYSCALL_DEFINE1(nice, int, increment)
if (increment > 40)
increment = 40;

- nice = TASK_NICE(current) + increment;
+ nice = task_nice(current) + increment;
if (nice < -20)
nice = -20;
if (nice > 19)
@@ -3111,18 +3111,6 @@ int task_prio(const struct task_struct *p)
}

/**
- * task_nice - return the nice value of a given task.
- * @p: the task in question.
- *
- * Return: The nice value [ -20 ... 0 ... 19 ].
- */
-int task_nice(const struct task_struct *p)
-{
- return TASK_NICE(p);
-}
-EXPORT_SYMBOL(task_nice);
-
-/**
* idle_cpu - is a given cpu idle currently?
* @cpu: the processor in question.
*
@@ -3321,7 +3309,7 @@ recheck:
*/
if (user && !capable(CAP_SYS_NICE)) {
if (fair_policy(policy)) {
- if (attr->sched_nice < TASK_NICE(p) &&
+ if (attr->sched_nice < task_nice(p) &&
!can_nice(p, attr->sched_nice))
return -EPERM;
}
@@ -3345,7 +3333,7 @@ recheck:
* SCHED_NORMAL if the RLIMIT_NICE would normally permit it.
*/
if (p->policy == SCHED_IDLE && policy != SCHED_IDLE) {
- if (!can_nice(p, TASK_NICE(p)))
+ if (!can_nice(p, task_nice(p)))
return -EPERM;
}

@@ -3385,7 +3373,7 @@ recheck:
* If not changing anything there's no need to proceed further:
*/
if (unlikely(policy == p->policy)) {
- if (fair_policy(policy) && attr->sched_nice != TASK_NICE(p))
+ if (fair_policy(policy) && attr->sched_nice != task_nice(p))
goto change;
if (rt_policy(policy) && attr->sched_priority != p->rt_priority)
goto change;
@@ -3837,7 +3825,7 @@ SYSCALL_DEFINE3(sched_getattr, pid_t, pid, struct sched_attr __user *, uattr,
else if (task_has_rt_policy(p))
attr.sched_priority = p->rt_priority;
else
- attr.sched_nice = TASK_NICE(p);
+ attr.sched_nice = task_nice(p);

rcu_read_unlock();

@@ -7010,7 +6998,7 @@ void normalize_rt_tasks(void)
* Renice negative nice level userspace
* tasks back to 0:
*/
- if (TASK_NICE(p) < 0 && p->mm)
+ if (task_nice(p) < 0 && p->mm)
set_user_nice(p, 0);
continue;
}
diff --git a/kernel/sched/cputime.c b/kernel/sched/cputime.c
index 9994791..58624a6 100644
--- a/kernel/sched/cputime.c
+++ b/kernel/sched/cputime.c
@@ -142,7 +142,7 @@ void account_user_time(struct task_struct *p, cputime_t cputime,
p->utimescaled += cputime_scaled;
account_group_user_time(p, cputime);

- index = (TASK_NICE(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;
+ index = (task_nice(p) > 0) ? CPUTIME_NICE : CPUTIME_USER;

/* Add user time to cpustat. */
task_group_account_field(p, index, (__force u64) cputime);
@@ -169,7 +169,7 @@ static void account_guest_time(struct task_struct *p, cputime_t cputime,
p->gtime += cputime;

/* Add guest time to cpustat. */
- if (TASK_NICE(p) > 0) {
+ if (task_nice(p) > 0) {
cpustat[CPUTIME_NICE] += (__force u64) cputime;
cpustat[CPUTIME_GUEST_NICE] += (__force u64) cputime;
} else {

Subject: [tip:sched/core] sched: Expose some macros related to priority

Commit-ID: 6b6350f155afdfdf888e18c7bf26950a6d10b0c2
Gitweb: http://git.kernel.org/tip/6b6350f155afdfdf888e18c7bf26950a6d10b0c2
Author: Dongsheng Yang <[email protected]>
AuthorDate: Mon, 27 Jan 2014 17:15:38 -0500
Committer: Ingo Molnar <[email protected]>
CommitDate: Sun, 9 Feb 2014 13:31:51 +0100

sched: Expose some macros related to priority

Some macros in kernel/sched/sched.h about priority are
private to kernel/sched. But they are useful to other
parts of the core kernel.

This patch moves these macros from kernel/sched/sched.h to
include/linux/sched/prio.h so that they are available to
other subsystems.

Signed-off-by: Dongsheng Yang <[email protected]>
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Cc: [email protected]
Signed-off-by: Peter Zijlstra <[email protected]>
Link: http://lkml.kernel.org/r/2b022810905b52d13238466807f4b2a691577180.1390859827.git.yangds.fnst@cn.fujitsu.com
Signed-off-by: Ingo Molnar <[email protected]>
---
include/linux/sched/prio.h | 18 ++++++++++++++++++
kernel/sched/sched.h | 18 ------------------
2 files changed, 18 insertions(+), 18 deletions(-)

diff --git a/include/linux/sched/prio.h b/include/linux/sched/prio.h
index 9382ba8..13216f1 100644
--- a/include/linux/sched/prio.h
+++ b/include/linux/sched/prio.h
@@ -20,4 +20,22 @@
#define MAX_PRIO (MAX_RT_PRIO + 40)
#define DEFAULT_PRIO (MAX_RT_PRIO + 20)

+/*
+ * Convert user-nice values [ -20 ... 0 ... 19 ]
+ * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
+ * and back.
+ */
+#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
+#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
+#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)
+
+/*
+ * 'User priority' is the nice value converted to something we
+ * can work with better when scaling various scheduler parameters,
+ * it's a [ 0 ... 39 ] range.
+ */
+#define USER_PRIO(p) ((p)-MAX_RT_PRIO)
+#define TASK_USER_PRIO(p) USER_PRIO((p)->static_prio)
+#define MAX_USER_PRIO (USER_PRIO(MAX_PRIO))
+
#endif /* _SCHED_PRIO_H */
diff --git a/kernel/sched/sched.h b/kernel/sched/sched.h
index c2119fd..b44720d 100644
--- a/kernel/sched/sched.h
+++ b/kernel/sched/sched.h
@@ -24,24 +24,6 @@ extern long calc_load_fold_active(struct rq *this_rq);
extern void update_cpu_load_active(struct rq *this_rq);

/*
- * Convert user-nice values [ -20 ... 0 ... 19 ]
- * to static priority [ MAX_RT_PRIO..MAX_PRIO-1 ],
- * and back.
- */
-#define NICE_TO_PRIO(nice) (MAX_RT_PRIO + (nice) + 20)
-#define PRIO_TO_NICE(prio) ((prio) - MAX_RT_PRIO - 20)
-#define TASK_NICE(p) PRIO_TO_NICE((p)->static_prio)
-
-/*
- * 'User priority' is the nice value converted to something we
- * can work with better when scaling various scheduler parameters,
- * it's a [ 0 ... 39 ] range.
- */
-#define USER_PRIO(p) ((p)-MAX_RT_PRIO)
-#define TASK_USER_PRIO(p) USER_PRIO((p)->static_prio)
-#define MAX_USER_PRIO (USER_PRIO(MAX_PRIO))
-
-/*
* Helpers for converting nanosecond timing to jiffy resolution
*/
#define NS_TO_JIFFIES(TIME) ((unsigned long)(TIME) / (NSEC_PER_SEC / HZ))

2014-02-10 14:09:55

by Steven Rostedt

[permalink] [raw]
Subject: Re: [PATCH 1/3] sched: Move the priority specific bits into a new header file.

On Mon, 10 Feb 2014 10:56:34 +0800
Dongsheng Yang <[email protected]> wrote:

> >> diff --git a/include/linux/sched.h b/include/linux/sched.h
> >> index 68a0e84..ba1b732 100644
> >> --- a/include/linux/sched.h
> >> +++ b/include/linux/sched.h
> >> @@ -3,6 +3,10 @@
> >>
> >> #include <uapi/linux/sched.h>
> >>
> >> +#ifndef _SCHED_PRIO_H
> >> +#include <linux/sched/prio.h>
> >> +#endif /* #ifndef _SCHED_PRIO_H */
> > It seems you don't need to use #ifndef-#endif pair to include a header
> > file?
>
> Sorry for the late reply, coming back from vacation for Chinese Spring
> Festival.
>
> The reason I use #ifndef-#endif here is that there are lots of files,
> such as kernel/sched/sched.h, are including <linux/sched.h> and
> <linux/sched/rt.h>. And both of them are including prio.h.
>
> I am not sure should we avoid reincluding a file and how.
>
> Could you help to give me some suggestion of it. Thanx :)


That's why you have:

+++ b/include/linux/sched/prio.h
@@ -0,0 +1,23 @@
+#ifndef _SCHED_PRIO_H
+#define _SCHED_PRIO_H

The first time a header gets included, it checks if _SCHED_PRIO_H is
defined, if not, it defines it and continues. Otherwise it skips the
content of the file.

This is so standard practice that CPP (C Pre-Processor) optimizes this
by checking if this exists and caches it. It wont even open the file
the second time it sees it included.

-- Steve

2014-02-11 01:10:58

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH 1/3] sched: Move the priority specific bits into a new header file.

Hi Steven,

On 02/10/2014 10:09 PM, Steven Rostedt wrote:
>
>
> That's why you have:
>
> +++ b/include/linux/sched/prio.h
> @@ -0,0 +1,23 @@
> +#ifndef _SCHED_PRIO_H
> +#define _SCHED_PRIO_H
>
> The first time a header gets included, it checks if _SCHED_PRIO_H is
> defined, if not, it defines it and continues. Otherwise it skips the
> content of the file.
>
> This is so standard practice that CPP (C Pre-Processor) optimizes this
> by checking if this exists and caches it. It wont even open the file
> the second time it sees it included.

Wow, yes. Thank you for your kind explanation. :)
>
> -- Steve
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>

2014-02-11 03:19:39

by Dongsheng Yang

[permalink] [raw]
Subject: Re: [PATCH] tracing: Use task_nice() in function __update_max_tr() to get the nice value of task.

Hi Steve,
As my patch to implement task_nice() as inline function was applied
to tip tree.
http://git.kernel.org/cgit/linux/kernel/git/tip/tip.git/commit/?id=d0ea026808ad81de2af14938448419a95211b938

Please consider to apply this patch in this thread. Thanx :)

On 01/23/2014 06:41 AM, Dongsheng Yang wrote:
> There is already a function named task_nice in sched.h to get the nice value
> of task_struct. We can use it in __update_max_tr() rather than calculate it
> manually.
>
> Signed-off-by: Dongsheng Yang <[email protected]>
> ---
> kernel/trace/trace.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/trace/trace.c b/kernel/trace/trace.c
> index 9d20cd9..ec149b4 100644
> --- a/kernel/trace/trace.c
> +++ b/kernel/trace/trace.c
> @@ -970,7 +970,7 @@ __update_max_tr(struct trace_array *tr, struct task_struct *tsk, int cpu)
> else
> max_data->uid = task_uid(tsk);
>
> - max_data->nice = tsk->static_prio - 20 - MAX_RT_PRIO;
> + max_data->nice = task_nice(tsk);
> max_data->policy = tsk->policy;
> max_data->rt_priority = tsk->rt_priority;
>