2020-03-17 02:04:02

by Yafang Shao

[permalink] [raw]
Subject: [PATCH v2] psi: move PF_MEMSTALL out of task->flags

The task->flags is a 32-bits flag, in which 31 bits have already been
consumed. So it is hardly to introduce other new per process flag.
Currently there're still enough spaces in the bit-field section of
task_struct, so we can define the memstall state as a single bit in
task_struct instead.
This patch also removes an out-of-date comment pointed by Matthew.

Suggested-by: Johannes Weiner <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Cc: Matthew Wilcox <[email protected]>
Signed-off-by: Yafang Shao <[email protected]>
---
include/linux/sched.h | 6 ++++--
kernel/sched/psi.c | 12 ++++++------
kernel/sched/stats.h | 10 +++++-----
3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 0d84f8f..c429e97 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -783,9 +783,12 @@ struct task_struct {
unsigned frozen:1;
#endif
#ifdef CONFIG_BLK_CGROUP
- /* to be used once the psi infrastructure lands upstream. */
unsigned use_memdelay:1;
#endif
+#ifdef CONFIG_PSI
+ /* Stalled due to lack of memory */
+ unsigned in_memstall:1;
+#endif

unsigned long atomic_flags; /* Flags requiring atomic access. */

@@ -1490,7 +1493,6 @@ static inline int is_global_init(struct task_struct *tsk)
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* Randomize virtual address space */
#define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */
-#define PF_MEMSTALL 0x01000000 /* Stalled due to lack of memory */
#define PF_UMH 0x02000000 /* I'm an Usermodehelper process */
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 517e371..d068e83 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -817,17 +817,17 @@ void psi_memstall_enter(unsigned long *flags)
if (static_branch_likely(&psi_disabled))
return;

- *flags = current->flags & PF_MEMSTALL;
+ *flags = current->in_memstall;
if (*flags)
return;
/*
- * PF_MEMSTALL setting & accounting needs to be atomic wrt
+ * in_memstall setting & accounting needs to be atomic wrt
* changes to the task's scheduling state, otherwise we can
* race with CPU migration.
*/
rq = this_rq_lock_irq(&rf);

- current->flags |= PF_MEMSTALL;
+ current->in_memstall = 1;
psi_task_change(current, 0, TSK_MEMSTALL);

rq_unlock_irq(rq, &rf);
@@ -850,13 +850,13 @@ void psi_memstall_leave(unsigned long *flags)
if (*flags)
return;
/*
- * PF_MEMSTALL clearing & accounting needs to be atomic wrt
+ * in_memstall clearing & accounting needs to be atomic wrt
* changes to the task's scheduling state, otherwise we could
* race with CPU migration.
*/
rq = this_rq_lock_irq(&rf);

- current->flags &= ~PF_MEMSTALL;
+ current->in_memstall = 0;
psi_task_change(current, TSK_MEMSTALL, 0);

rq_unlock_irq(rq, &rf);
@@ -920,7 +920,7 @@ void cgroup_move_task(struct task_struct *task, struct css_set *to)
else if (task->in_iowait)
task_flags = TSK_IOWAIT;

- if (task->flags & PF_MEMSTALL)
+ if (task->in_memstall)
task_flags |= TSK_MEMSTALL;

if (task_flags)
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index ba683fe..199e304 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -70,7 +70,7 @@ static inline void psi_enqueue(struct task_struct *p, bool wakeup)
return;

if (!wakeup || p->sched_psi_wake_requeue) {
- if (p->flags & PF_MEMSTALL)
+ if (p->in_memstall)
set |= TSK_MEMSTALL;
if (p->sched_psi_wake_requeue)
p->sched_psi_wake_requeue = 0;
@@ -90,7 +90,7 @@ static inline void psi_dequeue(struct task_struct *p, bool sleep)
return;

if (!sleep) {
- if (p->flags & PF_MEMSTALL)
+ if (p->in_memstall)
clear |= TSK_MEMSTALL;
} else {
if (p->in_iowait)
@@ -109,14 +109,14 @@ static inline void psi_ttwu_dequeue(struct task_struct *p)
* deregister its sleep-persistent psi states from the old
* queue, and let psi_enqueue() know it has to requeue.
*/
- if (unlikely(p->in_iowait || (p->flags & PF_MEMSTALL))) {
+ if (unlikely(p->in_iowait || p->in_memstall)) {
struct rq_flags rf;
struct rq *rq;
int clear = 0;

if (p->in_iowait)
clear |= TSK_IOWAIT;
- if (p->flags & PF_MEMSTALL)
+ if (p->in_memstall)
clear |= TSK_MEMSTALL;

rq = __task_rq_lock(p, &rf);
@@ -131,7 +131,7 @@ static inline void psi_task_tick(struct rq *rq)
if (static_branch_likely(&psi_disabled))
return;

- if (unlikely(rq->curr->flags & PF_MEMSTALL))
+ if (unlikely(rq->curr->in_memstall))
psi_memstall_tick(rq->curr, cpu_of(rq));
}
#else /* CONFIG_PSI */
--
1.8.3.1


Subject: [tip: sched/core] psi: Move PF_MEMSTALL out of task->flags

The following commit has been merged into the sched/core branch of tip:

Commit-ID: 1066d1b6974e095d5a6c472ad9180a957b496cd6
Gitweb: https://git.kernel.org/tip/1066d1b6974e095d5a6c472ad9180a957b496cd6
Author: Yafang Shao <[email protected]>
AuthorDate: Mon, 16 Mar 2020 21:28:05 -04:00
Committer: Peter Zijlstra <[email protected]>
CommitterDate: Fri, 20 Mar 2020 13:06:19 +01:00

psi: Move PF_MEMSTALL out of task->flags

The task->flags is a 32-bits flag, in which 31 bits have already been
consumed. So it is hardly to introduce other new per process flag.
Currently there're still enough spaces in the bit-field section of
task_struct, so we can define the memstall state as a single bit in
task_struct instead.
This patch also removes an out-of-date comment pointed by Matthew.

Suggested-by: Johannes Weiner <[email protected]>
Signed-off-by: Yafang Shao <[email protected]>
Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
Link: https://lkml.kernel.org/r/[email protected]
---
include/linux/sched.h | 6 ++++--
kernel/sched/psi.c | 12 ++++++------
kernel/sched/stats.h | 10 +++++-----
3 files changed, 15 insertions(+), 13 deletions(-)

diff --git a/include/linux/sched.h b/include/linux/sched.h
index 2e9199b..09bddd9 100644
--- a/include/linux/sched.h
+++ b/include/linux/sched.h
@@ -785,9 +785,12 @@ struct task_struct {
unsigned frozen:1;
#endif
#ifdef CONFIG_BLK_CGROUP
- /* to be used once the psi infrastructure lands upstream. */
unsigned use_memdelay:1;
#endif
+#ifdef CONFIG_PSI
+ /* Stalled due to lack of memory */
+ unsigned in_memstall:1;
+#endif

unsigned long atomic_flags; /* Flags requiring atomic access. */

@@ -1480,7 +1483,6 @@ extern struct pid *cad_pid;
#define PF_KTHREAD 0x00200000 /* I am a kernel thread */
#define PF_RANDOMIZE 0x00400000 /* Randomize virtual address space */
#define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */
-#define PF_MEMSTALL 0x01000000 /* Stalled due to lack of memory */
#define PF_UMH 0x02000000 /* I'm an Usermodehelper process */
#define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */
#define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 955a124..8f45cdb 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -865,17 +865,17 @@ void psi_memstall_enter(unsigned long *flags)
if (static_branch_likely(&psi_disabled))
return;

- *flags = current->flags & PF_MEMSTALL;
+ *flags = current->in_memstall;
if (*flags)
return;
/*
- * PF_MEMSTALL setting & accounting needs to be atomic wrt
+ * in_memstall setting & accounting needs to be atomic wrt
* changes to the task's scheduling state, otherwise we can
* race with CPU migration.
*/
rq = this_rq_lock_irq(&rf);

- current->flags |= PF_MEMSTALL;
+ current->in_memstall = 1;
psi_task_change(current, 0, TSK_MEMSTALL);

rq_unlock_irq(rq, &rf);
@@ -898,13 +898,13 @@ void psi_memstall_leave(unsigned long *flags)
if (*flags)
return;
/*
- * PF_MEMSTALL clearing & accounting needs to be atomic wrt
+ * in_memstall clearing & accounting needs to be atomic wrt
* changes to the task's scheduling state, otherwise we could
* race with CPU migration.
*/
rq = this_rq_lock_irq(&rf);

- current->flags &= ~PF_MEMSTALL;
+ current->in_memstall = 0;
psi_task_change(current, TSK_MEMSTALL, 0);

rq_unlock_irq(rq, &rf);
@@ -970,7 +970,7 @@ void cgroup_move_task(struct task_struct *task, struct css_set *to)
} else if (task->in_iowait)
task_flags = TSK_IOWAIT;

- if (task->flags & PF_MEMSTALL)
+ if (task->in_memstall)
task_flags |= TSK_MEMSTALL;

if (task_flags)
diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
index 1339f5b..33d0daf 100644
--- a/kernel/sched/stats.h
+++ b/kernel/sched/stats.h
@@ -70,7 +70,7 @@ static inline void psi_enqueue(struct task_struct *p, bool wakeup)
return;

if (!wakeup || p->sched_psi_wake_requeue) {
- if (p->flags & PF_MEMSTALL)
+ if (p->in_memstall)
set |= TSK_MEMSTALL;
if (p->sched_psi_wake_requeue)
p->sched_psi_wake_requeue = 0;
@@ -90,7 +90,7 @@ static inline void psi_dequeue(struct task_struct *p, bool sleep)
return;

if (!sleep) {
- if (p->flags & PF_MEMSTALL)
+ if (p->in_memstall)
clear |= TSK_MEMSTALL;
} else {
/*
@@ -117,14 +117,14 @@ static inline void psi_ttwu_dequeue(struct task_struct *p)
* deregister its sleep-persistent psi states from the old
* queue, and let psi_enqueue() know it has to requeue.
*/
- if (unlikely(p->in_iowait || (p->flags & PF_MEMSTALL))) {
+ if (unlikely(p->in_iowait || p->in_memstall)) {
struct rq_flags rf;
struct rq *rq;
int clear = 0;

if (p->in_iowait)
clear |= TSK_IOWAIT;
- if (p->flags & PF_MEMSTALL)
+ if (p->in_memstall)
clear |= TSK_MEMSTALL;

rq = __task_rq_lock(p, &rf);
@@ -149,7 +149,7 @@ static inline void psi_task_tick(struct rq *rq)
if (static_branch_likely(&psi_disabled))
return;

- if (unlikely(rq->curr->flags & PF_MEMSTALL))
+ if (unlikely(rq->curr->in_memstall))
psi_memstall_tick(rq->curr, cpu_of(rq));
}
#else /* CONFIG_PSI */

2020-03-21 02:48:30

by Yafang Shao

[permalink] [raw]
Subject: Re: [tip: sched/core] psi: Move PF_MEMSTALL out of task->flags

On Fri, Mar 20, 2020 at 8:58 PM tip-bot2 for Yafang Shao
<[email protected]> wrote:
>
> The following commit has been merged into the sched/core branch of tip:
>
> Commit-ID: 1066d1b6974e095d5a6c472ad9180a957b496cd6
> Gitweb: https://git.kernel.org/tip/1066d1b6974e095d5a6c472ad9180a957b496cd6
> Author: Yafang Shao <[email protected]>
> AuthorDate: Mon, 16 Mar 2020 21:28:05 -04:00
> Committer: Peter Zijlstra <[email protected]>
> CommitterDate: Fri, 20 Mar 2020 13:06:19 +01:00
>
> psi: Move PF_MEMSTALL out of task->flags
>
> The task->flags is a 32-bits flag, in which 31 bits have already been
> consumed. So it is hardly to introduce other new per process flag.
> Currently there're still enough spaces in the bit-field section of
> task_struct, so we can define the memstall state as a single bit in
> task_struct instead.
> This patch also removes an out-of-date comment pointed by Matthew.
>
> Suggested-by: Johannes Weiner <[email protected]>
> Signed-off-by: Yafang Shao <[email protected]>
> Signed-off-by: Peter Zijlstra (Intel) <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> Link: https://lkml.kernel.org/r/[email protected]
> ---
> include/linux/sched.h | 6 ++++--
> kernel/sched/psi.c | 12 ++++++------
> kernel/sched/stats.h | 10 +++++-----
> 3 files changed, 15 insertions(+), 13 deletions(-)
>
> diff --git a/include/linux/sched.h b/include/linux/sched.h
> index 2e9199b..09bddd9 100644
> --- a/include/linux/sched.h
> +++ b/include/linux/sched.h
> @@ -785,9 +785,12 @@ struct task_struct {
> unsigned frozen:1;
> #endif
> #ifdef CONFIG_BLK_CGROUP
> - /* to be used once the psi infrastructure lands upstream. */
> unsigned use_memdelay:1;
> #endif
> +#ifdef CONFIG_PSI
> + /* Stalled due to lack of memory */
> + unsigned in_memstall:1;
> +#endif
>
> unsigned long atomic_flags; /* Flags requiring atomic access. */
>
> @@ -1480,7 +1483,6 @@ extern struct pid *cad_pid;
> #define PF_KTHREAD 0x00200000 /* I am a kernel thread */
> #define PF_RANDOMIZE 0x00400000 /* Randomize virtual address space */
> #define PF_SWAPWRITE 0x00800000 /* Allowed to write to swap */
> -#define PF_MEMSTALL 0x01000000 /* Stalled due to lack of memory */
> #define PF_UMH 0x02000000 /* I'm an Usermodehelper process */
> #define PF_NO_SETAFFINITY 0x04000000 /* Userland is not allowed to meddle with cpus_mask */
> #define PF_MCE_EARLY 0x08000000 /* Early kill for mce process policy */
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 955a124..8f45cdb 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -865,17 +865,17 @@ void psi_memstall_enter(unsigned long *flags)
> if (static_branch_likely(&psi_disabled))
> return;
>
> - *flags = current->flags & PF_MEMSTALL;
> + *flags = current->in_memstall;
> if (*flags)
> return;
> /*
> - * PF_MEMSTALL setting & accounting needs to be atomic wrt
> + * in_memstall setting & accounting needs to be atomic wrt
> * changes to the task's scheduling state, otherwise we can
> * race with CPU migration.
> */
> rq = this_rq_lock_irq(&rf);
>
> - current->flags |= PF_MEMSTALL;
> + current->in_memstall = 1;
> psi_task_change(current, 0, TSK_MEMSTALL);
>
> rq_unlock_irq(rq, &rf);
> @@ -898,13 +898,13 @@ void psi_memstall_leave(unsigned long *flags)
> if (*flags)
> return;
> /*
> - * PF_MEMSTALL clearing & accounting needs to be atomic wrt
> + * in_memstall clearing & accounting needs to be atomic wrt
> * changes to the task's scheduling state, otherwise we could
> * race with CPU migration.
> */
> rq = this_rq_lock_irq(&rf);
>
> - current->flags &= ~PF_MEMSTALL;
> + current->in_memstall = 0;
> psi_task_change(current, TSK_MEMSTALL, 0);
>
> rq_unlock_irq(rq, &rf);
> @@ -970,7 +970,7 @@ void cgroup_move_task(struct task_struct *task, struct css_set *to)
> } else if (task->in_iowait)
> task_flags = TSK_IOWAIT;
>
> - if (task->flags & PF_MEMSTALL)
> + if (task->in_memstall)
> task_flags |= TSK_MEMSTALL;
>
> if (task_flags)
> diff --git a/kernel/sched/stats.h b/kernel/sched/stats.h
> index 1339f5b..33d0daf 100644
> --- a/kernel/sched/stats.h
> +++ b/kernel/sched/stats.h
> @@ -70,7 +70,7 @@ static inline void psi_enqueue(struct task_struct *p, bool wakeup)
> return;
>
> if (!wakeup || p->sched_psi_wake_requeue) {
> - if (p->flags & PF_MEMSTALL)
> + if (p->in_memstall)
> set |= TSK_MEMSTALL;
> if (p->sched_psi_wake_requeue)
> p->sched_psi_wake_requeue = 0;
> @@ -90,7 +90,7 @@ static inline void psi_dequeue(struct task_struct *p, bool sleep)
> return;
>
> if (!sleep) {
> - if (p->flags & PF_MEMSTALL)
> + if (p->in_memstall)
> clear |= TSK_MEMSTALL;
> } else {
> /*
> @@ -117,14 +117,14 @@ static inline void psi_ttwu_dequeue(struct task_struct *p)
> * deregister its sleep-persistent psi states from the old
> * queue, and let psi_enqueue() know it has to requeue.
> */
> - if (unlikely(p->in_iowait || (p->flags & PF_MEMSTALL))) {
> + if (unlikely(p->in_iowait || p->in_memstall)) {
> struct rq_flags rf;
> struct rq *rq;
> int clear = 0;
>
> if (p->in_iowait)
> clear |= TSK_IOWAIT;
> - if (p->flags & PF_MEMSTALL)
> + if (p->in_memstall)
> clear |= TSK_MEMSTALL;
>
> rq = __task_rq_lock(p, &rf);
> @@ -149,7 +149,7 @@ static inline void psi_task_tick(struct rq *rq)
> if (static_branch_likely(&psi_disabled))
> return;
>
> - if (unlikely(rq->curr->flags & PF_MEMSTALL))
> + if (unlikely(rq->curr->in_memstall))
> psi_memstall_tick(rq->curr, cpu_of(rq));
> }
> #else /* CONFIG_PSI */

+ Andrew

Hi Peter,

This patch was aleady added into Andrew's -mm tree.[1]
I'm not sure whether that could cause merge conflict when both of them
are merged into Linus's tree.

[1]. https://marc.info/?l=linux-mm-commits&m=158456557519886&w=2

--
Yafang Shao
DiDi

2020-03-21 02:56:43

by Andrew Morton

[permalink] [raw]
Subject: Re: [tip: sched/core] psi: Move PF_MEMSTALL out of task->flags

On Sat, 21 Mar 2020 10:47:05 +0800 Yafang Shao <[email protected]> wrote:

> This patch was aleady added into Andrew's -mm tree.[1]
> I'm not sure whether that could cause merge conflict when both of them
> are merged into Linus's tree.
>

That's OK - if a patch turns up in someone else's -next tree I'll drop
my copy. Usually after checking that the other copy was the same
version (it usually is) and after checking whether it has up to date
cc:stable and review/ack tags (it usually doesn't!).

2020-03-21 03:40:42

by Yafang Shao

[permalink] [raw]
Subject: Re: [tip: sched/core] psi: Move PF_MEMSTALL out of task->flags

On Sat, Mar 21, 2020 at 10:55 AM Andrew Morton
<[email protected]> wrote:
>
> On Sat, 21 Mar 2020 10:47:05 +0800 Yafang Shao <[email protected]> wrote:
>
> > This patch was aleady added into Andrew's -mm tree.[1]
> > I'm not sure whether that could cause merge conflict when both of them
> > are merged into Linus's tree.
> >
>
> That's OK - if a patch turns up in someone else's -next tree I'll drop
> my copy. Usually after checking that the other copy was the same
> version (it usually is) and after checking whether it has up to date
> cc:stable and review/ack tags (it usually doesn't!).
>

Got it.
Thanks for your explanation.

--
Yafang Shao
DiDi