2021-11-03 18:49:31

by Jimmy Shiu

[permalink] [raw]
Subject: [PATCH] SCHED: attribute page lock and waitqueue functions as sched

trace_sched_blocked_trace in CFS is really useful for debugging via
trace because it tell where the process was stuck on callstack.

For example,
<...>-6143 ( 6136) [005] d..2 50.278987: sched_blocked_reason: pid=6136 iowait=0 caller=SyS_mprotect+0x88/0x208
<...>-6136 ( 6136) [005] d..2 50.278990: sched_blocked_reason: pid=6142 iowait=0 caller=do_page_fault+0x1f4/0x3b0
<...>-6142 ( 6136) [006] d..2 50.278996: sched_blocked_reason: pid=6144 iowait=0 caller=SyS_prctl+0x52c/0xb58
<...>-6144 ( 6136) [006] d..2 50.279007: sched_blocked_reason: pid=6136 iowait=0 caller=vm_mmap_pgoff+0x74/0x104

However, sometime it gives pointless information like this.
RenderThread-2322 ( 1805) [006] d.s3 50.319046: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220
logd.writer-594 ( 587) [002] d.s3 50.334011: sched_blocked_reason: pid=6126 iowait=1 caller=wait_on_page_bit+0x194/0x208
kworker/u16:13-333 ( 333) [007] d.s4 50.343161: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220

Such wait_on_page_bit, __lock_page_killable are pointless because it doesn't
carry on higher information to identify the callstack.

The reason is page_lock and waitqueue are special synchronization method unlike
other normal locks(mutex, spinlock).
Let's mark them as "__sched" so get_wchan which used in trace_sched_blocked_trace
could detect it and skip them. It will produce more meaningful callstack
function like this.

<...>-2867 ( 1068) [002] d.h4 124.209701: sched_blocked_reason: pid=329 iowait=0 caller=worker_thread+0x378/0x470
<...>-2867 ( 1068) [002] d.s3 124.209763: sched_blocked_reason: pid=8454 iowait=1 caller=__filemap_fdatawait_range+0xa0/0x104
<...>-2867 ( 1068) [002] d.s4 124.209803: sched_blocked_reason: pid=869 iowait=0 caller=worker_thread+0x378/0x470
ScreenDecoratio-2364 ( 1867) [002] d.s3 124.209973: sched_blocked_reason: pid=8454 iowait=1 caller=f2fs_wait_on_page_writeback+0x84/0xcc
ScreenDecoratio-2364 ( 1867) [002] d.s4 124.209986: sched_blocked_reason: pid=869 iowait=0 caller=worker_thread+0x378/0x470
<...>-329 ( 329) [000] d..3 124.210435: sched_blocked_reason: pid=538 iowait=0 caller=worker_thread+0x378/0x470
kworker/u16:13-538 ( 538) [007] d..3 124.210450: sched_blocked_reason: pid=6 iowait=0 caller=worker_thread+0x378/0x470

Test: build pass and boot to home.
Bug: 144961676
Bug: 144713689
Bug: 172212772
Signed-off-by: Minchan Kim <[email protected]>
Signed-off-by: Jimmy Shiu <[email protected]>
(cherry picked from commit 1e4de875d9e0cfaccf5131bcc709ae8646cdc168)
Signed-off-by: Will McVicker <[email protected]>
---
include/linux/pagemap.h | 17 +++++++++--------
kernel/sched/wait.c | 8 +++++---
mm/filemap.c | 14 +++++++-------
3 files changed, 21 insertions(+), 18 deletions(-)

diff --git a/include/linux/pagemap.h b/include/linux/pagemap.h
index db2c3e3eb1cf..12e82ff7686a 100644
--- a/include/linux/pagemap.h
+++ b/include/linux/pagemap.h
@@ -15,6 +15,7 @@
#include <linux/bitops.h>
#include <linux/hardirq.h> /* for in_interrupt() */
#include <linux/hugetlb_inline.h>
+#include <linux/sched/debug.h>

struct pagevec;

@@ -687,7 +688,7 @@ static inline void folio_lock(struct folio *folio)
/*
* lock_page may only be called if we have the page's inode pinned.
*/
-static inline void lock_page(struct page *page)
+static inline __sched void lock_page(struct page *page)
{
struct folio *folio;
might_sleep();
@@ -697,7 +698,7 @@ static inline void lock_page(struct page *page)
__folio_lock(folio);
}

-static inline int folio_lock_killable(struct folio *folio)
+static inline __sched int folio_lock_killable(struct folio *folio)
{
might_sleep();
if (!folio_trylock(folio))
@@ -710,7 +711,7 @@ static inline int folio_lock_killable(struct folio *folio)
* signals. It returns 0 if it locked the page and -EINTR if it was
* killed while waiting.
*/
-static inline int lock_page_killable(struct page *page)
+static inline __sched int lock_page_killable(struct page *page)
{
return folio_lock_killable(page_folio(page));
}
@@ -722,7 +723,7 @@ static inline int lock_page_killable(struct page *page)
* Return value and mmap_lock implications depend on flags; see
* __folio_lock_or_retry().
*/
-static inline bool lock_page_or_retry(struct page *page, struct mm_struct *mm,
+static inline __sched bool lock_page_or_retry(struct page *page, struct mm_struct *mm,
unsigned int flags)
{
struct folio *folio;
@@ -746,25 +747,25 @@ int folio_wait_bit_killable(struct folio *folio, int bit_nr);
* ie with increased "page->count" so that the folio won't
* go away during the wait..
*/
-static inline void folio_wait_locked(struct folio *folio)
+static inline __sched void folio_wait_locked(struct folio *folio)
{
if (folio_test_locked(folio))
folio_wait_bit(folio, PG_locked);
}

-static inline int folio_wait_locked_killable(struct folio *folio)
+static inline __sched int folio_wait_locked_killable(struct folio *folio)
{
if (!folio_test_locked(folio))
return 0;
return folio_wait_bit_killable(folio, PG_locked);
}

-static inline void wait_on_page_locked(struct page *page)
+static inline __sched void wait_on_page_locked(struct page *page)
{
folio_wait_locked(page_folio(page));
}

-static inline int wait_on_page_locked_killable(struct page *page)
+static inline __sched int wait_on_page_locked_killable(struct page *page)
{
return folio_wait_locked_killable(page_folio(page));
}
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index 76577d1642a5..a5975579a741 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -404,7 +404,8 @@ void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_en
}
EXPORT_SYMBOL(finish_wait);

-int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key)
+__sched int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
+ int sync, void *key)
{
int ret = default_wake_function(wq_entry, mode, sync, key);

@@ -440,7 +441,7 @@ static inline bool is_kthread_should_stop(void)
* } smp_mb(); // C
* remove_wait_queue(&wq_head, &wait); wq_entry->flags |= WQ_FLAG_WOKEN;
*/
-long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
+__sched long wait_woken(struct wait_queue_entry *wq_entry, unsigned int mode, long timeout)
{
/*
* The below executes an smp_mb(), which matches with the full barrier
@@ -465,7 +466,8 @@ long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
}
EXPORT_SYMBOL(wait_woken);

-int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key)
+__sched int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
+ int sync, void *key)
{
/* Pairs with the smp_store_mb() in wait_woken(). */
smp_mb(); /* C */
diff --git a/mm/filemap.c b/mm/filemap.c
index bfcef6ff7a27..ad4268ee1bf1 100644
--- a/mm/filemap.c
+++ b/mm/filemap.c
@@ -1271,7 +1271,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
/* How many times do we accept lock stealing from under a waiter? */
int sysctl_page_lock_unfairness = 5;

-static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
+static inline __sched int folio_wait_bit_common(struct folio *folio, int bit_nr,
int state, enum behavior behavior)
{
wait_queue_head_t *q = folio_waitqueue(folio);
@@ -1411,13 +1411,13 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
}

-void folio_wait_bit(struct folio *folio, int bit_nr)
+__sched void folio_wait_bit(struct folio *folio, int bit_nr)
{
folio_wait_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
}
EXPORT_SYMBOL(folio_wait_bit);

-int folio_wait_bit_killable(struct folio *folio, int bit_nr)
+__sched int folio_wait_bit_killable(struct folio *folio, int bit_nr)
{
return folio_wait_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
}
@@ -1628,21 +1628,21 @@ EXPORT_SYMBOL_GPL(page_endio);
* __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
* @folio: The folio to lock
*/
-void __folio_lock(struct folio *folio)
+__sched void __folio_lock(struct folio *folio)
{
folio_wait_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
EXCLUSIVE);
}
EXPORT_SYMBOL(__folio_lock);

-int __folio_lock_killable(struct folio *folio)
+__sched int __folio_lock_killable(struct folio *folio)
{
return folio_wait_bit_common(folio, PG_locked, TASK_KILLABLE,
EXCLUSIVE);
}
EXPORT_SYMBOL_GPL(__folio_lock_killable);

-static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
+static __sched int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
{
struct wait_queue_head *q = folio_waitqueue(folio);
int ret = 0;
@@ -1679,7 +1679,7 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
* If neither ALLOW_RETRY nor KILLABLE are set, will always return true
* with the folio locked and the mmap_lock unperturbed.
*/
-bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
+__sched bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
unsigned int flags)
{
if (fault_flag_allow_retry_first(flags)) {
--
2.34.0.rc0.344.g81b53c2807-goog


2021-11-03 19:05:39

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] SCHED: attribute page lock and waitqueue functions as sched

On Thu, Nov 04, 2021 at 02:47:03AM +0800, Jimmy Shiu wrote:
> Bug: 144961676
> Bug: 144713689
> Bug: 172212772

A bug number is meaningless without knowing which bug tracker they
refer to. I suggest contemplating the meaning of the 'U' in URL.

> Signed-off-by: Minchan Kim <[email protected]>
> Signed-off-by: Jimmy Shiu <[email protected]>
> (cherry picked from commit 1e4de875d9e0cfaccf5131bcc709ae8646cdc168)

what tree is that commit ID in? I suggest it's meaningless and
should be removed.

> @@ -687,7 +688,7 @@ static inline void folio_lock(struct folio *folio)
> /*
> * lock_page may only be called if we have the page's inode pinned.
> */
> -static inline void lock_page(struct page *page)
> +static inline __sched void lock_page(struct page *page)

Why do you need to tag a static inline function as __sched? This would
be the only place where that is done.

> +++ b/kernel/sched/wait.c
> @@ -404,7 +404,8 @@ void finish_wait(struct wait_queue_head *wq_head, struct wait_queue_entry *wq_en
> }
> EXPORT_SYMBOL(finish_wait);
>
> -int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned mode, int sync, void *key)
> +__sched int autoremove_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
> + int sync, void *key)
> {

This function doesn't sleep. Why does it need to be tagged __sched?

> @@ -440,7 +441,7 @@ static inline bool is_kthread_should_stop(void)
> * } smp_mb(); // C
> * remove_wait_queue(&wq_head, &wait); wq_entry->flags |= WQ_FLAG_WOKEN;
> */
> -long wait_woken(struct wait_queue_entry *wq_entry, unsigned mode, long timeout)
> +__sched long wait_woken(struct wait_queue_entry *wq_entry, unsigned int mode, long timeout)

This one makes sense, but you've extended the length of the line past 80
columns, and it seems like it should be a separate patch with its own
justification.

> +__sched int woken_wake_function(struct wait_queue_entry *wq_entry, unsigned int mode,
> + int sync, void *key)

This doesn't seem to sleep either?

> +++ b/mm/filemap.c
> @@ -1271,7 +1271,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
> /* How many times do we accept lock stealing from under a waiter? */
> int sysctl_page_lock_unfairness = 5;
>
> -static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
> +static inline __sched int folio_wait_bit_common(struct folio *folio, int bit_nr,
> int state, enum behavior behavior)
> {
> wait_queue_head_t *q = folio_waitqueue(folio);
> @@ -1411,13 +1411,13 @@ static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
> return wait->flags & WQ_FLAG_WOKEN ? 0 : -EINTR;
> }
>
> -void folio_wait_bit(struct folio *folio, int bit_nr)
> +__sched void folio_wait_bit(struct folio *folio, int bit_nr)
> {
> folio_wait_bit_common(folio, bit_nr, TASK_UNINTERRUPTIBLE, SHARED);
> }
> EXPORT_SYMBOL(folio_wait_bit);
>
> -int folio_wait_bit_killable(struct folio *folio, int bit_nr)
> +__sched int folio_wait_bit_killable(struct folio *folio, int bit_nr)
> {
> return folio_wait_bit_common(folio, bit_nr, TASK_KILLABLE, SHARED);
> }
> @@ -1628,21 +1628,21 @@ EXPORT_SYMBOL_GPL(page_endio);
> * __folio_lock - Get a lock on the folio, assuming we need to sleep to get it.
> * @folio: The folio to lock
> */
> -void __folio_lock(struct folio *folio)
> +__sched void __folio_lock(struct folio *folio)
> {
> folio_wait_bit_common(folio, PG_locked, TASK_UNINTERRUPTIBLE,
> EXCLUSIVE);
> }
> EXPORT_SYMBOL(__folio_lock);
>
> -int __folio_lock_killable(struct folio *folio)
> +__sched int __folio_lock_killable(struct folio *folio)
> {
> return folio_wait_bit_common(folio, PG_locked, TASK_KILLABLE,
> EXCLUSIVE);
> }
> EXPORT_SYMBOL_GPL(__folio_lock_killable);
>
> -static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
> +static __sched int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
> {
> struct wait_queue_head *q = folio_waitqueue(folio);
> int ret = 0;
> @@ -1679,7 +1679,7 @@ static int __folio_lock_async(struct folio *folio, struct wait_page_queue *wait)
> * If neither ALLOW_RETRY nor KILLABLE are set, will always return true
> * with the folio locked and the mmap_lock unperturbed.
> */
> -bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
> +__sched bool __folio_lock_or_retry(struct folio *folio, struct mm_struct *mm,
> unsigned int flags)
> {
> if (fault_flag_allow_retry_first(flags)) {
> --
> 2.34.0.rc0.344.g81b53c2807-goog
>
>

2021-11-03 19:18:49

by Matthew Wilcox

[permalink] [raw]
Subject: Re: [PATCH] SCHED: attribute page lock and waitqueue functions as sched

On Thu, Nov 04, 2021 at 02:47:03AM +0800, Jimmy Shiu wrote:
> +++ b/mm/filemap.c
> @@ -1271,7 +1271,7 @@ static inline bool folio_trylock_flag(struct folio *folio, int bit_nr,
> /* How many times do we accept lock stealing from under a waiter? */
> int sysctl_page_lock_unfairness = 5;
>
> -static inline int folio_wait_bit_common(struct folio *folio, int bit_nr,
> +static inline __sched int folio_wait_bit_common(struct folio *folio, int bit_nr,

It's not clear to me whether folio_wait_bit_common() needs to be marked
as __sched or whether marking its callers is sufficient, but I'm pretty
sure you forgot to mark put_and_wait_on_page_locked() as __sched, which
is an important one as it's now where we wait for readahead to finish
when reading a file.

2021-11-04 09:19:34

by Mel Gorman

[permalink] [raw]
Subject: Re: [PATCH] SCHED: attribute page lock and waitqueue functions as sched

On Thu, Nov 04, 2021 at 02:47:03AM +0800, Jimmy Shiu wrote:
> trace_sched_blocked_trace in CFS is really useful for debugging via
> trace because it tell where the process was stuck on callstack.
>
> For example,
> <...>-6143 ( 6136) [005] d..2 50.278987: sched_blocked_reason: pid=6136 iowait=0 caller=SyS_mprotect+0x88/0x208
> <...>-6136 ( 6136) [005] d..2 50.278990: sched_blocked_reason: pid=6142 iowait=0 caller=do_page_fault+0x1f4/0x3b0
> <...>-6142 ( 6136) [006] d..2 50.278996: sched_blocked_reason: pid=6144 iowait=0 caller=SyS_prctl+0x52c/0xb58
> <...>-6144 ( 6136) [006] d..2 50.279007: sched_blocked_reason: pid=6136 iowait=0 caller=vm_mmap_pgoff+0x74/0x104
>
> However, sometime it gives pointless information like this.
> RenderThread-2322 ( 1805) [006] d.s3 50.319046: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220
> logd.writer-594 ( 587) [002] d.s3 50.334011: sched_blocked_reason: pid=6126 iowait=1 caller=wait_on_page_bit+0x194/0x208
> kworker/u16:13-333 ( 333) [007] d.s4 50.343161: sched_blocked_reason: pid=6136 iowait=1 caller=__lock_page_killable+0x17c/0x220
>

/me peers at coffee

Maybe I'm blind but aside from Matthew's concerns, this
appears to rely on an Android patch that is not upstream
https://android.googlesource.com/kernel/msm/+/c9f00aa0e25e397533c198a0fcf6246715f99a7b%5E!/

--
Mel Gorman
SUSE Labs