Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.
Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.
If the user doesn't kill the monitor process, it seems the
poll_work works fine. After killing and restarting the monitor,
the poll_work in kernel will never run again due to the wrong
value of poll_scheduled. Therefore, we should reset the value
as group_init() does after the last trigger is destroyed.
Signed-off-by: Jason Xing <[email protected]>
---
kernel/sched/psi.c | 6 ++++++
1 file changed, 6 insertions(+)
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 7acc632..66f4385 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
if (kworker_to_destroy) {
kthread_cancel_delayed_work_sync(&group->poll_work);
kthread_destroy_worker(kworker_to_destroy);
+ /*
+ * The poll_work should have the chance to be put into the
+ * kthread queue when calling poll syscall next time. So
+ * reset poll_scheduled to zero as group_init() does
+ */
+ atomic_set(&group->poll_scheduled, 0);
}
kfree(t);
}
--
1.8.3.1
On Tue, Jul 23, 2019 at 02:45:39PM +0800, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
>
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
>
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
>
> Signed-off-by: Jason Xing <[email protected]>
Reviewed-by: Caspar Zhang <[email protected]>
> ---
> kernel/sched/psi.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..66f4385 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
> if (kworker_to_destroy) {
> kthread_cancel_delayed_work_sync(&group->poll_work);
> kthread_destroy_worker(kworker_to_destroy);
> + /*
> + * The poll_work should have the chance to be put into the
> + * kthread queue when calling poll syscall next time. So
> + * reset poll_scheduled to zero as group_init() does
> + */
> + atomic_set(&group->poll_scheduled, 0);
> }
> kfree(t);
> }
> --
> 1.8.3.1
>
--
Thanks,
Caspar
Hello,
Could someone take a quick look at this patch? It's not complicated at
all, just one line added into PSI which can make the poll() run in the
right way.
Thanks,
Jason
On 2019/7/23 下午2:45, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
>
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
>
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
>
> Signed-off-by: Jason Xing <[email protected]>
> ---
> kernel/sched/psi.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..66f4385 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
> if (kworker_to_destroy) {
> kthread_cancel_delayed_work_sync(&group->poll_work);
> kthread_destroy_worker(kworker_to_destroy);
> + /*
> + * The poll_work should have the chance to be put into the
> + * kthread queue when calling poll syscall next time. So
> + * reset poll_scheduled to zero as group_init() does
> + */
> + atomic_set(&group->poll_scheduled, 0);
> }
> kfree(t);
> }
>
Hi Jason,
On Tue, Jul 23, 2019 at 02:45:39PM +0800, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
>
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
>
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
>
> Signed-off-by: Jason Xing <[email protected]>
Good catch, and the fix makes sense to me. However, it was a bit hard
to understand how the problem occurs:
> ---
> kernel/sched/psi.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..66f4385 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
> if (kworker_to_destroy) {
> kthread_cancel_delayed_work_sync(&group->poll_work);
> kthread_destroy_worker(kworker_to_destroy);
> + /*
> + * The poll_work should have the chance to be put into the
> + * kthread queue when calling poll syscall next time. So
> + * reset poll_scheduled to zero as group_init() does
> + */
> + atomic_set(&group->poll_scheduled, 0);
The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag:
if (kworker_to_destroy) {
/*
* After the RCU grace period has expired, the worker
* can no longer be found through group->poll_kworker.
* But it might have been already scheduled before
* that - deschedule it cleanly before destroying it.
*/
kthread_cancel_delayed_work_sync(&group->poll_work);
atomic_set(&group->poll_scheduled, 0);
kthread_destroy_worker(kworker_to_destroy);
}
With that change, please add:
Acked-by: Johannes Weiner <[email protected]>
Thanks!
On Mon, Jul 29, 2019 at 8:30 AM Johannes Weiner <[email protected]> wrote:
>
> Hi Jason,
>
> On Tue, Jul 23, 2019 at 02:45:39PM +0800, Jason Xing wrote:
> > Only when calling the poll syscall the first time can user
> > receive POLLPRI correctly. After that, user always fails to
> > acquire the event signal.
> >
> > Reproduce case:
> > 1. Get the monitor code in Documentation/accounting/psi.txt
> > 2. Run it, and wait for the event triggered.
> > 3. Kill and restart the process.
> >
> > If the user doesn't kill the monitor process, it seems the
> > poll_work works fine. After killing and restarting the monitor,
> > the poll_work in kernel will never run again due to the wrong
> > value of poll_scheduled. Therefore, we should reset the value
> > as group_init() does after the last trigger is destroyed.
> >
> > Signed-off-by: Jason Xing <[email protected]>
>
> Good catch, and the fix makes sense to me. However, it was a bit hard
> to understand how the problem occurs:
>
> > ---
> > kernel/sched/psi.c | 6 ++++++
> > 1 file changed, 6 insertions(+)
> >
> > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> > index 7acc632..66f4385 100644
> > --- a/kernel/sched/psi.c
> > +++ b/kernel/sched/psi.c
> > @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref)
> > if (kworker_to_destroy) {
> > kthread_cancel_delayed_work_sync(&group->poll_work);
> > kthread_destroy_worker(kworker_to_destroy);
> > + /*
> > + * The poll_work should have the chance to be put into the
> > + * kthread queue when calling poll syscall next time. So
> > + * reset poll_scheduled to zero as group_init() does
> > + */
> > + atomic_set(&group->poll_scheduled, 0);
>
> The question is why we can end up with poll_scheduled = 1 but the work
> not running (which would reset it to 0). And the answer is because the
> scheduling side sees group->poll_kworker under RCU protection and then
> schedules it, but here we cancel the work and destroy the worker. The
> cancel needs to pair with resetting the poll_scheduled flag:
>
> if (kworker_to_destroy) {
> /*
> * After the RCU grace period has expired, the worker
> * can no longer be found through group->poll_kworker.
> * But it might have been already scheduled before
> * that - deschedule it cleanly before destroying it.
> */
> kthread_cancel_delayed_work_sync(&group->poll_work);
> atomic_set(&group->poll_scheduled, 0);
>
> kthread_destroy_worker(kworker_to_destroy);
> }
>
> With that change, please add:
>
> Acked-by: Johannes Weiner <[email protected]>
>
> Thanks!
The changes makes sense to me as well. Thanks!
Reviewed-by: Suren Baghdasaryan <[email protected]>
Only when calling the poll syscall the first time can user
receive POLLPRI correctly. After that, user always fails to
acquire the event signal.
Reproduce case:
1. Get the monitor code in Documentation/accounting/psi.txt
2. Run it, and wait for the event triggered.
3. Kill and restart the process.
If the user doesn't kill the monitor process, it seems the
poll_work works fine. After killing and restarting the monitor,
the poll_work in kernel will never run again due to the wrong
value of poll_scheduled. Therefore, we should reset the value
as group_init() does after the last trigger is destroyed.
[PATCH V2]
In the patch v2, I put the atomic_set(&group->poll_scheduled, 0);
into the right place.
Here I quoted from Johannes as the best explaination:
"The question is why we can end up with poll_scheduled = 1 but the work
not running (which would reset it to 0). And the answer is because the
scheduling side sees group->poll_kworker under RCU protection and then
schedules it, but here we cancel the work and destroy the worker. The
cancel needs to pair with resetting the poll_scheduled flag."
Signed-off-by: Jason Xing <[email protected]>
Reviewed-by: Caspar Zhang <[email protected]>
Reviewed-by: Joseph Qi <[email protected]>
Reviewed-by: Suren Baghdasaryan <[email protected]>
Acked-by: Johannes Weiner <[email protected]>
---
kernel/sched/psi.c | 7 +++++++
1 file changed, 7 insertions(+)
diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
index 7acc632..acdada0 100644
--- a/kernel/sched/psi.c
+++ b/kernel/sched/psi.c
@@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref)
* deadlock while waiting for psi_poll_work to acquire trigger_lock
*/
if (kworker_to_destroy) {
+ /*
+ * After the RCU grace period has expired, the worker
+ * can no longer be found through group->poll_kworker.
+ * But it might have been already scheduled before
+ * that - deschedule it cleanly before destroying it.
+ */
kthread_cancel_delayed_work_sync(&group->poll_work);
+ atomic_set(&group->poll_scheduled, 0);
kthread_destroy_worker(kworker_to_destroy);
}
kfree(t);
--
1.8.3.1
Hi all,
According to the reviews from Johoannes, I've changed the old patch and
then submitted the version 2 patch a few days ago.
Please let me know if all this sounds good, or if there are any issues.
Thanks,
Jason
On 2019/7/30 下午1:16, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
>
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
>
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
>
> [PATCH V2]
> In the patch v2, I put the atomic_set(&group->poll_scheduled, 0);
> into the right place.
> Here I quoted from Johannes as the best explaination:
> "The question is why we can end up with poll_scheduled = 1 but the work
> not running (which would reset it to 0). And the answer is because the
> scheduling side sees group->poll_kworker under RCU protection and then
> schedules it, but here we cancel the work and destroy the worker. The
> cancel needs to pair with resetting the poll_scheduled flag."
>
> Signed-off-by: Jason Xing <[email protected]>
> Reviewed-by: Caspar Zhang <[email protected]>
> Reviewed-by: Joseph Qi <[email protected]>
> Reviewed-by: Suren Baghdasaryan <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> ---
> kernel/sched/psi.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..acdada0 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref)
> * deadlock while waiting for psi_poll_work to acquire trigger_lock
> */
> if (kworker_to_destroy) {
> + /*
> + * After the RCU grace period has expired, the worker
> + * can no longer be found through group->poll_kworker.
> + * But it might have been already scheduled before
> + * that - deschedule it cleanly before destroying it.
> + */
> kthread_cancel_delayed_work_sync(&group->poll_work);
> + atomic_set(&group->poll_scheduled, 0);
> kthread_destroy_worker(kworker_to_destroy);
> }
> kfree(t);
>
Hello,
It's been delayed for no reason a couple of days. Any comments and
suggestions on this patch V2 would be appreciated.
Thanks,
Jason
On 2019/7/30 下午1:16, Jason Xing wrote:
> Only when calling the poll syscall the first time can user
> receive POLLPRI correctly. After that, user always fails to
> acquire the event signal.
>
> Reproduce case:
> 1. Get the monitor code in Documentation/accounting/psi.txt
> 2. Run it, and wait for the event triggered.
> 3. Kill and restart the process.
>
> If the user doesn't kill the monitor process, it seems the
> poll_work works fine. After killing and restarting the monitor,
> the poll_work in kernel will never run again due to the wrong
> value of poll_scheduled. Therefore, we should reset the value
> as group_init() does after the last trigger is destroyed.
>
> [PATCH V2]
> In the patch v2, I put the atomic_set(&group->poll_scheduled, 0);
> into the right place.
> Here I quoted from Johannes as the best explaination:
> "The question is why we can end up with poll_scheduled = 1 but the work
> not running (which would reset it to 0). And the answer is because the
> scheduling side sees group->poll_kworker under RCU protection and then
> schedules it, but here we cancel the work and destroy the worker. The
> cancel needs to pair with resetting the poll_scheduled flag."
>
> Signed-off-by: Jason Xing <[email protected]>
> Reviewed-by: Caspar Zhang <[email protected]>
> Reviewed-by: Joseph Qi <[email protected]>
> Reviewed-by: Suren Baghdasaryan <[email protected]>
> Acked-by: Johannes Weiner <[email protected]>
> ---
> kernel/sched/psi.c | 7 +++++++
> 1 file changed, 7 insertions(+)
>
> diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c
> index 7acc632..acdada0 100644
> --- a/kernel/sched/psi.c
> +++ b/kernel/sched/psi.c
> @@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref)
> * deadlock while waiting for psi_poll_work to acquire trigger_lock
> */
> if (kworker_to_destroy) {
> + /*
> + * After the RCU grace period has expired, the worker
> + * can no longer be found through group->poll_kworker.
> + * But it might have been already scheduled before
> + * that - deschedule it cleanly before destroying it.
> + */
> kthread_cancel_delayed_work_sync(&group->poll_work);
> + atomic_set(&group->poll_scheduled, 0);
> kthread_destroy_worker(kworker_to_destroy);
> }
> kfree(t);
>