Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp4496555ybi; Tue, 30 Jul 2019 03:06:29 -0700 (PDT) X-Google-Smtp-Source: APXvYqxzL/JsHzVTXGTBJP1/smGMcMzI1FFHNEBNkwwX3/9egtocpWDro8HRsFA4lqDxF592hAuE X-Received: by 2002:a63:f304:: with SMTP id l4mr107376200pgh.66.1564481189174; Tue, 30 Jul 2019 03:06:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564481189; cv=none; d=google.com; s=arc-20160816; b=i8z3Uf/5xQIHWlMRTqX2vs3fOj5B98jUI+8QDVp67H3SzL3yFpRufB6xG4Q2j84gIk NH+DxmVuSq1Xr6Z433BKFPR1ff267uua0rXW6hXttW19k/ZyHDp9q/5FwC3/o0MeGUON e0w0TXN6SMkKwnJ0PtDFvXAqS8c6Vdamwt+/FziF5ghAbqqNdyMHsqd1z12mzlomLxzH JoTGmlzekKkQd44Y45bctgOR2V87/HxH8eAOQsQalJJtbqulonx5F+16/Wxp5RXrKCdB AvcSK4NCmvAQh7261ptpq5+0Oa8vy4znsViRDoaIwJsTv+oa1Blv1701dPxxHIZCJg3g TwSw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:references:in-reply-to:message-id:date :subject:cc:to:from; bh=qB68PRzMerpEMnedp7Wk6DuU0fXfJvNSqvKTD09XDpc=; b=aWckbKa6epjjNjfuufDl789bWE7punVuOUWjS4rEqQK41y67d1ZasqsMyXpqdxoXjf 5QjHq5ZxzRaMpIMD70zkCzrVxu8og7LNUNGtoZBrh5ZWbEVFPJEKyD15slBD3C2R1LDD kmNueaX520XJ4QV7G7c5tJOcFkgpOMA9JAXeTKl91s5byIKcNLsEHgSTucCSPSf4z01N Rzg3pO0/JgGMXmU2hJIXnMD9hvXgi1oacf3kI4fl4RVp/Wq8SRNJOh+Cr9psCwQEQqqx JxZBhv0+Vy3WKLUQuOMICagJauzNtW1Ww9uSO2XsB6cgVRwDCSjrRRJhOLk53mdI5ybG FuEA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id s3si27519145pgn.467.2019.07.30.03.06.14; Tue, 30 Jul 2019 03:06:29 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=alibaba.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729155AbfG3FRV (ORCPT + 99 others); Tue, 30 Jul 2019 01:17:21 -0400 Received: from out30-132.freemail.mail.aliyun.com ([115.124.30.132]:42009 "EHLO out30-132.freemail.mail.aliyun.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729002AbfG3FRV (ORCPT ); Tue, 30 Jul 2019 01:17:21 -0400 X-Alimail-AntiSpam: AC=PASS;BC=-1|-1;BR=01201311R421e4;CH=green;DM=||false|;FP=0|-1|-1|-1|0|-1|-1|-1;HT=e01e04394;MF=kerneljasonxing@linux.alibaba.com;NM=1;PH=DS;RN=12;SR=0;TI=SMTPD_---0TY9ePpV_1564463819; Received: from localhost(mailfrom:kerneljasonxing@linux.alibaba.com fp:SMTPD_---0TY9ePpV_1564463819) by smtp.aliyun-inc.com(127.0.0.1); Tue, 30 Jul 2019 13:17:17 +0800 From: Jason Xing To: hannes@cmpxchg.org, surenb@google.com Cc: dennis@kernel.org, mingo@redhat.com, axboe@kernel.dk, lizefan@huawei.com, peterz@infradead.org, tj@kernel.org, kerneljasonxing@linux.alibaba.com, linux-kernel@vger.kernel.org, caspar@linux.alibaba.com, joseph.qi@linux.alibaba.com Subject: [PATCH v2] psi: get poll_work to run when calling poll syscall next time Date: Tue, 30 Jul 2019 13:16:59 +0800 Message-Id: <1564463819-120014-1-git-send-email-kerneljasonxing@linux.alibaba.com> X-Mailer: git-send-email 1.8.3.1 In-Reply-To: <1563864339-2621-1-git-send-email-kerneljasonxing@linux.alibaba.com> References: <1563864339-2621-1-git-send-email-kerneljasonxing@linux.alibaba.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Only when calling the poll syscall the first time can user receive POLLPRI correctly. After that, user always fails to acquire the event signal. Reproduce case: 1. Get the monitor code in Documentation/accounting/psi.txt 2. Run it, and wait for the event triggered. 3. Kill and restart the process. If the user doesn't kill the monitor process, it seems the poll_work works fine. After killing and restarting the monitor, the poll_work in kernel will never run again due to the wrong value of poll_scheduled. Therefore, we should reset the value as group_init() does after the last trigger is destroyed. [PATCH V2] In the patch v2, I put the atomic_set(&group->poll_scheduled, 0); into the right place. Here I quoted from Johannes as the best explaination: "The question is why we can end up with poll_scheduled = 1 but the work not running (which would reset it to 0). And the answer is because the scheduling side sees group->poll_kworker under RCU protection and then schedules it, but here we cancel the work and destroy the worker. The cancel needs to pair with resetting the poll_scheduled flag." Signed-off-by: Jason Xing Reviewed-by: Caspar Zhang Reviewed-by: Joseph Qi Reviewed-by: Suren Baghdasaryan Acked-by: Johannes Weiner --- kernel/sched/psi.c | 7 +++++++ 1 file changed, 7 insertions(+) diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c index 7acc632..acdada0 100644 --- a/kernel/sched/psi.c +++ b/kernel/sched/psi.c @@ -1131,7 +1131,14 @@ static void psi_trigger_destroy(struct kref *ref) * deadlock while waiting for psi_poll_work to acquire trigger_lock */ if (kworker_to_destroy) { + /* + * After the RCU grace period has expired, the worker + * can no longer be found through group->poll_kworker. + * But it might have been already scheduled before + * that - deschedule it cleanly before destroying it. + */ kthread_cancel_delayed_work_sync(&group->poll_work); + atomic_set(&group->poll_scheduled, 0); kthread_destroy_worker(kworker_to_destroy); } kfree(t); -- 1.8.3.1