Received: by 2002:a25:ad19:0:0:0:0:0 with SMTP id y25csp3630226ybi; Mon, 29 Jul 2019 09:47:17 -0700 (PDT) X-Google-Smtp-Source: APXvYqygvCMdcRJz5CH1hUr7bKpkKnFsREG1SipkVMyienHSwHOBuFnIBoFzKg5AUR8RY8I8YFFm X-Received: by 2002:a17:902:1e2:: with SMTP id b89mr3155791plb.7.1564418837751; Mon, 29 Jul 2019 09:47:17 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1564418837; cv=none; d=google.com; s=arc-20160816; b=tdM4+odE3RRdVSwwaM3qZuAm4HFChDrFd0ulEqZUTPR5pzhtNmuqYIPwi9ZJi2CjSM g1C8/jywCHVtOGTfxq8O/QzsUqLS0y/3zlNdjzhU6HGkhW2Z4yn7E8ld6Po4OG0eMYWs 8+MiCxuPt5knqmSwJvyjeLX8N96gtTBSxEvMT9d9OZOaVRUh+/mP3LoiarnO6Y66QmO0 EkJkUSZBV0WYJgJsGAGyi3DStDMPAoUMglEFgmuV8F4lVLw4tQhxG8xHIwWgbrJcSsVp 8a6lgPYwxRAgrdjhoYzSoltikVbBUdjc4bF/1gXa0CES+KLCWWVbVOWZdaY1gea0E4Cf Bb5A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=mHHPMQ336CdhvDwSHAm2ZkUObmC6MYlngRuFkjj6+rQ=; b=IJUCXk/MVwdM2Gzx3hvL+Z6KZ1oF4jDHzIbUFLmENGtD99iBW7OpGHxVeuyQJ4e6yF 3gV/sbY0kYjwSFoBbNwrU2qjxWPRrgkLfXCzQe74U/CSlcfGJtcdP7tjmvWfgU861gcZ eapYzabypZGDujRYa00YLvkZI+ledEoA4A6mCq0l1zlMSIrd4tIM6cIFzgtOCPXgOY8Y ito5TSZx5fcCWMTpCeBlLnR4WvSf19wm/IFgzc7Qu8oJBFs/rSzFbffn3IchDrXGWVG/ 8bUu8wzcJqqd3JUkehEoB+wepKymrrWTYHqSzgEWLyYRlSSHjx05RLNEvA2bk/v8jMMQ 2xUA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DwZvqBjX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 70si25384594plc.253.2019.07.29.09.47.02; Mon, 29 Jul 2019 09:47:17 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DwZvqBjX; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728767AbfG2Q1e (ORCPT + 99 others); Mon, 29 Jul 2019 12:27:34 -0400 Received: from mail-wr1-f65.google.com ([209.85.221.65]:45171 "EHLO mail-wr1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727823AbfG2Q1e (ORCPT ); Mon, 29 Jul 2019 12:27:34 -0400 Received: by mail-wr1-f65.google.com with SMTP id f9so62511318wre.12 for ; Mon, 29 Jul 2019 09:27:32 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=mHHPMQ336CdhvDwSHAm2ZkUObmC6MYlngRuFkjj6+rQ=; b=DwZvqBjX9KZPS22QyEa1iILLH2hTN+dUD4/1b4KsFWjmUYFEpnvMp5tSzwCOjk7nXb 47dFo9DlBP6wgTwKxLmF9vyO4cn8qs5wQBO3Omtt8RKttUKk/isZrTTkLcrQdd1Z7rRs D46FJFRHuv3mVsh6Y7jppOHWDuC0krJot6cLHZjhOkilCIqI+IccrKR5Wgj8v288QNhh Tm5F5iCRaAl3tKxEvEcc8RgJY7oAHbrGH0Ly1ylK9L/y6pN0E7y5Qftdv8n8Pr5y5F9q oicicY+3MN+3DuQaNUXBk9ldinCIorc4sLE/AzB8zq6Q6VeVknCRYat0EcKOmFDaTqsI ZCQg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=mHHPMQ336CdhvDwSHAm2ZkUObmC6MYlngRuFkjj6+rQ=; b=uNsq7FD2GzBk+QieX4VeN0eVW6eGqyBqfQz9CQAvAkGtc3G5D/Xdy8AMeCYXYcgDMD THXblzgfYmfikOa1nLQTS0dzoIvtg/QA+4bIh5sO0UzYRwNB6o8iTW66CUlsevf66wdM rvTdr0c1zDxTwVrUL66d3tx7kvwHnWTs7jyTgwDKkD3+V2Uy3kqUMhtKk/XI1KUkBh0Y BQwLo07V/9W3yUvq1o+P6vV42qoFMDhvXVQbrncPTcM8qJuXAnje8hbkHscfebgmkyss BVsVk/bInXXPZfgheup73fZykfgeR7fyFbajcpQbiDNUzsxlVAcAziOK/UcVZzugduKo 4AMw== X-Gm-Message-State: APjAAAWX5nvxj7CmtskjUbh9+lVNVddBuyrBQNn5QL20SxynpN9rRKqv iEZCopVXC09CNqgdGEzWQ4Y9c64oPG5oA2OzR+VPnA== X-Received: by 2002:adf:e50c:: with SMTP id j12mr37995479wrm.117.1564417651286; Mon, 29 Jul 2019 09:27:31 -0700 (PDT) MIME-Version: 1.0 References: <1563864339-2621-1-git-send-email-kerneljasonxing@linux.alibaba.com> <20190729152957.GA21958@cmpxchg.org> In-Reply-To: <20190729152957.GA21958@cmpxchg.org> From: Suren Baghdasaryan Date: Mon, 29 Jul 2019 09:27:20 -0700 Message-ID: Subject: Re: [PATCH] psi: get poll_work to run when calling poll syscall next time To: Johannes Weiner Cc: Jason Xing , Ingo Molnar , Peter Zijlstra , Dennis Zhou , axboe@kernel.dk, lizefan@huawei.com, Tejun Heo , LKML Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 29, 2019 at 8:30 AM Johannes Weiner wrote: > > Hi Jason, > > On Tue, Jul 23, 2019 at 02:45:39PM +0800, Jason Xing wrote: > > Only when calling the poll syscall the first time can user > > receive POLLPRI correctly. After that, user always fails to > > acquire the event signal. > > > > Reproduce case: > > 1. Get the monitor code in Documentation/accounting/psi.txt > > 2. Run it, and wait for the event triggered. > > 3. Kill and restart the process. > > > > If the user doesn't kill the monitor process, it seems the > > poll_work works fine. After killing and restarting the monitor, > > the poll_work in kernel will never run again due to the wrong > > value of poll_scheduled. Therefore, we should reset the value > > as group_init() does after the last trigger is destroyed. > > > > Signed-off-by: Jason Xing > > Good catch, and the fix makes sense to me. However, it was a bit hard > to understand how the problem occurs: > > > --- > > kernel/sched/psi.c | 6 ++++++ > > 1 file changed, 6 insertions(+) > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > > index 7acc632..66f4385 100644 > > --- a/kernel/sched/psi.c > > +++ b/kernel/sched/psi.c > > @@ -1133,6 +1133,12 @@ static void psi_trigger_destroy(struct kref *ref) > > if (kworker_to_destroy) { > > kthread_cancel_delayed_work_sync(&group->poll_work); > > kthread_destroy_worker(kworker_to_destroy); > > + /* > > + * The poll_work should have the chance to be put into the > > + * kthread queue when calling poll syscall next time. So > > + * reset poll_scheduled to zero as group_init() does > > + */ > > + atomic_set(&group->poll_scheduled, 0); > > The question is why we can end up with poll_scheduled = 1 but the work > not running (which would reset it to 0). And the answer is because the > scheduling side sees group->poll_kworker under RCU protection and then > schedules it, but here we cancel the work and destroy the worker. The > cancel needs to pair with resetting the poll_scheduled flag: > > if (kworker_to_destroy) { > /* > * After the RCU grace period has expired, the worker > * can no longer be found through group->poll_kworker. > * But it might have been already scheduled before > * that - deschedule it cleanly before destroying it. > */ > kthread_cancel_delayed_work_sync(&group->poll_work); > atomic_set(&group->poll_scheduled, 0); > > kthread_destroy_worker(kworker_to_destroy); > } > > With that change, please add: > > Acked-by: Johannes Weiner > > Thanks! The changes makes sense to me as well. Thanks! Reviewed-by: Suren Baghdasaryan