Received: by 2002:a05:6520:4211:b029:f4:110d:56bc with SMTP id o17csp737231lkv; Tue, 18 May 2021 18:42:05 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxgfjA2+l6AG+vufkKDI9eFCZs5MhJkPmaWRR6OC5kL6MCIYqiEReS9OXklzCRR01ANp3dg X-Received: by 2002:a6b:8e04:: with SMTP id q4mr2738921iod.54.1621388525205; Tue, 18 May 2021 18:42:05 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1621388525; cv=none; d=google.com; s=arc-20160816; b=YcOA0rl5DQIQcEYFTFZKa6w6Baf9Eo4d63XdR/QBJb1sacjiFTk5OAhvFihSu2FBkg qTRE1tyGBHz1zF1ZlYl25kIWZ7WFU6ff4t8wiXC03ivE5usP9XLqI8/d0Lkne+TRDbWn 1pVUUrLpWCUvb2bhBvpLajE8cXt3Qo4N2b1nsaBiIfGXh/zE9kqNPFFEYXowevt47e1r lF2IpfKktVmfvwxla6riXMeZ+Qq+E5Jo2SvK1ye+MG56rV7maK4GYBbUgcuDRB0V6xue 090Gdyp1U54JZKyhI+JEi9Zphow0ZmF24MsM50wWofiVbAF/1juXBioyrgOpVmV4Zb4X F29g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=AyjSqF4OoTlJgbD+co2Y6A6lMQr2dK7Nnrmcg5+fa3A=; b=Z+2iIpDiUcEHK6TQr86sksyBHa65dYqmylVENfA65vyCttbVBwail7yEKp3k3Zn5+V 9pztyXFynlYHlLFRDaUR6F504UJn7IlWew2omivoqAprhLJS0BIKiUuAoLUqJ5eFf/1t wDZEAUjhGHr7Al5jIAW20hqRXec4+uyV/qguXzuccvBQGJWZrh12uYCYZ7y4/kssCm8P 8+FywP28OnG5SG4dzCPfLwBq0x62W6gN2upgZhl0clCOqruI7xnDqvZYSoZDLBXj1kQT s0nnSbmEgLRrW7RN7jV+riq6qAiCMGk6xCtp1ywguOjnKA/hIgdXVdccoM5wvQKyHZkD cuVQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Qnyutz6D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h31si25540519jav.116.2021.05.18.18.41.53; Tue, 18 May 2021 18:42:05 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=Qnyutz6D; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234568AbhEQTfI (ORCPT + 99 others); Mon, 17 May 2021 15:35:08 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53042 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234512AbhEQTfH (ORCPT ); Mon, 17 May 2021 15:35:07 -0400 Received: from mail-yb1-xb2a.google.com (mail-yb1-xb2a.google.com [IPv6:2607:f8b0:4864:20::b2a]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 050BDC061573 for ; Mon, 17 May 2021 12:33:50 -0700 (PDT) Received: by mail-yb1-xb2a.google.com with SMTP id h202so10031005ybg.11 for ; Mon, 17 May 2021 12:33:49 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=AyjSqF4OoTlJgbD+co2Y6A6lMQr2dK7Nnrmcg5+fa3A=; b=Qnyutz6DBkj5MMbtehNMBGL3ewb5ECdM9T4RjQQwMqlyOtl0cKfOqs0gkVP65NiT5V BlTakS9ZbaFZBBi2HK3sNXvGJrScdZbPNw+kOwBVdRHh0WAnAbB2KATv016/MWU3h74H NNNzqAEo0Mozchs7cOYChs0cd0t37sk4uZdkeug8vUaTRU6mrg97cycHvD4m5Az5da74 wJfJlfv0bETjpOlrjWbif/O1Id7v0EiIPUhlYrrMdGBIvgZR8Z+qb6wILWusRZ+uMctj dQn19a2RLgyYwvWz7+zStKkMQS5bRW1kxT9pLEEN/c7w21Cxr+zk5gYR14/7oIyIsnjv eaHw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=AyjSqF4OoTlJgbD+co2Y6A6lMQr2dK7Nnrmcg5+fa3A=; b=Xy3XUsjDZODdMTQudC73lEFMj6qbhXHBSRyjFKVGqpGBPNZ5Ziw3TQVf7xGZaJ4Q/4 4xdCiLFuN/+6cfXEXNhn+hWZh6N60tCKZ38s1nEp0rSG31PCfyIStEPMkMk+MnX7N+Ml tYNf7E5l7x2MQXIQNe9c/sIlZHi/S8T+7J1IIuJlRkGQYoI1WzwG/m+Fq9vJhvEZeaYG U2uKiF45zU9u6xO8Y6b5nTpQgyMzBhe5Oj4RlXBUWhNcsVhRXtAp9GBeMsLNx9XvFy8H G/DBUJhL6pGwGS/Cf1dywR0fOTcp6L/3ZRD8aWeyNoNnlc4I4nT5cHkWxqsPkNCaIeqi X3pw== X-Gm-Message-State: AOAM53057SpwNZdFtBusTPgwqqpMvHbDIudKYGCrwBtLnZ15Hpb2a7qp CF4bC29Z9PZbK4grWcAMwmBsPr64EVFyvgn0wpRS+Q== X-Received: by 2002:a05:6902:1026:: with SMTP id x6mr1863049ybt.23.1621280029123; Mon, 17 May 2021 12:33:49 -0700 (PDT) MIME-Version: 1.0 References: <1621242249-8314-1-git-send-email-huangzhaoyang@gmail.com> In-Reply-To: From: Suren Baghdasaryan Date: Mon, 17 May 2021 12:33:38 -0700 Message-ID: Subject: Re: [[RFC]PATCH] psi: fix race between psi_trigger_create and psimon To: Johannes Weiner Cc: Huangzhaoyang , Zhaoyang Huang , Ziwei Dai , Ke Wang , LKML Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, May 17, 2021 at 11:36 AM Johannes Weiner wrote: > > CC Suren Thanks! > > On Mon, May 17, 2021 at 05:04:09PM +0800, Huangzhaoyang wrote: > > From: Zhaoyang Huang > > > > Race detected between psimon_new and psimon_old as shown below, which > > cause panic by accessing invalid psi_system->poll_wait->wait_queue_entry > > and psi_system->poll_timer->entry->next. It is not necessary to reinit > > resource of psi_system when psi_trigger_create. resource of psi_system will not be reinitialized because init_waitqueue_head(&group->poll_wait) and friends are initialized only during the creation of the first trigger for that group (see this condition: https://elixir.bootlin.com/linux/latest/source/kernel/sched/psi.c#L1119). > > > > psi_trigger_create psimon_new psimon_old > > init_waitqueue_head finish_wait > > spin_lock(lock_old) > > spin_lock_init(lock_new) > > wake_up_process(psimon_new) > > > > finish_wait > > spin_lock(lock_new) > > list_del list_del Could you please clarify this race a bit? I'm having trouble deciphering this diagram. I'm guessing psimon_new/psimon_old refer to a new trigger being created while an old one is being deleted, so it seems like a race between psi_trigger_create/psi_trigger_destroy. The combination of trigger_lock and RCU should be protecting us from that but maybe I missed something? I'm excluding a possibility of a race between psi_trigger_create with another existing trigger on the same group because the codepath calling init_waitqueue_head(&group->poll_wait) happens only when the first trigger for that group is created. Therefore if there is an existing trigger in that group that codepath will not be taken. > > > > Signed-off-by: ziwei.dai > > Signed-off-by: ke.wang > > Signed-off-by: Zhaoyang Huang > > --- > > kernel/sched/psi.c | 6 ++++-- > > 1 file changed, 4 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/sched/psi.c b/kernel/sched/psi.c > > index cc25a3c..d00e585 100644 > > --- a/kernel/sched/psi.c > > +++ b/kernel/sched/psi.c > > @@ -182,6 +182,8 @@ struct psi_group psi_system = { > > > > static void psi_avgs_work(struct work_struct *work); > > > > +static void poll_timer_fn(struct timer_list *t); > > + > > static void group_init(struct psi_group *group) > > { > > int cpu; > > @@ -201,6 +203,8 @@ static void group_init(struct psi_group *group) > > memset(group->polling_total, 0, sizeof(group->polling_total)); > > group->polling_next_update = ULLONG_MAX; > > group->polling_until = 0; > > + init_waitqueue_head(&group->poll_wait); > > + timer_setup(&group->poll_timer, poll_timer_fn, 0); > > This makes sense. Well, this means we initialize resources for triggers in each psi group even if the user never creates any triggers. Current logic initializes them when the first trigger in the group gets created. > > > rcu_assign_pointer(group->poll_task, NULL); > > } > > > > @@ -1157,7 +1161,6 @@ struct psi_trigger *psi_trigger_create(struct psi_group *group, > > return ERR_CAST(task); > > } > > atomic_set(&group->poll_wakeup, 0); > > - init_waitqueue_head(&group->poll_wait); > > wake_up_process(task); > > timer_setup(&group->poll_timer, poll_timer_fn, 0); > > This looks now unncessary? > > > rcu_assign_pointer(group->poll_task, task); > > @@ -1233,7 +1236,6 @@ static void psi_trigger_destroy(struct kref *ref) > > * But it might have been already scheduled before > > * that - deschedule it cleanly before destroying it. > > */ > > - del_timer_sync(&group->poll_timer); > > And this looks wrong. Did you mean to delete the timer_setup() line > instead? I would like to get more details about this race before trying to fix it. Please clarify. Thanks!