Received: by 2002:a05:6a10:a841:0:0:0:0 with SMTP id d1csp406924pxy; Wed, 28 Apr 2021 06:45:59 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxfcigE8Uh8O77MdrLVsouX9NAUtHqduhcE2aEaBqNTwi5PF3UPZZp7UjpoeDYYapI74vnp X-Received: by 2002:a05:6402:434e:: with SMTP id n14mr11286465edc.378.1619617559707; Wed, 28 Apr 2021 06:45:59 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1619617559; cv=none; d=google.com; s=arc-20160816; b=L/WsP6k8tUcP2DiF1mKX9jgjV7dyETTOPUx66uIZ8CJJTRvEuzpoOY+/FYraP9peEv iAUR6yGMqY/GByW61eB/HpLbnYIF8A58FrKO23Regt/Gvz+mEMvC00O+NcMgxjExX6ve 4sPRNmMs1IGHqc2kbpNQgQxREDQZbswdBVJq4yYlE/FZr85AqmAP9whTaoUEs2M/+vM6 C6fN2tChmHjZilOhShw+I+MmONUiY4C0lCpEIgtVzHBWFVncJxhs8uNplB25SnPs9ce7 iBZqOacXQe+PKE6mfLS65VugPtvtuUtIilIpA+EbmGpQykLfxQ7QcFY4sfYWk3EG3z6m IIBQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=Ch8iPLVnjB26GYW/E4Kp88UBIdNVnDfQkN+teceVlmQ=; b=oaRhpgv7SJdWzkx7TBT09814k5+GLgu7v705oqXXAn0ELQoF8+wiLK2c0VhlFeVwyw SFu+SnJfacYAPVjdYinqCCRCvXqIGN3HtOHXUX9BEcrGp3aWQHCapmVs3hREl7mPYcvR RNulunC43NkNuhAnm9ozzgdaATRVbG+SPo5bbW4b3K7rxtzEEDZJgKB8/YfiCfRP3UFF 9+avBfXlOLir5D+GX7cVMVIrpaVuMz9p5UBaq7l1LmIsuMv1Q5QDRlqM4znT2qFMaVB4 gjwWy/8nN/zsX4vYWxxdsPcCntHjEIUMXiIeh0f4VFe/cW/V7KvEpcKeYYwChBr8ZweJ 7JYQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=anF9LsF+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z4si2150451edb.360.2021.04.28.06.45.22; Wed, 28 Apr 2021 06:45:59 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=anF9LsF+; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S239483AbhD1Kgf (ORCPT + 99 others); Wed, 28 Apr 2021 06:36:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:38280 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230057AbhD1Kge (ORCPT ); Wed, 28 Apr 2021 06:36:34 -0400 Received: from mail-yb1-xb33.google.com (mail-yb1-xb33.google.com [IPv6:2607:f8b0:4864:20::b33]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 59D97C061574 for ; Wed, 28 Apr 2021 03:35:48 -0700 (PDT) Received: by mail-yb1-xb33.google.com with SMTP id 82so73259528yby.7 for ; Wed, 28 Apr 2021 03:35:48 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=Ch8iPLVnjB26GYW/E4Kp88UBIdNVnDfQkN+teceVlmQ=; b=anF9LsF+XidzgO65yYDfmPZrk1xPeZig73tPiUHWzo7iTkQXWG0f42aiiG8QKO4i4X //SG/+iOl7RnS8Xg0I5OsxFeGZCsEFb6j5CRoKmWL+GVpaDKtNa+E9POooEHc/8CiEBZ ufg1IaqHm6XgVdYKsRH2PXZJY7nAO6dME8jkFPpAaicd/B0gXkB92PIiJJRxzcV7IcMA vSVPUrsorR5/IfpbiMuG61UKKao0da2aH60s0SlR3H6exO6X3DUzmZgvLCzQ3XayU81q zPjPXanYr1RHN5y+5FkD8zJc6jU9izvZHwp01bE0ogAVScGiLwvu5mLKALFFgDnXh5bs EeWA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=Ch8iPLVnjB26GYW/E4Kp88UBIdNVnDfQkN+teceVlmQ=; b=RF7+daFccolw46O7dP4TkHEWfCpF3OMMBHinpBy4YL3bGPkaZbqG8rFXa0V4tTv8CM 0RiCM6epoPaMSN3yuLPIwONmC4cLzf/Go8ibuh152IgC3AJXBB8kM6MvkfluWbfdO7e7 WlUkRJ0fs2zgnR7g4BJmUZ9GURji7KfSqfnGSSXXlPYdcG9ee/9lylEiSxasEW2vUpq2 988CZ2ZlM32cF04KpGZukKE7ioiyZW30BdZibG9CYYfwngOYzpG1VUWUK2Vey5HU64rJ zJhyI78CIYPdeBBUWqadjjZQl1jjBI5GMvOJihjK+hVUlCJ8wdzpxx11XVbHB7144Zux NWYw== X-Gm-Message-State: AOAM531PswOekwz9Gi3Ecds4PZCQvxJaBYhdxnWR5d5nXaek8Rkdl7BD a7KO995lSxfOWN/Plf3kw5kBDwrLh+YBxi7sssg= X-Received: by 2002:a5b:602:: with SMTP id d2mr3502182ybq.199.1619606147709; Wed, 28 Apr 2021 03:35:47 -0700 (PDT) MIME-Version: 1.0 References: <20210422120459.447350175@infradead.org> <20210422123308.196692074@infradead.org> In-Reply-To: From: Aubrey Li Date: Wed, 28 Apr 2021 18:35:36 +0800 Message-ID: Subject: Re: [PATCH 04/19] sched: Prepare for Core-wide rq->lock To: Peter Zijlstra Cc: Josh Don , Joel Fernandes , "Hyser,Chris" , Ingo Molnar , Vincent Guittot , Valentin Schneider , Mel Gorman , linux-kernel , Thomas Gleixner , Don Hiatt Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 28, 2021 at 5:14 PM Peter Zijlstra wrote: > > On Tue, Apr 27, 2021 at 04:30:02PM -0700, Josh Don wrote: > > > Also, did you mean to have a preempt_enable_no_resched() rather than > > prempt_enable() in raw_spin_rq_trylock? > > No, trylock really needs to be preempt_enable(), because it can have > failed, at which point it will not have incremented the preemption count > and our decrement can hit 0, at which point we really should reschedule. > > > I went over the rq_lockp stuff again after Don's reported lockup. Most > > uses are safe due to already holding an rq lock. However, > > double_rq_unlock() is prone to race: > > > > double_rq_unlock(rq1, rq2): > > /* Initial state: core sched enabled, and rq1 and rq2 are smt > > siblings. So, double_rq_lock(rq1, rq2) only took a single rq lock */ > > raw_spin_rq_unlock(rq1); > > /* now not holding any rq lock */ > > /* sched core disabled. Now __rq_lockp(rq1) != __rq_lockp(rq2), so we > > falsely unlock rq2 */ > > if (__rq_lockp(rq1) != __rq_lockp(rq2)) > > raw_spin_rq_unlock(rq2); > > else > > __release(rq2->lock); > > > > Instead we can cache __rq_lockp(rq1) and __rq_lockp(rq2) before > > releasing the lock, in order to prevent this. FWIW I think it is > > likely that Don is seeing a different issue. > > Ah, indeed so.. rq_lockp() could do with an assertion, not sure how to > sanely do that. Anyway, double_rq_unlock() is simple enough to fix, we > can simply flip the unlock()s. > > ( I'm suffering a cold and am really quite slow atm ) > > How's this then? > > --- > diff --git a/kernel/sched/core.c b/kernel/sched/core.c > index f732642e3e09..3a534c0c1c46 100644 > --- a/kernel/sched/core.c > +++ b/kernel/sched/core.c > @@ -290,6 +290,10 @@ static void sched_core_assert_empty(void) > static void __sched_core_enable(void) > { > static_branch_enable(&__sched_core_enabled); > + /* > + * Ensure raw_spin_rq_*lock*() have completed before flipping. > + */ > + synchronize_sched(); synchronize_sched() seems no longer exist... Thanks, -Aubrey