Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp650247ybz; Wed, 15 Apr 2020 15:58:13 -0700 (PDT) X-Google-Smtp-Source: APiQypJBMHGi0HGdrPcLmA0mdLviHx/V6NkmbEq/1eHonbS3YGW9fgQ13fTH58IRVZBdNXIkkB95 X-Received: by 2002:a17:906:4353:: with SMTP id z19mr7167049ejm.363.1586991493047; Wed, 15 Apr 2020 15:58:13 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1586991493; cv=none; d=google.com; s=arc-20160816; b=IPr79fBzwPY6Dza5kvNgO4TeSrZm1gjdBRWAf5WvLNHVsmhAA+iBmgybO3hCPa/mlI XZJiQ97x0QozrQAV/ePGlXTUdzUBZwy6Vhm8vwcJqlY5JiiupM9Av6+ITVt2lHuEzNNw TVdHMh7Fo9jquVJ0uhROKDwPBJ0IOJ/sWu16sKR++ePO5w53L9e4FJ16fHluFS+KC8uB UR+Hli+VNM7S82ItvVD3TG+LNc88O/Mxnmk7kpn3su0Y4+ZGj2ytV2AToRr8C9UxhQUl NIj4efZiObuu8Td9m/xS9bI2QWkuVL6WUeuWSO3kwMMHJkZlg/5F1GfdPCu0DyCQp3sf 8WIw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=pe8PVEu48fWSKZ5P+5UvfvD/3iJ+41ZHTskcjEn0pVc=; b=eGYuewyFj0FW6NExP90M1woeKQZe8JI00i/Dv+XzmJLYw69YKkt2kiso2oP7wnrbTQ L6VWOD8kwJE+cfMuoawrS7S1MeJ6XRlgSXuv0AcKFSwYXWKJSAHnI0EZYFuRwVj/WsKG rukw6C6DmJF70MT4iApfcF86kO27vb/bLypUc4Ivuu5YAoSWeL5rraYlwlORY/c33Jlz AJ9c/x1va9MsXvYptDecFtWtctPF5GKYTykeYkjHge+MqX2oZpf4UGRRRwHsNZ7vkGAK 0z3KAAH89Q2EYfI/6xBiX1REOGJm1kVpEYiMOqK4M6QGBXYLM4PUHmYrSRSxEBX4oSqy GVzw== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=p5NGKl29; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id g15si11989181edm.374.2020.04.15.15.57.49; Wed, 15 Apr 2020 15:58:13 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=p5NGKl29; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2896666AbgDOKzw (ORCPT + 99 others); Wed, 15 Apr 2020 06:55:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33876 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2896660AbgDOKzq (ORCPT ); Wed, 15 Apr 2020 06:55:46 -0400 Received: from merlin.infradead.org (unknown [IPv6:2001:8b0:10b:1231::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B7ED6C061A0C for ; Wed, 15 Apr 2020 03:55:46 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=pe8PVEu48fWSKZ5P+5UvfvD/3iJ+41ZHTskcjEn0pVc=; b=p5NGKl29ejcIzF1Mc8fNqajYE1 jIJ/WnRpRPtXB4L80OKxXNlwzhFXh2s4SivO3fxXPTgXBqvU10rrLp2pnP6baAA1rLPrgowCKmQWB T/sEeLAAaESB+D+ytHLSmnYvccPlPOuMdaXYGXvrm0KxsoV+oK8csqITp2/OsqOKTSa8OMgZ3gNw2 zEd53nU5qiV8T/eyNrRtqV+IRBorAMmyMsKzjD8DGjRrT8s5CU0VzxkkA6xy2Iiy4QYB+QuTOss0B vUBFrOEwyDlS+KSBAnb8bzcaTHmXmEuCnfEfpTWwxliJbjNuH/PEzwNJYtoghwoQMAFRDBUkR1Dj+ gBwV9jDg==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1jOfhJ-0008DJ-OZ; Wed, 15 Apr 2020 10:55:13 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id 963D830066E; Wed, 15 Apr 2020 12:55:11 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id 79A9B2BB393EA; Wed, 15 Apr 2020 12:55:11 +0200 (CEST) Date: Wed, 15 Apr 2020 12:55:11 +0200 From: Peter Zijlstra To: Vineeth Remanan Pillai Cc: Nishanth Aravamudan , Julien Desfossez , Tim Chen , Ingo Molnar , Thomas Gleixner , Paul Turner , Linus Torvalds , Linux List Kernel Mailing , =?iso-8859-1?Q?Fr=E9d=E9ric?= Weisbecker , Kees Cook , Greg Kerr , Phil Auld , Aaron Lu , Aubrey Li , "Li, Aubrey" , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , Joel Fernandes , Joel Fernandes Subject: Re: [RFC PATCH 03/13] sched: Core-wide rq->lock Message-ID: <20200415105511.GC20730@hirez.programming.kicks-ass.net> References: <855831b59e1b3774b11c3e33050eac4cc4639f06.1583332765.git.vpillai@digitalocean.com> <20200414113639.GS20730@hirez.programming.kicks-ass.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Apr 14, 2020 at 05:35:07PM -0400, Vineeth Remanan Pillai wrote: > > Aside from the fact that it's probably much saner to write this as: > > > > rq->core_enabled = static_key_enabled(&__sched_core_enabled); > > > > I'm fairly sure I didn't write this part. And while I do somewhat see > > the point of disabling core scheduling for a core that has only a single > > thread on, I wonder why we care. > > > I think this change was to fix some crashes which happened due to > uninitialized rq->core if a sibling was offline during boot and is > onlined after coresched was enabled. > > https://lwn.net/ml/linux-kernel/20190424111913.1386-1-vpillai@digitalocean.com/ > > I tried to fix it by initializing coresched members during a cpu online > and tearing it down on a cpu offline. This was back in v3 and do not > remember the exact details. I shall revisit this and see if there is a > better way to fix the race condition above. Argh, that problem again. So AFAIK booting with maxcpus= is broken in a whole number of 'interesting' ways. I'm not sure what to do about that, perhaps we should add a config around that option and make it depend on CONFIG_BROKEN. That said; I'm thinking it shouldn't be too hard to fix up the core state before we add the CPU to the masks, but it will be arch specific. See speculative_store_bypass_ht_init() for inspiration, but you'll need to be even earlier, before set_cpu_sibling_map() in smp_callin() on x86 (no clue about other archs). Even without maxcpus= this can happen when you do physical hotplug and add a part (or replace one where the new part has more cores than the old). The moment core-scheduling is enabled and you're adding unknown topology, we need to set up state before we publish the mask,... or I suppose endlessly do: 'smt_mask & active_mask' all over the place :/ In which case you can indeed do it purely in sched/core. Hurmph...