Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751631AbdG0TYQ (ORCPT ); Thu, 27 Jul 2017 15:24:16 -0400 Received: from mail-qk0-f196.google.com ([209.85.220.196]:34973 "EHLO mail-qk0-f196.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751577AbdG0TYO (ORCPT ); Thu, 27 Jul 2017 15:24:14 -0400 Date: Thu, 27 Jul 2017 15:24:10 -0400 From: Tejun Heo To: Michael Bringmann Cc: Lai Jiangshan , linux-kernel@vger.kernel.org, nfont@linux.vnet.ibm.com Subject: Re: [PATCH v6] workqueue: Fix edge cases for calc of pool's cpumask Message-ID: <20170727192410.GI742618@devbig577.frc2.facebook.com> References: <5c421712-fb38-3020-d3fc-b9bdea792cd3@linux.vnet.ibm.com> <20170727183148.GH742618@devbig577.frc2.facebook.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2120 Lines: 43 Hello, Michael. On Thu, Jul 27, 2017 at 02:07:53PM -0500, Michael Bringmann wrote: > The problem lies with the ordering of events with respect to the order in > which we add (or remove) CPUs to NUMA systems, and make use of that knowledge. Isn't the root cause that the upper layers including workqueue expect cpu <-> node mapping to be static but powerpc doesn't follow that? I don't get why ordering matters here. > The CPUs present are assigned to nodes, and workqueues and their infrastructure > are created to use the CPUs in a node. Workqueues are created at boot time > and updated or created as CPUs are added or removed. However, there is little > or no synchronization or ordering of these events, and the data structures What I meant was that there's no synchronization construct protection cpu <-> node mapping. If arch code changes it during hot plug, it's changing it underneath anybody who might be using that association. > mapping CPUs to nodes may not be updated before the workqueue infrastructure > is built for a node. Thus we have the possibility of an invalid CPU mask > attribute being attached to a newly created workqueue before the CPUs have > been properly registered and published to a node. > > This patch attempts to provide a partial ordering of events within workqueue > by delaying the use of newly calculated CPU masks as the value for a workqueue > attribute until they have valid content. Instead the workqueue code must delay > creating new workqueues until this function succeeds, or it can use a previously > calculated cpumask attribute that is known to be valid. > > This patch attempts to ensure that a valid, usable cpumask is used to set up > newly created pools for workqueues. This patch provides a fix for NUMA systems > which can add/subtract processors dynamically. The patch is expected to be an > intermediate one while developers find and correct any underlying issues. And what this patch does is adding a bandaid so that we at least don't crash immediately when this condition triggers until the arch code can be fixed properly. Thanks. -- tejun