Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751761AbdG0R6k (ORCPT ); Thu, 27 Jul 2017 13:58:40 -0400 Received: from mail-qk0-f170.google.com ([209.85.220.170]:38223 "EHLO mail-qk0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750981AbdG0R6j (ORCPT ); Thu, 27 Jul 2017 13:58:39 -0400 Date: Thu, 27 Jul 2017 13:58:35 -0400 From: Tejun Heo To: Michael Bringmann Cc: Lai Jiangshan , linux-kernel@vger.kernel.org, nfont@linux.vnet.ibm.com Subject: Re: [PATCH v5] workqueue: Fix edge cases for calc of pool's cpumask Message-ID: <20170727175834.GG742618@devbig577.frc2.facebook.com> References: <47aff1c6-73c2-bc8e-69e9-bdefdb3133c4@linux.vnet.ibm.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <47aff1c6-73c2-bc8e-69e9-bdefdb3133c4@linux.vnet.ibm.com> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1098 Lines: 25 Hello, Michael. On Thu, Jul 27, 2017 at 12:06:22PM -0500, Michael Bringmann wrote: > > On NUMA systems with dynamic processors, the content of the cpumask > may change over time. As new processors are added via DLPAR operations, > workqueues are created for them. Depending upon the order in which CPUs > are added/removed, we may run into problems with the content of the > cpumask used by the workqueues. This patch deals with situations where > the online cpumask for a node is a proper superset of possible cpumask > for the node. It also deals with edge cases where the order in which > CPUs are removed/added from the online cpumask may leave the set for a > node empty, and require execution by CPUs on another node. I think we already talked about this before but can you please note that this is a bandaid to workaround an underlying bug. This isn't something which normally happens on NUMA sytems with dynamic processors. This is bandaiding a hole so that the machine at least doesn't crash immediately until we can get the underlying problem fixed properly. Thanks. -- tejun