Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp214488imm; Thu, 6 Sep 2018 00:58:08 -0700 (PDT) X-Google-Smtp-Source: ANB0Vdb/oPul/B6Y6qVR+yHZyRS1iuDwoB5YvVm679fIBw1xc+F9GzQBgLS2COUcjeryFRl9jmYd X-Received: by 2002:a63:1c61:: with SMTP id c33-v6mr1509871pgm.109.1536220688762; Thu, 06 Sep 2018 00:58:08 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536220688; cv=none; d=google.com; s=arc-20160816; b=j5dUBVhQ+blrWJYAIHHN4Anz668DTAoNU7cy77ET3cNrmUBOva1dOFCjwjSuOFLQN8 T9eQhM+osEtsJ7NGaz4nJNI55fa1HPNLkL8gjGp5G7mtms/s5xl0zyLnBReSAdEofX29 1CiKqXPXnx8/ULbNf1rWt9/Zo4HfzznV2tdbw5ARS0FNWy6wHgGV8xYGwLBqAxc4txlK inVMalxpDZ3Ep9PXcx0PvgqeKYYs+JG5fR3jH7ET5fy1XYSG66+NywZ+DmzmKK0LpKKp QMJVrdk04jqeHKHU/xHFaHTPLDYuk/d7Zgez69zNUJhKvR1ejow52mmS3aGrPVGi43Kh UrKw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date; bh=K32LOnhqqcqRx/38qpNiqcxmANNc6RVCQ/I5DfJ5UKM=; b=wtU2lZry6qDbJxTFyqZyyzs4rO93E9bEPf3yeGuXj/zxtD56xJk8d1NHuWvoYk6joU cp+Py78FEUR9ojoNVFsPEznH2OEMglErOc4UDKqfQOXRtlydnbIgK4kgBWpEAJDn5AE4 PBEh8o08zP19CmdbX4v9CThsSJHS7/wz/PpP/pR58YPaViNIZGraScwVWdN+wmMKu2l2 o+rutrWlZ+EdKTuNHP5YTRI6LJTyK5YsSaAcGpy0ZucxBBR/By8ObqFfPf2xrHZy7IVs LPlbNIiKp4yaPydU21buJzHuu3WvgxsjSkALJMF/vNYrxqbE+hjuFDtQfMR2wqap5Us7 7SsQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id v3-v6si4256611pgc.447.2018.09.06.00.57.52; Thu, 06 Sep 2018 00:58:08 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728035AbeIFMap (ORCPT + 99 others); Thu, 6 Sep 2018 08:30:45 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:32821 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725819AbeIFMap (ORCPT ); Thu, 6 Sep 2018 08:30:45 -0400 Received: from p4fea45ac.dip0.t-ipconnect.de ([79.234.69.172] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fxp9U-0008Sz-I7; Thu, 06 Sep 2018 09:56:32 +0200 Date: Thu, 6 Sep 2018 09:56:32 +0200 (CEST) From: Thomas Gleixner To: pheragu@codeaurora.org cc: linux-kernel@vger.kernel.org, ckadabi@codeaurora.org, tsoni@codeaurora.org, bryanh@codeaurora.org, Prasad Sodagudi Subject: Re: [PATCH] genirq: Avoid race between cpu hot plug and irq_desc() allocation paths In-Reply-To: <819e8a811ebf49366d75676922903368@codeaurora.org> Message-ID: References: <1536167131-20585-1-git-send-email-pheragu@codeaurora.org> <819e8a811ebf49366d75676922903368@codeaurora.org> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, 5 Sep 2018, pheragu@codeaurora.org wrote: > On 2018-09-05 11:23, Thomas Gleixner wrote: > > On Wed, 5 Sep 2018, Prakruthi Deepak Heragu wrote: > > > > > One of the cores might have just allocated irq_desc() and other core > > > might be doing irq migration in the hot plug path. In the hot plug path > > > during the IRQ migration, for_each_active_irq macro is trying to get > > > irqs whose bits are set in allocated_irqs bit map but there is no return > > > value check after irq_to_desc for desc validity. > > > > Confused. All parts involved, irq allocation/deallocation and the CPU > > hotplug code take sparse_irq_lock to prevent exavtly that. > > > Removing the NULL pointer check and adding this sparse_irq_lock > that you suggested will solve this issue. The code looks like > this now. Is this okay? No, it's not okay at all. You _CANNOT_ take sparse_irq_lock() there. I wrote that _ALL_ parts take that lock already. Care to look at the code? takedown_cpu() { /* * Prevent irq alloc/free while the dying cpu reorganizes the * interrupt affinities. */ irq_lock_sparse(); /* * So now all preempt/rcu users must observe !cpu_active(). */ err = stop_machine_cpuslocked(take_cpu_down, NULL, cpumask_of(cpu)); and __cpu_disable() which in turn calls irq_migrate_all_off_this_cpu() is called from take_cpu_down(). So this deadlocks in any case. I have no idea what kind of modifications you have in your kernel which make 1) The thing explode in the first place 2) Not immediately deadlock with that hack you just sent and honestly I don't want to know at all. Thanks, tglx