Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5586786imm; Mon, 23 Jul 2018 02:20:43 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdnkvGTJxhPeDDln0EsGvDPq/P8rEk0ZIxKfVL3o5a++ft2i+wJUEK9ZLfDC7UbJrq2Pwga X-Received: by 2002:a17:902:b609:: with SMTP id b9-v6mr1426931pls.321.1532337643486; Mon, 23 Jul 2018 02:20:43 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532337643; cv=none; d=google.com; s=arc-20160816; b=YqqHCIYtSQdrqA79/le2S06i79Nmcg4BkLn3wbpepZoLhhEkDmUTzCHvLU0vMtlfl1 x6Gdxa1LKzVgYU19MLqfpGqDKaWZlODe5vA0ZhL1sffXITi+ANrmsNZWN8L3FE/voEuQ vHZG5Oaua2Uf72DXCfuhGg+UMLZ2F135125OGFcaOUo57qxVbO1ug1+YGXafpVNSKpHV ZVal+hWEZDhiUH5FEjnEcI0zugjnHCLXFWBhP0l4D+eqw858Y6RCOHWpKuJK5IjM3ONq DiCF2V2w+eT/gHMed64eOBHD+9mGr1V0HVfmSYAE3KVgEeHShVU6eIpZutiAB+CDbLum 8+dA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language:in-reply-to:mime-version :user-agent:date:message-id:cc:references:to:from:subject :dmarc-filter:dkim-signature:dkim-signature :arc-authentication-results; bh=yZ7pBKQNdCmfJudrCDaBdd66W8le3YKmJdzvqCy2I5s=; b=0HPNJILgKY24dH4b9kghZj9dE+Ve9i9QN7J58o++rAzHTO+R3q/1dqCJcAzBTejCaL kLRgmUFJHiXdIVq6zaPdrfdKMQPwjIsfpZIAFhIhlflVtYbqZSsd/Lgmq3ZzbFHaB9jK sJh95PWTQQeWg44RuNWPtqMORA6l+OuIOFMiYJ2ALkzffkihHDVjaLxpH+jngCBZWcoH J4HLcfdVC388e1xlh4At37ES3NYeOTzHCX5c2L3uaPrYxoxown3kk1Q3KGjwALFRfgE3 Y5wTOuJSove7FjHaooZhC54JSH+k/kIQUBdqrlyUWex4B7lGnRpQS6xxdv2uGBgdlNWG mo2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=c7vOcEPl; dkim=pass header.i=@codeaurora.org header.s=default header.b=TOjg0efu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k80-v6si8776063pfg.42.2018.07.23.02.20.27; Mon, 23 Jul 2018 02:20:43 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=c7vOcEPl; dkim=pass header.i=@codeaurora.org header.s=default header.b=TOjg0efu; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388123AbeGWKTr (ORCPT + 99 others); Mon, 23 Jul 2018 06:19:47 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:60430 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388048AbeGWKTr (ORCPT ); Mon, 23 Jul 2018 06:19:47 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id 487F960588; Mon, 23 Jul 2018 09:19:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1532337572; bh=+b208UypPv879paCfpkoFPLhitjhpVdoUTN3MrCKj3E=; h=Subject:From:To:References:Cc:Date:In-Reply-To:From; b=c7vOcEPleyF57mIlDFPK0wh7fsHNLeIB7mW4M/PAtq0oB3x0ybayUOCvxaNK9M9sr 6iwlxWHgEhcS7sNE9/92pXKU3xkcxhLchH5LQsE+XkrRO4Trx2/qXTAE5UTtG3o8zT PewyzM7lGxjuXhbH890U1DSAsFFXHAKC8t/xcy60= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,HTML_MESSAGE,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from [10.204.79.100] (blr-c-bdr-fw-01_globalnat_allzones-outside.qualcomm.com [103.229.19.19]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) (Authenticated sender: mojha@smtp.codeaurora.org) by smtp.codeaurora.org (Postfix) with ESMTPSA id 4491B60588; Mon, 23 Jul 2018 09:19:29 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1532337571; bh=+b208UypPv879paCfpkoFPLhitjhpVdoUTN3MrCKj3E=; h=Subject:From:To:References:Cc:Date:In-Reply-To:From; b=TOjg0efuxfI1RH0vD6/wv9hIFJBAROzW3UJTMPt7gdj/uTT2e9S7d58rsvFssp5t7 jC8+CxDn8Bwp8MGV1RUdWHugiuEn9oOQdvc9sbWsuor7pZjY2Z/SrPcQ40KHjfHebl WBq37GHihbAGKWgimsfO7FKjYfx2uOv0P+ZVPros= DMARC-Filter: OpenDMARC Filter v1.3.2 smtp.codeaurora.org 4491B60588 Authentication-Results: pdx-caf-mail.web.codeaurora.org; dmarc=none (p=none dis=none) header.from=codeaurora.org Authentication-Results: pdx-caf-mail.web.codeaurora.org; spf=none smtp.mailfrom=mojha@codeaurora.org Subject: Re: Issue related cpuhotplug failure path on 4.9.x version From: Mukesh Ojha To: Thomas Gleixner , Greg Kroah-Hartman , Ingo Molnar , gkohli@codeaurora.org, Maria Yu , Prasad Sodagudi , neeraju@codeaurora.org, stable@vger.kernel.org References: <4c43c9d5-cd7d-4f2f-70b3-384863a3fe61@codeaurora.org> Cc: lkml Message-ID: Date: Mon, 23 Jul 2018 14:49:27 +0530 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <4c43c9d5-cd7d-4f2f-70b3-384863a3fe61@codeaurora.org> Content-Type: multipart/alternative; boundary="------------E9CB3212F58963C65DBD47DF" Content-Language: en-US Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org This is a multi-part message in MIME format. --------------E9CB3212F58963C65DBD47DF Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Adding stable and lkml. Sorry for spam others. -Mukesh On 7/23/2018 1:57 PM, Mukesh Ojha wrote: > > Hi All, > > I wanted to discuss about one of the corner case exists in 4.9 kernel > (4.9.x) where > If hotplug of one of the CPU fails due to failure in one of the callback, > which is to be called after "notify:online"(as notify_online will > create sysfs nodes > for the hotplug cpu) . > > So, while cleaning up notify_dead() does not get called as step > ->skip_onerr set to > true for "notify:prepare"and due to that sysfs nodes of that cpu does > not get > cleaned up which can cause issue in next hotplug attempt of that cpu. > >                                                    Fails > cpuhp_up_callbacks > => > cpuhp_invoke_callback > => > undo_cpu_up > > .name = "notify:prepare", > .teardown.single = notify_dead > , > .skip_onerr = true, > > I think the possible solution here could be to remove the > -            .skip_onerr = true, > > for "notify:prepare"so that CPU_DEAD notification get send. > > Please, feel free to suggest if it has any side-effect as i don't feel > any. > > Ref: > > https://elixir.bootlin.com/linux/v4.9/source/kernel/cpu.c#L458 > > Cheers, > Mukesh > > > > > > --------------E9CB3212F58963C65DBD47DF Content-Type: text/html; charset=utf-8 Content-Transfer-Encoding: 8bit

Adding stable and lkml.

Sorry for spam others.

-Mukesh

On 7/23/2018 1:57 PM, Mukesh Ojha wrote:

Hi All,

I wanted to discuss about one of the corner case exists in 4.9 kernel (4.9.x) where
If hotplug of one of the CPU fails due to failure in one of the callback,
which is to be called after "notify:online"(as notify_online will create sysfs nodes
for the hotplug cpu) .

So, while cleaning up notify_dead() does not get called as 
step->skip_onerr set to
true for
"notify:prepare" and due to that sysfs nodes of that cpu does not get
cleaned up which can cause issue in next hotplug attempt of that cpu.

                                                   Fails   
cpuhp_up_callbacks => cpuhp_invoke_callback => undo_cpu_up

.name = "notify:prepare",
.teardown.single = notify_dead,
.skip_onerr = true,

I think the possible solution here could be to remove the
-            .skip_onerr = true,

for "notify:prepare" so that CPU_DEAD notification get send.

Please, feel free to suggest if it has any side-effect as i don't feel any.

Ref:

https://elixir.bootlin.com/linux/v4.9/source/kernel/cpu.c#L458

Cheers,
Mukesh










--------------E9CB3212F58963C65DBD47DF--