Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp3945081imm; Wed, 5 Sep 2018 08:21:15 -0700 (PDT) X-Google-Smtp-Source: ANB0VdYpzA6rdJ/49QsazRtWHFUZ5kOaWJKB3FSHdmcCrHEa4Sc86ZS2NKtVdntLVJ3+hkOi2KZx X-Received: by 2002:a17:902:4081:: with SMTP id c1-v6mr39463156pld.169.1536160875273; Wed, 05 Sep 2018 08:21:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536160875; cv=none; d=google.com; s=arc-20160816; b=dV+YD+/2JBdJK84DhzGA36hl0z4RMuUKr1ZIxXBZP2+nQi87bC8EQAhH1RNv9Nu524 n1QO0JNWyjcBgLGZTF4WkSlMGWyTrPl0LR2BtZaLAgGIz7oCQaAXsuROrtP7blMNVMNO enw2ejOf3zhh3Q0WMtz3Cp66Ey05Yc23HCCO+R1mTDyQ+aZ0xn65k7CQelf2kYyHcNPo 2rAOOlgGe/4htvpjnDFY9UawK4VSXGBSAV7cH4E+H10L6Cq8EefAfg+sUm8jlRVnN7MO 6LWJjqzW21QTGxQ1BXhEi+GHtYCgQWltDJ5KAoAQCpW2X3B/YWKz9fHrL+o4Z58XtP1N 8fqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version; bh=A5ShTxmYlAi3VGjNLkpVICEssMioNtljehLiWA3JiRg=; b=xUe0aAo/xKo6DZQ+HYjgFqmXWbiTo4VCmzvy45/SDsgJlSl9QmXRvB33tn2sZqGQhs 4dAkzQkln7ITEsMAnqUeMo1P+eN1jE1ErMA+gi+5fEhY5GQOX5xcnG9Yxh340vlGpnHC /Uw9ZEuGLjilkQBksgA3myQgAGwtVpuFGFbxz4RNpb1rYvh81qnPGtI+vNXmXU7f+gzA wKnxOCG6xvSSWVd8Eal7ikdA0yZ44WED3tBC5Q+aUNLLI+p4GAwO2XewnQ5CvrYAYYjt b7ayyYHiFPyhfd7h60Cp9VPox9LAlWOuHobd2D1Qj4Jv+FcDwD/WaDtnCC/Tf/m6qmif NBTg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z71-v6si2194420pff.223.2018.09.05.08.20.58; Wed, 05 Sep 2018 08:21:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727364AbeIETuc (ORCPT + 99 others); Wed, 5 Sep 2018 15:50:32 -0400 Received: from mail-ua1-f67.google.com ([209.85.222.67]:44137 "EHLO mail-ua1-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726366AbeIETuc (ORCPT ); Wed, 5 Sep 2018 15:50:32 -0400 Received: by mail-ua1-f67.google.com with SMTP id m11-v6so6081531uao.11; Wed, 05 Sep 2018 08:19:52 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=A5ShTxmYlAi3VGjNLkpVICEssMioNtljehLiWA3JiRg=; b=Z46M0+4+DO5OOtRm7IsY00AgGdCyA4xWZ0hAhUVTcVHaMuY1QjL4YaY6T9rMxT3oUz Hhe/oikJhlNuDg/j05GPuU7r0E4MJCKhF45IF5pKDhKsNTXaNEf+JUesYLd9HxRuLOR/ l2t3tCDG3nbQrffebPh3RSrlAlOxzSeRvvZyDG6uibQcmedG+2m7xJGx2dJl+eA9NOlP brSclUR6zpj0/EEZpSlME6qQMWCuMT1HJrNRd70ZW6tOYANiyUTwSgRLGTFIbjAIhYwL w7/W/wCqUFNYQK2ZUxbBixup18vEwMKvLn186uzVo7lESpyj0XbN6bIQCsQ/ArJn/HuK 39Ug== X-Gm-Message-State: APzg51Ccy1wR6rpPHG5Ebr17DLqietaCnMsLoHGRGkhgZnKOfgI+csHk LIXYYDC1yvwx6/dN894nadM55l7egPLf3WzUgh8= X-Received: by 2002:a67:4c93:: with SMTP id h19-v6mr5090535vsg.36.1536160791874; Wed, 05 Sep 2018 08:19:51 -0700 (PDT) MIME-Version: 1.0 References: <1536042803-6152-1-git-send-email-neeraju@codeaurora.org> <20180905145353.GA14069@e107155-lin> In-Reply-To: <20180905145353.GA14069@e107155-lin> From: Geert Uytterhoeven Date: Wed, 5 Sep 2018 17:19:40 +0200 Message-ID: Subject: Re: [PATCH] cpu/hotplug: Fix rollback during error-out in takedown_cpu() To: Sudeep Holla Cc: Thomas Gleixner , Neeraj Upadhyay , Josh Triplett , Peter Zijlstra , Ingo Molnar , Lai Jiangshan , dzickus@redhat.com, Brendan Jackman , Mathieu Malaterre , Linux Kernel Mailing List , sramana@codeaurora.org, linux-arm-msm@vger.kernel.org, Lorenzo Pieralisi , Linux-Renesas Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Sudeep, Thanks for the CC! On Wed, Sep 5, 2018 at 4:54 PM Sudeep Holla wrote: > On Wed, Sep 05, 2018 at 02:23:46PM +0200, Thomas Gleixner wrote: > > On Wed, 5 Sep 2018, Thomas Gleixner wrote: > > > On Tue, 4 Sep 2018, Neeraj Upadhyay wrote: > > > > ret = cpuhp_down_callbacks(cpu, st, target); > > > > if (ret && st->state > CPUHP_TEARDOWN_CPU && st->state < prev_state) { > > > > - cpuhp_reset_state(st, prev_state); > > > > + /* > > > > + * As st->last is not set, cpuhp_reset_state() increments > > > > + * st->state, which results in CPUHP_AP_SMPBOOT_THREADS being > > > > + * skipped during rollback. So, don't use it here. > > > > + */ > > > > + st->rollback = true; > > > > + st->target = prev_state; > > > > + st->bringup = !st->bringup; > > > > > > No, this is just papering over the actual problem. > > > > > > The state inconsistency happens in take_cpu_down() when it returns with a > > > failure from __cpu_disable() because that returns with state = TEARDOWN_CPU > > > and st->state is then incremented in undo_cpu_down(). > > > > > > That's the real issue and we need to analyze the whole cpu_down rollback > > > logic first. > > > > And looking closer this is a general issue. Just that the TEARDOWN state > > makes it simple to observe. It's universaly broken, when the first teardown > > callback fails because, st->state is only decremented _AFTER_ the callback > > returns success, but undo_cpu_down() increments unconditionally. > > > > Patch below. > > This patch fixes the issue reported @[1]. Lorenzo did some debugging and > I wanted to have a look at it at some point but this discussion drew my > attention and sounded very similar[2]. So I did a quick test with this > patch and it fixes the issue. > [1] https://lore.kernel.org/lkml/CAMuHMdVg868LgL5xTg5Dp5rReKxoo+8fRy+ETJiMxGWZCp+hWw@mail.gmail.com/ > [2] https://lore.kernel.org/lkml/20180823131505.GA31558@red-moon/ Thomas' patch fixes the issue for me: Tested-by: Geert Uytterhoeven Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds