Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp6411493imm; Wed, 27 Jun 2018 07:19:19 -0700 (PDT) X-Google-Smtp-Source: AAOMgpf4NAkwawBdcpJX2LhL92bC2YOuucsnBNrMNYBCdJLnaU0VdMp7a+d/BcnWeZpMkHw4M2Oo X-Received: by 2002:a62:3848:: with SMTP id f69-v6mr2075321pfa.10.1530109159281; Wed, 27 Jun 2018 07:19:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530109159; cv=none; d=google.com; s=arc-20160816; b=oXLx4OPpUOCSl8dCUVHKKKGrBThum6jf6yF13KCOukv+r/XTLUh08uJJWmZBTYjVT8 PlTUgN02VgCF1ESyqbdMw/Mon8BKogu0wS0QuONgsyuwbulSGRJ8YGUvPef4Ys9GCplV Bpk3pheacwgup8UN5o6X4QcWfCmN4+r+jGtUWxg1ZHAkVHciZonQRw+blVWjF696rp0y ytXYz5HvRX3hdPtWrYdePl/u01Z8ZJFQcFMG4fXM5rPBR7nFxJGWPtz5wg9vG/punWOQ COgZeqWGMQRDqfgJcQXrHScWx1ZrywiYB09t3wKv9rzqBFhkcuDxisj2bxM1+gz1ZYuc NTkQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:message-id:references :in-reply-to:subject:cc:to:from:date:content-transfer-encoding :mime-version:dkim-signature:dkim-signature :arc-authentication-results; bh=+ZNApom37NTBSkGoT/Ho20td1/sAyn5zx+kr6WhK7GE=; b=pemJ4ik1tQZrwZUJep72IWZXkMeKKf5CP7I4TxZN4wsEwC7HF30SZraDS79I+52Pls Q9Vihq93M+OZoA6HG78EXe/EgfgtZ5ubWCUKJzl8xV0mxf+ljySdEUMciHTJa25diwKk PApgM+M0CFqonG3UjzITjaNt34T68ETp6tQ0irj8IlJlneM8RJImFwUguZptGlY2tmu/ kgAjT4GT88gdQVQa9HJ8elap/LXfroR841y7Nro7spkNHRMRmJiiaBhKBhTJXhM9uoNR IRki9tkobVjmAKZ7D196fAn151w20VTiJJJUJzSb7iROD93EUJatTwJmXohpz6xPZW3d Xarw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=I38ErCxc; dkim=pass header.i=@codeaurora.org header.s=default header.b=I38ErCxc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id d132-v6si3563336pga.355.2018.06.27.07.19.04; Wed, 27 Jun 2018 07:19:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@codeaurora.org header.s=default header.b=I38ErCxc; dkim=pass header.i=@codeaurora.org header.s=default header.b=I38ErCxc; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S934644AbeF0OHz (ORCPT + 99 others); Wed, 27 Jun 2018 10:07:55 -0400 Received: from smtp.codeaurora.org ([198.145.29.96]:47610 "EHLO smtp.codeaurora.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932166AbeF0OHw (ORCPT ); Wed, 27 Jun 2018 10:07:52 -0400 Received: by smtp.codeaurora.org (Postfix, from userid 1000) id D63E560B1A; Wed, 27 Jun 2018 14:07:51 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1530108471; bh=Rf/F49z1Ad72rwzNS3wuVEc+hj0ojuRmX5uqUPc6334=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=I38ErCxcA+cVdGXxYSfGlZM8YJ7zHwB3cy0qjDPOh2g3WjVqC684F2LQ+7SLo4PlH U2QHY0dbNt9AKwvAbQP/Zg8p4V+aWze1qyjJxIRDuE6cSwqBnn8Nc7dzipld7qhV3P 8vHKPZbAWQrkcwjbXf/UeEla/OK8ztD53Jrm2irI= X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on pdx-caf-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.8 required=2.0 tests=ALL_TRUSTED,BAYES_00, DKIM_SIGNED,T_DKIM_INVALID autolearn=no autolearn_force=no version=3.4.0 Received: from mail.codeaurora.org (localhost.localdomain [127.0.0.1]) by smtp.codeaurora.org (Postfix) with ESMTP id ED159601EA; Wed, 27 Jun 2018 14:07:50 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=codeaurora.org; s=default; t=1530108471; bh=Rf/F49z1Ad72rwzNS3wuVEc+hj0ojuRmX5uqUPc6334=; h=Date:From:To:Cc:Subject:In-Reply-To:References:From; b=I38ErCxcA+cVdGXxYSfGlZM8YJ7zHwB3cy0qjDPOh2g3WjVqC684F2LQ+7SLo4PlH U2QHY0dbNt9AKwvAbQP/Zg8p4V+aWze1qyjJxIRDuE6cSwqBnn8Nc7dzipld7qhV3P 8vHKPZbAWQrkcwjbXf/UeEla/OK8ztD53Jrm2irI= MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII; format=flowed Content-Transfer-Encoding: 7bit Date: Wed, 27 Jun 2018 07:07:50 -0700 From: Sodagudi Prasad To: Sebastian Andrzej Siewior Cc: "Isaac J. Manjarres" , peterz@infradead.org, matt@codeblueprint.co.uk, mingo@kernel.org, tglx@linutronix.de, gregkh@linuxfoundation.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] stop_machine: Remove cpu swap from stop_two_cpus In-Reply-To: <20180627071527.hvrndkz436yeqwpq@linutronix.de> References: <1530048506-21393-1-git-send-email-isaacm@codeaurora.org> <20180627071527.hvrndkz436yeqwpq@linutronix.de> Message-ID: X-Sender: psodagud@codeaurora.org User-Agent: Roundcube Webmail/1.2.5 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018-06-27 00:15, Sebastian Andrzej Siewior wrote: > On 2018-06-26 14:28:26 [-0700], Isaac J. Manjarres wrote: >> Remove CPU ID swapping in stop_two_cpus() so that the >> source CPU's stopper thread is added to the wake queue last, >> so that the source CPU's stopper thread is woken up last, >> ensuring that all other threads that it depends on are woken >> up before it runs. > > You can't do that because you could deadlock while locking the stoper > lock. Without this change boot up issues are observed with Linux 4.14.52. One of the core is executing the stopper thread after wake_up_q() in cpu_stop_queue_two_works() function, without waking up other cores stopper thread. We see this issue 100% on device boot up with Linux 4.14.52. Could you please explain bit more how the deadlock occurs? static int cpu_stop_queue_two_works(int cpu1, struct cpu_stop_work *work1, int cpu2, struct cpu_stop_work *work2) { struct cpu_stopper *stopper1 = per_cpu_ptr(&cpu_stopper, cpu1); struct cpu_stopper *stopper2 = per_cpu_ptr(&cpu_stopper, cpu2); DEFINE_WAKE_Q(wakeq); int err; retry: raw_spin_lock_irq(&stopper1->lock); raw_spin_lock_nested(&stopper2->lock, SINGLE_DEPTH_NESTING); I think, you are suggesting to switch the locking sequence too. stopper2->lock and stopper1->lock. could you please share the test case to stress this code flow? > Couldn't you swap cpu1+cpu2 and work1+work2? Work1 and work2 are having same data contents. > >> diff --git a/kernel/stop_machine.c b/kernel/stop_machine.c >> index f89014a..d10d633 100644 >> --- a/kernel/stop_machine.c >> +++ b/kernel/stop_machine.c >> @@ -307,8 +307,6 @@ int stop_two_cpus(unsigned int cpu1, unsigned int >> cpu2, cpu_stop_fn_t fn, void * >> cpu_stop_init_done(&done, 2); >> set_state(&msdata, MULTI_STOP_PREPARE); >> >> - if (cpu1 > cpu2) >> - swap(cpu1, cpu2); >> if (cpu_stop_queue_two_works(cpu1, &work1, cpu2, &work2)) >> return -ENOENT; >> > > Sebastian -- The Qualcomm Innovation Center, Inc. is a member of the Code Aurora Forum, Linux Foundation Collaborative Project