Received: by 2002:a05:6358:16cc:b0:ea:6187:17c9 with SMTP id r12csp8903185rwl; Tue, 10 Jan 2023 21:49:44 -0800 (PST) X-Google-Smtp-Source: AMrXdXu8HhHzBgBdyJ/iPuXjKSLplgptICRn1t1cQz9z8JH9944h9g6iI6E90lAwB7fzgiW428TU X-Received: by 2002:a05:6a20:8347:b0:b0:4f70:b2b2 with SMTP id z7-20020a056a20834700b000b04f70b2b2mr65418047pzc.46.1673416184049; Tue, 10 Jan 2023 21:49:44 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1673416184; cv=none; d=google.com; s=arc-20160816; b=YWQ5stFYS64M8U/TC5u1J3F1bI2LCmR/WwRWSARTJIzcs6AViS4UlJjgy4ASKUkM6o XZJb6M4YFg4pBg59kgCJhtiMd0DFqriLl6wPjSM4ZQWywDrRZDaMfdVXjnGjUZm64l0r noRfQPhGXbMnOPZevPHvkfp81RGATOwC3ZtCGywexjHKI6HET6oprd7eJNRUG3BfZBqc jmUj8wjvQF8E1guYmFS7FCvE/no15CQ5eJ82YqnDxJ4Co6luPrEnXNr3hDBpe7opTwFU KviKadzL926dw3T4deJDzZC5cyIh3SDWeL8gd1QLymzudnNmH2Tff7fmnm7zxDP8ahFg dmGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:subject :from:references:cc:to:content-language:user-agent:mime-version:date :message-id:dkim-signature; bh=6efOzxskyle/vWG9BU/vVCnQaPEf1/mgX3EOfun9a6Q=; b=VKA/jDuj7Upy/VcSsEU+4LbZ0+c6xEQyOUdpwl1fwHpAdbFjI850vPG4/QKKn7KBI7 Qw0egGlcW4sZt1kkZfwaXP65IQC3ZGEopPzHGsIbbRlzKiu7b/DaHd+0/IrEimnxnB7d 2eiPZ47HC+lnicCWgETALFCxKgTlD3r1OScs8bgKNZ5d4TDsGTLvl6Q3xDkZ7J+vGJWa x3uI+hfXm2mvNMNYtgIWgU6OnomjdNnb0IISrhLie2LSkOBwzD8D9ZY7G/NLKnlhsAyU ikM10cJ+Me0BYwkPT9b6bbF2s0NolDRERTEMBYzGQ7Wg8e4WyMNDoXjBNrFaPVFF1R4R b9/g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@marcan.st header.s=default header.b=eJbjeLpx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=marcan.st Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id s22-20020a63f056000000b004790756c656si13548316pgj.299.2023.01.10.21.49.37; Tue, 10 Jan 2023 21:49:44 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@marcan.st header.s=default header.b=eJbjeLpx; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=marcan.st Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S237885AbjAKFW2 (ORCPT + 55 others); Wed, 11 Jan 2023 00:22:28 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57196 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S234716AbjAKFS3 (ORCPT ); Wed, 11 Jan 2023 00:18:29 -0500 Received: from mail.marcansoft.com (marcansoft.com [212.63.210.85]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A8A7713F8C for ; Tue, 10 Jan 2023 21:10:49 -0800 (PST) Received: from [127.0.0.1] (localhost [127.0.0.1]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: marcan@marcan.st) by mail.marcansoft.com (Postfix) with ESMTPSA id BE7A742165; Wed, 11 Jan 2023 05:10:44 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=marcan.st; s=default; t=1673413848; bh=8p9ORD/j7MyhO+y0jemuohQe4qSH/sbLWaipDi0ZxtU=; h=Date:To:Cc:References:From:Subject:In-Reply-To; b=eJbjeLpx3locQgq2Ide2Gy2Z0NuQpOwqetqb56fj/QmKJ9+7f4TsjLb2AhTZqfHUA vxKhqktsYfv7PKp0T23pA+W93fFKXKCATCDQkTNgcefNPX4TlewQCn/6Sm4YXl55IO JBMYoNdDNvUUD0BQbHkfzai2owStuKyfJjP4lhSUOLHQLfmCoq5dsSuaFl/iMpgO92 mfSDZFctS2RoOfPIeMKmOSEWmz7iONmzp+hMoet/z2KyLos3AA8rVfKOCfxHNsGyJ5 ugfVZEEmGIsGCRgJ9QbQxm2eLty+mEHe8KeqZAGa6pzSbUBl4smDQTXpF9VrQdB+Mn pgQe5J3y8KfLQ== Message-ID: Date: Wed, 11 Jan 2023 14:10:42 +0900 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.9.1 Content-Language: en-US To: Christoph Hellwig Cc: Keith Busch , Jens Axboe , Sagi Grimberg , Eric Curtin , Janne Grunau , Sven Peter , Alyssa Rosenzweig , asahi@lists.linux.dev, linux-arm-kernel@lists.infradead.org, linux-nvme@lists.infradead.org, linux-kernel@vger.kernel.org References: <20230111043614.27087-1-marcan@marcan.st> <20230111043614.27087-2-marcan@marcan.st> <20230111045402.GB15520@lst.de> From: Hector Martin Subject: Re: [PATCH 1/2] nvme-apple: Do not try to shut down the controller twice In-Reply-To: <20230111045402.GB15520@lst.de> Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit X-Spam-Status: No, score=-2.1 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,NICE_REPLY_A,SPF_HELO_NONE, SPF_PASS autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/01/2023 13.54, Christoph Hellwig wrote: > On Wed, Jan 11, 2023 at 01:36:13PM +0900, Hector Martin wrote: >> The blamed commit stopped explicitly disabling the controller when we do >> a controlled shutdown, but apple_nvme_reset_work was only checking for >> the disable bit before deciding to issue another disable. Check for the >> shutdown state too, to avoid breakage. >> >> This issue does not affect nvme-pci, since it only issues controller >> shutdowns when the system is actually shutting down anyway. > > There's a few other places where nvme-pci does a shutdown like > probe/reset failure and most notably and mostly notably various > power management scenarios. > > What path is causing a problem here for nvme-apple? I fear we're > missing some highler level check here and getting further out of > sync. > OK, so the first question is who is responsible for resetting the controller after a shutdown? The spec requires a reset in order to bring it back up from that state. Indeed the PCIe code does an explicit disable right now (though, judging by the comment, it probably wasn't done with the intent of resetting after a shutdown, it just happens to work for that too :)) Right now, apple_nvme_reset_work() tries to check for the condition of an enabled controller (under the assumption that it's coming from a live controller, not considering shutdown/sleep) and issue an apple_nvme_disable(). However, this doesn't work when resuming because at that point the firmware coprocessor is shut down, so the device isn't usable (can't even get a disable command to complete properly). Perhaps a better conditional here would be to check for apple_rtkit_is_running(), since apple_nvme_disable() can't work otherwise. But then, even if we get past that, once apple_nvme_reset_work actually resets the firmware CPU and kicks things back online, the controller is still logically in the shutdown state. So something has to disable it in order for nvme_enable_ctrl() to work. An alternate patch would be to change the gate for apple_nvme_disable() in apple_nvme_reset_work() to check for apple_rtkit_is_running() on top of the controller enable state, and then add a further direct call to nvme_disable_ctrl(..., false) later in apple_nvme_reset_work, once the firmware is back up, to ensure we can enable it after. Thoughts? Alternatively, we could just revert to the prior behavior of always issuing a disable after a shutdown. We need to disable at some point to come back anyway, so it might as well be done early (before we shut down firmware, so it works). - Hector