Received: by 2002:a05:7412:d8a:b0:e2:908c:2ebd with SMTP id b10csp3675693rdg; Wed, 18 Oct 2023 02:44:52 -0700 (PDT) X-Google-Smtp-Source: AGHT+IEE75e1TJ8FxyyTshTmfLFxp53m6o7doHL2Vkgq9DvmdyDSn3MimyuMubrENY9d7YGEnb6N X-Received: by 2002:a17:90b:2783:b0:27d:2949:9e05 with SMTP id pw3-20020a17090b278300b0027d29499e05mr6441353pjb.11.1697622291903; Wed, 18 Oct 2023 02:44:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1697622291; cv=none; d=google.com; s=arc-20160816; b=THRo96JpTRg1Y64+srAneKzVIw/tFDI/ot3tJGXyTPKPZArnSfUcw4Tq3tdjZ/W0LJ qlmkHISPzU0BqyYNFP12AQclRQqeICB7MpYcwDGj7B9xbrGBUKeSNLAgUXfdffn+tBuw 4xz42h2xkSp+Mtb7rrZXL3BdWyMyeh0D/Yq1KXFMXM4Lnk6/B6tRB0T14Axzq+jYLElu 1eVtuNjSaON6/q+cIgmjoZ39XYiAOfpA7babszZaLGg/hOFE86waVv8344s9DDvC7S9g 9hPRuvn1CQNEhW7rQYCOAsP9Mxl73N/7jQXQisdpADu3sm8GFIGFK5rsPYf+imhaBI5O +Bkw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=HeWEV6pMEU/dlqFJOWvr+FmlcneSF5qPdhau8cU8g0U=; fh=R5K7hII9wLE4ReWLuIEIOvNQcf639s/f/6Vk2NrhVYE=; b=GFVCUe8cu3Xhd+PrlKM2ayDG2Agp32DVmIkmbPFSOvfdEu2ID6+oZd04Gu0g5LVc3+ mRVXZO4wg9VcyBjyqvS43EOe7FdAt12AccnKLCI8AysgjviNbwO3kAmvULJUcdrhUczt kBpJmCqL/q14Q5YnP5AK6rO/5ywmiQZAhBybPk/HoOh20zef3Zv4wRJN/udejEPA/pFB 3GN/nKqp0QjeT03MJSnSx3pfdFZOEr8C9+dq+BvUXPXB6ODFUwxKzhrr+MghSJumH/8k 1GDlxxvJ/0ljI1Gb5Rh4p93UvveaJgwO/sFGXQUZRG8f23uOtVQGsAviHikzLNJkhHWQ wsIg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from morse.vger.email (morse.vger.email. [23.128.96.31]) by mx.google.com with ESMTPS id mz7-20020a17090b378700b0027762fed4a0si1102412pjb.11.2023.10.18.02.44.51 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 18 Oct 2023 02:44:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) client-ip=23.128.96.31; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.31 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by morse.vger.email (Postfix) with ESMTP id 9F61A80FFDAB; Wed, 18 Oct 2023 02:44:49 -0700 (PDT) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.10 at morse.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229824AbjJRJom (ORCPT + 99 others); Wed, 18 Oct 2023 05:44:42 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:46326 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229510AbjJRJol (ORCPT ); Wed, 18 Oct 2023 05:44:41 -0400 Received: from bmailout3.hostsharing.net (bmailout3.hostsharing.net [IPv6:2a01:4f8:150:2161:1:b009:f23e:0]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 04B4EB0; Wed, 18 Oct 2023 02:44:38 -0700 (PDT) Received: from h08.hostsharing.net (h08.hostsharing.net [IPv6:2a01:37:1000::53df:5f1c:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL Global TLS RSA4096 SHA256 2022 CA1" (verified OK)) by bmailout3.hostsharing.net (Postfix) with ESMTPS id 349ED100AF921; Wed, 18 Oct 2023 11:44:36 +0200 (CEST) Received: by h08.hostsharing.net (Postfix, from userid 100393) id 0525E30EFD; Wed, 18 Oct 2023 11:44:36 +0200 (CEST) Date: Wed, 18 Oct 2023 11:44:35 +0200 From: Lukas Wunner To: Ricky WU Cc: Kai-Heng Feng , "bhelgaas@google.com" , "linux-pm@vger.kernel.org" , "linux-mmc@vger.kernel.org" , Kees Cook , Tony Luck , "Guilherme G. Piccoli" , "linux-pci@vger.kernel.org" , "linux-kernel@vger.kernel.org" Subject: Re: [PATCH] PCI: pciehp: Prevent child devices from doing RPM on PCIe Link Down Message-ID: <20231018094435.GA21090@wunner.de> References: <20231016040132.23824-1-kai.heng.feng@canonical.com> <20231016093210.GA22952@wunner.de> <263982e90fc046cf977ecb8727003690@realtek.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <263982e90fc046cf977ecb8727003690@realtek.com> User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-0.8 required=5.0 tests=HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on morse.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (morse.vger.email [0.0.0.0]); Wed, 18 Oct 2023 02:44:49 -0700 (PDT) [cc -= unrelated mailing lists bpf, kernel-hardening] On Tue, Oct 17, 2023 at 10:25:39AM +0000, Ricky WU wrote: > > On Mon, Oct 16, 2023 at 12:01:31PM +0800, Kai-Heng Feng wrote: > > > When inserting an SD7.0 card to Realtek card reader, it can trigger > > > PCI slot Link down and causes the following error: > > > > Why does *inserting* a card cause a Link Down? > > > > > > > [ 63.898861] pcieport 0000:00:1c.0: pciehp: Slot(8): Link Down > > > [ 63.912118] BUG: unable to handle page fault for address: > > ffffb24d403e5010 > > [...] > > > [ 63.912198] ? asm_exc_page_fault+0x27/0x30 > > > [ 63.912203] ? ioread32+0x2e/0x70 > > > [ 63.912206] ? rtsx_pci_write_register+0x5b/0x90 [rtsx_pci] > > > [ 63.912217] rtsx_set_l1off_sub+0x1c/0x30 [rtsx_pci] > > > [ 63.912226] rts5261_set_l1off_cfg_sub_d0+0x36/0x40 [rtsx_pci] > > > [ 63.912234] rtsx_pci_runtime_idle+0xc7/0x160 [rtsx_pci] > > > [ 63.912243] ? __pfx_pci_pm_runtime_idle+0x10/0x10 > > > [ 63.912246] pci_pm_runtime_idle+0x34/0x70 > > > [ 63.912248] rpm_idle+0xc4/0x2b0 > > > [ 63.912251] pm_runtime_work+0x93/0xc0 > > > [ 63.912254] process_one_work+0x21a/0x430 > > > [ 63.912258] worker_thread+0x4a/0x3c0 > > > > This looks like pcr->remap_addr is accessed after it has been iounmap'ed in > > rtsx_pci_remove() or before it has been iomap'ed in rtsx_pci_probe(). > > > > Is the card reader itself located below a hotplug port and unplugged here? > > Yes it is card reader unplug itself, because sd7.0 card is not used > rtsx_pcr, it use nvme driver I can only guess here as the dmesg and lspci output I requested hasn't been provided: I assume that this card reader may contain a PCIe switch with a regular SD card reader below a first Downstream Port (which is hotplug-capable). If an SD express card is inserted, the regular SD card reader disappears from the Downstream Port and the inserted card appears as an NVMe drive (possibly below a second Downstream Port, or below the first Downstream Port). The commit message is somewhat misleading in that it links the unhandled page fault to card insertion. That may be the trigger, but the root cause appears to be that a runtime PM request is performed asynchronously after the SD card reader has iounmap'ed pcr->remap_addr. If that is indeed the root cause (as I suspect), the fix needs to be placed somewhere else. pciehp is only one of several hotplug drivers and fixing this only in pciehp may leave the other hotplug drivers vulnerable to the same issue. Unfortunately the information provided so far is insufficient to recommend a better fix. It would be necessary to debug why the asynchronous RPM request is not canceled even though rtsx_pci_remove() runtime resumes the device before iounmap'ing pcr->remap_addr. Perhaps there's a bug in runtime PM code that causes asynchronous requests to not be canceled upon a pm_runtime_get_sync()? Or perhaps a wmb() is necessary in pci_device_remove() after setting pci_dev->driver = NULL? Thanks, Lukas