Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp5726628rdb; Wed, 13 Dec 2023 18:43:13 -0800 (PST) X-Google-Smtp-Source: AGHT+IEIGItADoJ83CwR596N4Eo8ibf/IO6QEuAXBQQOW8Jp5AMfyHiUWbABy82hvShzVXkZkfXx X-Received: by 2002:a17:90b:350:b0:286:6cc1:5fd4 with SMTP id fh16-20020a17090b035000b002866cc15fd4mr4426017pjb.87.1702521666199; Wed, 13 Dec 2023 18:41:06 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702521666; cv=none; d=google.com; s=arc-20160816; b=IwFSOCUEV0yLuld4QHLHpMyQwf69cMd8puF+771QUQ1v+DS/RCyfFYW88D8z1MbFcN jVdYN7L8V7eqjAarFqksKDqlVjGUv9Ej8mGC7kNuhupICZUCEMXeFZA8ijasB+i4+N7i Up8Xc0bIvXBhTer3uip329TUVOet9ml/KX57hCrSswJGW32Z1bDyJh5MGchD5oNBoEOp QAf+5Y5mO4tqQQAl4TmcibVlIu787f5nDQrUG9P+VBJrTOa4jfUCkAZj2bnCzut0JGH6 0RuRgIS+QrEvniD5GkFluOqleZ4rsAWQF/mjcKqQ4eTMktM70WkSq+VfIBkKO6G9vVEV xm0A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=Q0RB9gyy0mSuCd5MMicXJ7dhG8/cWWtvalZokF1iM6Q=; fh=oyvXy35Z47Of+prfoqVO/6cengT4bsvbNYr0dCnnDB8=; b=IbSFEk66KwwGSXjjG6nm/OdVHfBWm69RWt2vOV6cbu/xRLI1yes4ZPCYns6UVORQ6/ 7Nq6vIEImlZs42iUN9CImClfVYCFVs6Rtp8ngvZUwiFqagUQa7NvoiQILhWTfFH8LHNG HhB3yBb/xsyt67//vsaZxBYm7RomuNvE2oAurFdaU+5keG4oNG+5p+7lwV4gM0rduo5/ zm0ZiVAxahilPPQog/QExN6cIwS8clQAx9O7o5CErTKpxPpOILOhGccl07ha3xWGwwUp x/si4gU7TRUwaGk5Gl43keuvFZQRQMDKKenTFfOKxXsHLVW8oz0qhPKTT1YeKomnPSDx YF2Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=b+l8MaT0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from howler.vger.email (howler.vger.email. [23.128.96.34]) by mx.google.com with ESMTPS id np9-20020a17090b4c4900b0028681add0c4si10648461pjb.151.2023.12.13.18.41.05 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 18:41:06 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) client-ip=23.128.96.34; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=b+l8MaT0; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.34 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by howler.vger.email (Postfix) with ESMTP id 9E33B82C516F; Wed, 13 Dec 2023 18:41:03 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at howler.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1442884AbjLNCk1 (ORCPT + 99 others); Wed, 13 Dec 2023 21:40:27 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43582 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230034AbjLNCkU (ORCPT ); Wed, 13 Dec 2023 21:40:20 -0500 Received: from mgamail.intel.com (mgamail.intel.com [198.175.65.13]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id C0F1DF4; Wed, 13 Dec 2023 18:40:26 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702521627; x=1734057627; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=i/fgG2eXBpzB7OoPlqAkRyGnSbZuRbwRmN6gEwObCHw=; b=b+l8MaT0Jh9AbtRWeimEImC7Ipr07QKr/D4+0jFdXsdQEkfoBb7aA4fK YVvasQD7D/LAROE7XUfY3/KjJNiNiNh1sl80HEp8RcvOEyMHfcAgOE1GC HmUL0wf5LOY/X5Qpfyi4r3Jxb+b1ZweUtr/RcstG5aedyGJrpSsSIrynR isbaHREUvrbitcQ4srvzzIJW3G6bfHuhMkNK6JkeYM9Svhhc4FELJQEFQ T3XbSTWOY/KJklSEBdHndHPGZ7PoYXpN4ipkIxSgTnPGUAAlPHX7AiVTt WmyyrXWm67THU0RFhvC5XX/eyRR6Ii+ytVwJ4qFJifxcIyWTD01MxFeCu Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10923"; a="2142499" X-IronPort-AV: E=Sophos;i="6.04,274,1695711600"; d="scan'208";a="2142499" Received: from orsmga007.jf.intel.com ([10.7.209.58]) by orvoesa105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Dec 2023 18:40:26 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10923"; a="767457647" X-IronPort-AV: E=Sophos;i="6.04,274,1695711600"; d="scan'208";a="767457647" Received: from zhaohaif-mobl.ccr.corp.intel.com (HELO [10.254.210.186]) ([10.254.210.186]) by orsmga007-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Dec 2023 18:40:23 -0800 Message-ID: Date: Thu, 14 Dec 2023 10:40:20 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] iommu/vt-d: don's issue devTLB flush request when device is disconnected To: Robin Murphy , Lukas Wunner Cc: bhelgaas@google.com, baolu.lu@linux.intel.com, dwmw2@infradead.org, will@kernel.org, linux-pci@vger.kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Haorong Ye References: <20231213034637.2603013-1-haifeng.zhao@linux.intel.com> <20231213034637.2603013-3-haifeng.zhao@linux.intel.com> <20231213104417.GA31964@wunner.de> <3b7742c4-bbae-4a78-a5a6-30df936a17d4@arm.com> From: Ethan Zhao In-Reply-To: <3b7742c4-bbae-4a78-a5a6-30df936a17d4@arm.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on howler.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (howler.vger.email [0.0.0.0]); Wed, 13 Dec 2023 18:41:03 -0800 (PST) On 12/13/2023 7:54 PM, Robin Murphy wrote: > On 13/12/2023 10:44 am, Lukas Wunner wrote: >> On Tue, Dec 12, 2023 at 10:46:37PM -0500, Ethan Zhao wrote: >>> For those endpoint devices connect to system via hotplug capable ports, >>> users could request a warm reset to the device by flapping device's >>> link >>> through setting the slot's link control register, >> >> Well, users could just *unplug* the device, right?  Why is it relevant >> that thay could fiddle with registers in config space? >> >> >>> as pciehpt_ist() DLLSC >>> interrupt sequence response, pciehp will unload the device driver and >>> then power it off. thus cause an IOMMU devTLB flush request for >>> device to >>> be sent and a long time completion/timeout waiting in interrupt >>> context. >> >> A completion timeout should be on the order of usecs or msecs, why >> does it >> cause a hard lockup?  The dmesg excerpt you've provided shows a 12 >> *second* >> delay between hot removal and watchdog reaction. > > The PCIe spec only requires an endpoint to respond to an ATS > invalidate within a rather hilarious 90 seconds, so it's primarily a > question of how patient the root complex and bridges in between are > prepared to be. The issue reported only reproduce with endpoint device connects to system via PCIe switch (only has read tracking feature), those switchses seem not be aware of ATS transaction. while root port is aware of ATS while the ATS transaction is broken. (invalidation sent, but link down, device removed etc). but I didn't find any public spec about that. > >>> Fix it by checking the device's error_state in >>> devtlb_invalidation_with_pasid() to avoid sending meaningless devTLB >>> flush >>> request to link down device that is set to >>> pci_channel_io_perm_failure and >>> then powered off in >> >> This doesn't seem to be a proper fix.  It will work most of the time >> but not always.  A user might bring down the slot via sysfs, then yank >> the card from the slot just when the iommu flush occurs such that the >> pci_dev_is_disconnected(pdev) check returns false but the card is >> physically gone immediately afterwards.  In other words, you've shrunk >> the time window during which the issue may occur, but haven't eliminated >> it completely. > > Yeah, I think we have a subtle but fundamental issue here in that the > iommu_release_device() callback is hooked to > BUS_NOTIFY_REMOVED_DEVICE, so in general probably shouldn't be > assuming it's safe to do anything with the device itself *after* it's > already been removed from its bus - this step is primarily about > cleaning up any of the IOMMU's own state relating to the given device. > > I think if we want to ensure ATCs are invalidated on hot-unplug we > need an additional pre-removal notifier to take care of that, and that > step would then want to distinguish between an orderly removal where > cleaning up is somewhat meaningful, and a surprise removal where it > definitely isn't. So, at least, we should check device state before issue devTLB invaliation. Thanks, Ethan > > > Thanks, > Robin.