Received: by 2002:a05:7412:b995:b0:f9:9502:5bb8 with SMTP id it21csp1805205rdb; Sun, 24 Dec 2023 02:47:26 -0800 (PST) X-Google-Smtp-Source: AGHT+IFeUz/i+xNkI30qzuwZG3kcoJiCf3t7+SmD4Zva+S9kFay5Rty6x4oKnNNXLvg8dM63a174 X-Received: by 2002:a05:620a:4690:b0:77f:8a87:ebfd with SMTP id bq16-20020a05620a469000b0077f8a87ebfdmr4301010qkb.41.1703414846193; Sun, 24 Dec 2023 02:47:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1703414846; cv=none; d=google.com; s=arc-20160816; b=xYUfS4SMNFkB7/n5fRctKNsEHfAYYXaBcPSyUSmEdkevQOEQTYy+T0V85GJ+SrgmrG ZnAthvv1PtU2y8cYZgUUEGicoTlvA7O5yY9XQvXoub5z4WjCUDboCWEqNTwGA49nWbu4 SDlWhgW09BSo+PMPt02diAHwz3F0PVg/U7zagO9mRJKI/vMfuSS279HWbfceoRk9pr6E Ei1tzpP6+5+cO3MLG7RDNYZ+4lLo/I31kR5fUhZ9z9U6zStJdPkP5EDyD8knb1yWdltR NUjOhqfhAPOhagn6VaMblNg5TAjr/UNJHncsm4bYrgtZQoln61pFU1LfW75rL4ZYTEva 1Yyg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=user-agent:in-reply-to:content-disposition:mime-version :list-unsubscribe:list-subscribe:list-id:precedence:references :message-id:subject:cc:to:from:date; bh=b5vygnLtOB5a3zRfpLWFqbL8sFO/+XIm3wdPcIZJWZs=; fh=DHEMuZO50Ax5xBtjxa5sqB9Udspp1xRBp9zNoCKdeN4=; b=DUyHSEUOT7RFetLFUzePNDcThxV97StWcnXbXBVdg5YDbH7c8PerDkemWYyMP7yJJK qqlodX8l3mgNYevX0MVy4xj3g3HCJMjVdYrNVbLCgGO/Sn5cJBThvU446QTxlCyARYDE 4qy2r6tFfuVeR2c4EJf+hxu2I5ewN60zPfzrz4Hg6pVHgPm+60tgxAZPoxNpMK1UNffs t+qUCrg8IJk8dQrbcpd6hSHu3518cIyflf9Mipaafv9h5KHzIdX2N25xdm8MCmcXjJCh WwrV1yggRfkxLXQEyyAjfROzWoizRnO66ceviAk6OtQWPrrwsm5pixXMMFzPg0odYqhS cd2w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-10688-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-10688-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id c16-20020ae9e210000000b0078129984488si5801365qkc.148.2023.12.24.02.47.26 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Sun, 24 Dec 2023 02:47:26 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-10688-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel+bounces-10688-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-10688-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id E83CC1C217E8 for ; Sun, 24 Dec 2023 10:47:25 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 05A4423A8; Sun, 24 Dec 2023 10:47:20 +0000 (UTC) X-Original-To: linux-kernel@vger.kernel.org Received: from bmailout2.hostsharing.net (bmailout2.hostsharing.net [83.223.78.240]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 7328220F9; Sun, 24 Dec 2023 10:47:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=wunner.de Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=h08.hostsharing.net Received: from h08.hostsharing.net (h08.hostsharing.net [IPv6:2a01:37:1000::53df:5f1c:0]) (using TLSv1.3 with cipher TLS_AES_256_GCM_SHA384 (256/256 bits) key-exchange X25519 server-signature RSA-PSS (4096 bits) server-digest SHA256 client-signature RSA-PSS (4096 bits) client-digest SHA256) (Client CN "*.hostsharing.net", Issuer "RapidSSL TLS RSA CA G1" (verified OK)) by bmailout2.hostsharing.net (Postfix) with ESMTPS id 42C042800B1D7; Sun, 24 Dec 2023 11:47:09 +0100 (CET) Received: by h08.hostsharing.net (Postfix, from userid 100393) id 30CFF10A64; Sun, 24 Dec 2023 11:47:09 +0100 (CET) Date: Sun, 24 Dec 2023 11:47:09 +0100 From: Lukas Wunner To: Ethan Zhao Cc: bhelgaas@google.com, baolu.lu@linux.intel.com, dwmw2@infradead.org, will@kernel.org, robin.murphy@arm.com, linux-pci@vger.kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org Subject: Re: [RFC PATCH v6 4/4] iommu/vt-d: break out devTLB invalidation if target device is gone Message-ID: <20231224104709.GB31197@wunner.de> References: <20231224050657.182022-1-haifeng.zhao@linux.intel.com> <20231224050657.182022-5-haifeng.zhao@linux.intel.com> Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20231224050657.182022-5-haifeng.zhao@linux.intel.com> User-Agent: Mutt/1.10.1 (2018-07-13) On Sun, Dec 24, 2023 at 12:06:57AM -0500, Ethan Zhao wrote: > --- a/drivers/iommu/intel/dmar.c > +++ b/drivers/iommu/intel/dmar.c > @@ -1423,6 +1423,13 @@ int qi_submit_sync(struct intel_iommu *iommu, struct qi_desc *desc, > writel(qi->free_head << shift, iommu->reg + DMAR_IQT_REG); > > while (qi->desc_status[wait_index] != QI_DONE) { > + /* > + * if the devTLB invalidation target device is gone, don't wait > + * anymore, it might take up to 1min+50%, causes system hang. > + */ > + if (type == QI_DIOTLB_TYPE && iommu->flush_target_dev) > + if (!pci_device_is_present(to_pci_dev(iommu->flush_target_dev))) > + break; As a general approach, this is much better now. Please combine the nested if-clauses into one. Please amend the code comment with a spec reference, i.e. "(see Implementation Note in PCIe r6.1 sec 10.3.1)" so that readers of the code know where the magic number "1min+50%" is coming from. Is flush_target_dev guaranteed to always be a pci_dev? I'll let iommu maintainers comment on whether storing a flush_target_dev pointer is the right approach. (May store a back pointer from struct intel_iommu to struct device_domain_info?) Maybe move the "to_pci_dev(iommu->flush_target_dev)" lookup outside the loop to avoid doing this over and over again? I think we still have a problem here if the device is not removed but simply takes a long time to respond to Invalidate Requests (as it is permitted to do per the Implementation Note). We'll busy-wait for the completion and potentially run into the watchdog's time limit again. So I think you or someone else in your org should add OKRs to refactor the code so that it sleeps in-between polling for Invalidate Completions (instead of busy-waiting with interrupts disabled). Thanks, Lukas