Received: by 2002:a05:7412:8d10:b0:f3:1519:9f41 with SMTP id bj16csp5718016rdb; Wed, 13 Dec 2023 18:16:52 -0800 (PST) X-Google-Smtp-Source: AGHT+IEPEqSBEORsTJyZDbhecii3buMtu1x14U8/5aU44eYXxJ/OfF9Jy8O6uZ40cxdXZymREkiA X-Received: by 2002:a05:6a00:2356:b0:6ce:2732:283 with SMTP id j22-20020a056a00235600b006ce27320283mr5992633pfj.50.1702520212604; Wed, 13 Dec 2023 18:16:52 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1702520212; cv=none; d=google.com; s=arc-20160816; b=cd9Hm8GGruPvjkT7H9gVr5nAaN+RMHgVSK1VyEiVeKsnxOk/OlS/D5QgTIz9gA+BzO F78lBpmPQhYtMgoiqnbZa4GlJAqsP892JiIXfj4eu2S2m7bVvsmNl7Y06HYiIK8/VuNf cBxopPQOyr6rKADhpip04EllDNQkTqr9g+dE8o4TuuGpBevHTm68sIrVPh06WCea1a/p WOeEo6aDouWdXjpoeyBZ+0DxHH0F9TJbuFfvm3WA3XWKOHtpByovAgnWGWQ3xWtGvqJa kjwZagExMPz44JGU7YE25KaM1+KvGQydjgJyuTLCkE5/BKPpdPNTfZeI7GcVKX/fmFXU GAYQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:subject:user-agent:mime-version:date:message-id :dkim-signature; bh=S6JwxRjPR6+7lIut2kkbrK0jK0nrD2o6T7cDD97eekw=; fh=YTmfznrBNBA+w5Xsr8rNCiZzFL1zVnuVu0r2j7yghYo=; b=EPExrefIKPa8hWZ+AmJS4170fDEpf1TMYlP6IRJPMnSTnLtnuZhH/ilHP0egfMg+jv 0ob+BIhQ0QZ5ZXvDNDRU6l1kOJDRxajKpvvxXEOEiVONF1rDeeOtoZTL9F0hwjEhSW6p 2mhaLaG5RqzaVOOUq33UpgRP9OASl9s2phBp85o3B+HPSI5muPHlym7mS8mdOaU8ffhC mHtOSQ7JDgUI/g/7YP/KD/9hsUY1VJ2ToTju7vYytQK4h7MlmawQ1GKvlzGDu1vopaQn mx6znXiy1tQGYnHMI2U0iWT1gscq5Y5qCzjVZ9iM8OmrurxfCPn/Koqbz+2kyN76Zyhm /dkQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CfGH+pXB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from groat.vger.email (groat.vger.email. [2620:137:e000::3:5]) by mx.google.com with ESMTPS id v18-20020a631512000000b005c672f5f9e9si10368266pgl.602.2023.12.13.18.16.52 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 13 Dec 2023 18:16:52 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) client-ip=2620:137:e000::3:5; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=CfGH+pXB; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::3:5 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from out1.vger.email (depot.vger.email [IPv6:2620:137:e000::3:0]) by groat.vger.email (Postfix) with ESMTP id EEDD480D6509; Wed, 13 Dec 2023 18:16:49 -0800 (PST) X-Virus-Status: Clean X-Virus-Scanned: clamav-milter 0.103.11 at groat.vger.email Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S234253AbjLNCQd (ORCPT + 99 others); Wed, 13 Dec 2023 21:16:33 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41366 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229525AbjLNCQc (ORCPT ); Wed, 13 Dec 2023 21:16:32 -0500 Received: from mgamail.intel.com (mgamail.intel.com [192.198.163.7]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 597E9E3; Wed, 13 Dec 2023 18:16:38 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1702520198; x=1734056198; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=fEz3jDQ/xk9byD6ipVMs16iK9GZAZqzLksjBqt8qPrA=; b=CfGH+pXBR1lwOwkU3xiUMDseHblZR+IjZYsjbDPEENFE4HbOMu1Cj5GI tLZvkZNWr8VFgDHfY/hU2boW28fdaVYC6gcBawVwdzDAYcmmL/tfT7HAq eVZPVKPAJhgn9mR2a6/gMSJybYpeN0yEf9/8iTnZ9hVPyyLk7Pb6MTJYJ 1850JxkzHQ53kMlMiRnxO+LfXc4elmIxFxOwyoFWuwRZmevxGku7DuSFK jGGKmE856G37TBoOEgf91mHtNKV5qREoijRInOV72Ud9uXb2nAOQ62bRo OIn6a7VoLTXfEZwTMH0lXpwb9a393MicaMhQ0joHNXzEj4KtuuphHpHxW g==; X-IronPort-AV: E=McAfee;i="6600,9927,10923"; a="16607959" X-IronPort-AV: E=Sophos;i="6.04,274,1695711600"; d="scan'208";a="16607959" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmvoesa101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Dec 2023 18:16:38 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10923"; a="723891359" X-IronPort-AV: E=Sophos;i="6.04,274,1695711600"; d="scan'208";a="723891359" Received: from zhaohaif-mobl.ccr.corp.intel.com (HELO [10.254.210.186]) ([10.254.210.186]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 13 Dec 2023 18:16:34 -0800 Message-ID: <7f756fc6-e8ea-4fea-ad8b-30066f41037e@linux.intel.com> Date: Thu, 14 Dec 2023 10:16:31 +0800 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH 2/2] iommu/vt-d: don's issue devTLB flush request when device is disconnected To: Lukas Wunner Cc: bhelgaas@google.com, baolu.lu@linux.intel.com, dwmw2@infradead.org, will@kernel.org, robin.murphy@arm.com, linux-pci@vger.kernel.org, iommu@lists.linux.dev, linux-kernel@vger.kernel.org, Haorong Ye References: <20231213034637.2603013-1-haifeng.zhao@linux.intel.com> <20231213034637.2603013-3-haifeng.zhao@linux.intel.com> <20231213104417.GA31964@wunner.de> From: Ethan Zhao In-Reply-To: <20231213104417.GA31964@wunner.de> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit X-Spam-Status: No, score=-0.8 required=5.0 tests=DKIMWL_WL_HIGH,DKIM_SIGNED, DKIM_VALID,HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI, SPF_HELO_NONE,SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=unavailable autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on groat.vger.email Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org X-Greylist: Sender passed SPF test, not delayed by milter-greylist-4.6.4 (groat.vger.email [0.0.0.0]); Wed, 13 Dec 2023 18:16:50 -0800 (PST) On 12/13/2023 6:44 PM, Lukas Wunner wrote: > On Tue, Dec 12, 2023 at 10:46:37PM -0500, Ethan Zhao wrote: >> For those endpoint devices connect to system via hotplug capable ports, >> users could request a warm reset to the device by flapping device's link >> through setting the slot's link control register, > Well, users could just *unplug* the device, right? Why is it relevant > that thay could fiddle with registers in config space? > Yes, if the device and it's slot are hotplug capable, users could just 'unplug' the device. But this case reported, users try to do a warm reset with a tool command like:   mlxfwreset -d -y reset Actually, it will access configuration space  just as  setpci -s 0000:17:01.0 0x78.L=0x21050010 Well, we couldn't say don't fiddle PCIe config space registers like that. >> as pciehpt_ist() DLLSC >> interrupt sequence response, pciehp will unload the device driver and >> then power it off. thus cause an IOMMU devTLB flush request for device to >> be sent and a long time completion/timeout waiting in interrupt context. > A completion timeout should be on the order of usecs or msecs, why does it > cause a hard lockup? The dmesg excerpt you've provided shows a 12 *second* > delay between hot removal and watchdog reaction. > In my understanding, the devTLB flush request sent to ATS capable devcie is non-posted request, if the ATS transaction is broken by endpoint link -down, power-off event, the timeout will take up to 60 seconds+-30, see "Invalidate Completion Timeout " part of chapter 10.3.1 Invalidate Request In PCIe spec 6.1 " IMPLEMENTATION NOTE: INVALIDATE COMPLETION TIMEOUT Devices should respond to Invalidate Requests within 1 minute (+50% -0%).Having a bounded time permits an ATPT to implement Invalidate Completion Timeouts and reuse the associated ITag values. ATPT designs are implementation specific. As such, Invalidate Completion Timeouts and their associated error handling are outside the scope of this specification " >> Fix it by checking the device's error_state in >> devtlb_invalidation_with_pasid() to avoid sending meaningless devTLB flush >> request to link down device that is set to pci_channel_io_perm_failure and >> then powered off in > This doesn't seem to be a proper fix. It will work most of the time > but not always. A user might bring down the slot via sysfs, then yank > the card from the slot just when the iommu flush occurs such that the > pci_dev_is_disconnected(pdev) check returns false but the card is > physically gone immediately afterwards. In other words, you've shrunk > the time window during which the issue may occur, but haven't eliminated > it completely. If you mean disable the slot via sysfs, that's SAFE_REMOVAL, right ? that would issse devTLB invalidation first, power off device later, it wouldn't trigger the hard lockup, though the pci_dev_is_disconnected() return false. this fix works such case. Thanks, Ethan > > Thanks, > > Lukas