Received: by 2002:a05:7412:e794:b0:fa:551:50a7 with SMTP id o20csp1470031rdd; Wed, 10 Jan 2024 23:14:37 -0800 (PST) X-Google-Smtp-Source: AGHT+IGJbesbji9TR7SG1MbkMXKvwtQG1l48tJkGqb2oPGNX7x0QL1Qb1Flx8UjDvys2eZuvLqpF X-Received: by 2002:a05:6402:1503:b0:553:ad79:b7cb with SMTP id f3-20020a056402150300b00553ad79b7cbmr334577edw.3.1704957276940; Wed, 10 Jan 2024 23:14:36 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1704957276; cv=none; d=google.com; s=arc-20160816; b=odv8/ZTW7JIBfcHUw9RszlU0vADR/uD5MYNvRcQUMmVQDQ11EJUflasbegFq0L9Tkz DfPi9vXPIqwJirm9xCgPbwVGZuVhZ6n3sL6WjN6U9JgtFw2TPFsxCP/x4nQNvmfyxgmt YRGJrV8KvH3AdtpStBzmC/F/bcL/wOp+dyWYfEgVNCzsDksV/cGOXxVnXRRHNOAZJuio L7RfiK9EGg40nkuWijm8tZjoPJu4MNzLY7/7SQk8a47bqOQ9YB53AFQbIcdM66lJfpb5 RrVqEw8HNovvKhmEUMsuDDiqon8veqN5iVHTGakJbXvr+LOLg3iITYu8cWcaaslMFZfC luGg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:in-reply-to:from:references:cc:to:subject :user-agent:mime-version:list-unsubscribe:list-subscribe:list-id :precedence:date:message-id:dkim-signature; bh=H+416/3n4ylqGfKmg7Sl+d9/gRrLvv2dv0lzwFUjjWo=; fh=ZJDGnKCesxy9VVbmeFer+30hxcOR4YWMFnBG1rR6pBI=; b=Ey8eiEJLSS79LtHuy3Wbl9++jVvxrwsD0DH+lQgyMLTC044BkdSXOEzMFLqVTw/hj2 porK29hgDu903Cx/pnlvnKLrfMFKaaMgSFlnelLIYTXPWgJXPL8gHGC2vpPhbVwurFHy xYe2Dx7Rs8pA3XCw886cIJioMAKcGCjQrU4YVWNexBM/F8cx8Q3wadRheAKqJTUyP6eQ Qk4ssSGarew3rwkyUOGKse+kezapfH7to37TVeM6Bj26Mr+kT2strU35Igrs5qmabaaI LTqx07O+MIffvpd6j9HDqaxAsBYTM3Vpba5+RrAp9+2qRqTxTyH7fwMNOZFFcWROghu5 SOrA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PsxFlqiM; spf=pass (google.com: domain of linux-kernel+bounces-23132-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23132-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id z8-20020a056402274800b00557ab3b5e62si255027edd.233.2024.01.10.23.14.36 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 10 Jan 2024 23:14:36 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-23132-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@intel.com header.s=Intel header.b=PsxFlqiM; spf=pass (google.com: domain of linux-kernel+bounces-23132-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-kernel+bounces-23132-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id 87CA11F24534 for ; Thu, 11 Jan 2024 07:14:36 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id B3D8AD27D; Thu, 11 Jan 2024 07:14:26 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=intel.com header.i=@intel.com header.b="PsxFlqiM" Received: from mgamail.intel.com (mgamail.intel.com [192.55.52.93]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 14AA8DDAD; Thu, 11 Jan 2024 07:14:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linux.intel.com Authentication-Results: smtp.subspace.kernel.org; spf=none smtp.mailfrom=linux.intel.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1704957264; x=1736493264; h=message-id:date:mime-version:subject:to:cc:references: from:in-reply-to:content-transfer-encoding; bh=TwjCUuzwUOMo/dYEA4GwmnKF1gt7VZ+tjYe5DJBJoEM=; b=PsxFlqiM25EHVaiQabPdmwbPVsYzFdepEUQIHN7hWX9FyX3YRFBTkXPs y3j3T7impGSkxmopEmFzYP/9UwwT3ivW7USqw9Hd/23aUJeBIUTOJ70Ny tHJpxPP29fJJBV5ftKyfFV2AP0wg8J71EYDTJOBXIAnI2YpHmtGpGAcew VyDdHlv0AcHp9vcmPtkWN7LMIfre8ZtIdegGKjYrF752KCyshIhH8wUdD V39xWHoA64c+XYvWGXKiuMu3xARI4tfWn2TNF8A2NpolNHAIgBFit8xfZ h+/71UebPhobW0vAlstTdfkphgYpkz/g7yG/Swt3J5sEEj0auv5dOkb64 Q==; X-IronPort-AV: E=McAfee;i="6600,9927,10949"; a="395910247" X-IronPort-AV: E=Sophos;i="6.04,185,1695711600"; d="scan'208";a="395910247" Received: from orsmga003.jf.intel.com ([10.7.209.27]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jan 2024 23:14:23 -0800 X-ExtLoop1: 1 X-IronPort-AV: E=McAfee;i="6600,9927,10949"; a="732116839" X-IronPort-AV: E=Sophos;i="6.04,185,1695711600"; d="scan'208";a="732116839" Received: from zhaohaif-mobl.ccr.corp.intel.com (HELO [10.93.8.238]) ([10.93.8.238]) by orsmga003-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jan 2024 23:14:15 -0800 Message-ID: <4d50c00a-9718-4ec5-bdef-ea14c7727ff4@linux.intel.com> Date: Thu, 11 Jan 2024 15:14:12 +0800 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH v8 07/10] iommu/vt-d: Allow qi_submit_sync() to return the QI faults To: Baolu Lu , "Tian, Kevin" , "Liu, Yi L" , "joro@8bytes.org" , "alex.williamson@redhat.com" , "jgg@nvidia.com" , "robin.murphy@arm.com" Cc: "cohuck@redhat.com" , "eric.auger@redhat.com" , "nicolinc@nvidia.com" , "kvm@vger.kernel.org" , "mjrosato@linux.ibm.com" , "chao.p.peng@linux.intel.com" , "yi.y.sun@linux.intel.com" , "peterx@redhat.com" , "jasowang@redhat.com" , "shameerali.kolothum.thodi@huawei.com" , "lulu@redhat.com" , "suravee.suthikulpanit@amd.com" , "iommu@lists.linux.dev" , "linux-kernel@vger.kernel.org" , "linux-kselftest@vger.kernel.org" , "Duan, Zhenzhong" , "joao.m.martins@oracle.com" , "Zeng, Xin" , "Zhao, Yan Y" , "j.granados@samsung.com" References: <20231227161354.67701-1-yi.l.liu@intel.com> <20231227161354.67701-8-yi.l.liu@intel.com> <50098edc-2bbb-4c8f-9360-6990f0f5d88a@linux.intel.com> From: Ethan Zhao In-Reply-To: <50098edc-2bbb-4c8f-9360-6990f0f5d88a@linux.intel.com> Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit On 1/1/2024 11:34 AM, Baolu Lu wrote: > On 12/28/23 2:17 PM, Tian, Kevin wrote: >>> raw_spin_lock_irqsave(&qi->q_lock, flags); >>>       /* >>> @@ -1430,7 +1439,7 @@ int qi_submit_sync(struct intel_iommu *iommu, >>> struct qi_desc *desc, >>>            * a deadlock where the interrupt context can wait >>> indefinitely >>>            * for free slots in the queue. >>>            */ >>> -        rc = qi_check_fault(iommu, index, wait_index); >>> +        rc = qi_check_fault(iommu, index, wait_index, fault); >>>           if (rc) >>>               break; >> and as replied in another thread let's change qi_check_fault to return >> -ETIMEDOUT to break the restart loop when fault pointer is valid. > > It's fine to break the retry loop when fault happens and the fault > pointer is valid. Please don't forget to add an explanation comment > around the code. Something like: > > /* >  * The caller is able to handle the fault by itself. The IOMMU driver >  * should not attempt to retry this request. >  */ If caller could pass desc with mixed iotlb & devtlb invalidation request, it would be problematic/difficult for caller or qi_submit_sync() to do error handling, imagine a case like, 1. call qi_submit_sync() with iotlb & devltb. 2. qi_submit_sync() detects the target device is dead. 3.  break the loop, or will block other invalidation submitter / hang. 4. it is hard for qi_submit_sync() to extract those iotlb invalidation to retry. 5. it is also difficult for caller to retry the iotlb invalidation, or     leave iotlb out-of-sync. ---there is no sync at all, device is gone. and if only ITE fault hit, but target device is there && configuration space reading okay, the ITE is probably left by previous request for other device, not triggered by this batch, the question is we couldn't identify the ITE device is just the same as current target ? if the same, then breaking out is reasonable, or just leave the problem to caller, something in the request batch is bad, some requests someone request befoere is bad, but the request is not from the same caller. Thanks, Ethan > > Best regards, > baolu >