Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp3066815imu; Mon, 19 Nov 2018 10:07:28 -0800 (PST) X-Google-Smtp-Source: AFSGD/X34R98FYlFMMMJ3yaIG96IMltBFSyYncxfnzd64nNnxeS863ja/ZKBiXJEvwLVTkWZiBbG X-Received: by 2002:a17:902:8484:: with SMTP id c4mr10633580plo.59.1542650848345; Mon, 19 Nov 2018 10:07:28 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542650848; cv=none; d=google.com; s=arc-20160816; b=FdK8ho5IVRNIjtSX1i63fGQgV1/j0cMnCnaKccI6yPP94rJ3imZ89LmUUymwJ9zDFu cCMdxqXOyO5XYgFPT7lhKaug/O/LFuwPdCx1W3uLGG9maJCNGkV/edp3kyoeSY5aX8vo bIc3gBL+iGQ6fbqBimaoVXMHjQbrB0kbLfLXGfhbImXwOfUJiLOeSxnVu97xYSXAwjP7 9WP0S5//EpHC+E5DyM69m4fG8kaV1LiBSIUB6uIaTuHV1qhJeGX52J9UD042GLdnYTbv Ux0+QbIKAaFlBp5fx7MTGK+kjvKFrBoCA0LZFnJagoTVKuBso3VYrZBrJxa2y2WYHjSp MOqg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:references:cc:to:subject:from; bh=LyXtEa7clPCnssNj/IpqtKlOz9LnhXWqDteoUWl5KYI=; b=0AbcNJesEKs/Xdw8gK6mAxA5nDcLjVsNVBPll5o40P3wfFE0zRa2aunno0WBWTT1wu srEAtfKpeQZOvDPsp0EW3L8wPqCGDkqg8J7PbkRP6T4F+WgnN86RMY2N6xXckdAtpaZo kikk2Iw44ktt2Ri/tORCD/Kdzd6Lv2sXAs8jHgzR6RSRtoJd8UwXrs8+QdS4DZbZhCUW 3dyVDY1oST+1Cm6avDX/4OTsiMXUgWwJbjVqdhca3yAnu7FXXIKKiySTmK+dzaa9IGZK pSN/Y0ZJTv/vBFiKHmlYzQJqm1pFBCIiMiUU+VXAFEs/Q8RP2MVBvRuPB1VafhJg7b0P nlfw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y22-v6si23396044pfa.169.2018.11.19.10.07.13; Mon, 19 Nov 2018 10:07:28 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731095AbeKTEaI (ORCPT + 99 others); Mon, 19 Nov 2018 23:30:08 -0500 Received: from foss.arm.com ([217.140.101.70]:36948 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1730227AbeKTEaH (ORCPT ); Mon, 19 Nov 2018 23:30:07 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 5F3EC1596; Mon, 19 Nov 2018 10:05:32 -0800 (PST) Received: from [10.1.196.105] (eglon.cambridge.arm.com [10.1.196.105]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D0003F5B7; Mon, 19 Nov 2018 10:05:31 -0800 (PST) From: James Morse Subject: Re: Resend: How to handle the SMMU RAS Error in the kernel To: gengdongjiu Cc: robin.murphy@arm.com, arm-mail-list , xuqiang36@huawei.com, Linux Kernel Mailing List , gengdongjiu References: Message-ID: Date: Mon, 19 Nov 2018 18:05:29 +0000 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi gengdongjiu, On 17/11/2018 15:41, gengdongjiu wrote: > In the current kernel, it only handles three kinds of error, which is > memory error, PCIE device and ARM process. But now the SMMU already > support the RAS, how to handle the SMMU RAS error in the kernel? What errors are being detected here? I don't know much about the SMMU, but I think we should start with a list of errors that we want to handle. Is this a v8.2 fault handling interrupt from the SMMU taken to EL3? Or a cpu-access that was returned as external-abort? or a device access that was told external-abort? What do we intend to do with this error information? Does the DMA layer have error handling we can hook this into? Is this just another interface for memory-errors? (e.g the SMMU provides a device/address pair and the kernel works out the physical page to run memory_failure() on) > I check the UEFI_SPEC_2.7, the ACPI's CPER have the IOMMU type, but it > seems the IOMMU type only are specific to AMD’s IOMMU specification, ... and Intel VT-d. It looks like UEFI generalises all these as types of 'DMAr'. > not have the ARM’s IOMMU section type, can we reuse this IOMMU section > type for the ARM SMMU? The architecture specific records for AMD? No. Even if the information was the same, the presence of this record tells you its an AMD IOMMU, which its not. The generic error section? Maybe. Assuming the 'fault reason' list in Table 285 is sufficient to cover our list or errors, we can use the 'DMAr Generic Errors' section, (N.2.11.1), to describe the generic bits of the error ... but SMMU doesn't have an 'Architecture Type', so we at least need to get one allocated. We will probably need an architecture specific 'DMAr Error Section'. I think adding the UEFI bits is probably the 'easy' bit. We should start with a list of errors, and the error handling code. This way we know what we need to add to the spec. Thanks, James