Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp4081748imu; Fri, 30 Nov 2018 10:38:57 -0800 (PST) X-Google-Smtp-Source: AFSGD/WFbAp5U4MB9ZhA4YL6ae/70SDkZZt7TjPqHbg94xMX/cp+krTxh1Ys/dTiH7v/UPoF8fOb X-Received: by 2002:a63:1258:: with SMTP id 24mr5692510pgs.114.1543603137261; Fri, 30 Nov 2018 10:38:57 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543603137; cv=none; d=google.com; s=arc-20160816; b=LYL3sBf+4ysKEwus4aaxU1YCiFcmddCErmm3fFxPVJRj5ef5av6Cy6RpeKnXLnps8+ /LHovYND24MfdCytcpO8+0iqpOCfYKi/9Mol2e2kUpkr2iHTkuaqIlgCUmCEETC9oe+x utGz/AZHP64TZY5QgVIAh4K6LXMTVw6xAiL00oinW8/qhaOk072PHE4DWmfJPp2tB8tX 1Lj11X+X7TnTONTI/lVe5XeRtHjcRt1/2C3jzi4FoBTAr9yW5GGSG0WFuP6CwhtwbZ4I ZWKfI1o3SJktoQDjw01wzEzqoUYJPdZvo2q0R6Jen+EmpU+YKGT3XNn1ZzwEZTsGbdXl /S3Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject; bh=ZqsiBAwz4iHMCWHizDVfDPyj+YA3j45P2Xoc6QTkRjo=; b=k+zYCGwsahY02ugVfQuh6kL7vDJ03uRM99A8PWz59AFONbNuH7eJkpcBK64MDaEkXv RHvWqQX+M3osDs9CjJ3maZQj58EWXS2eTUVW5wfkbK1R5VUNyziR6O8hXJ1e6wyk01LT lk/OWs3/WMTSezJnzY7sjCQgGUVdpGHrn4jRxW2k9VnEer2sdmuypIhtBsVkx7NW8w2C ifIa9xFeFdwzL4T8Ou7Mg19sV6T6CApkYudPMegUB6j16+wI7DMBJ42gu6rjFGwyEiT+ tHWuI5bZ9A9Uv0/7hDi3Saf5FRSHBJu0EF1uBUYJUJ7ynT8awjsF0ZiMPbQz8U5e4Tl/ /CVQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x6si5770359pln.425.2018.11.30.10.38.41; Fri, 30 Nov 2018 10:38:57 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726753AbeLAFrn (ORCPT + 99 others); Sat, 1 Dec 2018 00:47:43 -0500 Received: from foss.arm.com ([217.140.101.70]:34646 "EHLO foss.arm.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725817AbeLAFrn (ORCPT ); Sat, 1 Dec 2018 00:47:43 -0500 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id A0597165C; Fri, 30 Nov 2018 10:37:32 -0800 (PST) Received: from [10.1.196.105] (eglon.cambridge.arm.com [10.1.196.105]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 746D23F575; Fri, 30 Nov 2018 10:37:31 -0800 (PST) Subject: Re: Resend: How to handle the SMMU RAS Error in the kernel To: gengdongjiu , gengdongjiu Cc: robin.murphy@arm.com, arm-mail-list , xuqiang36@huawei.com, Linux Kernel Mailing List References: <7a93bf1b-7b7b-240e-bc74-68bf0ee18ac9@huawei.com> From: James Morse Message-ID: Date: Fri, 30 Nov 2018 18:37:29 +0000 User-Agent: Mozilla/5.0 (X11; Linux aarch64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 In-Reply-To: <7a93bf1b-7b7b-240e-bc74-68bf0ee18ac9@huawei.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-GB Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi gengdongjiu, On 21/11/2018 08:10, gengdongjiu wrote: > On 2018/11/20 2:05, James Morse wrote: >> On 17/11/2018 15:41, gengdongjiu wrote: >>> In the current kernel, it only handles three kinds of error, which is >>> memory error, PCIE device and ARM process. But now the SMMU already >>> support the RAS, how to handle the SMMU RAS error in the kernel? >> >> What errors are being detected here? >> >> I don't know much about the SMMU, but I think we should start with a list of >> errors that we want to handle. > > In our platform, the SMMU RAS error mainly include below which flow the SMMU spec: > 1. one bit ECC error, reported as CE. > 2. two bits ECC error, reported as UEU. > 3. fetch error in the SMMUv3 spec, reported as UER. These are faults, but this isn't enough information for software to act on. Are these faults in the device, host-memory, or memory that is part of the SMMU? Was the error discovered during a read/write by the device?, (which one?) Or the SMMU's page-tables, or command-queue. > The 2 and 3 should be handled, but I do not know how do recovery to it. Me neither. If we can come up with the errors that can be detected, we can work out which ones can be handled. Thanks, James