Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1756067AbcCaQFX (ORCPT ); Thu, 31 Mar 2016 12:05:23 -0400 Received: from mail-bn1bon0066.outbound.protection.outlook.com ([157.56.111.66]:57344 "EHLO na01-bn1-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1750828AbcCaQFV (ORCPT ); Thu, 31 Mar 2016 12:05:21 -0400 Authentication-Results: arm.com; dkim=none (message not signed) header.d=none;arm.com; dmarc=none action=none header.from=caviumnetworks.com; Date: Thu, 31 Mar 2016 19:05:00 +0300 From: Yury Norov To: Mark Rutland CC: , , , , , , , Subject: Re: [RFC] [PATCH] arm64: survive after access to unimplemented register Message-ID: <20160331160500.GC29800@yury-N73SV> References: <1459391223-3826-1-git-send-email-ynorov@caviumnetworks.com> <20160331100547.GA26532@leverpostej> <20160331122859.GA27859@yury-N73SV> <20160331131231.GF26532@leverpostej> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline In-Reply-To: <20160331131231.GF26532@leverpostej> User-Agent: Mutt/1.5.23 (2014-03-12) X-Originating-IP: [95.143.213.121] X-ClientProxiedBy: AM2PR09CA0070.eurprd09.prod.outlook.com (10.160.228.166) To BLUPR07MB610.namprd07.prod.outlook.com (10.141.207.15) X-MS-Office365-Filtering-Correlation-Id: 87a69b40-4031-41db-8c7c-08d3597e416c X-Microsoft-Exchange-Diagnostics: 1;BLUPR07MB610;2:McDIL+W3x2ngXyU3dtlYGQt32Km3ikE61z91fa7Q36YhVtSJN7BE9bYKIUzEeA4IwDTzdBcim80IOL6/HTMl9YYTDmuiRMuleUvWmF43Eniq2xUzLuJQ2LR1VWnmR07ywpgz08Zi3buhUDWzq1XjCPK2URFuMS8IhY9WaEsiqZKOJf+IjJE/iDZUoHP1U1xe;3:mvPIF3b5uKHT2fYD8+2CJ+aIzw23+tenKojfKZ8Lkws/UsYE5e+C1CiuGsd9CGUgsQPwCHa3zdtmM1ddbRrnwqRB8gXL82Ni+VCCqkUEOUpot7VTMktr2EvuV/NIzu1J;25:qNtFQZ+/jfDlUkp4bTVe5PLp4/TtdiAZsv7oayrKuFT3L3FTpIOHAhg9o7XZhOoqCOD/1JYuWlWDPN2+g7+95I4XoU1xzFXhEnN9b1yX/rructGXyrshkm8fGHMJg1rGZ1z1SxlCHThg4XdnkeZDLyo8WDSAgQKekJRihSVQ2aR3RcFT13adT5ENFGDONixKDEkGnu1ZbkLUzPLfsHQ/Z0h/MpO7gmREUFVr4Ux73+bUKz2PHi+HHpCuVnoULlL6 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:;SRVR:BLUPR07MB610; X-Microsoft-Exchange-Diagnostics: 1;BLUPR07MB610;20:MTCCMeaEWtWHS+aKPjblyRQ1BKqgTsl5LgfqqnZok/HjnGtGS2AXrHuoPoG772j7sQ8uOIF3OXiEkeXsw042Um+0ZHy2mCM6kgUmGFwQArirsQ33oCWwryUiGKzIDFw1ygRCujUjWX1R4ypWtUoaeT5HUuAA6/pLcyruyv4X+yUawbAK8C5AscMsw3yhEm8u5bBK2hBvaTUjfO4RxAuKtYSAWEk+Mx8M91u2sSwws1vSoEjYb/96OS6yP6uF2lamsKa8CAN53pxp75Low6gVhIkMRfAKIsqs9Cm1DIJL6atl9jfXwIZlj8mzQu3d5HZ/v2fjmVtfjT4UN2sju9fzJwApVMUXQRgWtO2eA+3VCM0VnYj3RdR2u1/ab4+gvV+fyn+UeBmnWDumMWg+dkYwClqfQgcCxVp7mk2k0rMUjHuHZ2+1z4H9o5R/65pmPksDuZ9h3c+yLDQTBDVNLgnHnD9dzSeJ/UNgnJfOM5q9V9flSf44WvsrAkt7zXhEiqKNpCRTcbN3/4ajMcQOThw4kaG5cBrNN/rqOouve/Ip/TXMIsO3eyuXboxt/sSvvIUXsTndSKG9J+4DFDyDK9qh06xMfRGjSNpr/qXHGPSHPZ8= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001);SRVR:BLUPR07MB610;BCL:0;PCL:0;RULEID:;SRVR:BLUPR07MB610; X-Microsoft-Exchange-Diagnostics: 1;BLUPR07MB610;4:uQ0Vzdwk2c8OGAP3bd3MRDKQTrdPpRfxBdS69D5f0HLgriDkS6CudI/mKwInpF/8cQQcQ7O9B/53dtGx2MHTprJt4eiYIfL45XgCMCJilb7EmrIP2o4vEue/yy9PZm7j6d8gmraN3HWBWC/evWWUhvjEqY0CUgSHBnK34H12QD49bD+SVJnculpucAxzJT3sfhatTIRWd8lyj/BcCsT9RbfvpDjaRnvWVRVd7oUpS4kF81ZtQA8bkkph77RV1JMc3Oh/qahqQ9/d6fINTXRGcz5O7kNIY9wM+LE9p92fhmkTPj8kYDRvAy8fOeMbb7pR2uf+llmm0Ub23CbltWZla5yHK5joeEALEc65OiycNJGDVQ2Eh1hcIXnMbmkqtTcQ X-Forefront-PRVS: 0898A6E028 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6069001)(6009001)(24454002)(164054003)(33656002)(92566002)(81166005)(46406003)(47776003)(50466002)(77096005)(5008740100001)(33716001)(83506001)(19580395003)(2950100001)(66066001)(189998001)(3846002)(110136002)(76506005)(4001350100001)(42186005)(93886004)(2906002)(5004730100002)(4326007)(97756001)(6116002)(586003)(23726003)(76176999)(1076002)(1096002)(54356999)(50986999);DIR:OUT;SFP:1101;SCL:1;SRVR:BLUPR07MB610;H:localhost;FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;BLUPR07MB610;23:hJL2I34+FXyKJI0f7CeFLPctmBJZyb79Y0Qv1rmd0V?= =?us-ascii?Q?eQyyOoAS0zrs7H46LbD877s+HjmNQa0kSz4jHB86QDRZJbsSc5x/aLdzFsFs?= =?us-ascii?Q?l3ucDYUJTETgUvlK0GIvBKROCmi1QIgqVvEBAxJ0y7f5IfDoXMDR9Z2u2rU0?= =?us-ascii?Q?xeyJjEye1j84Mudnxt806ip8knjVm+C7TOMRZtTaw9NWU7uWJMOGYQaVV1sp?= =?us-ascii?Q?q9FeQtVNSuDGNASbTgGs9d2NeNaPW4YvrkoiJ473c+7yy6M9WJ1QD4nq93X1?= =?us-ascii?Q?sXW0FqZv/ZK0od28uA8S0GjOi96ATvb01peW3lxWTiWVdpJdgNUK0moWdcta?= =?us-ascii?Q?NmTpln2kYPYjW+700CPM57NsJKSVrPgmSFaiLxluq8wygEOUKSBcX4xTC6Oz?= =?us-ascii?Q?sSBtfpuJir00R5RZio1A4AYs/v5O+WvBifA+KYLygAQAEOcI+K1sFU+pPfaD?= =?us-ascii?Q?SCgoMLj2iQELU2NdIONCLkxIzgc7RrB3TmpYDRlm1hGDM74+LOqAYRj6WXTD?= =?us-ascii?Q?pkNvzfQOlPzSYY5GiGMRi6OiYx8qoo1gzqTn35bg1tv2cmQpzDcr1oFbodjm?= =?us-ascii?Q?vOAnnt7TRkSvbZOMuliDGPdslgyoimtTTKjTPzq8OxyPF8qM39HtSEgoOXSu?= =?us-ascii?Q?P/ggfDb78WKmIq6cszmPCJUBSiWjTZqp8Chx44tAYiAw+PbSm7uVgFvKjgWO?= =?us-ascii?Q?WjXYGDyYKPemhQLYhXBZrqQWPcQ6AS+7ijUhRk01kpPH0+3JTX8EyRkYZ870?= =?us-ascii?Q?TRITtCgCoEQnJC9FX3WzythaA7vz4ybDmlZjGOmP36ZknOWh3Y1jNWCCWXlP?= =?us-ascii?Q?MPP7csSlkrs9sOyCVe6f1nq8zY0SQI5QmzoF7e/2zAlzQSjlinodiRPwl/D2?= =?us-ascii?Q?q+/xLR4bnA69C7rtRV02O04ssNKKfuSvsxXF46fGWV0KIJ8WZ0N6PjI1kD2u?= =?us-ascii?Q?PfFRRRpegCN3TTAQ0u71ytPQChr4AxHUtLdHFv40G4mJlfE6Q/8RiayzuVeM?= =?us-ascii?Q?o=3D?= X-Microsoft-Exchange-Diagnostics: 1;BLUPR07MB610;5:LAUX+b+gg71pogKls+B+7HEuuKmSIJ2PgHTpV5V6NK1QWzMJV80dJw2J8cfgzvv2WkUFB/S0OYDiHhB+JTpPANumxsAnDUvCDkX8J3RdOEAOkKtm3fgNCxXdrCKm1eNvRidTg6G8dh5R1kyeVG76hw==;24:BKygv6gGGwSi+24z8QUKIhlr76RcMwBjkjby3556OYlE+L53qEbMRLB9DJNsZ6u8GryNBkyhJNKZ3GPYcMEcE8gceahstbK7Y5mrNdNlWT4= SpamDiagnosticOutput: 1:23 SpamDiagnosticMetadata: NSPM X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 31 Mar 2016 16:05:18.4787 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BLUPR07MB610 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4219 Lines: 95 On Thu, Mar 31, 2016 at 02:12:31PM +0100, Mark Rutland wrote: > On Thu, Mar 31, 2016 at 03:28:59PM +0300, Yury Norov wrote: > > Hi Mark, > > > > On Thu, Mar 31, 2016 at 11:05:48AM +0100, Mark Rutland wrote: > > > On Thu, Mar 31, 2016 at 05:27:03AM +0300, Yury Norov wrote: > > > > Not all vendors implement all the system registers ARM specifies. > > > > > > The ID registers in question are precisely documented in the ARM ARM > > > (see table C5-6 in ARM DDI 0487A.i). Specifically, the ID space > > > ID_AA64MMFR2_EL1 now falls in to is listed as RAZ. > > > > > > Any deviation from this is an erratum, and needs to be handled as such > > > (e.g. listing in silicon-errata.txt). > > > > > > Does the issue affect ThunderX natively? > > > > Yes, Thunder is involved, but I cannot tell more due to NDA. > > And this error is not in silicon-errata.txt. > > I'll ask permission to share more details. > > Ok. Regardless of how this is solved, we need to know the details of the > erratum (and need an entry in silicon-errata.txt). > > > > > So access them causes undefined instruction abort and then kernel > > > > panic. There are 3 ways to handle it we can figure out: > > > > - use conditional compilation and erratas; > > > > - use kernel patching; > > > > - inline fixups resolving the abort. > > > > > > > > Last option is more robust as it does not require additional efforts > > > > to support targers. It is looking reasonable because in many cases > > > > optional registers should be RAZ if not implemented. Special cases may > > > > be handled by underlying __read_cpuid() when needed. > > > > > > I don't think we should do this if the only affected implementations are > > > software emulators which can be patched (and have already been, in the > > > case of QEMU). > > > > > > In future it's very likely that early assembly code (potentially in > > > hypervisor context) will need to access ID registers which are currently > > > reserved/RAZ, and it will be rather painful to fix up accesses to this. > > > > So we will not fix. This one fixes el1 only, and don't pretend for more. > > At some point, it's practically guaranteed that we will have to access > reserved/RAZ ID registers in other cases, so we _will_ need workarounds > that cater for those sooner or later. > > We need to consider how we can handle those, in case it implies > constraints on our solution elsewhere, or requires a more complex, but > more general solution (which we can implement part of today). > > For example: > > * The sanity checks code will perform many back-to-back register > accesses. Trapping lots of these could be expensive, so not performing > the MRS at all when known to be unsafe may be preferable. > > * Some registers may be read in a hot/critical path, or potentially in a > context where we cannot handle trapping (e.g. early boot code or parts > of KVM). In some cases, patching may be preferable to an MRS that only > gets executed depending on a branch condition. > > Before we can do any of this, we need to know the conditions of the > erratum, however. > No matter, patching is preferable by many reasons, of course. But kernel patching requires some investigations, and may take time. This is the last resort for kernel to stay alive. > > > Additionally, this workaround will silently mask other bugs in this area > > > (e.g. if registers like ID_AA64MMFR0_EL1 were to trap for some reason on > > > an implementation), which doesn't seem good. > > > > We can mask it less silently, for example, print message to dmesg. > > > > Initially I was thinking about erratas as well, but Arnd suggested > > this approach, and now think it's better. From consumer point of view, > > it's much better to have a warning line in dmesg, instead of bricked > > device, after another kernel or driver update. > > Having some warning is certainly better, though I think we need to > scream _very loudly_ for cases we do not expect, as non-fatal warnings > are easily/often ignored, and can later turn out to be more critical > than previously believed. > > Thanks, > Mark. So what? Are we drop it? Or I can prepare new version with loud warning and runtime patching. Yury.