Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp2407776imm; Sat, 23 Jun 2018 17:57:51 -0700 (PDT) X-Google-Smtp-Source: ADUXVKIUvSX59IrtP7GbzWydEQfZBgbRyXQlgNO+qFAnA+NbzKKODfCB7kYQVwEOjC0OyAarCsMP X-Received: by 2002:a17:902:b590:: with SMTP id a16-v6mr7214088pls.225.1529801871574; Sat, 23 Jun 2018 17:57:51 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529801871; cv=none; d=google.com; s=arc-20160816; b=PZCsN/lz7D0ZA8ukptffn9GTfbdcGc2Mbh9OqJwuDrRGZNUKIyrvhmVhd1F2oCzTjP Xpmbce7tUi33uVaa/lslDN0zj6yCSGL76GXH0z09cyW6g/MpXNJGD1xH+5DHDCedCxap CRolhdEYE8HV8b7A4aSIfVRMnSOh272bS/zwZsGoME2NrKuioXDe5XurhxgwcSn65Eoz yMLhEgv6DtkqOakUQDRn5jAloIURJlrybQuEdIQmbyO/vuulpB9B+N3HuMYrtiDX5XO1 vaKg3IOTQ2zQ1O+hV6srbrkqYbmdZUiPxi01aKtlenFE42fBEHJWYl0Fkgu0rSJcdd0u VmOQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :message-id:in-reply-to:subject:cc:to:from:date :arc-authentication-results; bh=WJmCCUQsjSb83/0mG+5EM09QWy6ZiT5dP7NrdwEoCrw=; b=LrjRpT1xdFqcl2fnxbfO6uaUpOM3K0CnAne3UE4CJWLc0+CzLDa8PO3oY0cqmq5kzY Bccrcp3U/DJ/HFS9s8tdG7Tp+ju/GwbidEjFRqwxVZKp9/npahSrnKNhTtObN6o6Vhoi 91S8Hf/FjHN6qrrzdAOn7/X5hdB8ZiBqzEcKXgUene1q21ey0U0+gpZ8nUKrlkZTNkwr rML4QvjXTC8owfHlxjRceB3/N62aC1oVCMDmBtWX6cvY9JyS8jdv2f/4ooTQcqE7mgzx E2j3gWtZNO8xfquso/0Z0JmthHALHLKwx6O9HDLLLyTC93WABfpTY97yy/CgoUOlwNlM o/lg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 11-v6si10782518plc.466.2018.06.23.17.57.03; Sat, 23 Jun 2018 17:57:51 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751902AbeFXAzZ (ORCPT + 99 others); Sat, 23 Jun 2018 20:55:25 -0400 Received: from Galois.linutronix.de ([146.0.238.70]:44125 "EHLO Galois.linutronix.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751585AbeFXAzX (ORCPT ); Sat, 23 Jun 2018 20:55:23 -0400 Received: from p4fea482e.dip0.t-ipconnect.de ([79.234.72.46] helo=nanos) by Galois.linutronix.de with esmtpsa (TLS1.2:DHE_RSA_AES_256_CBC_SHA256:256) (Exim 4.80) (envelope-from ) id 1fWtJ6-0000aa-PS; Sun, 24 Jun 2018 02:55:08 +0200 Date: Sun, 24 Jun 2018 02:55:08 +0200 (CEST) From: Thomas Gleixner To: Fenghua Yu cc: Ingo Molnar , "H. Peter Anvin" , Ashok Raj , Dave Hansen , Rafael Wysocki , Tony Luck , Alan Cox , Ravi V Shankar , Arjan van de Ven , linux-kernel , x86 Subject: Re: [RFC PATCH 02/16] x86/split_lock: Handle #AC exception for split lock in kernel mode In-Reply-To: <20180623150521.GG18979@romley-ivt3.sc.intel.com> Message-ID: References: <1527435965-202085-1-git-send-email-fenghua.yu@intel.com> <1527435965-202085-3-git-send-email-fenghua.yu@intel.com> <20180623042033.GF18979@romley-ivt3.sc.intel.com> <20180623150521.GG18979@romley-ivt3.sc.intel.com> User-Agent: Alpine 2.21 (DEB 202 2017-01-01) MIME-Version: 1.0 Content-Type: text/plain; charset=US-ASCII X-Linutronix-Spam-Score: -1.0 X-Linutronix-Spam-Level: - X-Linutronix-Spam-Status: No , -1.0 points, 5.0 required, ALL_TRUSTED=-1,SHORTCIRCUIT=-0.0001 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, 23 Jun 2018, Fenghua Yu wrote: > On Sat, Jun 23, 2018 at 11:17:03AM +0200, Thomas Gleixner wrote: > > On Fri, 22 Jun 2018, Fenghua Yu wrote: > > > Should I add kernel parameter or control knob to opt-out the feature? > > > > A simple command line option 'acoff' or something more sensible should be > > ok. No sysfs knobs or whatever please. The Kconfig option is not required > > either. > > Ok. I will have a command line option. > > BTW, I have a Kconfig option to enable split lock test in kernel mode in > patch #15. Are the Kconfig option and the kernel test code still needed > in next version? Unless you do not trust #AC to work everywhere where it is advertised it's pretty much pointless. Btw, please get also rid of these bloated control_ac() stuff. We have msr_set/clear_bit() so no need to reinvent the wheel. > > > I'm afraid firmware may hang system after handling split lock if the > > > feature is enabled by kernel, e.g. "reboot" hits split lock in firmware > > > and firmware hangs the system after handling #AC. > > > > Have you observed the problem in reality? I mean why would 'reboot' be the > > critical path? I'd rather expect that EFI callbacks or SMM 'value add' > > would trip over it. > > > > Vs. reboot. If that is the only problem then we might just have to clear > > #AC enable before issuing it, but that does not need to be part of the > > initial patch set. Its an orthogonal issue. > > Yes, I do see a real firmware hang after hitting and handling a split lock > in firmware during "reboot" in one simulation test environment. Apprantly > the split lock (and alignment access) is treated as a failure in firmware. It's not treated as failure. The firmware simply does not have an handler for #AC installed and dies. I hope you yelled at the firmware people already. > This real case triggered my concern that split lock in any future > firmware may happen in any path including run time service, S3/S4/S5, > hotplug. If we don't have opt-out option or something similar, system hang > from split lock in firmware can be a blocking issue on some platforms. If > that happens, bisect always finds the split lock patch to blame. That's fine. The changelog will hopefully explain it along with the text that people should use the commandline option and yell at their firmware supplier. So what? Move on.... If that is a real wide spread issue in practice, then we might have to go for some ugly workarounds, but we won't find out when we add them upfront. So testing will show what's wrong in firmware land and we can handle it from there. It's a completely orthogonal issue and has nothing to do with the core functionality. Thanks, tglx