Received: by 2002:a25:ca44:0:0:0:0:0 with SMTP id a65csp1384593ybg; Wed, 29 Jul 2020 12:46:28 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwtpoIEUqk+KtpxGi/yb6Goc1rt1DJi3UzSn+erCoL6gYoSWZ3G9bkj1CWEFEgIqgnuxPzJ X-Received: by 2002:a05:6402:b99:: with SMTP id cf25mr31585343edb.291.1596051988703; Wed, 29 Jul 2020 12:46:28 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1596051988; cv=none; d=google.com; s=arc-20160816; b=mSwSDazE8B5kmhSo/2G4tZOOGJ2fVPq1JaEtHbizlvKPttSItoRGnvSsbAdvPmnwgs P1G2iPNGSA/U58C8HtlGOyhuGC5VY+yDt10AwPJD/ZvU/TEjCHljvDN1LO6jvQ3BVIOL /4wudxJTyKkP3p2BoNmoQUjk/0mVWX9UrHPjU0A6xC4tQtPS/sySd5whZNNGwRfVszt0 vyDFIWeQ3flfFSdmFRf1zOk/7sgpNvE0qgZ+Qjp9XAEmZdd12qXEM2fxo5I3bf6Kl88d tqr9YLIaKTzfQ60j5mQwLuKiZZHgob3d7OsZoc88vIDoPnYb/OPTyUvqP54veCFq6u5p QmDw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :ironport-sdr:ironport-sdr; bh=qqAMV5lWLdFsX0lkz3XsXkKove7hE35M4rytfJaZxO4=; b=OCKwQyVPiayRLrlgqnAQK71ZWaW666klLGjWEZPZqemHKu9O2IRKPhHEX1DHil2zF+ QNvP2aGMNzxhe/dYhOZARJpoOfNLFjW5m3nsXbBVgCPoM+Klr9+faHwRjXk6SYkDENDt IeaYPsdcK+d42XsZ4NtXEL2Wo2ceHkWvZe/MDJy6amWPhZU+WI9FBkPQMwB81fTwV083 7tC+GQsivvqScNDExJ5177SfUMTM0NIaZU3yudc0VYlEFGPE1+07cth71JjcmWJ3OHeR k9g+hffTo7n1fT7mxALDmgmlvp1SUdXgUPbm/W7O7OwI0wPvgKB7y4x53f/wBd+p73Tb a9iQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id b14si1880839eds.102.2020.07.29.12.46.05; Wed, 29 Jul 2020 12:46:28 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726813AbgG2TnA (ORCPT + 99 others); Wed, 29 Jul 2020 15:43:00 -0400 Received: from mga11.intel.com ([192.55.52.93]:27520 "EHLO mga11.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726365AbgG2TnA (ORCPT ); Wed, 29 Jul 2020 15:43:00 -0400 IronPort-SDR: HK22QScsf9v7QXhE8ubZc/cWmdtXqFs8UREcOFRMdHYQbiAPmup12c3uwCsOgcACRENYPftVDp hZ4ThWSijPDA== X-IronPort-AV: E=McAfee;i="6000,8403,9697"; a="149325711" X-IronPort-AV: E=Sophos;i="5.75,411,1589266800"; d="scan'208";a="149325711" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Jul 2020 12:42:59 -0700 IronPort-SDR: vyO1B/mhpSv8RYoTu+4lYJUGBf2tr+TOflJ3K1vNXIWZxEhSDxrxWsNncnwjZoc+FO5DSR1IBQ DatswMsKXP7Q== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,411,1589266800"; d="scan'208";a="272710679" Received: from otcwcpicx6.sc.intel.com ([172.25.55.29]) by fmsmga007.fm.intel.com with ESMTP; 29 Jul 2020 12:42:59 -0700 Date: Wed, 29 Jul 2020 19:42:59 +0000 From: Fenghua Yu To: Sean Christopherson Cc: "Yu, Fenghua" , Thomas Gleixner , Borislav Petkov , Ingo Molnar , Peter Zijlstra , "Shanbhogue, Vedvyas" , "Luck, Tony" , H Peter Anvin , Andy Lutomirski , "Shankar, Ravi V" , "Li, Xiaoyao" , x86 , linux-kernel Subject: Re: [PATCH RFC] x86/bus_lock: Enable bus lock detection Message-ID: <20200729194259.GA318576@otcwcpicx6.sc.intel.com> References: <1595021700-68460-1-git-send-email-fenghua.yu@intel.com> <20200729030232.GE5583@linux.intel.com> <20200729184614.GI27751@linux.intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200729184614.GI27751@linux.intel.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, Sean, On Wed, Jul 29, 2020 at 11:46:14AM -0700, Sean Christopherson wrote: > On Wed, Jul 29, 2020 at 11:09:16AM -0700, Yu, Fenghua wrote: > > > > Some CPUs have ability to notify the kernel by an #DB trap after the > > > > instruction acquires a bus lock and is executed. This allows the > > > > kernel to enforce user application throttling or mitigations and also > > > > provides a better environment to debug kernel split lock issues since > > > > the kernel can continue instead of crashing. > > > > > > > > #DB for bus lock detect fixes all issues in #AC for split lock detect: > > > > > > Fixes "all" issues... and creates some new ones, e.g. there are use cases > > > where preventing the split lock from happening in the first place is strongly > > > desired. It's why that train wreck exists. > > > > Bus Lock Detection doesn't replace Split Lock Detection. If both features > > are enabled, default behavior is warning from bus lock, fatal behavior is > > still from split lock. See the behavior table as follows. > > > > > > > > > 1) It's architectural ... just need to look at one CPUID bit to know it > > > > exists > > > > 2) The IA32_DEBUGCTL MSR, which reports bus lock in #DB, is per-thread. > > > > So each process or guest can have different behavior. > > > > 3) It has support for VMM/guests (new VMEXIT codes, etc). > > > > > > > > Use the existing kernel command line option "split_lock_detect=" to > > > > handle #DB for bus lock: > > > > > > Are SLD and BLD mutually exclusive? Can we even guarantee that given the > > > track record of SLD? If not, we'll likely want to allow the user to choose > > > between SDL and BLD via split_lock_detect. > > > > The two hardware features can be enabled on the same platform. > > But they are mutually exclusive in the kernel because #AC from SLD happens > > before the instruction is executed and #DB happens after the instruction is > > executed. > > > > Right now, if both of them are enabled, "warn" behavior goes to > > bus lock and "fatal" behavior goes to split lock. > > > > Do you want the user to override the behaviors by something like this? > > > > split_lock_detect=warn[,sld]: if given "sld" while both features are enabled, > > warn behavior is from split lock instead of bus lock detection. > > > > split_lock_detect=fatal[,bld]: if given "bld" while both features are enabled, > > fatal behavior is from bus lock detection. > > IMO these should be completely independent features (that happen to share > some code). > > BLD in fatal mode doesn't make any sense because it can't be fatal without > a completely different implementation, e.g. the bus lock has already > happened and the application can eat the SIGBUS. The current SLD code > works because the split lock is prevented entirely, i.e. eating SIGBUS > doesn't allow the application to make forward progress. If "fatal" is meaningless for BLD, we can remove it for BLD. If we need it in the future, we can add it later. The "ratelimit:N" maybe more useful for BLD: it mitigates DOS from bus locks. > > Smushing the two into a single option is confusing, e.g. from the table > below it's not at all clear what will happen if sld=fatal, both features > are supported, and the kernel generates a split lock. > > Given that both SLD (per-core, not architectural) and BLD (#DB recursion and > inverted DR6 flag) have warts, it would be very nice to enable/disable them > independently. The lock to non-WB behavior for BLD may also be problematic, > e.g. maybe it turns out that fixing drivers to avoid locks to non-WB isn't > as straightforward as avoiding split locks. But the two features are related if both of them are enabled in hardware: If a split lock happens, SLD will generate #AC before instruction execution and BLD will generate #DB after instruction execution. The software needs to make them exclusive. The same kernel option reflects the relationship and make them exclusive, e.g. "fatal" enables SLD and disables BLD, "warn" does the other way. If using two different kernel options, the user needs to give right options to make both work, e.g. can the user give this combination "split_lock_detect=fatal bus_lock_detect=warn"? What does the combination mean? There could be many combinations of the two options, some of them are meaningful and some of them aren't. Maintaining the combinations is unnecessary complex, right? Thanks. -Fenghua