Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp495408ybt; Fri, 10 Jul 2020 05:23:04 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxMAMDG1h+JXhKZszk1Dlj9xn32KspGR46rPiMDYzpqe6o/zPPJ6nG+VnQ7FoiJBfjjwa/T X-Received: by 2002:aa7:d8cf:: with SMTP id k15mr74825624eds.250.1594383783986; Fri, 10 Jul 2020 05:23:03 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1594383783; cv=none; d=google.com; s=arc-20160816; b=q5ZSejofrnJ9Gm2D854jcznh62P/KYjjQMDeEs9vhesCVttcAy8x2I5d84jyA6GpiU 7clm9wygakWS3TUWcwfH1eAlCsTynnhgsHd2Gu2Gn4LngdtXqF8KQhKcfOeDbnQ3bQDp jC63fNx0RlMH8jM3RH1YC67oOCzOcSqoLItebTnQ/x8SxKX0BEvZF24aV4L33IMR6rvP oGHLSbZaiOHX1FbFR+NYTEuwaImHdheY5wrt1p/BzUh8xnFE3RVXoTic6/7W+evRvAXz Vj3Tnj7XCsI0/id5yLr3gbVByDKYhSOZ6U+d17K3Y1SH6M70KJ8eJyE9uGRst6YPgb6c L74w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:ironport-sdr:ironport-sdr; bh=a3zMU4Bbmj1ZIuNeCri1ssI02kocIkFMMn+i9AGSiZs=; b=AodWGWn2rDN+IMfKm1t8YZt530ArU5whzdLWNWBxxGePXV551AGrZu2sqfXZMDTvDb 3E9w2Bzu692s9SqKpBOyxZ8KMtwFsamLfsUbr9BtcMrxlPwTBePMjydKBgafStiMfG0Q GASMuHCJ8bKERqR8FTY/EnhzKVmgP6pB9Hx0OkhfVmLbMCeiHXcl1kEBTPeD6TKkCy9s BDEd/vPc41pdWKgD1iKcNIBoAft3CGG2XJwZBSWK8rBd2gzF509ZEVGi1LC118kctdu1 0xKU1iU7gEnbuqu+6Ka7NJY3tkQTQ2aJou01uordKa6juwxXBXBJYmAbm4bfPR8z/nh4 UTGw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z65si3734245ede.610.2020.07.10.05.22.40; Fri, 10 Jul 2020 05:23:03 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=intel.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727827AbgGJMTd (ORCPT + 99 others); Fri, 10 Jul 2020 08:19:33 -0400 Received: from mga07.intel.com ([134.134.136.100]:42024 "EHLO mga07.intel.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726774AbgGJMTd (ORCPT ); Fri, 10 Jul 2020 08:19:33 -0400 IronPort-SDR: r5V7AY1Ez+IZjKMK49C0ydTMZeo3YD6IR0pohmyKhTVIWVl6giHbKhWVhOtXhQs/6XsRnzFCBB keg/7vO/Ut8w== X-IronPort-AV: E=McAfee;i="6000,8403,9677"; a="213073204" X-IronPort-AV: E=Sophos;i="5.75,335,1589266800"; d="scan'208";a="213073204" X-Amp-Result: SKIPPED(no attachment in message) X-Amp-File-Uploaded: False Received: from orsmga002.jf.intel.com ([10.7.209.21]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 10 Jul 2020 05:19:32 -0700 IronPort-SDR: R8IRRN7fc1v9hVSd2D1kzUJdeV3EX7GTQ+he1aBMQLevX97oAj/5h5jSQ+qxg808IkMPa0GQxx 3ahXZAr9gW0A== X-ExtLoop1: 1 X-IronPort-AV: E=Sophos;i="5.75,335,1589266800"; d="scan'208";a="298417723" Received: from cli6-desk1.ccr.corp.intel.com (HELO [10.239.161.135]) ([10.239.161.135]) by orsmga002.jf.intel.com with ESMTP; 10 Jul 2020 05:19:25 -0700 Subject: Re: [RFC PATCH 14/16] irq: Add support for core-wide protection of IRQ and softirq To: Vineeth Remanan Pillai , Nishanth Aravamudan , Julien Desfossez , Peter Zijlstra , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org Cc: "Joel Fernandes (Google)" , linux-kernel@vger.kernel.org, subhra.mazumdar@oracle.com, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Aaron Lu , Aubrey Li , Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini , Joel Fernandes , vineethrp@gmail.com, Chen Yu , Christian Brauner , Tim Chen , "Paul E . McKenney" References: From: "Li, Aubrey" Message-ID: Date: Fri, 10 Jul 2020 20:19:24 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:68.0) Gecko/20100101 Thunderbird/68.9.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Joel/Vineeth, On 2020/7/1 5:32, Vineeth Remanan Pillai wrote: > From: "Joel Fernandes (Google)" > > With current core scheduling patchset, non-threaded IRQ and softirq > victims can leak data from its hyperthread to a sibling hyperthread > running an attacker. > > For MDS, it is possible for the IRQ and softirq handlers to leak data to > either host or guest attackers. For L1TF, it is possible to leak to > guest attackers. There is no possible mitigation involving flushing of > buffers to avoid this since the execution of attacker and victims happen > concurrently on 2 or more HTs. > > The solution in this patch is to monitor the outer-most core-wide > irq_enter() and irq_exit() executed by any sibling. In between these > two, we mark the core to be in a special core-wide IRQ state. > > In the IRQ entry, if we detect that the sibling is running untrusted > code, we send a reschedule IPI so that the sibling transitions through > the sibling's irq_exit() to do any waiting there, till the IRQ being > protected finishes. > > We also monitor the per-CPU outer-most irq_exit(). If during the per-cpu > outer-most irq_exit(), the core is still in the special core-wide IRQ > state, we perform a busy-wait till the core exits this state. This > combination of per-cpu and core-wide IRQ states helps to handle any > combination of irq_entry()s and irq_exit()s happening on all of the > siblings of the core in any order. > > Lastly, we also check in the schedule loop if we are about to schedule > an untrusted process while the core is in such a state. This is possible > if a trusted thread enters the scheduler by way of yielding CPU. This > would involve no transitions through the irq_exit() point to do any > waiting, so we have to explicitly do the waiting there. > > Every attempt is made to prevent a busy-wait unnecessarily, and in > testing on real-world ChromeOS usecases, it has not shown a performance > drop. In ChromeOS, with this and the rest of the core scheduling > patchset, we see around a 300% improvement in key press latencies into > Google docs when Camera streaming is running simulatenously (90th > percentile latency of ~150ms drops to ~50ms). > > This fetaure is controlled by the build time config option > CONFIG_SCHED_CORE_IRQ_PAUSE and is enabled by default. There is also a > kernel boot parameter 'sched_core_irq_pause' to enable/disable the > feature at boot time. Default is enabled at boot time. We saw a lot of soft lockups on the screen when we tested v6. [ 186.527883] watchdog: BUG: soft lockup - CPU#86 stuck for 22s! [uperf:5551] [ 186.535884] watchdog: BUG: soft lockup - CPU#87 stuck for 22s! [uperf:5444] [ 186.555883] watchdog: BUG: soft lockup - CPU#89 stuck for 22s! [uperf:5547] [ 187.547884] rcu: INFO: rcu_sched self-detected stall on CPU [ 187.553760] rcu: 40-....: (14997 ticks this GP) idle=49a/1/0x4000000000000002 softirq=1711/1711 fqs=7279 [ 187.564685] NMI watchdog: Watchdog detected hard LOCKUP on cpu 14 [ 187.564723] NMI watchdog: Watchdog detected hard LOCKUP on cpu 38 The problem is gone when we reverted this patch. We are running multiple uperf threads(equal to cpu number) in a cgroup with coresched enabled. This is 100% reproducible on our side. Just wonder if anything already known before we dig into it. Thanks, -Aubrey