Received: by 2002:a25:1985:0:0:0:0:0 with SMTP id 127csp450266ybz; Fri, 17 Apr 2020 04:15:15 -0700 (PDT) X-Google-Smtp-Source: APiQypJEE/53r3LxFGi4cQ7bBrvkphK4SpPkm7SchQelCFBwhYSnMGCKhf7Scgtd8T4BEqEa+n5M X-Received: by 2002:a50:cd17:: with SMTP id z23mr2333566edi.191.1587122115436; Fri, 17 Apr 2020 04:15:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1587122115; cv=none; d=google.com; s=arc-20160816; b=oEf8m4GB9fSN4LRdPDV033LTrZhqljyorzP5kC1ewh6vVIcaBl6U6aL5FJe9+9PuNr NXocbdIW873yiAmbZqMCRpYUKsIAiSaV9/mBzUsfBoR78Pirhg7PRN7Ej7pXSVJ4POTC 9zUpfE23IGUQKd3hbP7nQYgBWiDRZF/UAJ+uepGa9SXKy4lEuG2uZThro5e1oslcVB7B nMKsOE6VNAhbHNxayL19Gm9XaTZc3uWpDtiNhKy8Mhn6rWLfhTEAQUYNnU1ABAppzFMj x/RnM1ZVTbF372uipwGNXCFG6VmVNbfEFWEvwPRE4B3GCr+RXF5FNqLLPAo8KOT7AbK7 ofuA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=Z9R8TwgFFZEdYwWvXS1m69C8jKNDkSMDggxlCSZl1WU=; b=VutlwqTi6qFish4Evzrm6S4smeauyFiHM1RBBb7r+MJDCWrt1Prs+KQlfBkcaNUqSG Y+RrqLadQ7d8d4Vu9WjLs/yTXGYuxzCeW71rjLgRHRTm54/CQ5s8A/onAxBXF4Q6sjw9 Kcp/HnrlR77cYVFY7Dh/k9nmyR6HyB/OxH3v4A+0yCtyP6wyO6wQUSagTzX4a/sjKlBe tiRMvycGuhYITw5AB+dYGa4Ux+uRo60Qblb8ZHW5vSAYasrOZT387EOi/nF2z533kqA4 e/8feByvTUKi7O+/OM1oBgZjbGWkBOUfhULiMXzc8jHKRZp2NZYY8F9CyyYQQB5bIgiG qVwA== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=L8F6v9W7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id v19si4474204edr.209.2020.04.17.04.14.52; Fri, 17 Apr 2020 04:15:15 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@infradead.org header.s=merlin.20170209 header.b=L8F6v9W7; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730248AbgDQLN2 (ORCPT + 99 others); Fri, 17 Apr 2020 07:13:28 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:33090 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1729846AbgDQLN1 (ORCPT ); Fri, 17 Apr 2020 07:13:27 -0400 Received: from merlin.infradead.org (unknown [IPv6:2001:8b0:10b:1231::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id A3AAAC061A0C for ; Fri, 17 Apr 2020 04:13:27 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=infradead.org; s=merlin.20170209; h=In-Reply-To:Content-Type:MIME-Version: References:Message-ID:Subject:Cc:To:From:Date:Sender:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description; bh=Z9R8TwgFFZEdYwWvXS1m69C8jKNDkSMDggxlCSZl1WU=; b=L8F6v9W7jafZdYrltN0+Nk3whw s6Meo+NJQv7L5JnhaN8h9qASZOrOcJXSAbUHL2gDOzzV90yyHnji1xkl/he04CybPyvwCE/DE5CuZ RJGDNl3qhNG8TId7f4kKIrOg5IlYaqVFriwuWr7m4hoK4wnB95iD+08huaj/zPVCH8GI3wQ6MOnXD +ktAJ4XpKKKZIi8U+Ek8w0uZVqGp9oiDwAE8WaKvn3erXlUgl1H9xI8ZMS05XCk+LJ9PxFalvdqac AiFt4m4Ihdfoe2HBw56DHzw75lT0q8KcVhc+gGxQx6V6PUk/pcfvcsqTbHa+aLYbShky8TIGxDXWJ b6Nt/E4w==; Received: from j217100.upc-j.chello.nl ([24.132.217.100] helo=noisy.programming.kicks-ass.net) by merlin.infradead.org with esmtpsa (Exim 4.92.3 #3 (Red Hat Linux)) id 1jPOva-0006AO-G8; Fri, 17 Apr 2020 11:12:58 +0000 Received: from hirez.programming.kicks-ass.net (hirez.programming.kicks-ass.net [192.168.1.225]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (Client did not present a certificate) by noisy.programming.kicks-ass.net (Postfix) with ESMTPS id BFCF2304E03; Fri, 17 Apr 2020 13:12:55 +0200 (CEST) Received: by hirez.programming.kicks-ass.net (Postfix, from userid 1000) id AF8D72B1214AF; Fri, 17 Apr 2020 13:12:55 +0200 (CEST) Date: Fri, 17 Apr 2020 13:12:55 +0200 From: Peter Zijlstra To: Joel Fernandes Cc: vpillai , Nishanth Aravamudan , Julien Desfossez , Tim Chen , mingo@kernel.org, tglx@linutronix.de, pjt@google.com, torvalds@linux-foundation.org, linux-kernel@vger.kernel.org, fweisbec@gmail.com, keescook@chromium.org, kerrnel@google.com, Phil Auld , Aaron Lu , Aubrey Li , aubrey.li@linux.intel.com, Valentin Schneider , Mel Gorman , Pawan Gupta , Paolo Bonzini Subject: Re: [RFC PATCH 00/13] Core scheduling v5 Message-ID: <20200417111255.GZ20730@hirez.programming.kicks-ass.net> References: <20200414142152.GV20730@hirez.programming.kicks-ass.net> <20200415163220.GA180518@google.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20200415163220.GA180518@google.com> Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 15, 2020 at 12:32:20PM -0400, Joel Fernandes wrote: > On Tue, Apr 14, 2020 at 04:21:52PM +0200, Peter Zijlstra wrote: > > On Wed, Mar 04, 2020 at 04:59:50PM +0000, vpillai wrote: > > > TODO > > > ---- > > > - Work on merging patches that are ready to be merged > > > - Decide on the API for exposing the feature to userland > > > - Experiment with adding synchronization points in VMEXIT to mitigate > > > the VM-to-host-kernel leaking > > > > VMEXIT is too late, you need to hook irq_enter(), which is what makes > > the whole thing so horrible. > > We came up with a patch to do this as well. Currently testing it more and it > looks clean, will share it soon. Thomas said we actually first do VMEXIT, and then enable interrupts. So the VMEXIT thing should actually work, and that is indeed much saner than sticking it in irq_enter(). It does however put yet more nails in the out-of-tree hypervisors. > > > - Investigate the source of the overhead even when no tasks are tagged: > > > https://lkml.org/lkml/2019/10/29/242 > > > > - explain why we're all still doing this .... > > > > Seriously, what actual problems does it solve? The patch-set still isn't > > L1TF complete and afaict it does exactly nothing for MDS. > > The L1TF incompleteness is because of cross-HT attack from Guest vCPU > attacker to an interrupt/softirq executing on the other sibling correct? The > IRQ enter pausing the other sibling should fix that (which we will share in > a future series revision after adequate testing). Correct, the vCPU still running can glean host (kernel) state from the sibling handling the interrupt in the host kernel. > > Like I've written many times now, back when the world was simpler and > > all we had to worry about was L1TF, core-scheduling made some sense, but > > how does it make sense today? > > For ChromeOS we're planning to tag each and every task seperately except for > trusted processes, so we are isolating untrusted tasks even from each other. > > Sorry if this sounds like pushing my usecase, but we do get parallelism > advantage for the trusted tasks while still solving all security issues (for > ChromeOS). I agree that cross-HT user <-> kernel MDS is still an issue if > untrusted (tagged) tasks execute together on same core, but we are not > planning to do that on our setup at least. That doesn't completely solve things I think. Even if you run all untrusted tasks as core exclusive, you still have a problem of them vs interrupts on the other sibling. You need to somehow arrange all interrupts to the core happen on the same sibling that runs your untrusted task, such that the VERW on return-to-userspace works as intended. I suppose you can try and play funny games with interrupt routing tied to the force-idle state, but I'm dreading what that'll look like. Or were you going to handle this from your irq_enter() thing too? Can someone go write up a document that very clearly states all the problems and clearly explains how to use this feature?