Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752905AbeADUIQ (ORCPT + 1 other); Thu, 4 Jan 2018 15:08:16 -0500 Received: from mx1.redhat.com ([209.132.183.28]:56530 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751899AbeADUIP (ORCPT ); Thu, 4 Jan 2018 15:08:15 -0500 Subject: Re: Avoid speculative indirect calls in kernel To: Linus Torvalds , Pavel Machek References: <20180103230934.15788-1-andi@firstfloor.org> <20180104112614.GA1702@amd> Cc: Andi Kleen , Thomas Gleixner , Greg Kroah-Hartman , David Woodhouse , Tim Chen , Linux Kernel Mailing List , Dave Hansen From: Jon Masters Message-ID: <10ef2b03-22e4-1ac8-94a8-82613c8cf9d4@redhat.com> Date: Thu, 4 Jan 2018 15:08:11 -0500 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.38]); Thu, 04 Jan 2018 20:08:15 +0000 (UTC) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On 01/04/2018 01:33 PM, Linus Torvalds wrote: > On Thu, Jan 4, 2018 at 3:26 AM, Pavel Machek wrote: >> On Wed 2018-01-03 15:51:35, Linus Torvalds wrote: >>> >>> A *competent* CPU engineer would fix this by making sure speculation >>> doesn't happen across protection domains. Maybe even a L1 I$ that is >>> keyed by CPL. >> >> Would that be enough? > > No, you'd need to add the CPL to the branch target buffer itself, not the I$ L1. > > And as somebody pointed out, that only helps the user space messing > with the kernel. It doesn't help the "one user context fools another > user context to mispredict". (Where the user contexts might be a > JIT'ed JS vs the rest of the web browser). > > So you really would want to just make sure the full address is used to > index (or at least verify) the BTB lookup, and even then you'd then > need to invalidate the BTB on context switches so that one context > can't fill in data for another context. IMO the correct hardware fix is to index the BTB using the full VA including the ASID/PCID. And guarantee (as is the case) that there is not a live conflict between address space identifiers with entries. The sad thing is that even the latest academic courses recommend "optimizing" branch predictors with a few low order bits (e.g. 31 in Intel's case, various others for different vendors). The fix for variant 3 is similarly not that difficult in new hardware: don't allow the speculated load to happen by enforcing the permission check at the right time. The last several editions of Computer Architecture spell this out in Appendix B (page 37 or thereabouts). Jon. -- Computer Architect | Sent from my Fedora powered laptop