Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754085AbeAIAMq (ORCPT + 1 other); Mon, 8 Jan 2018 19:12:46 -0500 Received: from mail-it0-f46.google.com ([209.85.214.46]:41091 "EHLO mail-it0-f46.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752011AbeAIAMo (ORCPT ); Mon, 8 Jan 2018 19:12:44 -0500 X-Google-Smtp-Source: ACJfBovboXihcxBQK1cDHYz/IH9Kw2XU1Gfzu8MA5r+MtAcTANMwxBF3EJcJPhUzPncAwkDuieGLLP8euqN2BN6jGfY= MIME-Version: 1.0 From: Linus Torvalds Date: Mon, 8 Jan 2018 16:12:42 -0800 X-Google-Sender-Auth: S8tiXLGkIKhGrb5t5ED0U1hrx7U Message-ID: Subject: Re: [PATCH 06/18] x86, barrier: stop speculation for failed access_ok To: Dan Williams Cc: Linux Kernel Mailing List , linux-arch@vger.kernel.org, Andi Kleen , Arnd Bergmann , Greg Kroah-Hartman , Peter Zijlstra , Network Development , "the arch/x86 maintainers" , Ingo Molnar , "H. Peter Anvin" , Thomas Gleixner , Alan Cox Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Mon, Jan 8, 2018 at 3:53 PM, Dan Williams wrote: > > I've been thinking the "and" is only suitable for the array bounds > check, for get_user() we're trying to block speculation past > access_ok() at which point we can only do the lfence? Well, we *could* do the "and", at least for the simple cases (ie the true "get_user()" that integrates the access_ok with the access). IOW, mainly the code in arch/x86/lib/getuser.S. But it probably is a lot simpler to just add the "lfence" to ASM_STAC, because by definition those cases don't tend to be the truly critical ones - people who use those functions tend to do one or two accesses, and the real cost is likely the I$ misses and the D$ miss to get current->addr_limit. Not to mention the "stac" itself, which is much more expensive than the access on current microarchitectures. But something like this *might* work: index c97d935a29e8..7fa3d293beaf 100644 --- a/arch/x86/lib/getuser.S +++ b/arch/x86/lib/getuser.S @@ -38,8 +38,11 @@ .text ENTRY(__get_user_1) mov PER_CPU_VAR(current_task), %_ASM_DX - cmp TASK_addr_limit(%_ASM_DX),%_ASM_AX + mov TASK_addr_limit(%_ASM_DX),%_ASM_DX + cmp %_ASM_DX,%_ASM_AX jae bad_get_user + or $0xfff,%_ASM_DX + and %_ASM_DX,%_ASM_AX ASM_STAC 1: movzbl (%_ASM_AX),%edx xor %eax,%eax (this only does the one-byte case - the 2/4/8 byte cases are exactly the same). The above is completely untested and might have some stupid thinko/typo, so take it purely as a "example patch" to show the concept, rather than actually do it. But just adding "lfence" to the existing ASM_STAC is a hell of a lot easier, and the performance difference between that trivial patch and the above "let's be clever with 'and'" might not be measurable. I really have no idea how expensive lfence might actually end up being in practice. It's possible that lfence is actually fairly cheap in kernel code, since we tend to not have very high IPC anyway. Linus