Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751279AbeADXGy (ORCPT + 1 other); Thu, 4 Jan 2018 18:06:54 -0500 Received: from mail-it0-f54.google.com ([209.85.214.54]:33281 "EHLO mail-it0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751029AbeADXGv (ORCPT ); Thu, 4 Jan 2018 18:06:51 -0500 X-Google-Smtp-Source: ACJfBovkjpQjxfdMpl0IwJ5V+oMdukcR/BfvCS0cP+aPxtWFSO8RbYfueoGC1p43NlJOXZ4c6n0gqIuhxlvMJBJyxMU= MIME-Version: 1.0 In-Reply-To: <20180104225515.13a40f0f@alans-desktop> References: <20180103223827.39601-1-mark.rutland@arm.com> <151502463248.33513.5960736946233335087.stgit@dwillia2-desk3.amr.corp.intel.com> <20180104010754.22ca6a74@alans-desktop> <20180104192648.GA10427@amd> <20180104225515.13a40f0f@alans-desktop> From: Linus Torvalds Date: Thu, 4 Jan 2018 15:06:49 -0800 X-Google-Sender-Auth: _W-o2hjR0VtAkn_oeDkHJEtAb0o Message-ID: Subject: Re: [RFC PATCH] asm/generic: introduce if_nospec and nospec_barrier To: Alan Cox Cc: Dan Williams , Pavel Machek , Julia Lawall , Linux Kernel Mailing List , Mark Rutland , linux-arch@vger.kernel.org, Peter Zijlstra , Greg KH , Thomas Gleixner , Elena Reshetova , Alan Cox , Dan Carpenter Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Return-Path: On Thu, Jan 4, 2018 at 2:55 PM, Alan Cox wrote: > > How do you ensure that the CPU doesn't speculate j < _m ? ~0 : 0 pick the > wrong mask and then reference base[] ? .. yeah, that's exactly where we want to make sure that the compiler uses a select or 'setb'. That's what gcc does for me in testing: xorl %eax, %eax setbe %al negq %rax but yes, we'd need to guarantee it somehow. Presumably that is where we end up having some arch-specific stuff. Possibly there is some gcc builtin. I wanted to avoid actually writing architecture-specific asm. > Anding with a constant works because the constant doesn't get speculated > and nor does the and with a constant, but you've got a whole additional > conditional path in your macro. Absolutely. Think of it as an example, not "the solution". It's also possible that x86 'lfence' really is so fast that it doesn't make sense to try to do this. Agner Fog claims that it's single-cycle (well, except for P4, surprise, surprise), but I suspect that his timings are simply for 'lfence' in a loop or something. Which may not show the real cost of actually halting things until they are stable. Also, maybe that __fcheck_files() pattern where getting a NULL pointer happens to be the right thing for out-of-range is so unusual as to be useless, and most people end up having to have that limit check for other reasons anyway. Linus