Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp625516imm; Mon, 2 Jul 2018 19:03:11 -0700 (PDT) X-Google-Smtp-Source: ADUXVKJhJqfNk94ZRsriv5NzmuEvlfgj5UT1XaIXTtxG/HPEJ4XdjNZ0MwlZlkSQ2tgxzSkfCgx6 X-Received: by 2002:a65:4dc3:: with SMTP id q3-v6mr23566330pgt.331.1530583391490; Mon, 02 Jul 2018 19:03:11 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530583391; cv=none; d=google.com; s=arc-20160816; b=t3J75q6yIRPXf/7HBap9Es6+i55yXYMEUkPRFJcAaq3Xz1jArNxCNB7ZpVRFgAfZ1s /EOJMKXZZeC0FRCQmwfXfbEIJa7KBvES1rZ0GTfq4Wk/ZSZB4DwZ9tEhFlnPIQ53DjKT RsrytDXsyw6BEgVBkp7Vmu4ItOUGZ9BIzj/wpwJ02Lv2LVPwdrmEL5C5i1o5AO08NpuF +w0fjdFxYwrnhc2Q6zl6X4IqJJrvXXGLct8VbvAkYwsWxLNNa9hf+L9aM6dks5KAJV9s X1oGE39yN0BtCqUvqs+KfDVVyNw1yiazG1u205yS5aJm7XjVN/fyjf5JnNFCOAxqrZ9u AAvw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter :arc-authentication-results; bh=p8Z4rQpwX7V9YTEWZ1h5DJ9xt5hUq/oFR2z3F0/tFlk=; b=xqW7ASLlX/z6MwcIlb5CU/pZ7KYDFxnELvpHMjXiraO7vUbu2yh3EAfyDSyr8TUI4d zdUvr/VdOhMGiqKRvjnrtbz6Umfbchp78PpE0RsABTWbmKSPQLstWzAUHr2Rr5avs1TU HbLc7RyGkA0AHHKjKsjIGInEdVLLv/QjgZq5wpm9612HoEOpabKTo5E+ll0x3tdL2Og2 rCmCEN1YAHGztLuFUOyVEJLg60sJzpBT8npyMu7/qJp0qtojC6FFbCSdTqIpwwWrgJU8 NjAsD+w6n4IOVBRO5spTnw/uEK1fLADfiWE0oJmPGFQWUPwnTN/wU/vwSIFiFALwdC2E edxA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=CQu11eFs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z4-v6si14607120pge.173.2018.07.02.19.02.56; Mon, 02 Jul 2018 19:03:11 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=CQu11eFs; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753787AbeGCCBe (ORCPT + 99 others); Mon, 2 Jul 2018 22:01:34 -0400 Received: from mail.efficios.com ([167.114.142.138]:57588 "EHLO mail.efficios.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752925AbeGCCBd (ORCPT ); Mon, 2 Jul 2018 22:01:33 -0400 Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id 5DDC522F4B1; Mon, 2 Jul 2018 22:01:32 -0400 (EDT) Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10032) with ESMTP id kmaGrwq0zPYS; Mon, 2 Jul 2018 22:01:31 -0400 (EDT) Received: from localhost (ip6-localhost [IPv6:::1]) by mail.efficios.com (Postfix) with ESMTP id C4CA922F4AD; Mon, 2 Jul 2018 22:01:31 -0400 (EDT) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com C4CA922F4AD DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1530583291; bh=p8Z4rQpwX7V9YTEWZ1h5DJ9xt5hUq/oFR2z3F0/tFlk=; h=Date:From:To:Message-ID:MIME-Version; b=CQu11eFst4wGECa0j8RVhdW+2RjS2Jk0fATgcAiKf5gK1VsT/0sMauoVm13EY/2xJ B21igvBVZyd8V1ZTkTPz3qZs/WONrN82Kl2PQx94N+GNvjCQwoaC8QN0zOgFDipnPE 78rfoULEdsf/ZBnOOjTy9uaVJ0PWzjOySHI4GyIFLaDvQQiC5P4Gged4upDq1PtgPg LxbSCBF4hyUbhdBb5Ln7gfLkTq4XDy2Hc8mGheCf0+oR28MntWFBLeKUiBZQLUt8JO UYFm169r2Up6TMsiEXkIl5POLY4kZhUFggJsya8FPsyXM5U6wTNCkiyQTV5WP8JVFp XoGmnI/5BlLyA== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([IPv6:::1]) by localhost (mail02.efficios.com [IPv6:::1]) (amavisd-new, port 10026) with ESMTP id CsO_-jMxCwdL; Mon, 2 Jul 2018 22:01:31 -0400 (EDT) Received: from mail02.efficios.com (mail02.efficios.com [167.114.142.138]) by mail.efficios.com (Postfix) with ESMTP id A5C5F22F4A7; Mon, 2 Jul 2018 22:01:31 -0400 (EDT) Date: Mon, 2 Jul 2018 22:01:31 -0400 (EDT) From: Mathieu Desnoyers To: Andy Lutomirski Cc: Linus Torvalds , Thomas Gleixner , linux-kernel , linux-api , Peter Zijlstra , "Paul E. McKenney" , Boqun Feng , Dave Watson , Paul Turner , Andrew Morton , Russell King , Ingo Molnar , "H. Peter Anvin" , Andi Kleen , Chris Lameter , Ben Maurer , rostedt , Josh Triplett , Catalin Marinas , Will Deacon , Michael Kerrisk , Joel Fernandes Message-ID: <858886246.10882.1530583291379.JavaMail.zimbra@efficios.com> In-Reply-To: <459661281.10865.1530580742205.JavaMail.zimbra@efficios.com> References: <20180702223143.4663-1-mathieu.desnoyers@efficios.com> <415287289.10831.1530572418907.JavaMail.zimbra@efficios.com> <825871008.10839.1530573419561.JavaMail.zimbra@efficios.com> <1959930320.10843.1530573742647.JavaMail.zimbra@efficios.com> <8B2E4CEB-3080-4602-8B62-774E400892EB@amacapital.net> <459661281.10865.1530580742205.JavaMail.zimbra@efficios.com> Subject: Re: [RFC PATCH for 4.18] rseq: use __u64 for rseq_cs fields, validate user inputs MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable X-Originating-IP: [167.114.142.138] X-Mailer: Zimbra 8.8.8_GA_2096 (ZimbraWebClient - FF52 (Linux)/8.8.8_GA_1703) Thread-Topic: rseq: use __u64 for rseq_cs fields, validate user inputs Thread-Index: erm+LR4InFsVzMK0uwedmMsmP9wzeVgMPhQ6 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Jul 2, 2018, at 9:19 PM, Mathieu Desnoyers mathieu.desnoyers@effic= ios.com wrote: > ----- On Jul 2, 2018, at 7:37 PM, Andy Lutomirski luto@amacapital.net wro= te: >=20 >>> On Jul 2, 2018, at 4:22 PM, Mathieu Desnoyers >>> wrote: >>>=20 >>> ----- On Jul 2, 2018, at 7:16 PM, Mathieu Desnoyers >>> mathieu.desnoyers@efficios.com wrote: >>>=20 >>>> ----- On Jul 2, 2018, at 7:06 PM, Linus Torvalds torvalds@linux-founda= tion.org >>>> wrote: >>>>=20 >>>>> On Mon, Jul 2, 2018 at 4:00 PM Mathieu Desnoyers >>>>> wrote: >>>>>>=20 >>>>>> Unfortunately, that rseq->rseq_cs field needs to be updated by user-= space >>>>>> with single-copy atomicity. Therefore, we want 32-bit user-space to = initialize >>>>>> the padding with 0, and only update the low bits with single-copy at= omicity. >>>>>=20 >>>>> Well... It's actually still single-copy atomicity as a 64-bit value. >>>>>=20 >>>>> Why? Because it doesn't matter how you write the upper bits. You'll b= e >>>>> writing the same value to them (zero) anyway. >>>>>=20 >>>>> So who cares if the write ends up being two instructions, because the >>>>> write to the upper bits doesn't actually *do* anything. >>>>>=20 >>>>> Hmm? >>>>=20 >>>> Are there any kind of guarantees that a __u64 update on a 32-bit archi= tecture >>>> won't be torn into something daft like byte-per-byte stores when perfo= rmed >>>> from C code ? >>>>=20 >>>> I don't worry whether the upper bits get updated or how, but I really = care >>>> about not having store tearing of the low bits update. >>>=20 >>> For the records, most updates of those low bits are done in assembly >>> from critical sections, for which we control exactly how the update is >>> performed. >>>=20 >>> However, there is one helper function in user-space that updates that v= alue >>> from C through a volatile store, e.g.: >>>=20 >>> static inline void rseq_prepare_unload(void) >>> { >>> __rseq_abi.rseq_cs =3D 0; >>> } >>=20 >> How about making the field be: >>=20 >> union { >> __u64 rseq_cs; >> struct { >> __u32 rseq_cs_low; >> __u32 rseq_cs_high; >> }; >> }; >>=20 >> 32-bit user code that cares about performance can just write to rseq_cs_= low >> because it already knows that rseq_cs_high =3D=3D 0. >>=20 >> The header could even supply a static inline helper write_rseq_cs() that >> atomically writes a pointer and just does the right thing for 64-bit, fo= r >> 32-bit BE, and for 32-bit LE. >>=20 >> I think the union really is needed because we can=E2=80=99t rely on user= code being >> built with -fno-strict-aliasing. Or the helper could use inline asm. >>=20 >> Anyway, the point is that we get optimal code generation (a single instr= uction >> write of the correct number of bits) without any compat magic in the ker= nel. >=20 > That works for me! Any objection from anyone else for this approach ? One thing to consider is how we will implement the load of that pointer on the kernel side. Strictly-speaking, the rseq uapi talks about single-cop= y atomicity, and does not specify _which_ thread is expected to update that pointer. So arguably, the common case is that the current thread is updatin= g it, which would allow the kernel to read it piece-wise. However, nothing prevents user-space from updating it from another thread with single-copy atomicity. So in order to be on the safe side, I prefer to guarantee single-copy atomicity of the get_user() load from the kernel that reads this pointer. This means a 32-bit kernel would have to perform two independent loads: one for low bits, one for high bits. So it does look like we need some __LP64__ ifdefery even with the union trick. Therefore, I'm not convinced the union is useful at all. Thoughts ? Thanks, Mathieu --=20 Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com