Received: by 2002:a05:6a10:17d3:0:0:0:0 with SMTP id hz19csp360141pxb; Wed, 14 Apr 2021 17:49:07 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwYUO4O6RVqQSc8+0noiulDVUY1G5wkSrWASr3zJog8qgfgIVJQGbZncSka879B1uU1s7wE X-Received: by 2002:aa7:d587:: with SMTP id r7mr246190edq.388.1618447747558; Wed, 14 Apr 2021 17:49:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1618447747; cv=none; d=google.com; s=arc-20160816; b=XQh2U7DSK2A7fAO3rrihRJFv9pa3JWRUH+Ml299LhexZMDYhqC7XbJ91HLwhzyHqDU pOlhwznF9xDqvrazYWK7bR/Mis3aG1e1JEttGebD589fnHb370z9aqQwzWrSpEnA1hhV ggpFGhDT09fq/kpmv9eEYVcdbl5fAxiakhK35pYe7hhcTbs91LYHgiazBKEhP9dkZkgR 8bFA09niP8nM2Gh6OZ4ozNyfvjhiNtFzCtMvRk+Qo5zDZYUmxlgU5TattTGLFwxzA35c x/7kpO8PgSkPodBU5nekz+keTxtVms6TiVgRp2ey0iVEQldCjCKOUpQCIuH8zD3lP0eH W3Mg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=wiaZwHGTdCRj5+b/FatO3lAkw7tSqDQttkSKoeHjA7s=; b=b6ZL/aisBNjQTax5u8r2QgOlKVQqaYR3iV1aaHk6xb0tMjQQjXIcC9lUerkUqZOL8Q WYS7VrdfmS35SnRmnsoEakwbbgkrfc3SxcSigInxVglV/uq8swO5DUsiRI0bDpVvJhf/ 4/h3BOdaZ7AsZKm5r7JoZ0yHgTR3Fui1EB3ciqZe7w0PcKT+j/XmZ1qFJk3DwlZpHnaq 8eJuEnQKQ+qhpwJMBd7JrqbHNtOVuDCd/duA2It6l8MfsKp82i0ZkttiKXMprdCmA0CP qPhQa6QYg5pf9Xpp0yMfof1gfSSajfdXNFx042lihlM/JlupJrkYwFJuJQvRGjSnYeuM SnEg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=lrlW0qZV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id h16si737008ejt.695.2021.04.14.17.48.44; Wed, 14 Apr 2021 17:49:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=lrlW0qZV; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S245075AbhDNUff (ORCPT + 99 others); Wed, 14 Apr 2021 16:35:35 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:58092 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S244702AbhDNUfe (ORCPT ); Wed, 14 Apr 2021 16:35:34 -0400 Received: from mail-pl1-x629.google.com (mail-pl1-x629.google.com [IPv6:2607:f8b0:4864:20::629]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 6D78FC061574 for ; Wed, 14 Apr 2021 13:35:12 -0700 (PDT) Received: by mail-pl1-x629.google.com with SMTP id z22so5677692plo.3 for ; Wed, 14 Apr 2021 13:35:12 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=wiaZwHGTdCRj5+b/FatO3lAkw7tSqDQttkSKoeHjA7s=; b=lrlW0qZV+YVC2aLuyTTCzdEh/STkvFuGresOjdRh4pVuImIj4ZKS/pMqdt/Vps1g+I jvkqVg67lsiTt2P5KQYNlt5Oft4TafNrKS/86QVFmrPJdnHqVKjY3Z6SxZ+ir/hBJsYT XenCklvDU78qUlXpHVIEQHmdkhxWR9nghPavEdPw5YUJmn+mlrDgk778ir7lUCXyDNzr skV+HvM2/hrY1tYUEjD8LtqO+EJIiGKA4NnwO/ALqhN7uEmAaihGYxEl8Bt57JXD+8ZJ ezEX9+s3kac+CexiWaxM6fdSUF2WKHxTUIAtIfzrbRygspDbrj5j2Nt2+0m+QdzLFliD hf+A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=wiaZwHGTdCRj5+b/FatO3lAkw7tSqDQttkSKoeHjA7s=; b=Jd1U8a2eSdk/+XYkyMUFHeIijTm+mNtYFOPNushwzSCEMnH/QQHSFfIvJ51u3zJJi3 vQIsMqWDkjmj81pGlw41YwvC3yfVcQedzPkyXDj5wgJKd62FFrgJbnHfJ4dnWBm0mL3e l6IirSReT7Yt138bLlrkMh7QdFoRVP2Ai1VlMGU3jfsX6xtw4Bs0gVpnocT4N0Zank5I pPgwhoLi7An9ZzEYPMjMadD5x3gI0CbuIjZ+sBOm0JA5fIsp2q4sG/TYgl3cExlAGUjn ab04bhIq3d0sti0omUpb5GZZ/PyNFLzlB+eeL2X5wIoH4kqAmhHWPmHuwNAng/J8Rdud 9UZQ== X-Gm-Message-State: AOAM533cScWCB/+Dk4ZEmyGe10VdaXxd2DK0unvPNIVTDA7rIVYG6MSi 7KUM8BJZsLd+bYOuTUZMc1WHOPWgqTawqGHPthcyAw== X-Received: by 2002:a17:902:7e8b:b029:e9:2ba0:20a2 with SMTP id z11-20020a1709027e8bb02900e92ba020a2mr40838pla.69.1618432511767; Wed, 14 Apr 2021 13:35:11 -0700 (PDT) MIME-Version: 1.0 References: <20210413162240.3131033-1-eric.dumazet@gmail.com> <20210413162240.3131033-4-eric.dumazet@gmail.com> <567941475.72456.1618332885342.JavaMail.zimbra@efficios.com> <989543379.72506.1618334454075.JavaMail.zimbra@efficios.com> <1347243835.72576.1618336812739.JavaMail.zimbra@efficios.com> <2c6885b0241d4127b8cb7e38abbbe1e5@AcuMS.aculab.com> <1e5576a1a5b24cb0b1d53b9bb22d528e@AcuMS.aculab.com> In-Reply-To: From: Arjun Roy Date: Wed, 14 Apr 2021 13:35:00 -0700 Message-ID: Subject: Re: [PATCH v2 3/3] rseq: optimise rseq_get_rseq_cs() and clear_rseq_cs() To: Eric Dumazet Cc: David Laight , Mathieu Desnoyers , Eric Dumazet , Ingo Molnar , Peter Zijlstra , paulmck , Boqun Feng , linux-kernel Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 14, 2021 at 1:25 PM Eric Dumazet wrote: > > On Wed, Apr 14, 2021 at 10:15 PM Arjun Roy wrote: > > > > On Wed, Apr 14, 2021 at 10:35 AM Eric Dumazet wrote: > > > > > > On Wed, Apr 14, 2021 at 7:15 PM Arjun Roy wrote: > > > > > > > > On Wed, Apr 14, 2021 at 9:10 AM Eric Dumazet wrote: > > > > > > > > > > On Wed, Apr 14, 2021 at 6:08 PM David Laight wrote: > > > > > > > > > > > > From: Eric Dumazet > > > > > > > Sent: 14 April 2021 17:00 > > > > > > ... > > > > > > > > Repeated unsafe_get_user() calls are crying out for an optimisation. > > > > > > > > You get something like: > > > > > > > > failed = 0; > > > > > > > > copy(); > > > > > > > > if (failed) goto error; > > > > > > > > copy(); > > > > > > > > if (failed) goto error; > > > > > > > > Where 'failed' is set by the fault handler. > > > > > > > > > > > > > > > > This could be optimised to: > > > > > > > > failed = 0; > > > > > > > > copy(); > > > > > > > > copy(); > > > > > > > > if (failed) goto error; > > > > > > > > Even if it faults on every invalid address it probably > > > > > > > > doesn't matter - no one cares about that path. > > > > > > > > > > > > > > > > > > > > > On which arch are you looking at ? > > > > > > > > > > > > > > On x86_64 at least, code generation is just perfect. > > > > > > > Not even a conditional jmp, it is all handled by exceptions (if any) > > > > > > > > > > > > > > stac > > > > > > > copy(); > > > > > > > copy(); > > > > > > > clac > > > > > > > > > > > > > > > > > > > > > > > > > > > > efault_end: do error recovery. > > > > > > > > > > > > It will be x86_64. > > > > > > I'm definitely seeing repeated tests of (IIRC) %rdx. > > > > > > > > > > > > It may well be because the compiler isn't very new. > > > > > > Will be an Ubuntu build of 9.3.0. > > > > > > Does that support 'asm goto with outputs' - which > > > > > > may be the difference. > > > > > > > > > > > > > > > > Yep, probably. I am using some recent clang version. > > > > > > > > > > > > > On x86-64 I can confirm, for me it (4 x unsafe_get_user()) compiles > > > > down to stac + lfence + 8 x mov + clac as straight line code. And > > > > results in roughly a 5%-10% speedup over copy_from_user(). > > > > > > > > > > But rseq_get_rseq_cs() would still need three different copies, > > > with 3 stac+lfence+clac sequences. > > > > > > Maybe we need to enclose all __rseq_handle_notify_resume() operations > > > in a single section. > > > > > > > > > > To provide a bit of further exposition on this point, if you do 4x > > unsafe_get_user() recall I mentioned a 5-10% improvement. On the other > > hand, 4x normal get_user() I saw something like a 100% (ie. doubling > > of sys time measured) regression. > > > > I assume that's the fault of multiple stac+clac. > > > I was suggesting only using unsafe_get_user() and unsafe_put_user(), > and one surrounding stac/clac > > Basically what we had (partially) in our old Google kernels, before > commit 8f2817701492 ("rseq: Use get_user/put_user rather than > __get_user/__put_user") > but with all the needed modern stuff. Yep - in agreement with that approach. -Arjun