Received: by 2002:a05:6902:102b:0:0:0:0 with SMTP id x11csp3055440ybt; Mon, 29 Jun 2020 14:06:06 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwAfjHaPjehxl8RuzyHKlOXuRHuZBGam1vwk3i5I88LykHszIQnR+axKsfhqJOg1ekB/rBj X-Received: by 2002:a17:906:84ef:: with SMTP id zp15mr10656839ejb.3.1593464766457; Mon, 29 Jun 2020 14:06:06 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1593464766; cv=none; d=google.com; s=arc-20160816; b=qg5MUW0X3ixnC7Jb3zDyuCzg5rx9ZBklDGKKnXODd+jKj2//+W3Wpr/gdF8bGkfvyD WjdEwvGT2/FR2D0RQ1iTI/JbhIN4qVz+6oiov0h1tKIjw2Ojh6mzn4Y/WX3Y/+zU/Ymu MjDIYt5BcssC53V65qGzHQ1zEBk2ViMPm3BQEgRE4LOSrDMsLVkeyY8RFty4ILqkxIVj eGW+o5dpVsIh9Ab+2dRTIOBFcgMUyp70rTK2GMUV2w4+EP8svAqREEl4V8iHmkxtCHF0 1tnwN0Z2CrRexkr0WH19Rk4N5ZO677jFHiiMtCYnowhs5b07XdliNkdma6FCiXZ98gXY 1tLg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=l1ZouKwkA6OTn94EhFg+HzBGtNxozDTOKD9+zn9cXGk=; b=TId68Gt/Dh4YRSIVGelqtlFpDxsBBtPfyj/m1kY5NbWkEqWOhYzMzthHrZHaNYgG0E RPBzVr6PY+PPs/N9dQKNvm1ILvEkpogHgSca8+Mz7YpYxVFM5hiSk/JNbTAXc+5xgibG C1GJyr3qnqFmACFKjAxnJk8XDIddI0gguy4Ux3I7H2E6P+Gb4sWRGW1dFgl+fNIYO+9b CWidOcop53iYuL7aYg9r93i1Z6S5Z3B/L0VvYCg4fBRNMp7zTIGhdyaGVXwJl9xg9mDD 2mZ+0VqTSUt4yebztAh1Vk7f08rSNuJwns9BRalHXb0pGScZcxZWbLtRnxT9J8BA4j5t 8PFw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pKJ9UEl9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l3si383632eja.81.2020.06.29.14.05.43; Mon, 29 Jun 2020 14:06:06 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=pKJ9UEl9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731209AbgF2VEK (ORCPT + 99 others); Mon, 29 Jun 2020 17:04:10 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:43364 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1731120AbgF2TMp (ORCPT ); Mon, 29 Jun 2020 15:12:45 -0400 Received: from mail-lj1-x242.google.com (mail-lj1-x242.google.com [IPv6:2a00:1450:4864:20::242]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 807A2C030F2C for ; Mon, 29 Jun 2020 09:44:26 -0700 (PDT) Received: by mail-lj1-x242.google.com with SMTP id f5so2883291ljj.10 for ; Mon, 29 Jun 2020 09:44:26 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=l1ZouKwkA6OTn94EhFg+HzBGtNxozDTOKD9+zn9cXGk=; b=pKJ9UEl9YW6klCcaX2H9le5Wxf+le5pbrXq73l9SdmWUOnpuT8/IOV1JLmFazjhbMo 4l9KjDPBB1mpMTXsVeNWQ4LchMq5Rgb7gkAQmsd2+IzeuneMCE26vVvSgnZ/Q2Jvu+FM 8JXKXwEyofd1vO8XYbIIbb1usQTJ/f9PapEiLqZ9hguc2OStJMeXr/f9LKLuIgUedLv7 AOBnxndVCPIrMRusUptiQdUMtqV42zIHpFwrjoE2ROBTg484kSZUBgxHP6Kd6J3V7WE9 OPnOlPS5v21DyhZvPVY0wyCz/Bns6fZHzCb3KrYfovBR0U4nnQBGbI1UO5e33F2Xabgn tENA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=l1ZouKwkA6OTn94EhFg+HzBGtNxozDTOKD9+zn9cXGk=; b=Wel/jbmYCSi82WhB4C/p+Y41JuoCPdh1iM3FKCnhhdeJfuw1i889QbfARcD5gFYwbx 8cOhDqlhE+RijC8MJ6jY9srXf4vHBc/ivHmWRg9995yBPQS4tzYtTFOPQfLUwHCblt/1 gkCWDxIRE26k62PZ+DXJjG1ZJm6oLzekVEYoBNwBhCefkrHyLi6MhkmaquD0G6mR7NL0 aBdFE/wN8bbYo+kZgM9EX4teeE4qBAzwl0Z8QwStlkZW7Fxm6zFRJQwZaflXQCxMDOmx nIRJtMlDEqYSUybWy2jhAnGqI7qwgAp2rx/z2MUyZoE6MSvMArfLfPdQeg5onTac3rqN 3msA== X-Gm-Message-State: AOAM533mIA4RjgTsDJIn3yxFCiraPIXtLUmSEVPzezxIOjCc8G6HKb7h as6Hzqj7s+Rqwt6mW21HsoO3xlic2sNqBg1wscr2pA== X-Received: by 2002:a2e:7a03:: with SMTP id v3mr2272980ljc.141.1593449064684; Mon, 29 Jun 2020 09:44:24 -0700 (PDT) MIME-Version: 1.0 References: <20200624185247.13269-1-posk@posk.io> In-Reply-To: <20200624185247.13269-1-posk@posk.io> From: Peter Oskolkov Date: Mon, 29 Jun 2020 09:44:13 -0700 Message-ID: Subject: Re: [RFC PATCH 0/3 v3] futex/sched: introduce FUTEX_SWAP operation To: Thomas Gleixner , Ingo Molnar Cc: Linux Kernel Mailing List , Peter Zijlstra , Darren Hart , Peter Oskolkov , Vincent Guittot , Andrei Vagin , Paul Turner , Ben Segall , Aaron Lu Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Thomas, Ingo! Do you have any comments/suggestions/objections here? FUTEX_SWAP seems to be quite useful for fast task context switching, and several teams at Google would like to see this capability upstreamed. Thanks, Peter On Wed, Jun 24, 2020 at 11:53 AM Peter Oskolkov wrote: > > From: Peter Oskolkov > > This is an RFC! > > As Paul Turner presented at LPC in 2013 ... > - pdf: http://pdxplumbers.osuosl.org/2013/ocw//system/presentations/1653/original/LPC%20-%20User%20Threading.pdf > - video: https://www.youtube.com/watch?v=KXuZi9aeGTw > > ... Google has developed an M:N userspace threading subsystem backed > by Google-private SwitchTo Linux Kernel API (page 17 in the pdf referenced > above). This subsystem provides latency-sensitive services at Google with > fine-grained user-space control/scheduling over what is running when, > and this subsystem is used widely internally (called schedulers or fibers). > > This RFC patchset is the first step to open-source this work. As explained > in the linked pdf and video, SwitchTo API has three core operations: wait, > resume, and swap (=switch). So this patchset adds a FUTEX_SWAP operation > that, in addition to FUTEX_WAIT and FUTEX_WAKE, will provide a foundation > on top of which user-space threading libraries can be built. > > Another common use case for FUTEX_SWAP is message passing a-la RPC > between tasks: task/thread T1 prepares a message, > wakes T2 to work on it, and waits for the results; when T2 is done, it > wakes T1 and waits for more work to arrive. Currently the simplest > way to implement this is > > a. T1: futex-wake T2, futex-wait > b. T2: wakes, does what it has been woken to do > c. T2: futex-wake T1, futex-wait > > With FUTEX_SWAP, steps a and c above can be reduced to one futex operation > that runs 5-10 times faster. > > Patches in this patchset: > > Patch 1: introduce FUTEX_SWAP futex operation that, > internally, does wake + wait. The purpose of this patch is > to work out the API. > Patch 2: a first rough attempt to make FUTEX_SWAP faster than > what wake + wait can do. > Patch 3: a selftest that can also be used to benchmark FUTEX_SWAP vs > FUTEX_WAKE + FUTEX_WAIT. > > v2: fix undefined symbol error ifndef CONFIG_SMP. > v3: rebased onto the latest tip/locking/core. > > Peter Oskolkov (3): > futex: introduce FUTEX_SWAP operation > futex/sched: add wake_up_process_prefer_current_cpu, use in FUTEX_SWAP > selftests/futex: add futex_swap selftest > > include/linux/sched.h | 1 + > include/uapi/linux/futex.h | 2 + > kernel/futex.c | 96 ++++++-- > kernel/sched/core.c | 5 + > kernel/sched/fair.c | 3 + > kernel/sched/sched.h | 1 + > .../selftests/futex/functional/.gitignore | 1 + > .../selftests/futex/functional/Makefile | 1 + > .../selftests/futex/functional/futex_swap.c | 209 ++++++++++++++++++ > .../selftests/futex/include/futextest.h | 19 ++ > 10 files changed, 322 insertions(+), 16 deletions(-) > create mode 100644 tools/testing/selftests/futex/functional/futex_swap.c > > -- > 2.25.1 >