Received: by 2002:a89:d88:0:b0:1fa:5c73:8e2d with SMTP id eb8csp135299lqb; Thu, 23 May 2024 13:07:51 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCUPQy+lFEwmgClb+LieBbyZpEMvkSy3Y+JSvnpDDEd7BWfyjPXrlwsBWtINMExS6b1NX1+ejzyGYyi2wFzxOcDN8ZDJKf6+VtIiSonP5g== X-Google-Smtp-Source: AGHT+IGs77R7zW/mtlgWRo6tnkXEdelLfhx2IzmSZ64or6aqVz090BBdyZX4loxJsWBEtwVDHwIS X-Received: by 2002:ac8:5ad1:0:b0:43e:1231:1040 with SMTP id d75a77b69052e-43fa746b12fmr65958401cf.20.1716494871378; Thu, 23 May 2024 13:07:51 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1716494871; cv=pass; d=google.com; s=arc-20160816; b=bpbFl3EIC1kt0fYwQlHvGYUXNAD3Y893iM55Une/701UBhNEaw6ypo9kouCgdrpcUQ PaZU1NtU2Kqs4/MNFs0b2NFZE71nH0kwjXmfG+nr4RY7O00U7Eeg3lBa9bxxZtijsspD ps/jv1Ga7kDJAH6lV380FtEIYQx4dOMCMNdRDgrb6a1azudnvaL+Ttio9Flz0oqEEOk7 o1T/9juSje833YZjuvg7pdR5k4QzKjf6cl4irknzfCHjdfGHKbuo/TLxflgb1tcMV8AC esZmYp9HXdBF/LyaLAZbA+8yrBUNVX+CH2l68YgBukO3NkOidprfC+JS7zjL5wyIVRG9 YgRw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:list-unsubscribe :list-subscribe:list-id:precedence:message-id:date:subject:cc:to :from:dkim-signature; bh=+RKBUDWRrMarm2gr40PiqR7AOLsy7jlxbXPbY/6bQJk=; fh=mrSudHu1cNdwDef6YMqMmE4TOkN75ugFhIln0OhjIBA=; b=uExnloZY2aA6KgtXCs0D0WDatpoJGOPAEPpsCuMxESBdnlkUhGip/5IXc3a1ZpWBS2 YVi88BiyyEb24zUGHNjQWy3m/Y7LEQNUo2E02UKR5xbUkXRAInX9Du2f5V9FCUCeFqRj RhJaaj5QtcKm7mr1Rr8bJvB59zDvG5Cyo3pKjjPLJpx/NNogb8oKv8inEWdPnj8i+hTB 7XHMy4/Oi4KdMT4l6+48CSlQeodt64rEFKKT2KigwB0gqCUXz9aIA57Vqrgtz6DnfpJm 9L+KDFEDpN/fHfi3HvYrBags6X8xlWCpIe+m68KI+UnRonNW8M/l5/oNM8ME1qtZVW3c /Vhw==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=PVfROZdl; arc=pass (i=1 spf=pass spfdomain=igalia.com dkim=pass dkdomain=igalia.com); spf=pass (google.com: domain of linux-kernel+bounces-187955-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-187955-linux.lists.archive=gmail.com@vger.kernel.org" Return-Path: Received: from ny.mirrors.kernel.org (ny.mirrors.kernel.org. [2604:1380:45d1:ec00::1]) by mx.google.com with ESMTPS id d75a77b69052e-43e3b75591csi23813351cf.25.2024.05.23.13.07.51 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 23 May 2024 13:07:51 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-187955-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) client-ip=2604:1380:45d1:ec00::1; Authentication-Results: mx.google.com; dkim=fail header.i=@igalia.com header.s=20170329 header.b=PVfROZdl; arc=pass (i=1 spf=pass spfdomain=igalia.com dkim=pass dkdomain=igalia.com); spf=pass (google.com: domain of linux-kernel+bounces-187955-linux.lists.archive=gmail.com@vger.kernel.org designates 2604:1380:45d1:ec00::1 as permitted sender) smtp.mailfrom="linux-kernel+bounces-187955-linux.lists.archive=gmail.com@vger.kernel.org" Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by ny.mirrors.kernel.org (Postfix) with ESMTPS id 1130D1C214C9 for ; Thu, 23 May 2024 20:07:51 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 83FB084D15; Thu, 23 May 2024 20:07:41 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=fail reason="signature verification failed" (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b="PVfROZdl" Received: from fanzine2.igalia.com (fanzine.igalia.com [178.60.130.6]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id C109A7CF30; Thu, 23 May 2024 20:07:37 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=178.60.130.6 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716494860; cv=none; b=j0hIdkJzo2OKc8T8ML5Ih7zgvrA+oBl5moL4CleTx2CHu5WAPFOpqwJbijQuWxarKLPddJFvmVAfubcvM8Qw31NqW6OrZiQDfmbykDBT42ZQw846HaYu0AH8nxDSZ3xOmsZXktfowGwYdp/2/NMsKDm1asEZvCd4UVfLK6YY7Ig= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1716494860; c=relaxed/simple; bh=iZrphTEqMOGCT96r7nEcEQdllx5byLvE5QbEGAdCWpk=; h=From:To:Cc:Subject:Date:Message-ID:MIME-Version:Content-Type; b=iu2Z76737d4YlP30Nzqz+t5ADawEA8c2LJiceugXHwt9OQcQ/ae2GWyyJ92A/6sZhX4nOt8927NcT/mjLi3H36ODXMS+E/Wqc8yiSrJhhxPs8lg5ePyUB8RMJP29tbJgqo+vLT2zo8b8RFnhBoMiz0ctjZZ068hJsu8qVZpLD+U= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=igalia.com; spf=pass smtp.mailfrom=igalia.com; dkim=pass (2048-bit key) header.d=igalia.com header.i=@igalia.com header.b=PVfROZdl; arc=none smtp.client-ip=178.60.130.6 Authentication-Results: smtp.subspace.kernel.org; dmarc=none (p=none dis=none) header.from=igalia.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=igalia.com DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=igalia.com; s=20170329; h=Content-Transfer-Encoding:Content-Type:MIME-Version:Message-ID: Date:Subject:Cc:To:From:Sender:Reply-To:Content-ID:Content-Description: Resent-Date:Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID: In-Reply-To:References:List-Id:List-Help:List-Unsubscribe:List-Subscribe: List-Post:List-Owner:List-Archive; bh=+RKBUDWRrMarm2gr40PiqR7AOLsy7jlxbXPbY/6bQJk=; b=PVfROZdlzV1Gg2kWL64WllVIep 0ZKKCFJX1vGJKZ3JXyemRZ1TQCy7dYgNSuLs8IK9gM1KznqiQ0XINruQ7L/DrI7KEvriBhMVvcLgy XA1tQ3u4hHYOpwJwUJuFs1HkNGdZsLTy26PBJ3GaKz3YEUOSZc0iSu3YofnWLl3GDQrRHnxtzfZ5H f6qChLd4lQZywfmETtlJhFi26U0mob6W4LWsgDNzLREIZD7zQyeHwLGngKVjWvxE7ROmITknoX6rL E4Kmo37vO1ux+kYXzVif0O1mMnkkhUnyQbJDb2gr+nbBxpcfwgyKU33bHs/FIe0HcqWQqTPZOQXYM wVSMKYsw==; Received: from [191.8.29.37] (helo=localhost.localdomain) by fanzine2.igalia.com with esmtpsa (Cipher TLS1.3:ECDHE_X25519__RSA_PSS_RSAE_SHA256__AES_256_GCM:256) (Exim) id 1sAEiZ-00BhS0-BS; Thu, 23 May 2024 22:07:15 +0200 From: =?UTF-8?q?Andr=C3=A9=20Almeida?= To: Mathieu Desnoyers , Peter Zijlstra Cc: linux-kernel@vger.kernel.org, "Thomas Gleixner" , "Paul E . McKenney" , "Boqun Feng" , "H . Peter Anvin" , "Paul Turner" , linux-api@vger.kernel.org, "Christian Brauner" , "Florian Weimer" , David.Laight@ACULAB.COM, carlos@redhat.com, "Peter Oskolkov" , "Alexander Mikhalitsyn" , "Chris Kennelly" , "Ingo Molnar" , "Darren Hart" , "Davidlohr Bueso" , =?UTF-8?q?Andr=C3=A9=20Almeida?= , libc-alpha@sourceware.org, "Steven Rostedt" , "Jonathan Corbet" , "Noah Goldstein" , "Daniel Colascione" , longman@redhat.com, kernel-dev@igalia.com Subject: [PATCH v2 0/1] Add FUTEX_SPIN operation Date: Thu, 23 May 2024 17:07:03 -0300 Message-ID: <20240523200704.281514-1-andrealmeid@igalia.com> X-Mailer: git-send-email 2.45.1 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit Hi, In the last LPC, Mathieu Desnoyers and I presented[0] a proposal to extend the rseq interface to be able to implement spin locks in userspace correctly. Thomas Gleixner agreed that this is something that Linux could improve, but asked for an alternative proposal first: a futex operation that allows to spin a user lock inside the kernel. This patchset implements a prototype of this idea for further discussion. With FUTEX2_SPIN flag set during a futex_wait(), the futex value is expected to be the TID of the lock owner. Then, the kernel gets the task_struct of the corresponding TID, and checks if it's running. It spins until the futex is awaken, the task is scheduled out or if a timeout happens. If the lock owner is scheduled out at any time, then the syscall follows the normal path of sleeping as usual. The user input is masked with FUTEX_TID_MASK so we have some bits to play. If the futex is awaken and we are spinning, we can return to userspace quickly, avoid the scheduling out and in again to wake from a futex_wait(), thus speeding up the wait operation. The user input is masked with FUTEX_TID_MASK so we have some bits to play. Christian Brauner suggested using pidfd to avoid race conditions, and I will implement that in the next patch iteration. I benchmarked the implementation measuring the time required to wait for a futex for a simple loop using the code at [2]. In my setup, the total wait time for 1000 futexes using the spin method was almost 10% lower than just using the normal futex wait: Testing with FUTEX2_SPIN | FUTEX_WAIT Total wait time: 8650089 usecs Testing with FUTEX_WAIT Total wait time: 9447291 usecs However, as I played with how long the lock owner would be busy, the benchmark results of spinning vs no spinning would match, showing that the spinning will be effective for some specific scheduling scenarios, but depending on the wait time, there's no big difference either spinning or not. [0] https://lpc.events/event/17/contributions/1481/ You can find a small snippet to play with this interface here: [1] https://gist.github.com/andrealmeid/f0b8c93a3c7a5c50458247c47f7078e1 Changelog: v1: - s/PID/TID - masked user input with FUTEX_TID_MASK - add benchmark tool to the cover letter - dropped debug prints - added missing put_task_struct() André Almeida (1): futex: Add FUTEX_SPIN operation include/uapi/linux/futex.h | 2 +- kernel/futex/futex.h | 6 ++- kernel/futex/waitwake.c | 78 +++++++++++++++++++++++++++++++++++++- 3 files changed, 82 insertions(+), 4 deletions(-) -- 2.45.1