Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp3736666pxj; Mon, 7 Jun 2021 19:37:38 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzlU5Xfi/KPusVvAuTSEbNecowjXcNow1acBv0kiOzfqhXiN7Nrt62vvFMUgglCMgE3ZAL0 X-Received: by 2002:a05:6402:b5a:: with SMTP id bx26mr11140388edb.81.1623119857857; Mon, 07 Jun 2021 19:37:37 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1623119857; cv=none; d=google.com; s=arc-20160816; b=eSasJS7fmkvdg87aawzDKtKeDS0iHI2fxsXSYlgQ6ZEr5CuftCorgeotmSlSKpXLR9 obdPjb3wBGIywSSpTwe6NGhYIEq6KVwi/mVDP764apyZqo19pLOwdTQ2Gj0b8nCYnsgI TXIEdY5RUDn/+77rKPbI5CrHNR/+mzcPXIn1lTG42LBxgFxgdnGPI5FSImsSk9TZRrIq ajy0FgG/d6E/gY1yYf+BwfZO854rcvV4rZZoQXFJRGQOc6Po1wP9G5UH9YaiG2KDqr2V 2OTKqZzVQeY5b6Li8Wxg1bIKoq2E4Ho8KHEYafkgQISfNZvLHhPJABeB7OabbGJ5/8M4 J6Mw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-transfer-encoding :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=7YF3+Ax/RdyXQMSGM2N76tSO9qrrZbDdDaHFzNINZ7o=; b=K9d57mlIhlQFFOpoGceJC5WxGtzaTBb3KMCcGF5/8JYPxlKPUnqUS9PW+rETGUFUyg 8DVyPnyA2I2LowLaMDIDt0zUrvDwkvcjZIA/HGj9JPvbeEHb09MGaAZKH+XybuAn/XAn lzt/3WMrgzR0GBniWBLgI0wZ2GfmLaDpay4pGhCVmbMO24arqllUGQvL9WN8XRuFNm0u 3u1EOc9Qs/OdKPS3BxWXbEOOW1E63OBhLOOT1T19N1aPt2qg6hE62oFuGo0jAQMHdKyl lMPqmUYaCqfithTIpk1qmR6A94x6Oz2fngpIQnAZyVvkgo0Yc8oqkgMH2uW2PwSCbqLw wrrA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id e14si1344135edm.431.2021.06.07.19.37.14; Mon, 07 Jun 2021 19:37:37 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S230500AbhFHCfI convert rfc822-to-8bit (ORCPT + 99 others); Mon, 7 Jun 2021 22:35:08 -0400 Received: from smtp-out1.suse.de ([195.135.220.28]:40800 "EHLO smtp-out1.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S230444AbhFHCfH (ORCPT ); Mon, 7 Jun 2021 22:35:07 -0400 Received: from imap.suse.de (imap-alt.suse-dmz.suse.de [192.168.254.47]) (using TLSv1.2 with cipher ECDHE-ECDSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp-out1.suse.de (Postfix) with ESMTPS id A2BDC219BE; Tue, 8 Jun 2021 02:33:13 +0000 (UTC) Received: from imap3-int (imap-alt.suse-dmz.suse.de [192.168.254.47]) by imap.suse.de (Postfix) with ESMTP id 194DA118DD; Tue, 8 Jun 2021 02:33:07 +0000 (UTC) Received: from director2.suse.de ([192.168.254.72]) by imap3-int with ESMTPSA id jKjUNOPWvmA2HgAALh3uQQ (envelope-from ); Tue, 08 Jun 2021 02:33:07 +0000 Date: Mon, 7 Jun 2021 19:33:02 -0700 From: Davidlohr Bueso To: =?utf-8?B?QW5kcsOvwr/CvQ==?= Almeida Cc: Nicholas Piggin , acme@kernel.org, Andrey Semashev , Sebastian Andrzej Siewior , corbet@lwn.net, Darren Hart , fweimer@redhat.com, joel@joelfernandes.org, kernel@collabora.com, krisman@collabora.com, libc-alpha@sourceware.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, linux-kselftest@vger.kernel.org, malteskarupke@fastmail.fm, Ingo Molnar , Peter Zijlstra , pgriffais@valvesoftware.com, Peter Oskolkov , Steven Rostedt , shuah@kernel.org, Thomas Gleixner , z.figura12@gmail.com Subject: Re: [PATCH v4 00/15] Add futex2 syscalls Message-ID: <20210608023302.34yzrm5ktf3qvxhq@offworld> References: <20210603195924.361327-1-andrealmeid@collabora.com> <1622799088.hsuspipe84.astroid@bobo.none> <1622853816.mokf23xgnt.astroid@bobo.none> <22137ccd-c5e6-9fcc-a176-789558e9ab1e@collabora.com> MIME-Version: 1.0 Content-Type: text/plain; charset=iso-8859-1; format=flowed Content-Disposition: inline Content-Transfer-Encoding: 8BIT In-Reply-To: <22137ccd-c5e6-9fcc-a176-789558e9ab1e@collabora.com> User-Agent: NeoMutt/20201120 Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, 07 Jun 2021, Andr� Almeida wrote: >?s 22:09 de 04/06/21, Nicholas Piggin escreveu: >> Actually one other scalability thing while I remember it: >> >> futex_wait currently requires that the lock word is tested under the >> queue spin lock (to avoid consuming a wakeup). The problem with this is >> that the lock word can be a very hot cache line if you have a lot of >> concurrency, so accessing it under the queue lock can increase queue >> lock hold time. >> >> I would prefer if the new API was relaxed to avoid this restriction >> (e.g., any wait call may consume a wakeup so it's up to userspace to >> avoid that if it is a problem). > >Maybe I'm wrong, but AFAIK the goal of checking the lock word inside the >spin lock is to avoid sleeping forever (in other words, wrongly assuming >that the lock is taken and missing a wakeup call), not to avoid >consuming wakeups. Or at least this is my interpretation of this long >comment in futex.c: > >https://elixir.bootlin.com/linux/v5.12.9/source/kernel/futex.c#L51 I think what Nick is referring to is that futex_wait() could return 0 instead of EAGAIN upon a uval != val condition if the check is done without the hb lock. The value could have changed between when userspace did the condition check and called into futex(2) to block in the slowpath. But such spurious scenarios should be pretty rare, and while I agree that the cacheline can be hot, I'm not sure how much of a performance issue this really is(?), compared to other issues, certainly not to govern futex2 design. Changing such semantics would be a _huge_ difference between futex1 and futex2. At least compared, for example, to the hb collisions serializing independent futexes, affecting both performance and determinism. And I agree that a new interface should address this problem - albeit most of the workloads I have seen in production use but a handful of futexes and larger thread counts. One thing that crossed my mind (but have not actually sat down to look at) would be to use rlhastables for the dynamic resizing, but of course that would probably add a decent amount of overhead to the simple hashing we currently have. Thanks, Davidlohr