Received: by 2002:a05:6358:c692:b0:131:369:b2a3 with SMTP id fe18csp4777747rwb; Mon, 31 Jul 2023 12:03:08 -0700 (PDT) X-Google-Smtp-Source: APBJJlEe/aZzRjapF8uHPNps2DmGpDhS14hnvKumbbPIY8/VGRN3zhczUFLdXdzinJ/K4OlslD2P X-Received: by 2002:a17:906:ef8f:b0:99b:237e:6ee with SMTP id ze15-20020a170906ef8f00b0099b237e06eemr528599ejb.30.1690830187809; Mon, 31 Jul 2023 12:03:07 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1690830187; cv=none; d=google.com; s=arc-20160816; b=NBtMUEzp9D90NKvRLxOcyIkXgTn+aCfZ02mE4LB/5h6+HbMFTbfoApBzBNFBa5Mbr9 EcKXU2G76bfMiTZ7ZsIQs9DSoDVawXZdi5mZYKi89OUJzq2sAfd6C1J8ApEuJehheVyT w2qUj/68JrGKGiG71EX4Q8CfQNDNOsqPo99xAWvTqSMVnT3og4WAkyeQNHHv7cJYAJ3X pQSB1TXvfV0g/7F9iyoKOOORDRcTqitdT/yPgzMLd4poQHOxYvcG0Z8YGzaT+EFt1tkl rSmBN8w5d2OxN4BHxIY2F+eD6aZZ1Yil3SPtl6THi7SBupfASc24fGZSnGfIIFAcCSdi JJQg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:mime-version:message-id:date:references :in-reply-to:subject:cc:to:dkim-signature:dkim-signature:from; bh=iFf50TJmeKBJp8QFVacRBeJgY/I84fMWpugiLSkN2d0=; fh=xxHYUAYzGbxZHZHiTvUbuIcuHusI+eZA6qmV5uBhRJ0=; b=aSsjOwCjXiHxcEOLgxajcioNuYjcm40OtpCgiXQVMI80Rcu7XECQO73IDlOB9YA5NQ EsmszRShw0YtnlPMYeWqQFJG6q9IxJylKZbckwv2A6dpJM8Hs4Dlfx/5a1YCdQCw/kST MZceXLWKkNphQvl8KNXiL2vpvukPFvp/H43DklNGCVqm3JBxSz6XFsCTSutTsasS+EMG +uS57Mp+Jw/cHFppLNWL2c1b7jlJdNl679bGPK2Nwvhj1G5kl8bWlMbq1EuD8E8TYiGI e3D/dzq8DDxx0Ks30UTOQVMBQ/Nb9LCwlQ7Xf77aD8NKxA/AnVlY1mZ/9k+Ym3CfM/K6 mHyQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=sniaLJjU; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=Cw3umMTe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Return-Path: Received: from out1.vger.email (out1.vger.email. [2620:137:e000::1:20]) by mx.google.com with ESMTP id u22-20020a170906951600b00992e0e54464si7579639ejx.254.2023.07.31.12.02.40; Mon, 31 Jul 2023 12:03:07 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) client-ip=2620:137:e000::1:20; Authentication-Results: mx.google.com; dkim=pass header.i=@linutronix.de header.s=2020 header.b=sniaLJjU; dkim=neutral (no key) header.i=@linutronix.de header.s=2020e header.b=Cw3umMTe; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:20 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=linutronix.de Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S232338AbjGaRg0 (ORCPT + 99 others); Mon, 31 Jul 2023 13:36:26 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:53894 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229660AbjGaRgZ (ORCPT ); Mon, 31 Jul 2023 13:36:25 -0400 Received: from galois.linutronix.de (Galois.linutronix.de [193.142.43.55]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id B33451A7; Mon, 31 Jul 2023 10:36:24 -0700 (PDT) From: Thomas Gleixner DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020; t=1690824982; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=iFf50TJmeKBJp8QFVacRBeJgY/I84fMWpugiLSkN2d0=; b=sniaLJjU9khAD/zHKPICJex4h/W1UTe8KR4cp/RPdiVDPzRAFxo6US8Cdk2noeDT7T75WD xFGgh+YZpmXu+CC6kd+8wl+oqCWe+vfe5w/f8+8RDJtb9ut+PI2ZCtSZ1VHR4ZSvMMknvb CABTIObaYdIIbbS8TwAZ83yTuujof1JjdtF7Q8zFdVl3Bu32mOgEmhg9xoVhXFomk54AuQ gEQk1K3riDtf5IYeLNtUcAHDz3djZFzJFtGf/3P7Dln4zK0+fdepJEpmreCcQ+/Po50hyy FwZ+im0VEcR6R1j6qTkTtkvxHF0eN7N6z+HmS6FA1Erlj15b9s+P8DOSJdyQrQ== DKIM-Signature: v=1; a=ed25519-sha256; c=relaxed/relaxed; d=linutronix.de; s=2020e; t=1690824982; h=from:from:reply-to:subject:subject:date:date:message-id:message-id: to:to:cc:cc:mime-version:mime-version:content-type:content-type: in-reply-to:in-reply-to:references:references; bh=iFf50TJmeKBJp8QFVacRBeJgY/I84fMWpugiLSkN2d0=; b=Cw3umMTe1MDbI7v/AYOO9l3z7Lq/XpIXoigLOz6Wy828nDvlximePUklxfgQRXPmJMrasA ENUcHEe3LhopAxBg== To: Peter Zijlstra , axboe@kernel.dk Cc: linux-kernel@vger.kernel.org, peterz@infradead.org, mingo@redhat.com, dvhart@infradead.org, dave@stgolabs.net, andrealmeid@igalia.com, Andrew Morton , urezki@gmail.com, hch@infradead.org, lstoakes@gmail.com, Arnd Bergmann , linux-api@vger.kernel.org, linux-mm@kvack.org, linux-arch@vger.kernel.org, malteskarupke@web.de Subject: Re: [PATCH v1 11/14] futex: Implement FUTEX2_NUMA In-Reply-To: <20230721105744.434742902@infradead.org> References: <20230721102237.268073801@infradead.org> <20230721105744.434742902@infradead.org> Date: Mon, 31 Jul 2023 19:36:21 +0200 Message-ID: <87pm48m19m.ffs@tglx> MIME-Version: 1.0 Content-Type: text/plain X-Spam-Status: No, score=-4.4 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,DKIM_VALID_EF,RCVD_IN_DNSWL_MED,SPF_HELO_NONE, SPF_PASS,T_SCC_BODY_TEXT_LINE autolearn=ham autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jul 21 2023 at 12:22, Peter Zijlstra wrote: > struct futex_hash_bucket *futex_hash(union futex_key *key) > { > - u32 hash = jhash2((u32 *)key, offsetof(typeof(*key), both.offset) / 4, > + u32 hash = jhash2((u32 *)key, > + offsetof(typeof(*key), both.offset) / sizeof(u32), > key->both.offset); > + int node = key->both.node; > > - return &futex_queues[hash & (futex_hashsize - 1)]; > + if (node == -1) { > + /* > + * In case of !FLAGS_NUMA, use some unused hash bits to pick a > + * node -- this ensures regular futexes are interleaved across > + * the nodes and avoids having to allocate multiple > + * hash-tables. > + * > + * NOTE: this isn't perfectly uniform, but it is fast and > + * handles sparse node masks. > + */ > + node = (hash >> futex_hashshift) % nr_node_ids; Is nr_node_ids guaranteed to be stable after init? It's marked __read_mostly, but not __ro_after_init. > + if (!node_possible(node)) { > + node = find_next_bit_wrap(node_possible_map.bits, > + nr_node_ids, node); > + } > + } > + > + return &futex_queues[node][hash & (futex_hashsize - 1)]; > } > fshared = flags & FLAGS_SHARED; > + size = futex_size(flags); > > /* > * The futex address must be "naturally" aligned. > */ > key->both.offset = address % PAGE_SIZE; > - if (unlikely((address % sizeof(u32)) != 0)) > + if (unlikely((address % size) != 0)) > return -EINVAL; Hmm. Shouldn't that have changed with the allowance of the 1 and 2 byte futexes? > address -= key->both.offset; > > - if (unlikely(!access_ok(uaddr, sizeof(u32)))) > + if (flags & FLAGS_NUMA) > + size *= 2; > + > + if (unlikely(!access_ok(uaddr, size))) > return -EFAULT; > > if (unlikely(should_fail_futex(fshared))) > return -EFAULT; > > + key->both.node = -1; Please put this into an else path. > + if (flags & FLAGS_NUMA) { > + void __user *naddr = uaddr + size/2; size / 2; > + > + if (futex_get_value(&node, naddr, flags)) > + return -EFAULT; > + > + if (node == -1) { > + node = numa_node_id(); > + if (futex_put_value(node, naddr, flags)) > + return -EFAULT; > + } > + > + if (node >= MAX_NUMNODES || !node_possible(node)) > + return -EINVAL; That's clearly an else path too. No point in checking whether numa_node_id() is valid. > + key->both.node = node; > + } > > +static inline unsigned int futex_size(unsigned int flags) > +{ > + unsigned int size = flags & FLAGS_SIZE_MASK; > + return 1 << size; /* {0,1,2,3} -> {1,2,4,8} */ > +} > + > static inline bool futex_flags_valid(unsigned int flags) > { > /* Only 64bit futexes for 64bit code */ > @@ -77,13 +83,19 @@ static inline bool futex_flags_valid(uns > if ((flags & FLAGS_SIZE_MASK) != FLAGS_SIZE_32) > return false; > > - return true; > -} > + /* > + * Must be able to represent both NUMA_NO_NODE and every valid nodeid > + * in a futex word. > + */ > + if (flags & FLAGS_NUMA) { > + int bits = 8 * futex_size(flags); > + u64 max = ~0ULL; > + max >>= 64 - bits; Your newline key is broken, right? > + if (nr_node_ids >= max) > + return false; > + } Thanks, tglx