Received: by 2002:a05:6520:4d:b0:139:a872:a4c9 with SMTP id i13csp763746lkm; Wed, 22 Sep 2021 14:24:36 -0700 (PDT) X-Google-Smtp-Source: ABdhPJx2brc8ORV8BeKg90KET8u4zbu0WwfssRqvUKoNUGg8p1JPTGA+8CMEpiqVt0YJBwVqbgk8 X-Received: by 2002:aa7:c905:: with SMTP id b5mr1718651edt.380.1632345876275; Wed, 22 Sep 2021 14:24:36 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1632345876; cv=none; d=google.com; s=arc-20160816; b=C5uGOVkCZu3Yogyp4VVOsX4+ff5fP3yVxFoBF5ifYuhIHfkXnjLpEzS887IQ818/KB uhvk9DU20/wDEbLwAC+ETsGv6xOTeyxKyIUaJN6lhXlkx3rbxke0CL238VF2w2UuMg2g m6yQ0hj41xsa3n/FaROOIK34jy1Q1Dof1Gos6urWik2XIq2kNPcUkmoau56Ehn5RbuUn YhK29pOtFJBRI2gxOi11/LW9AFyX9BAZbUsuQRzZoIMMMWbC02ymy5u24TWhceJyTxYM z1Tlm+gsmyziN737ShbkWFpPyW2MGo4H39fCgqpLhK14V0MrCVFkwlSLAnj2l798aYXx pRJQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:from:references :cc:to:subject; bh=5vs1zHshr9eEFlTTjZtYwJjtir258LihylnmGc9NvUk=; b=QqnrGAZx595IKTd6PJ7BOkAhRJPbObWNZlbrep9bAVT/nalw0uC/yF0yW4SGcAS5xY TcurivR6zLIfrwJ+2+y2ZT2ablUfN8KKftHjZVS4b5mbWKapMDF43+PFN2JvTHLTH+AB Ns6CDjb9CLNnlgIKuLMoZ3XvvU0WDMFpO+351LCRdRXwkQZFihbsm6kJkPCqho+q7f1o /B6fv6zXHC4R/bT8AVlAwIlCPtMmWfZIpW6DE/hIGU+/wKIPvDSJvwu2bc8cYe2uV6iR UiEfWCFJNzqS3eVstk7OVSyEB30BLGB0DES221NfnBq6GlNmd0plqG4H5aEnJJ4PQeMw WpAg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d12si3657032edx.464.2021.09.22.14.24.12; Wed, 22 Sep 2021 14:24:36 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238005AbhIVVYY (ORCPT + 99 others); Wed, 22 Sep 2021 17:24:24 -0400 Received: from www62.your-server.de ([213.133.104.62]:37346 "EHLO www62.your-server.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S236476AbhIVVYX (ORCPT ); Wed, 22 Sep 2021 17:24:23 -0400 Received: from sslproxy06.your-server.de ([78.46.172.3]) by www62.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92.3) (envelope-from ) id 1mT9hZ-000D8G-7O; Wed, 22 Sep 2021 23:22:49 +0200 Received: from [85.1.206.226] (helo=linux.home) by sslproxy06.your-server.de with esmtpsa (TLSv1.3:TLS_AES_256_GCM_SHA384:256) (Exim 4.92) (envelope-from ) id 1mT9hY-000Rpx-Vo; Wed, 22 Sep 2021 23:22:49 +0200 Subject: Re: [PATCH v2 bpf-next] libbpf: Use sysconf to simplify libbpf_num_possible_cpus To: Muhammad Falak R Wani , Alexei Starovoitov , Andrii Nakryiko Cc: Martin KaFai Lau , Song Liu , Yonghong Song , John Fastabend , KP Singh , netdev@vger.kernel.org, bpf@vger.kernel.org, linux-kernel@vger.kernel.org References: <20210922070748.21614-1-falakreyaz@gmail.com> From: Daniel Borkmann Message-ID: Date: Wed, 22 Sep 2021 23:22:48 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.7.2 MIME-Version: 1.0 In-Reply-To: <20210922070748.21614-1-falakreyaz@gmail.com> Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit X-Authenticated-Sender: daniel@iogearbox.net X-Virus-Scanned: Clear (ClamAV 0.103.3/26300/Wed Sep 22 11:04:22 2021) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/22/21 9:07 AM, Muhammad Falak R Wani wrote: > Simplify libbpf_num_possible_cpus by using sysconf(_SC_NPROCESSORS_CONF) > instead of parsing a file. > This patch is a part ([0]) of libbpf-1.0 milestone. > > [0] Closes: https://github.com/libbpf/libbpf/issues/383 > > Signed-off-by: Muhammad Falak R Wani > --- > tools/lib/bpf/libbpf.c | 17 ++++------------- > 1 file changed, 4 insertions(+), 13 deletions(-) > > diff --git a/tools/lib/bpf/libbpf.c b/tools/lib/bpf/libbpf.c > index ef5db34bf913..f1c0abe5b58d 100644 > --- a/tools/lib/bpf/libbpf.c > +++ b/tools/lib/bpf/libbpf.c > @@ -10898,25 +10898,16 @@ int parse_cpu_mask_file(const char *fcpu, bool **mask, int *mask_sz) > > int libbpf_num_possible_cpus(void) > { > - static const char *fcpu = "/sys/devices/system/cpu/possible"; > static int cpus; > - int err, n, i, tmp_cpus; > - bool *mask; > + int tmp_cpus; > > tmp_cpus = READ_ONCE(cpus); > if (tmp_cpus > 0) > return tmp_cpus; > > - err = parse_cpu_mask_file(fcpu, &mask, &n); > - if (err) > - return libbpf_err(err); > - > - tmp_cpus = 0; > - for (i = 0; i < n; i++) { > - if (mask[i]) > - tmp_cpus++; > - } > - free(mask); > + tmp_cpus = sysconf(_SC_NPROCESSORS_CONF); > + if (tmp_cpus < 1) > + return libbpf_err(-EINVAL); This approach is unfortunately broken, see also commit e00c7b216f34 ("bpf: fix multiple issues in selftest suite and samples") for more details: 3) Current selftest suite code relies on sysconf(_SC_NPROCESSORS_CONF) for retrieving the number of possible CPUs. This is broken at least in our scenario and really just doesn't work. glibc tries a number of things for retrieving _SC_NPROCESSORS_CONF. First it tries equivalent of /sys/devices/system/cpu/cpu[0-9]* | wc -l, if that fails, depending on the config, it either tries to count CPUs in /proc/cpuinfo, or returns the _SC_NPROCESSORS_ONLN value instead. If /proc/cpuinfo has some issue, it returns just 1 worst case. This oddity is nothing new [1], but semantics/behaviour seems to be settled. _SC_NPROCESSORS_ONLN will parse /sys/devices/system/cpu/online, if that fails it looks into /proc/stat for cpuX entries, and if also that fails for some reason, /proc/cpuinfo is consulted (and returning 1 if unlikely all breaks down). While that might match num_possible_cpus() from the kernel in some cases, it's really not guaranteed with CPU hotplugging, and can result in a buffer overflow since the array in user space could have too few number of slots, and on perpcu map lookup, the kernel will write beyond that memory of the value buffer. William Tu reported such mismatches: [...] The fact that sysconf(_SC_NPROCESSORS_CONF) != num_possible_cpu() happens when CPU hotadd is enabled. For example, in Fusion when setting vcpu.hotadd = "TRUE" or in KVM, setting ./qemu-system-x86_64 -smp 2, maxcpus=4 ... the num_possible_cpu() will be 4 and sysconf() will be 2 [2]. [...] Documentation/cputopology.txt says /sys/devices/system/cpu/possible outputs cpu_possible_mask. That is the same as in num_possible_cpus(), so first step would be to fix the _SC_NPROCESSORS_CONF calls with our own implementation. Later, we could add support to bpf(2) for passing a mask via CPU_SET(3), for example, to just select a subset of CPUs. BPF samples code needs this fix as well (at least so that people stop copying this). Thus, define bpf_num_possible_cpus() once in selftests and import it from there for the sample code to avoid duplicating it. The remaining sysconf(_SC_NPROCESSORS_CONF) in samples are unrelated. Thanks, Daniel > WRITE_ONCE(cpus, tmp_cpus); > return tmp_cpus; >