Received: by 2002:a05:6a10:9afc:0:0:0:0 with SMTP id t28csp712498pxm; Fri, 25 Feb 2022 18:02:31 -0800 (PST) X-Google-Smtp-Source: ABdhPJyyNz4C5SmrkroItRSny8PbQ6QD4+fH1pPdWmqDMeUcNBHcguoizsfIsD0XWC83jIKwTcxY X-Received: by 2002:a17:902:ec88:b0:14f:d5fb:7a8c with SMTP id x8-20020a170902ec8800b0014fd5fb7a8cmr9913572plg.5.1645840951786; Fri, 25 Feb 2022 18:02:31 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1645840951; cv=none; d=google.com; s=arc-20160816; b=d2wHvo84Oo0JRe73QFnOVbWI6uAs79iJIuQ+I4xI+BT2bPOGg72SzV/xXr2vRzSHOM OFZXf8vwXJggaoXEl0axCsrUbrtu8bvjWl3M0HOkPtZIzCvfD5hqHiRjUUPA5aoD7t42 XOqZ6WfmPFj7ZZ6N3XIjFJ+bbrECigsOBECwR0S6/ijfStoNHEmIxO6dBenb3xr1W7AR Mu2HnMJk3mrRmkSB3YuY/C05f9L3VezWRHuFVN1mrYvNqyzzXkSsBDl9qxv96jGeSw9P Y+BQhuP/uq4gBzUDMrcrmcfdGrj3OJ9o0vDq3FA1I3pXltvzqbAmshDahywPwBEiDPQV 2nmQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:thread-index:thread-topic :content-transfer-encoding:mime-version:subject:references :in-reply-to:message-id:cc:to:from:date:dkim-signature:dkim-filter; bh=bRaEeVVQaMjClWxRtEpX+2V+p2vFHfmPT7g2jJx2WJs=; b=iM5FyF3hNSSMxZONf8a0ATz/TVt2xKjD7a6PSY2FrgdLY5OZYWyHSLDku055AYc6M9 3pTzLKsXrS0jAAlv0hn//U6kRUTKIAEu39uyf2kGK+UW8dvkr025IaK0UOVDJP7cZq49 3CGkJdBQ8VAFzbc8JoQACwfNjWGC3s5EzJWevDVS3daHfVkoiz+F8reNb6AqD49RRgVA 8j3AL+ZFXsYnBRe6BkXb+qjjxdFawEQf8brN2aytSwxH4iAHwFgFCcihR8oujhZkoIV7 FRmh8LrctUA0km3RHHZC2Q7Oz/CKBdGjTn+JMsVIWnXSW1RG20KgT6k5SQKGeHCxmk0f qS4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=MvcttoXu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [2620:137:e000::1:18]) by mx.google.com with ESMTPS id f14-20020a170902684e00b0014a52e43e63si3242069pln.358.2022.02.25.18.02.31 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 25 Feb 2022 18:02:31 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) client-ip=2620:137:e000::1:18; Authentication-Results: mx.google.com; dkim=pass header.i=@efficios.com header.s=default header.b=MvcttoXu; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 2620:137:e000::1:18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=efficios.com Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 1BF1E2A3554; Fri, 25 Feb 2022 17:43:25 -0800 (PST) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S238379AbiBYVVn (ORCPT + 99 others); Fri, 25 Feb 2022 16:21:43 -0500 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:48610 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S238368AbiBYVVm (ORCPT ); Fri, 25 Feb 2022 16:21:42 -0500 Received: from mail.efficios.com (mail.efficios.com [167.114.26.124]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 955D71F0822; Fri, 25 Feb 2022 13:21:07 -0800 (PST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id BE84D3E68A2; Fri, 25 Feb 2022 16:21:06 -0500 (EST) Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10032) with ESMTP id dmcvhaBcIuw6; Fri, 25 Feb 2022 16:21:02 -0500 (EST) Received: from localhost (localhost [127.0.0.1]) by mail.efficios.com (Postfix) with ESMTP id 3D3233E63FA; Fri, 25 Feb 2022 16:21:02 -0500 (EST) DKIM-Filter: OpenDKIM Filter v2.10.3 mail.efficios.com 3D3233E63FA DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=efficios.com; s=default; t=1645824062; bh=bRaEeVVQaMjClWxRtEpX+2V+p2vFHfmPT7g2jJx2WJs=; h=Date:From:To:Message-ID:MIME-Version; b=MvcttoXuNkDBxnUlHkcg4+DmCWaUO6cIFurJL83YNCwJRM19H1lMpMaBJLxm9/ajm k+NGN5XEcjA0p3yyV/4Hvh9H/Ri24lvpK2Rsf27Wye+hZmlcW3rFAGF+Y7IFckyea2 bqfLpSbXLTRbysUYU+YShjpLkb6nl2qYBcYINaMQ5LnTMYn1menEyOJt0Cq0xuHpxj DMO/RCZ/ASnhkmM55nncQ3I7N3dIaokZrlGf0rRNQptzsb0+EcygkOs/BIQklbIBN9 2153/b+AwoNGgIiF9auyYEkFFgsxKhwm2MtHmhs1GKKmi/5uhDWiqx6dU2L8DsgtuR K0Qr0gBdcspGA== X-Virus-Scanned: amavisd-new at efficios.com Received: from mail.efficios.com ([127.0.0.1]) by localhost (mail03.efficios.com [127.0.0.1]) (amavisd-new, port 10026) with ESMTP id tvXQ-QN8DIcu; Fri, 25 Feb 2022 16:21:02 -0500 (EST) Received: from mail03.efficios.com (mail03.efficios.com [167.114.26.124]) by mail.efficios.com (Postfix) with ESMTP id 26A423E63EE; Fri, 25 Feb 2022 16:21:02 -0500 (EST) Date: Fri, 25 Feb 2022 16:21:02 -0500 (EST) From: Mathieu Desnoyers To: Jonathan Corbet Cc: Peter Zijlstra , linux-kernel , Thomas Gleixner , paulmck , Boqun Feng , "H. Peter Anvin" , Paul Turner , linux-api , Christian Brauner , Florian Weimer , David Laight , carlos , Peter Oskolkov Message-ID: <1136157594.109786.1645824062005.JavaMail.zimbra@efficios.com> In-Reply-To: <1323451367.108396.1645811762372.JavaMail.zimbra@efficios.com> References: <20220218210633.23345-1-mathieu.desnoyers@efficios.com> <20220218210633.23345-10-mathieu.desnoyers@efficios.com> <87k0dikfxa.fsf@meer.lwn.net> <1323451367.108396.1645811762372.JavaMail.zimbra@efficios.com> Subject: Re: [RFC PATCH v2 09/11] sched: Introduce per memory space current virtual cpu id MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 7bit X-Originating-IP: [167.114.26.124] X-Mailer: Zimbra 8.8.15_GA_4203 (ZimbraWebClient - FF97 (Linux)/8.8.15_GA_4232) Thread-Topic: sched: Introduce per memory space current virtual cpu id Thread-Index: lCW3kQh5ZH1hwRJzi/Qgd8U7jX6j3o6dldnR X-Spam-Status: No, score=-2.0 required=5.0 tests=BAYES_00,DKIM_SIGNED, DKIM_VALID,DKIM_VALID_AU,HEADER_FROM_DIFFERENT_DOMAINS, MAILING_LIST_MULTI,RDNS_NONE,SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org ----- On Feb 25, 2022, at 12:56 PM, Mathieu Desnoyers mathieu.desnoyers@efficios.com wrote: > ----- On Feb 25, 2022, at 12:35 PM, Jonathan Corbet corbet@lwn.net wrote: > >> Mathieu Desnoyers writes: >> >>> This feature allows the scheduler to expose a current virtual cpu id >>> to user-space. This virtual cpu id is within the possible cpus range, >>> and is temporarily (and uniquely) assigned while threads are actively >>> running within a memory space. If a memory space has fewer threads than >>> cores, or is limited to run on few cores concurrently through sched >>> affinity or cgroup cpusets, the virtual cpu ids will be values close >>> to 0, thus allowing efficient use of user-space memory for per-cpu >>> data structures. >> >> So I have one possibly (probably) dumb question: if I'm writing a >> program to make use of virtual CPU IDs, how do I know what the maximum >> ID will be? It seems like one of the advantages of this mechanism would >> be not having to be prepared for anything in the physical ID space, but >> is there any guarantee that the virtual-ID space will be smaller? >> Something like "no larger than the number of threads", say? > > Hi Jonathan, > > This is a very relevant question. Let me quote what I answered to Florian > on the last round of review for this series: > > Some effective upper bounds for the number of vcpu ids observable in a process: > > - sysconf(3) _SC_NPROCESSORS_CONF, > - the number of threads which exist concurrently in the process, One small detail I forgot to mention: on a NUMA system, a single-threaded process will observe (typically) vcpu_id=numa_node_id. So it can jump around between vcpu_id values depending on which numa node it runs on at the moment. So the vcpu_id is not strictly bound by the number of concurrently running threads. Thanks, Mathieu > - the number of cpus in the cpu affinity mask applied by sched_setaffinity, > except in corner-case situations such as cpu hotplug removing all cpus from > the affinity set, > - cgroup cpuset "partition" limits, > > Note that AFAIR non-partition cgroup cpusets allow a cgroup to "borrow" > additional cores from the rest of the system if they are idle, therefore > allowing the number of concurrent threads to go beyond the specified limit. > > AFAIR the sched affinity mask is tweaked independently of the cgroup cpuset. > Those are two mechanisms both affecting the scheduler task placement. > > I would expect the user-space code to use some sensible upper bound as a > hint about how many per-vcpu data structure elements to expect (and how many > to pre-allocate), but have a "lazy initialization" fall-back in case the > vcpu id goes up to the number of configured processors - 1. And I suspect > that even the number of configured processors may change with CRIU. > > If the above explanation makes sense (please let me know if I am wrong > or missed something), I suspect I should add it to the commit message. > > Thanks, > > Mathieu > > -- > Mathieu Desnoyers > EfficiOS Inc. > http://www.efficios.com -- Mathieu Desnoyers EfficiOS Inc. http://www.efficios.com