Received: by 2002:a05:6a10:f3d0:0:0:0:0 with SMTP id a16csp2387247pxv; Sun, 11 Jul 2021 11:38:10 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy6qRz+/4kyA4qeYyyh72J4d2okBunywMEFdItJP+53SdLKHmTpj5IIk7DR7ajT+pVuYLeu X-Received: by 2002:a05:6638:24c3:: with SMTP id y3mr702909jat.10.1626028690840; Sun, 11 Jul 2021 11:38:10 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626028690; cv=none; d=google.com; s=arc-20160816; b=p7aQwqQXvprDB0K8z8Bqm4M3y1oWx8F3qUorMrMQ/B90oQBlnp+27d0/EIXTBKIKGv 9RfA5pSUSNE+h9ApvlfJQTYSYi+bwiSFJPsHc/VrQapb6IN8itg35QffUfFjNostcXY1 DMQP6AGqmwB3l3GOVNo2fPSLnSGJ0vfdR2f1mXT0OcVHbDFSN0AdhOcdCjsdAxgsR6DZ QbwZEr3cx63usb7Zril9yFOTWFY8CwHbAYs7vfr1eq9rOmmhbiYHkMeK78ixUYYL39gJ o+v4/p37FvEQtsnq9TfsoALwTjKZdPpxMoYGvfjiV58OiT4mj7lRwVYJ0dfi1AxjLi/G KqMg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-language:content-transfer-encoding :mime-version:user-agent:date:message-id:from:in-reply-to:subject :references:cc:to:dkim-signature; bh=Uzvgg3P53G2g/a+sxAu8X6S30LpydyRfp1sqsPbroLU=; b=HHiMGBo2XZHZwKvNaDbl3OaTe0x8z5bu7/qpOdRFxLAKXYm8T1pMSG70pRqN4ES+n5 /+2BMeSUE5loxs5ynUTunqlzo9SeFWFDFkghLz91+SAs2evW7+CvbuicRRTl87D1/4f2 wnUGkPL1drWe+RDMTbt1+PK+4E3/FlbznsDhO69DiV+fYCVBlJv6SFOlzXkM/WDLiPsi mPZi60y4jyh+RVVf2yhqnD3UKc6E0SCo0ANvMYgmgoCs9u78rikbJmH4HpwK9D1cQ7kF /zmr//ee3mJn98LUsgd2eYZ6Ps5omUcJkzTG638YHlRC3Z5zBOOZ3Wmq/FC+eXvp3OlW 1zQg== ARC-Authentication-Results: i=1; mx.google.com; dkim=fail header.i=@uwaterloo.ca header.s=default header.b=DsaZjNb9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=uwaterloo.ca Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id z14si16085109jat.81.2021.07.11.11.37.49; Sun, 11 Jul 2021 11:38:10 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=fail header.i=@uwaterloo.ca header.s=default header.b=DsaZjNb9; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=uwaterloo.ca Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S235727AbhGKSjf (ORCPT + 99 others); Sun, 11 Jul 2021 14:39:35 -0400 Received: from esa.hc503-62.ca.iphmx.com ([216.71.135.51]:33353 "EHLO esa.hc503-62.ca.iphmx.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231277AbhGKSje (ORCPT ); Sun, 11 Jul 2021 14:39:34 -0400 X-Greylist: delayed 426 seconds by postgrey-1.27 at vger.kernel.org; Sun, 11 Jul 2021 14:39:34 EDT DKIM-Signature: v=1; a=rsa-sha256; c=simple/simple; d=uwaterloo.ca; i=@uwaterloo.ca; q=dns/txt; s=default; t=1626028607; x=1657564607; h=to:cc:references:subject:in-reply-to:from:message-id: date:mime-version:content-transfer-encoding; bh=l8PuGoPMNTSs7wgUV+e+eaDnUAK/vLI5wMbtdje7hW0=; b=DsaZjNb9vTQSu5f0at8dQzVNSDA83XCN3LYl5q7zH8dTUFA/qyfYqqt/ oQqg/icZERZqG7jom9oogz2+kJzTnDDaUlFJ0oVUOTIoTrgp5ERLH6Oj/ xUZu6X96yjIJOhSrBSWb/TfWeL/yORyY2CfuzZg+SLK0PQNvi7yNuzckL w=; Received: from connect.uwaterloo.ca (HELO connhm04.connect.uwaterloo.ca) ([129.97.208.43]) by ob1.hc503-62.ca.iphmx.com with ESMTP/TLS/AES256-GCM-SHA384; 11 Jul 2021 14:29:40 -0400 Received: from [10.42.0.123] (10.32.139.159) by connhm04.connect.uwaterloo.ca (172.16.137.68) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_128_CBC_SHA256_P256) id 15.1.2176.2; Sun, 11 Jul 2021 14:29:39 -0400 To: CC: , , , , , , , , , , , , Peter Buhr , Martin Karsten References: <20210708194638.128950-4-posk@google.com> Subject: Re: [RFC PATCH 3/3 v0.2] sched/umcg: RFC: implement UMCG syscalls In-Reply-To: <20210708194638.128950-4-posk@google.com> From: Thierry Delisle Message-ID: Date: Sun, 11 Jul 2021 14:29:39 -0400 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Originating-IP: [10.32.139.159] X-ClientProxiedBy: connhm03.connect.uwaterloo.ca (172.16.137.67) To connhm04.connect.uwaterloo.ca (172.16.137.68) Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > Let's move the discussion to the new thread. I'm happy to start a new thread. I'm re-responding to my last post because many of my questions are still unanswered. > + * State transitions: > + * > + * RUNNING => IDLE:   the current RUNNING task becomes IDLE by calling > + *                    sys_umcg_wait(); > > [...] > > +/** > + * enum umcg_wait_flag - flags to pass to sys_umcg_wait > + * @UMCG_WAIT_WAKE_ONLY: wake @self->next_tid, don't put @self to sleep; > + * @UMCG_WF_CURRENT_CPU: wake @self->next_tid on the current CPU > + *                       (use WF_CURRENT_CPU); @UMCG_WAIT_WAKE_ONLY must be set. > + */ > +enum umcg_wait_flag { > +    UMCG_WAIT_WAKE_ONLY = 1, > +    UMCG_WF_CURRENT_CPU = 2, > +}; What is the purpose of using sys_umcg_wait without next_tid or with UMCG_WAIT_WAKE_ONLY? It looks like Java's park/unpark semantics to me, that is worker threads can use this for synchronization and mutual exclusion. In this case, how do these compare to using FUTEX_WAIT/FUTEX_WAKE? > +struct umcg_task { > [...] > +    /** > +     * @server_tid: the TID of the server UMCG task that should be > +     *              woken when this WORKER becomes BLOCKED. Can be zero. > +     * > +     *              If this is a UMCG server, @server_tid should > +     *              contain the TID of @self - it will be used to find > +     *              the task_struct to wake when pulled from > +     *              @idle_servers. > +     * > +     * Read-only for the kernel, read/write for the userspace. > +     */ > +    uint32_t    server_tid;        /* r   */ > [...] > +    /** > +     * @idle_servers_ptr: a single-linked list pointing to the list > +     *                    of idle servers. Can be NULL. > +     * > +     * Readable/writable by both the kernel and the userspace: the > +     * userspace adds items to the list, the kernel removes them. > +     * > +     * TODO: describe how the list works. > +     */ > +    uint64_t    idle_servers_ptr;    /* r/w */ > [...] > +} __attribute__((packed, aligned(8 * sizeof(__u64)))); From the comments and by elimination, I'm guessing that idle_servers_ptr is somehow used by servers to block until some worker threads become idle. However, I do not understand how the userspace is expected to use it. I also do not understand if these link fields form a stack or a queue and where is the head. > +/** > + * sys_umcg_ctl: (un)register a task as a UMCG task. > + * @flags:       ORed values from enum umcg_ctl_flag; see below; > + * @self:        a pointer to struct umcg_task that describes this > + *               task and governs the behavior of sys_umcg_wait if > + *               registering; must be NULL if unregistering. > + * > + * @flags & UMCG_CTL_REGISTER: register a UMCG task: > + *         UMCG workers: > + *              - self->state must be UMCG_TASK_IDLE > + *              - @flags & UMCG_CTL_WORKER > + * > + *         If the conditions above are met, sys_umcg_ctl() immediately returns > + *         if the registered task is a RUNNING server or basic task; an IDLE > + *         worker will be added to idle_workers_ptr, and the worker put to > + *         sleep; an idle server from idle_servers_ptr will be woken, if any. This approach to creating UMCG workers concerns me a little. My understanding is that in general, the number of servers controls the amount of parallelism in the program. But in the case of creating new UMCG workers, the new threads only respect the M:N threading model after sys_umcg_ctl has blocked. What does this mean for applications that create thousands of short lived tasks? Are users expcted to create pools of reusable UMCG workers? I would suggest adding at least one uint64_t field to the struct umcg_task that is left as-is by the kernel. This allows implementers of user-space schedulers to add scheduler specific data structures to the threads without needing some kind of table on the side.