Received: by 2002:a05:6a10:1287:0:0:0:0 with SMTP id d7csp3899920pxv; Mon, 19 Jul 2021 11:28:54 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwnfeXDE/6R23+iPrQlY8kJkFTVHvgrsFcUe/mBVQJAG7Nr79PlZ5Wx7V8ZcS3SrvfgFRDS X-Received: by 2002:a6b:e016:: with SMTP id z22mr19532818iog.187.1626719334723; Mon, 19 Jul 2021 11:28:54 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1626719334; cv=none; d=google.com; s=arc-20160816; b=T0tODjp7RxXgZ7pMzZhaubJ8sKhOBUcQKXNrZDe5iT1EV9gaHyMU3mY39o71BPpUfi +pVAXQMn0BSoIe0sQMXnTQt/5HT8fuEIeARdSLsdRDZunuLe2RNZ7gQICuZCRhXZoN1o HaEDdJHeDB59lnnGcYaX+xKA/mJccaxhp3H9EFlLXHCACemriyUan8hZfkOzBgm/ZmFS cOblXNwE9SGIL6g0hsd9kcRiULYOfNcbetn2cCfhzc03tAtSFJXX+ZJlOghhZ3aFPn+P Z7jxm5PT877WpKZMV4oAnImZmLHhmt1WOfHzWsS9ZRmyVr0kGAD6VW8OwW7Ugbr+NgnI gWag== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=+vqFcAPpr2Wrm3IQBRWJ29wt2kc6RE+Zc3iAnlFe6GY=; b=zaBD/zednNTV762FPi6TJTCAP7TSytqNnQ9X/kpVtmb/NUDD1lnOBi6OAHtwnQELyQ J7BaSCce00YCjkW3PntNTBiWqV13RbIizlo6D+07NMBOuzczwk4krCXmNh4qd0j2p2Ab CSSBwPCgNkMkaGUSjdZMcj898Rt/Zi7J0jdI9xed8Btz9Mi/rdyOP3sX4920yp7OVKX7 ZNTax6uURbcal0bQ6QkH13qfhDcFCldNuOOg1CS5wIX0hoWJqeRKiyD444AGdLd84q6O 3bsgGzs641+fnx9QivzPbodbJCDL+dD31FXQ3zv+gc4zKQbm0tgbEwO/J4hvb7mZpoDF l/qg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DREFa+Cr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id l11si20078398ilo.83.2021.07.19.11.28.43; Mon, 19 Jul 2021 11:28:54 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DREFa+Cr; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1382827AbhGSRmx (ORCPT + 99 others); Mon, 19 Jul 2021 13:42:53 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:39934 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1358906AbhGSQx3 (ORCPT ); Mon, 19 Jul 2021 12:53:29 -0400 Received: from mail-lj1-x22c.google.com (mail-lj1-x22c.google.com [IPv6:2a00:1450:4864:20::22c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CFD6BC0F26EB for ; Mon, 19 Jul 2021 10:11:39 -0700 (PDT) Received: by mail-lj1-x22c.google.com with SMTP id l11so12179951ljq.4 for ; Mon, 19 Jul 2021 10:30:13 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=+vqFcAPpr2Wrm3IQBRWJ29wt2kc6RE+Zc3iAnlFe6GY=; b=DREFa+Crra3/Gwt/ozWS6XyyFKY4Mg7dPcld3cizJiTFXWiznp9HUE67QvDO9oZbMl 3B000kz0sS3krAYD5a9gSu4NLZy1dE4yjQ9TVgBrKCK528Roj8zK08CGtsxThRGSyXDy ffWrnC9O7291POYBsVEi4zTCMxsxY7N/ykOM/tzi3rOF108KqHG+a3+2JefEhIN/oX0f VyMRoQNFp8urHC5L9xqQac7l8DS+rfqzzY7gd9u2EOqO//9C9pgpq/ZMCAFw4jQUsB6A v19hBqOGOdThZ0zo/07dXAHgVKO8uuU8kzOaDdOVo8Li2Tzapur/tu/weHW3xGPsT6zs GwVw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=+vqFcAPpr2Wrm3IQBRWJ29wt2kc6RE+Zc3iAnlFe6GY=; b=eTQGG5OyrwgcSAbLPc3bz/+bWfVISiKHA7k8Bi+RBYuUgfmFxKwuZgEimB/wfM4gC5 HbJlWcem/FMAIUuVQfqd8d6MOt4uHOIRB676iVJsIc1cO3VZ16YJDUOXkTviY1JZD4L/ nYQLqghCKUEgsZpePkolZS55kV7Kqk/vLXfOHyEEsRy0KPidthqkvv5fj5Kjj3xPZILl d2yMYNn4BryQDkIwfgz7mrZg4PqbNoOT28SFXToKjbuqt0Ts5XEltq708Rg70wCWzKI1 vTtKYeHCGT3by4bov+WIWdEiSaXsdpah/WzHkKFOaKZBp9+RtKZ1UFQsyFk4B+NqKU7s Be6w== X-Gm-Message-State: AOAM533W0CVnuRM2AT8xZDsD3V55kWy13RJU+8//9QmjhHwxVj6geDAR tu007frbu/PY6THjEIdEyBoUrtVJ2G7NZfV6WdZcWQ== X-Received: by 2002:a2e:934f:: with SMTP id m15mr22566736ljh.208.1626715811254; Mon, 19 Jul 2021 10:30:11 -0700 (PDT) MIME-Version: 1.0 References: <20210716184719.269033-5-posk@google.com> <2c971806-b8f6-50b9-491f-e1ede4a33579@uwaterloo.ca> In-Reply-To: <2c971806-b8f6-50b9-491f-e1ede4a33579@uwaterloo.ca> From: Peter Oskolkov Date: Mon, 19 Jul 2021 10:29:59 -0700 Message-ID: Subject: Re: [RFC PATCH 4/4 v0.3] sched/umcg: RFC: implement UMCG syscalls To: Thierry Delisle Cc: posk@posk.io, avagin@google.com, bsegall@google.com, jannh@google.com, jnewsome@torproject.org, joel@joelfernandes.org, linux-api@vger.kernel.org, linux-kernel@vger.kernel.org, mingo@redhat.com, peterz@infradead.org, pjt@google.com, tglx@linutronix.de, Peter Buhr Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jul 19, 2021 at 9:07 AM Thierry Delisle wrote: > > > /** > > * @idle_servers_ptr: a single-linked list pointing to the list > > * of idle servers. Can be NULL. > > * > > * Readable/writable by both the kernel and the userspace: the > > * userspace adds items to the list, the kernel removes them. > > * > > * This is a single-linked list (stack): head->next->next->next->NULL. > > * "next" nodes are idle_servers_ptr fields in struct umcg_task. > > * > > * Example: > > * > > * a running worker idle server 1 idle server 2 > > * > > * struct umct_task: struct umcg_task: struct umcg_task: > > * state state state > > * api_version api_version api_version > > * ... ... ... > > * idle_servers_ptr --> head --> idle_servers_ptr --> > idle_servers_ptr --> NULL > > * ... ... ... > > * > > * > > * Due to the way struct umcg_task is aligned, idle_servers_ptr > > * is aligned at 8 byte boundary, and so has its first byte as zero > > * when it holds a valid pointer. > > * > > * When pulling idle servers from the list, the kernel marks nodes as > > * "deleted" by ORing the node value (the pointer) with 1UL atomically. > > * If a node is "deleted" (i.e. its value AND 1UL is not zero), > > * the kernel proceeds to the next node. > > * > > * The kernel checks at most [nr_cpu_ids * 2] first nodes in the list. > > * > > * It is NOT considered an error if the kernel cannot find an idle > > * server. > > * > > * The userspace is responsible for cleanup/gc (i.e. for actually > > * removing nodes marked as "deleted" from the list). > > */ > > uint64_t idle_servers_ptr; /* r/w */ > > I don't understand the reason for using this ad-hoc scheme, over using a > simple > eventfd to do the job. As I understand it, the goal here is to let > servers that > cannot find workers to run, block instead of spinning. Isn't that > exactly what > the eventfd interface is for? Latency/efficiency: on worker wakeup an idle server can be picked from the list and context-switched into synchronously, on the same CPU. Using FDs and select/poll/epoll will add extra layers of abstractions; synchronous context-switches (not yet fully implemented in UMCG) will most likely be impossible. This patchset seems much more efficient and lightweight than whatever can be built on top of FDs. > > Have you considered an idle_fd field, the kernel writes 1 to the fd when a > worker is appended to the idle_workers_ptr? Servers that don't find work can > read the fd or alternatively use select/poll/epoll. Multiple workers are > expected to share fds, either a single global fd, one fd per server, or any > other combination the scheduler may fancy. >