Received: by 2002:ab2:620c:0:b0:1ef:ffd0:ce49 with SMTP id o12csp312532lqt; Mon, 18 Mar 2024 08:31:29 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCXCyB6f1EnRKqReDRfV18EQIrXUrKDHsc2KTTrmzmwdIy9RFxKwURYaa38R3fUPBnj75UL/dVZ9xz5MryrfwIxyP8CAyvQUaaDLqY0RhA== X-Google-Smtp-Source: AGHT+IH4WUSOXcUx65flRBO9U2DE/rrNYpUcVkiXF5JFv5NuMiP/oG2styZngZ9QRm/PDwugOL9e X-Received: by 2002:a05:6358:4319:b0:17b:b830:2809 with SMTP id r25-20020a056358431900b0017bb8302809mr11305109rwc.19.1710775889232; Mon, 18 Mar 2024 08:31:29 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710775889; cv=pass; d=google.com; s=arc-20160816; b=W6pAKp9QdeN2PZ9Nd7ZxqqFN+mO98iEGTV+UPyLDdX50vYLAwNe+n70AsyoEH5N7GV 08qgu7RH8zKF0KOaZMZA60RwsI1iK2JOl7i95ebvLbI6iqn1yVbefc0tw9MY+5NR51xy PsIw6y8UN4sbRPmlFrDAYw2MOznq6AOf+iPofI5rPcTf2YTvF2yQAYMo2p3fwTO2LENS 1xDj3wktx6LSa7FYSgC6/7wMDq9B3itzJlJt0cuOICmPbYDw4QeLrqeIkoMvlN4Xxz4B LlZttR3M2k5QZGFppEMM/w7qTybpHGTij1QccXZoXOilMhSs/a//gu3bFEZVbZxkS0Sv 0SjA== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=BnbQuLUwlY/yZQSEJ6UKuPd4bJoloGALr8ZQRUHT1Jc=; fh=My1FFU0BpRwB9SCoo8QWZ5XESi0UifhZWNsMpg32ZGc=; b=MF554/Mqjm85bERSLAVUeb/aLutDMinQG5SSFFURuW4IiucCqEiuqW60ejyR2ULnTT nQ56Ol+Hn7yuNtCyVptpHa56UgJVvtkSjFWWHLym7QSORJqFZ6+0FlwIrAowHLS03z7R q4IAFLZUDGViLlfBJSFEGNk76UMOzk+g3ctN8/3Gl2/o2o39yaASqzrt5mHb1GIQensq c5puuvVD8LS/E39bnRAUCWgkZVp+bGht28r6fwG8t1od98Xp2Gcjys7dBv/9qNKoQr7z tJdOTlYDCvl0boWtvgEHSAGcSC7gbcuzeI7SzN9sUebte2jsd7lHXZCmTcAxRm/Z5WWC IQ6A==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=R5MNsB4F; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-106306-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-106306-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id j23-20020a63ec17000000b005dbec216167si8105776pgh.614.2024.03.18.08.31.28 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Mar 2024 08:31:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-106306-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=R5MNsB4F; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-106306-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-106306-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id DB8CF281915 for ; Mon, 18 Mar 2024 15:31:28 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 452014F217; Mon, 18 Mar 2024 15:31:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="R5MNsB4F" Received: from mail-qt1-f175.google.com (mail-qt1-f175.google.com [209.85.160.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id DABB64F1FC for ; Mon, 18 Mar 2024 15:31:19 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.175 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710775881; cv=none; b=Mc4vzxeRpgVyyESAPnWmrw0WuPJveDG8w5MIep7Fv3jDNR5qVlp7NT4lJIA4pe3pKfYPzbe1TnZnh+FKU40UL5PXip27aGBUoAh+y1ck1Y0Gv7JRdjbYaYbEoPKzLDWFqh0whOqz+QtfHKsdDOOqLA0NgkQxof39eNJ26VT2TWE= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710775881; c=relaxed/simple; bh=T0Ne9yMNYLZi4uTj1Xuo2ZKrcAbKFkp6jK9oc5YUMpg=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=Yksl9Y6NltyPf3ATJuZFPCA96ys8jcnNo8IMhSfqFAprh+Xp72Mg8IrBlWwK8oaxIsB1aYyjcxCBqxvQx+SNtrCkcbdDmL9UNx3h46DNpgNVq+CqCOKLoguQ+9wZ+CWpl1NA4JpvByIdBw7W9334e45QQ2dFYzhwOsbPEHy1uGU= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=R5MNsB4F; arc=none smtp.client-ip=209.85.160.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Received: by mail-qt1-f175.google.com with SMTP id d75a77b69052e-42e29149883so18685591cf.2 for ; Mon, 18 Mar 2024 08:31:19 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1710775879; x=1711380679; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=BnbQuLUwlY/yZQSEJ6UKuPd4bJoloGALr8ZQRUHT1Jc=; b=R5MNsB4Fifqd1eCnameang+os7as9bXZ3nIlwxLPkyGC8kLuK4v5leY8JmAjmMHrQW r2Tfx0c7S2Zrb2Dl9sn98NyWCS1OnBAiLIsCcNxKTCUpLPAOTT0U9GQpoFfcQcrHcWcn qCNhjHvIiBO0YdnCfaLTxHx7kVD3CDZx6wBZ9B9JSKQpgxnBet2SQP7teK4Pr8XfGQoi BWsSw+ArhyyAf3mn3+NIan2VayH17DjJJg4PjLcWUa0eJnH1jfUU9GHPfpmh3Ci5InTT CdutYP8X1uEjPLZven4iwIJbD60GspzOzFY6prMEt7B4muvF52r8zbcMSYXqy0AeYUS2 5K1A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710775879; x=1711380679; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=BnbQuLUwlY/yZQSEJ6UKuPd4bJoloGALr8ZQRUHT1Jc=; b=N5AX1iUyIBDVH0vCqH+UR3u9NqY017BICP9cd5eiDeTAvs5LGvYD9SPywQfWM6g+Ym IqdDgOf0T7HdBbmkdct5scrkwF/c621tzvQdW19ScxJ7kznnNtqS9n2YO0h1ZlOyNmyl N0dmLJu+OTJJvDNpdsT889Vyik+a4GmJH5NGKmXQL4oi5QEPDBc3vdjmGv7WpfvXN5Vg LeAL2vh3Tj1llI+YYdo7Wxsq0pGqmfKnPGLWaTKoIBAdeDn6Y6H5rfbG/I6Ge7FYxpE8 nGpPBHw5aCW3d6mtxwzUcUXA61hkJZKGaxTjJCzS5J71M4goruJzmn8DgmwZ4FzLBbKL RMyw== X-Forwarded-Encrypted: i=1; AJvYcCWJ1M6NcwH8pLdvo5ulNw9peeIa45KVoliwJvQD9ZWC0Xxhckr9phLedRte87yUpeLfDSsluVNv65cd9bAfQbq2/uh65/fpy9hB5MIg X-Gm-Message-State: AOJu0Yzpn/8s1A+pXTeuMpLv3EwAIsXZ6jglJaIpxzBxKax+J5ddIDFV hHUwmKNgBLqQLR2OKQhXW/dc7xh81UhkjU+JkZYv/BSZZqvyuab8TgNbSZab129dvUJtlMo9wgz RrzD/8Fg16cEABb8fycAAc+eGsfzwsBBFS6sLgQ== X-Received: by 2002:ac8:5d8c:0:b0:42e:f950:d225 with SMTP id d12-20020ac85d8c000000b0042ef950d225mr18462982qtx.1.1710775878637; Mon, 18 Mar 2024 08:31:18 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240311164638.2015063-1-pasha.tatashin@soleen.com> <2cb8f02d-f21e-45d2-afe2-d1c6225240f3@zytor.com> <2qp4uegb4kqkryihqyo6v3fzoc2nysuhltc535kxnh6ozpo5ni@isilzw7nth42> <39F17EC4-7844-4111-BF7D-FFC97B05D9FA@zytor.com> In-Reply-To: From: Pasha Tatashin Date: Mon, 18 Mar 2024 11:30:42 -0400 Message-ID: Subject: Re: [RFC 00/14] Dynamic Kernel Stacks To: Matthew Wilcox Cc: David Laight , "H. Peter Anvin" , Kent Overstreet , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "akpm@linux-foundation.org" , "x86@kernel.org" , "bp@alien8.de" , "brauner@kernel.org" , "bristot@redhat.com" , "bsegall@google.com" , "dave.hansen@linux.intel.com" , "dianders@chromium.org" , "dietmar.eggemann@arm.com" , "eric.devolder@oracle.com" , "hca@linux.ibm.com" , "hch@infradead.org" , "jacob.jun.pan@linux.intel.com" , "jgg@ziepe.ca" , "jpoimboe@kernel.org" , "jroedel@suse.de" , "juri.lelli@redhat.com" , "kinseyho@google.com" , "kirill.shutemov@linux.intel.com" , "lstoakes@gmail.com" , "luto@kernel.org" , "mgorman@suse.de" , "mic@digikod.net" , "michael.christie@oracle.com" , "mingo@redhat.com" , "mjguzik@gmail.com" , "mst@redhat.com" , "npiggin@gmail.com" , "peterz@infradead.org" , "pmladek@suse.com" , "rick.p.edgecombe@intel.com" , "rostedt@goodmis.org" , "surenb@google.com" , "tglx@linutronix.de" , "urezki@gmail.com" , "vincent.guittot@linaro.org" , "vschneid@redhat.com" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Mon, Mar 18, 2024 at 11:19=E2=80=AFAM Matthew Wilcox wrote: > > On Mon, Mar 18, 2024 at 11:09:47AM -0400, Pasha Tatashin wrote: > > The TLB load is going to be exactly the same as today, we already use > > small pages for VMA mapped stacks. We won't need to have extra > > flushing either, the mappings are in the kernel space, and once pages > > are removed from the page table, no one is going to access that VA > > space until that thread enters the kernel again. We will need to > > invalidate the VA range only when the pages are mapped, and only on > > the local cpu. > > No; we can pass pointers to our kernel stack to other threads. The > obvious one is a mutex; we put a mutex_waiter on our own stack and > add its list_head to the mutex's waiter list. I'm sure you can > think of many other places we do this (eg wait queues, poll(), select(), > etc). Hm, it means that stack is sleeping in the kernel space, and has its stack pages mapped and invalidated on the local CPU, but access from the remote CPU to that stack pages would be problematic. I think we still won't need IPI, but VA-range invalidation is actually needed on unmaps, and should happen during context switch so every time we go off-cpu. Therefore, what Brian/Andy have suggested makes more sense instead of kernel/enter/exit paths. Pasha