Received: by 2002:ab2:620c:0:b0:1ef:ffd0:ce49 with SMTP id o12csp298231lqt; Mon, 18 Mar 2024 08:10:34 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWFWM//62DoP9h8hT3oRy9aqwUh2Juy0gEqdsjNjanQ7BkAjmiPp1WSC4jKcuGDu48MQ4sK8sF5vUD8F+AfcOPZAz9rTosiDbxruzmWTA== X-Google-Smtp-Source: AGHT+IG7SBy/n1AJ1SCQZg7Dz6a4QfjYSnmBB3dW+H5cS/+O4Iw/FdoUW/pszYims2ct6coXyGLo X-Received: by 2002:a05:6358:d396:b0:17e:8f90:dd31 with SMTP id mp22-20020a056358d39600b0017e8f90dd31mr15421695rwb.32.1710774634436; Mon, 18 Mar 2024 08:10:34 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1710774634; cv=pass; d=google.com; s=arc-20160816; b=MY+my6ILAdxpS/9zf77qYgaEOTGhOCimS3xwNrFrpeUw2rT5eeSp2H2ihCLobn0tNO CFTKOEvyBxCttRrON+h472UQ+6nQ4rJ0gPQ5Ma+Twp2r5neh+G6owEuffMbPZaXLiOFr W2R+hL/rhyI+pzSN65dd/oNL/XAdBac7yv0biEzqCXkD1JPqaEG8Aq4kC14sqI62lw3o sR0YlMdsGU6bnummeYQByoABorml6CP4+N+RB0mPpxKNDLf9npHV6NxvCLrugt6oEXoL NRQBEUSiK5VxM7SLiAFYt7cW0rpyenf/ej+ngSnuepOIPbPXu2jmTkfQyjFkcIILB3r6 oqHw== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=9KVpr3mWHXuGxJ7EXrWZlpaYerSnna04C7a1q/1FFdM=; fh=YeiWlxcB5bHQDenkTyLpBf9Ib2w8yOAAvnRKbQlON80=; b=Rd46pQR/f3D5gxrwgClNNvDQHnhdPHVwIwqqfdnK9HbAbDJOY+pfJ1oLkYLrGeI5C3 /g3Ua1CIKqLPzHVnJdewHkCmgTmHa5R6rrP5S2NUHQEqDQegTQCd524Hb4RPgy7zI/j4 d18lMX3QjuMlsC3uh7qBS7g8LsDirjJMywR3iHbL/0HJbZcXhgC7X6jLYRARDURF8Sv5 5FLx6czXI6Ru0sqHI2pbg8oWvoS8QbrYI7lnrAjtz2vt3m1/z4qFFApAeLNBkxNMLty+ CSIu/++Sm/3GuO28NB1sHycmSqjooV3o56btRYaCMasmOXAZN6Mu7TwE+Sar97lpbJG1 alVQ==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=Hlm4MGv7; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-106275-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-106275-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Return-Path: Received: from sv.mirrors.kernel.org (sv.mirrors.kernel.org. [139.178.88.99]) by mx.google.com with ESMTPS id e1-20020a656781000000b005d3adb65694si8132887pgr.757.2024.03.18.08.10.34 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 18 Mar 2024 08:10:34 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel+bounces-106275-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) client-ip=139.178.88.99; Authentication-Results: mx.google.com; dkim=pass header.i=@soleen-com.20230601.gappssmtp.com header.s=20230601 header.b=Hlm4MGv7; arc=pass (i=1 spf=pass spfdomain=soleen.com dkim=pass dkdomain=soleen-com.20230601.gappssmtp.com dmarc=pass fromdomain=soleen.com); spf=pass (google.com: domain of linux-kernel+bounces-106275-linux.lists.archive=gmail.com@vger.kernel.org designates 139.178.88.99 as permitted sender) smtp.mailfrom="linux-kernel+bounces-106275-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=soleen.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sv.mirrors.kernel.org (Postfix) with ESMTPS id 160D6281598 for ; Mon, 18 Mar 2024 15:10:34 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id A50A34F201; Mon, 18 Mar 2024 15:10:27 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b="Hlm4MGv7" Received: from mail-qt1-f181.google.com (mail-qt1-f181.google.com [209.85.160.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id E305D4E1D3 for ; Mon, 18 Mar 2024 15:10:24 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.160.181 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710774626; cv=none; b=sQUcSXnHK4RY0rfdqPMRIMQ0ootR8T2zdpsh4nBCAwsgs4rngOrosCCGJ/CfLHWP70sleir5vOn7kfmon1SOZM6Zj4DwBzeL4pB0AOqpbdkaIxQzA11wrz3VzcH7Ar5+PtzM+QleEsE1Um/cttk6wAhkFVL7047pGFbM9rSKWcw= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1710774626; c=relaxed/simple; bh=73vfzaGUrkL8OR4EpKhuTHYiz7zUQJHz8EOvbTyVe8U=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=Nk7argkfrdfhsX5aPQ2ASXDpHrySeACLD3Ig04XOwIIEwSS+w3x81GEWWnTm7sifrTIwMwy8AuUT1qIGYMmGHfnkP+EEwmkmaWT0+Ju+XNUKuDebmc2VOMZ7EyeCGQ92G+YtRbo1MjUAajrY2q2R8neIVEJLJ3hvwE/e49hiSog= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com; spf=pass smtp.mailfrom=soleen.com; dkim=pass (2048-bit key) header.d=soleen-com.20230601.gappssmtp.com header.i=@soleen-com.20230601.gappssmtp.com header.b=Hlm4MGv7; arc=none smtp.client-ip=209.85.160.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=soleen.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=soleen.com Received: by mail-qt1-f181.google.com with SMTP id d75a77b69052e-430c4d0408eso10002381cf.3 for ; Mon, 18 Mar 2024 08:10:24 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=soleen-com.20230601.gappssmtp.com; s=20230601; t=1710774624; x=1711379424; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=9KVpr3mWHXuGxJ7EXrWZlpaYerSnna04C7a1q/1FFdM=; b=Hlm4MGv7QOpozYhUXKj+XpKx84GfcBbSjFyhHfhQRkbzn/ZXfy+LO4T3PLC+klsEhx jHqLEiemd1U6gAIlepSu8ISvSvmMYmvU7JPzR5DtnxoQszpO06xmns/wTxiR9jp0qRt3 YwlZImcRBMk4jZXf3BZyd1o+icYzVSWwDkYm9m/5QdS4LTj7/HXs+YBLeAJ5EvPqzO0N HJrhQAN+XWJrsH7KjkVcEWRdtLY69u6KZVkYU0VxV7mJYda8ggoiMzyBZRoWck4QZAem p5U7RmaYtbkbncJ3soY/6VZhQmwNjhEdPd13WEGSozPYE3k571cfdevVuFPW16kieoXI Ul4g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1710774624; x=1711379424; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=9KVpr3mWHXuGxJ7EXrWZlpaYerSnna04C7a1q/1FFdM=; b=pH+k7OKaobf7+jWZEhyDRkVebKVBeVLgnAaRxmYddDA2xW6ToqcRJEJYlP8rZ/idqs tkN9uyMROnWG3Q6B2UdYGRMFpwsR5eSBwZiyf3UyaoLhRh+MWjtI3TSAv93gdtfQHDY6 hgorkCPw0r/KG9oPfWGYeZ4nwb7AF6seYKBdkxacK9WyM/PnRpPtKWOwWMrhE8WeQbgT 854Rcgh7UVdIKKJvy+UoDuotWMfDB8aTD0WHuL7wCiSjM95iQEHbtlT0UfFOD1+BQNyK F9omowOUUgL+oX3g8LFJLYX9C3+sVX0FTgj2GlBciOdZNbxOlC/9f8Aq9wb1RXy4LLka 3roA== X-Forwarded-Encrypted: i=1; AJvYcCWeBZWBQ92zWeknSFCgtT/a9P990LMUsXHEJCAPeYjrwU9jJgGdYSBeP5ap+O9WFEgE3E156Ml85Bu/2sReNrNQlQCI/94jjtiMlb8m X-Gm-Message-State: AOJu0Yy5Fqgvb+7P26Enre12C3WKlT+R8YRrOgzkSw4cu6AiCj1qT0P1 NNCVbmTopXdN9LuckL7mbVdZp9oSBB3oOt9A6FTQvCYaiyHzUHrD+estT4gDL0vWrIDvP+jfykH c6mPnpXS8q4ui1xFJG5mCo6lXuf5EtnBkS2DSDQ== X-Received: by 2002:a05:622a:d4:b0:430:d2ed:3bbe with SMTP id p20-20020a05622a00d400b00430d2ed3bbemr3376760qtw.59.1710774623709; Mon, 18 Mar 2024 08:10:23 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240311164638.2015063-1-pasha.tatashin@soleen.com> <2cb8f02d-f21e-45d2-afe2-d1c6225240f3@zytor.com> <2qp4uegb4kqkryihqyo6v3fzoc2nysuhltc535kxnh6ozpo5ni@isilzw7nth42> <39F17EC4-7844-4111-BF7D-FFC97B05D9FA@zytor.com> In-Reply-To: From: Pasha Tatashin Date: Mon, 18 Mar 2024 11:09:47 -0400 Message-ID: Subject: Re: [RFC 00/14] Dynamic Kernel Stacks To: David Laight Cc: "H. Peter Anvin" , Matthew Wilcox , Kent Overstreet , "linux-kernel@vger.kernel.org" , "linux-mm@kvack.org" , "akpm@linux-foundation.org" , "x86@kernel.org" , "bp@alien8.de" , "brauner@kernel.org" , "bristot@redhat.com" , "bsegall@google.com" , "dave.hansen@linux.intel.com" , "dianders@chromium.org" , "dietmar.eggemann@arm.com" , "eric.devolder@oracle.com" , "hca@linux.ibm.com" , "hch@infradead.org" , "jacob.jun.pan@linux.intel.com" , "jgg@ziepe.ca" , "jpoimboe@kernel.org" , "jroedel@suse.de" , "juri.lelli@redhat.com" , "kinseyho@google.com" , "kirill.shutemov@linux.intel.com" , "lstoakes@gmail.com" , "luto@kernel.org" , "mgorman@suse.de" , "mic@digikod.net" , "michael.christie@oracle.com" , "mingo@redhat.com" , "mjguzik@gmail.com" , "mst@redhat.com" , "npiggin@gmail.com" , "peterz@infradead.org" , "pmladek@suse.com" , "rick.p.edgecombe@intel.com" , "rostedt@goodmis.org" , "surenb@google.com" , "tglx@linutronix.de" , "urezki@gmail.com" , "vincent.guittot@linaro.org" , "vschneid@redhat.com" Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Sun, Mar 17, 2024 at 2:58=E2=80=AFPM David Laight wrote: > > From: Pasha Tatashin > > Sent: 16 March 2024 19:18 > ... > > Expanding on Mathew's idea of an interface for dynamic kernel stack > > sizes, here's what I'm thinking: > > > > - Kernel Threads: Create all kernel threads with a fully populated > > THREAD_SIZE stack. (i.e. 16K) > > - User Threads: Create all user threads with THREAD_SIZE kernel stack > > but only the top page mapped. (i.e. 4K) > > - In enter_from_user_mode(): Expand the thread stack to 16K by mapping > > three additional pages from the per-CPU stack cache. This function is > > called early in kernel entry points. > > - exit_to_user_mode(): Unmap the extra three pages and return them to > > the per-CPU cache. This function is called late in the kernel exit > > path. > > Isn't that entirely horrid for TLB use and so will require a lot of IPI? The TLB load is going to be exactly the same as today, we already use small pages for VMA mapped stacks. We won't need to have extra flushing either, the mappings are in the kernel space, and once pages are removed from the page table, no one is going to access that VA space until that thread enters the kernel again. We will need to invalidate the VA range only when the pages are mapped, and only on the local cpu. > Remember, if a thread sleeps in 'extra stack' and is then resheduled > on a different cpu the extra pages get 'pumped' from one cpu to > another. Yes, the per-cpu cache can get unbalanced this way, we can remember the original CPU where we acquired the pages to return to the same place. > I also suspect a stack_probe() is likely to end up being a cache miss > and also slow??? Can you please elaborate on this point. I am not aware of stack_probe() and how it is used. > So you wouldn't want one on all calls. > I'm not sure you'd want a conditional branch either. > > The explicit request for 'more stack' can be required to be allowed > to sleep - removing a lot of issues. > It would also be portable to all architectures. > I'd also suspect that any thread that needs extra stack is likely > to need to again. > So while the memory could be recovered, I'd bet is isn't worth > doing except under memory pressure. > The call could also return 'no' - perhaps useful for (broken) code > that insists on being recursive. The current approach discussed is somewhat different from explicit more stack requests API. I am investigating how feasible it is to use kernel stack multiplexing, so the same pages can be re-used by many threads when they are actually used. If the multiplexing approach won't work, I will come back to the explicit more stack API. > Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1= 1PT, UK > Registration No: 1397386 (Wales)