Received: by 2002:ab2:6309:0:b0:1fb:d597:ff75 with SMTP id s9csp1109077lqt; Fri, 7 Jun 2024 08:13:38 -0700 (PDT) X-Forwarded-Encrypted: i=3; AJvYcCWamw4F3aWIgqdSLDlbEQ+zvTBQHqAPv2lP4AU9A2k9qhkJ5x+7TAIi2i8p/N2dgie4061pjee07UaPypVAA2vacLKAU/9KDx+PkZcIhg== X-Google-Smtp-Source: AGHT+IFFmKr3RUTHLknEG0X5jFnmzzxw/WIWv9j8Q0IJsSthYitPONYdKVPf6X72SxsJVQTKL8cG X-Received: by 2002:a17:906:e299:b0:a59:c39b:6bc3 with SMTP id a640c23a62f3a-a6cdb6d11e3mr174135466b.49.1717773218261; Fri, 07 Jun 2024 08:13:38 -0700 (PDT) ARC-Seal: i=2; a=rsa-sha256; t=1717773218; cv=pass; d=google.com; s=arc-20160816; b=AZ3R5Zv/KvjLYfIGy3RR5wLyrDS+ULbPZUt2leJOPZmhlwRz8WU+TTrs98RyDOT5SO 9LgZCSZG20u4TyjzYX9eUfYQ8x32wwTh/Zn0NNos+rtq0LVs7Eb1JjNybQoF0yBTM74x UfEZqqgW/QAI8d8xEkAunLLTkF0ozhRtiVVYfIXwdEwKgKeDZDACqseh2orlgyAVLJPc YT/jxl+SW8fImDWUDaPDZxBkejeIrTUqcex3u+mohFNQe09T9XNXdYwDFeGWcVMb/Anp xbuck7lxUx4AjT1+7cCSRcVoBk1zV7FQTB2duy3zRvvFyJ58yiIxztYk2p41lb26vbeq 0EXg== ARC-Message-Signature: i=2; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:list-unsubscribe:list-subscribe :list-id:precedence:dkim-signature; bh=F9DLoOIliT2vV+jCDlYYVBA1v56GjGd/sBFCy6REvhs=; fh=qwwo6XOUYAXx21lcNqjbmjGWqGM/74pIou6ksUXFWv4=; b=K8Pscr5bGLx2ptJAx6+nocmgp1aA2viVyIHfS/CNqE1qdxatI0CltGlmu7lmFrx4T3 k1GmGK71UtowYgUjlw/E+KnEi71RXWIo3cnHlBaqWWWYSqwb/OGv9MMyTiF8Ovs3BR9X Oof2VY+X/0rl7a2RxGAw8CzBBpqG8YgEclXQBwsKbW64FYQ/D4MhM7K75pv6nQ6whLH8 ZJQO11LvtTW4Y/s6ZmVoAUrB3rMlkEpvBeRZhqDv7oMgVZxYoyhcEduhUYMerRKhV6fG ne+/syVPl+KblIyqtlNAluQEdCiLBns4F0MvbnsqqRoLNp93pC+zvtIZbxfejApY7CG2 2Z7Q==; dara=google.com ARC-Authentication-Results: i=2; mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oGF5ChVz; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-crypto+bounces-4843-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-crypto+bounces-4843-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from am.mirrors.kernel.org (am.mirrors.kernel.org. [147.75.80.249]) by mx.google.com with ESMTPS id a640c23a62f3a-a6c8074a7d5si185699766b.975.2024.06.07.08.13.38 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 07 Jun 2024 08:13:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-crypto+bounces-4843-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) client-ip=147.75.80.249; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20230601 header.b=oGF5ChVz; arc=pass (i=1 spf=pass spfdomain=google.com dkim=pass dkdomain=google.com dmarc=pass fromdomain=google.com); spf=pass (google.com: domain of linux-crypto+bounces-4843-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.80.249 as permitted sender) smtp.mailfrom="linux-crypto+bounces-4843-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by am.mirrors.kernel.org (Postfix) with ESMTPS id EE85E1F21915 for ; Fri, 7 Jun 2024 15:13:37 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id 87D231922FD; Fri, 7 Jun 2024 15:13:18 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b="oGF5ChVz" X-Original-To: linux-crypto@vger.kernel.org Received: from mail-ed1-f41.google.com (mail-ed1-f41.google.com [209.85.208.41]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A4DD4198A24 for ; Fri, 7 Jun 2024 15:13:16 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.208.41 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717773198; cv=none; b=tsivV+4l4otk71NEbQ+lkJ3nInF6q4Ii8YNxgm8iTW8WIah0i2M1TCtekx2M8LIgW2Kir2Y4ikAvOiWgozi0+f6PfvgZOALsFBSq0AyFmN3ct1aXDIf+X76q81H5Hi5viEQ4PQaxCxozEDFslqRLdHoGzRvsQcPdX+YEg8KEdp0= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1717773198; c=relaxed/simple; bh=7n0apmTpqZ+tog798fnZb4ZTZm7Gl3vutnIWL1LYnuA=; h=MIME-Version:References:In-Reply-To:From:Date:Message-ID:Subject: To:Cc:Content-Type; b=YJ2d/9jMT8DbQD0drAOBMOQKa96ry7japQmSU+cLBwReEaEoW1uuwJZFKzx0pqnjFEJf7dLiMWOWasRenkGI6I4uVr4yYalv0T/VJySey5NGDAJE39kd0aIkZ8ozvEVmxVdGDrdF+arXes8GcRooTESyGQW67Ptwg6h7U7IJUoE= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com; spf=pass smtp.mailfrom=google.com; dkim=pass (2048-bit key) header.d=google.com header.i=@google.com header.b=oGF5ChVz; arc=none smtp.client-ip=209.85.208.41 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=reject dis=none) header.from=google.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=google.com Received: by mail-ed1-f41.google.com with SMTP id 4fb4d7f45d1cf-57c5ec83886so9069a12.1 for ; Fri, 07 Jun 2024 08:13:16 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20230601; t=1717773195; x=1718377995; darn=vger.kernel.org; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:from:to:cc:subject:date :message-id:reply-to; bh=F9DLoOIliT2vV+jCDlYYVBA1v56GjGd/sBFCy6REvhs=; b=oGF5ChVzmkSrf4CDt2HjlSTVWxz0aYNl3lhxr2e67cX6NlsRsWve/wt9BUszmxwKV/ KOi8z8tpmG6DM9EVZZE6v7rpbNsSJaKmwxQav+hh8XcwWF0tMjgtsZ+YippmyST/Ccc1 HINUkWQYHln80yuBonj6B/0l4uXQ/xLdBD9n3xRqYJLSayvHUOiAJU+pK8LWVSZ2Jd2s yWcydFqS/ro21AUCmgZdpHYS6Mli4NdHd+3oKYvPIWhY+HxAaXYTd63tAiU6b4LdYASf ODfoYPGtmG202PkiEnCOF+xh+B9BE/4q8Lfh3z4hKQiEYW/OuUqqmhWzZQdFPy15YQ2J QTSA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1717773195; x=1718377995; h=content-transfer-encoding:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=F9DLoOIliT2vV+jCDlYYVBA1v56GjGd/sBFCy6REvhs=; b=pYH5h1QrUWVCUCgw4bLWor7lhjcuF7XBF5MIcklJPBKqVQuFjRi2jAzwB5+QezmK1j Q3Z11h1pxkGuyB0HMNgGv7t0L5aH43sLmSc8PNmmQhmvldKVD9p4ez2nodzrdRxO+Fzt luH3XH4kJlJBqgtQUHF+oysGYURNldc1AIg8wlAf39Vq2zYEncGGa2czPq67h0XJIl0z nF1lyuBrxupmSBoJYM5xzWQWo6LRkQIXywMavZZLmrilEutlHf6F8P5Su8E99IWe0h/U PjkHEat7JZaqd8dtwO4UAUdkWohzssiSJ2OD4xRB6vK+0jmWV3l7Vi827LjC3BbfRWQP XmIw== X-Forwarded-Encrypted: i=1; AJvYcCWa0TJTjk8taXWQ3PK06fAFF+MCCKD4VDtBXilA9H/TiICdcV1emyZuqOR52MdB5x0ZnOI+zrUtnPv2lCb2eFGN/bNALDxN6ceFsrla X-Gm-Message-State: AOJu0YxG9IKoRiCWDkVSbioNrN27f7vJqSWrZlvJJ3WJ1v595cY9eESH +4ejjhPKHe2YJBMK2Z8diHUn43+bTrNvWaKDa4WA/LLTj6SFzdqf0xvA0EfL+KyQbZCMWg7WknX bQL8Vx+yW2JTgvv+AV71ga4tH/DGz6H1aaQLm X-Received: by 2002:a50:fb96:0:b0:572:988f:2f38 with SMTP id 4fb4d7f45d1cf-57aa6e8e2e9mr588051a12.6.1717773194672; Fri, 07 Jun 2024 08:13:14 -0700 (PDT) Precedence: bulk X-Mailing-List: linux-crypto@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 References: <20240528122352.2485958-1-Jason@zx2c4.com> <20240528122352.2485958-2-Jason@zx2c4.com> In-Reply-To: From: Jann Horn Date: Fri, 7 Jun 2024 17:12:38 +0200 Message-ID: Subject: Re: [PATCH v16 1/5] mm: add VM_DROPPABLE for designating always lazily freeable mappings To: "Jason A. Donenfeld" Cc: linux-kernel@vger.kernel.org, patches@lists.linux.dev, tglx@linutronix.de, linux-crypto@vger.kernel.org, linux-api@vger.kernel.org, x86@kernel.org, Greg Kroah-Hartman , Adhemerval Zanella Netto , "Carlos O'Donell" , Florian Weimer , Arnd Bergmann , Christian Brauner , David Hildenbrand , linux-mm@kvack.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable On Fri, Jun 7, 2024 at 4:35=E2=80=AFPM Jason A. Donenfeld = wrote: > On Fri, May 31, 2024 at 03:00:26PM +0200, Jann Horn wrote: > > On Fri, May 31, 2024 at 2:13=E2=80=AFPM Jason A. Donenfeld wrote: > > > On Fri, May 31, 2024 at 12:48:58PM +0200, Jann Horn wrote: > > > > On Tue, May 28, 2024 at 2:24=E2=80=AFPM Jason A. Donenfeld wrote: > > > > > c) If there's not enough memory to service a page fault, it's not= fatal. > > > > [...] > > > > > @@ -5689,6 +5689,10 @@ vm_fault_t handle_mm_fault(struct vm_area_= struct *vma, unsigned long address, > > > > > > > > > > lru_gen_exit_fault(); > > > > > > > > > > + /* If the mapping is droppable, then errors due to OOM ar= en't fatal. */ > > > > > + if (vma->vm_flags & VM_DROPPABLE) > > > > > + ret &=3D ~VM_FAULT_OOM; > > > > > > > > Can you remind me how this is supposed to work? If we get an OOM > > > > error, and the error is not fatal, does that mean we'll just keep > > > > hitting the same fault handler over and over again (until we happen= to > > > > have memory available again I guess)? > > > > > > Right, it'll just keep retrying. I agree this isn't great, which is w= hy > > > in the 2023 patchset, I had additional code to simply skip the faulti= ng > > > instruction, and then the userspace code would notice the inconsisten= cy > > > and fallback to the syscall. This worked pretty well. But it meant > > > decoding the instruction and in general skipping instructions is weir= d, > > > and that made this patchset very very contentious. Since the skipping > > > behavior isn't actually required by the /security goals/ of this, I > > > figured I'd just drop that. And maybe we can all revisit it together > > > sometime down the line. But for now I'm hoping for something a little > > > easier to swallow. > > > > In that case, since we need to be able to populate this memory to make > > forward progress, would it make sense to remove the parts of the patch > > that treat the allocation as if it was allowed to silently fail (the > > "__GFP_NOWARN | __GFP_NORETRY" and the "ret &=3D ~VM_FAULT_OOM")? I > > think that would also simplify this a bit by making this type of > > memory a little less special. > > The whole point, though, is that it needs to not fail or warn. It's > memory that can be dropped/zeroed at any moment, and the code is > deliberately robust to that. Sure - but does it have to be more robust than accessing a newly allocated piece of memory [which hasn't been populated with anonymous pages yet] or bringing a swapped-out page back from swap? I'm not an expert on OOM handling, but my understanding is that the kernel tries _really_ hard to avoid failing low-order GFP_KERNEL allocations, with the help of the OOM killer. My understanding is that those allocations basically can't fail with a NULL return unless the process has already been killed or it is in a memcg_kmem cgroup that contains only processes that have been marked as exempt from OOM killing. (Or if you're using error injection to explicitly tell the kernel to fail the allocation.) My understanding is that normal outcomes of an out-of-memory situation are things like the OOM killer killing processes (including potentially the calling one) to free up memory, or the OOM killer panic()ing the whole system as a last resort; but getting a NULL return from page_alloc(GFP_KERNEL) without getting killed is not one of those outcomes.