Received: by 2002:ac0:aed5:0:0:0:0:0 with SMTP id t21csp4546307imb; Wed, 6 Mar 2019 16:34:35 -0800 (PST) X-Google-Smtp-Source: APXvYqzPtseHpiSDnpb0MQMMYAGowRti4Uc4KfS+dlivZgHNQrMy4ZCqBa5XGjQuqh4V/QOHsIo6 X-Received: by 2002:a17:902:2702:: with SMTP id c2mr9780464plb.239.1551918875776; Wed, 06 Mar 2019 16:34:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1551918875; cv=none; d=google.com; s=arc-20160816; b=ujG76hetM3mjS1B666u/J7xou3jMOCd4G+H5/LsKUX6yCZAylNvShqMNnDzzXNG+yX hOb24EgD23zvELuIHQ7WQXuTW1TN9bQxssUa3Enyqf7AWa3X7tQsZhxXxmvlhvDpVKZT AouD32J7Dkt0QQAxToxp+jbfIhmiT0btwiQzfczK8YmCVUvXdmrdyikJgFIomccOpnIv LykhXTFs91wTP7LN6qF18T9ttWXWX5GHY5MVaO0Vw7Zahz2Ph/Vyhvu8QZOmEy3sU5sO pvalUwDgfhxwjSPo9Z2E73fKTpV3VACUqO9nODThCrM0UKQiXcoFjx56nwGocvun4htn ecAA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=bEJ5G8jnHVQMy7aNfN+k83bonxkbyl6Qcj8Z0QcUY2I=; b=cleHNi9wxqSUxaT4BxORGrHr8xqwphztY/hwH9frLfoz2NzRBjzexc2f4Pntie+WHE Kz3vgRumvWGaaJ1yibPh07cFQmW/qzuQqCVrFlvmdxhQcc3bAeCdN/7ugHciY4phDQor 2vlcbQmWGn4QmCDH7ZBL7u3rZnYaLiJBcZvoyVrb6zuiSKKodKfCQcMwoutStHI8dqH5 5nlRyYI9frya19QCYUnLNemjEFoNoYFT0KaF1ip6nLZYmmTEZ+xLyQAr+7eM8OaiL6co KqTea683tVVxSPHD2g75WtX5Datrw9qwW/LJwCmiiCmrDPuttRVVICoO1NNZAhkWi40A gX8Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=IjiFceot; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 24si2823824pfr.253.2019.03.06.16.34.20; Wed, 06 Mar 2019 16:34:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=IjiFceot; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726177AbfCGAdW (ORCPT + 99 others); Wed, 6 Mar 2019 19:33:22 -0500 Received: from mail-vs1-f65.google.com ([209.85.217.65]:34818 "EHLO mail-vs1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725790AbfCGAdV (ORCPT ); Wed, 6 Mar 2019 19:33:21 -0500 Received: by mail-vs1-f65.google.com with SMTP id e81so3429050vsd.2 for ; Wed, 06 Mar 2019 16:33:20 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=bEJ5G8jnHVQMy7aNfN+k83bonxkbyl6Qcj8Z0QcUY2I=; b=IjiFceotbOUdvh5dCUAxBCGwm2/WNmIbFzPS7TK5q6LWR+1u83Cj3bA72X45511uRM ien4KzupB3a9Q2iRIRblscdKhQ5GG1nyLzTKZ/3DVA3dM7AO3EZfI8IuPhWu3S7OH3x2 Kk3plKijR91I1t2G0AaC4ctugSL62vf+7q/BQ/m28uWeIugOp1yE+9I75ibOvHdAnIVc /963sSBbFAomEeotRmKxdVzO7UwHECsRqqjefBw0WSnZorsegTcNW0e1g/fd5MIHMwoV BFrY07BCDFtK58LN3VpSJOTIbRY7LAvNMv/Tj3B9qrG3zRsMy85QsY6/uHkOLecvXmWI z1PA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=bEJ5G8jnHVQMy7aNfN+k83bonxkbyl6Qcj8Z0QcUY2I=; b=mXFGgCApDw9q28Ib21DBK3bSuXioXPVmNfP9dIObjiqyDsWo2LnoLHMfwhVdn8qWk+ O9GZu2OehhVZn4oSQoYCU5NqMZwT9K5EEnxL1eU5JzS0dxRMBBSsm5GfOIuwLANs5kNu A2OseGid9wuXrkvBRKgsaWrEGH6SDldndkeymbPMM6wb3yfWzrttJL+E0Afh3PQrAW0W Q5dtiMeP6PPw3El1Bagr9IDfT0S+Do09f/Las7E2Fj5sbYCmkjg5EA488+CT6LZsQQxF LqiNJZhFJMyKygHESx7T28YANSOPCEsEufjl0H7iK2dFWXjcBVrARgHEqWm2aReS1n2u y+fg== X-Gm-Message-State: APjAAAWBCTSUlwzHd1dPTgXYuIC13u0JMl2o5ZfdFZBsGfC2K4s4hpmA 7qjcWBebimLli9YCB967gueYcyYJ0CTeiROsgWiyOK405Z8= X-Received: by 2002:a67:af05:: with SMTP id v5mr5792194vsl.114.1551918799495; Wed, 06 Mar 2019 16:33:19 -0800 (PST) MIME-Version: 1.0 References: <20190118225543.86996-1-joel@joelfernandes.org> <20190119082532.GA9004@kroah.com> <20190119162754.GC231277@google.com> <20190119232503.GA149403@google.com> <78AACAF1-8EBF-4DF3-BE94-5B14E78BA791@zytor.com> <20190120155838.GA23827@google.com> <20190306230944.GB7915@amd> <754146f0-8b57-8644-81c1-528b5ca7dba1@zytor.com> In-Reply-To: <754146f0-8b57-8644-81c1-528b5ca7dba1@zytor.com> From: Daniel Colascione Date: Wed, 6 Mar 2019 16:33:07 -0800 Message-ID: Subject: Re: [RFC] Provide in-kernel headers for making it easy to extend the kernel To: "H. Peter Anvin" Cc: Pavel Machek , Joel Fernandes , Greg KH , linux-kernel , Andrew Morton , ast@kernel.org, atish patra , Borislav Petkov , Ingo Molnar , Jan Kara , Jonathan Corbet , Karim Yaghmour , Kees Cook , kernel-team@android.com, "open list:DOCUMENTATION" , Manoj Rao , Masahiro Yamada , Paul McKenney , "Peter Zijlstra (Intel)" , Randy Dunlap , rostedt@goodmis.org, Thomas Gleixner , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , yhs@fb.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Mar 6, 2019 at 4:07 PM H. Peter Anvin wrote: > > On 3/6/19 3:37 PM, Daniel Colascione wrote: > > > > I just don't get the opposition to Joel's work. The rest of the thread > > already goes into detail about the problems with pure-filesystem > > solutions, and you and others are just totally ignoring those > > well-thought-out rationales for the module approach and doing > > inflooping on "lol just use a tarball". That's not productive. > > > > Look; here's the bottom line: without this work, doing certain kinds > > of system tracing is a nightmare, and with this patch, it Just Works. > > You're arguing that various tools should do a better job of keeping > > the filesystem in sync with the kernel. Maybe you're right. But we > > don't live in a world where they will, because if this coherence were > > going to happen, it'd work already. But this work solves the problem: > > by necessity, anything that changes a kernel image *must* update > > modules coherently, whether the kernel image and module come from the > > filesystem, network boot, some kind of SQL database, or carrier > > pigeon. There's nothing wrong with work that very cheaply makes the > > kernel self-describing (introspection is elegant) and that takes > > advantage of *existing* kernel tooling infrastructure to transparently > > do a new thing. > > > > You don't have to use this patch if you don't want to. Please stop > > trying to block it. > > > > No, that's not how this works. It is far worse to do something the wrong > way than not doing it at all, when it affects the kernel-user space > interactions. And what are the supposedly disastrous consequences of this change? It's basically a souped-up version /proc/config.gz. Tell me more about the trail of destruction and regret behind /proc/config.gz. > Experience -- and we have almost 30 years of it -- has shown that hacks > of this nature become engrained and all of a sudden is "mandatory". At > the *very least* it needs to comply with the zero-one-infinity rule > rather than being yet another ad hoc hack. It already satisfies the zero-one-infinity rule by virtue of not being a system for encoding an arbitrary number of random kernel header blobs for some reason in a single kernel. > More fundamentally, we already have multiple ways to handle objects that > need to go into the filesystem: they can be installed with (or as) > modules, they can use the firmware interface, and so on. *There* *may* *be* *no* *filesystem*. Or the filesystem may be read-only. The only thing the kernel can really guarantee is its own existence --- it should be entire in itself. If I'm hacking on an Android kernel and say "fastboot boot mykernel" without making any changes to the device's boot filesystem, I should still be able to use tracing tools that rely on knowing the headers for the kernel with which the device happened to boot. Any approach that requires coordinated kernel and filesystem changes to make this usecase work is inferior to what Joel's proposed. > Saying "it can be a module" is worse than a copout: even if dynamically > loaded -- and many setups lock out post-boot module loadings for > security reasons -- there is nothing to cause it to unload. Those setups can ship kernel headers as they do today. Or a tmpfs-based approach may be workable. > The bottom line is that in the end there is no difference between making > this an archive of some sort and a module, except that to use the module > you need privilege that you otherwise may not need. If your argument is > that someone may not be providing the whole set of items provided by > "make modules_install", what is there to say that they would include > your specific module? > > Here are some better ways of implementation I can see: > > 1. Include an archive in "make modules_install". Most trivial; no kernel > changes at all. No. See above. > 2. Generalize the initramfs code to be able to create a pre-populated > tmpfs at any time, that can be populated from an archive provided by > the firmware loading mechanism; like all firmware allows it to either > be built in or fetched from the filesystem. This allows it to be > built in to the kernel image if that becomes necessary; using tmpfs > means that it can be pushed out to swap rather than permanently > stored in kernel memory, and this filesystem can be unmounted freeing > its memory. Backing the blob storage with tmpfs is a reasonable tweak to Joel's existing model. We can mark the header blob discardable and memcpy it into some tmpfs-backed storage. This way, it can swap, and you can release the memory with rm(1) as well as unmount. You might as well expose the facility as a new just-like-tmpfs filesystem that init scripts can mount anywhere --- once. Making the thing a firmware blob sounds fine too, although I know less about that subsystem. But blocking this work as a whole in favor of some yet-to-be-designed general-purpose initramfs-tmpfs conversion thingamajig really is perfect-is-the-enemy-of-the-good-ism, and I don't think tmpfs storage is necessary for the initial version of this work. > 3. Use a squashfs image instead of an archive. Why?