Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp873594imu; Fri, 25 Jan 2019 12:35:35 -0800 (PST) X-Google-Smtp-Source: ALg8bN6b6hCh0qhClRRRw65RYA/WFBKI9lB5NSEh9zonbf8HvQ7ch+PLZsDs6tqiSoIhiwMp3KNs X-Received: by 2002:a62:34c6:: with SMTP id b189mr12726049pfa.229.1548448535421; Fri, 25 Jan 2019 12:35:35 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1548448535; cv=none; d=google.com; s=arc-20160816; b=EftgAz0YXmVtdbad9BqW36zlb2gMFWt2VGyik2pdlfqq4CTmG5Z+2gGGbCbMnYekP4 EXaawW9Mbvl+Ymf25c5bA4WqDgeXsIZUWf7X7LSVAk9OYMEsNa/uvEV+zAqA0tzRtLo8 1UKYqScvLY32dFZLxvux2qgkqb+LI1OxZx0y8grhtAvmx7v1sHmflmdwULDsn5J/Xqos 7FWzM0mHHeJNSVHHcBmIxZqqVUD0kw3yGYdOQDGGVc0SQ5MnwRhFXVBUx27VcLV/qDaF Hq7HfAzyGB8gFL4OTZbXYqQ18ZUFzmv3AtvyfjYafxUf4XYZss8fc8X+LGDjiaRl07KT eXlg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=83fXAUpWsk7Zx8K8xl4WoDl3abwNlJzwQQNiZa7/O/w=; b=a0U1ws9S/rq8ZHe8+HGFE5D5s22k9Tly44zxoW9D0Qfeiyb0YUIDhlil+uCR0Zghd8 7eFREjZxJzqKZ3nsj25Oo+NQ8ljw5M9gmM3IdennJHvJOzdpv7WWKnajeTjJ2nvfkQiR sRbaEzdYFEneJw/ECMIQ18Cc4Ejl4AwxNFkNDl50M8+n36eDX9zHKjhvFo2LhrF+6BmC BScTWZa9tW3GcJQ/S5l8Pk1lUmhVHf5nTNG0d83ii/26k0h+xMt6dn95aN9Id4yFKGFJ 4LpEuOIQXz8LU1pAlpSW3y/pSMnTv2a24GUncRiXL4uS29u6RZoCDu4Pa0NlMGFu6jXF cwZw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E25Eh9OG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id k11si1958894pgh.132.2019.01.25.12.35.20; Fri, 25 Jan 2019 12:35:35 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=E25Eh9OG; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726991AbfAYUfF (ORCPT + 99 others); Fri, 25 Jan 2019 15:35:05 -0500 Received: from mail-vs1-f68.google.com ([209.85.217.68]:38448 "EHLO mail-vs1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1725778AbfAYUfF (ORCPT ); Fri, 25 Jan 2019 15:35:05 -0500 Received: by mail-vs1-f68.google.com with SMTP id x64so6446876vsa.5 for ; Fri, 25 Jan 2019 12:35:04 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=83fXAUpWsk7Zx8K8xl4WoDl3abwNlJzwQQNiZa7/O/w=; b=E25Eh9OGP0rP3TCMmwt2JfLPlF6phN2VI6PWNNrCDOyyLWZZUbUYgKGnlQ8iiLVU1y 97MHt8bvYGxvsme4utOhQvL1y1TFBRvd+x0xfhs1JU++Pt8lN8G6QEh5kI7si/Zuj5vE K0oz1VCKPuwqojUwLnH6Gah3XrXw5tA2ikuTXvutmryrLiFvWdGtHmHVVX8u4qVK3dIJ MnHmV59PjFXMrgivECVEKkKlx0ijTnK8THCYIeyO5W+hKGzufukNTX0nXrqih6ExghMp 0Vx96wX4KahBM+29svrvpSS6yTuv7iPMSJOiHZmM5nmSUtELBndb8dw/TLm5ocIHfqZJ c8CA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=83fXAUpWsk7Zx8K8xl4WoDl3abwNlJzwQQNiZa7/O/w=; b=eZC/c/WPiPVzW/ZhlHwMyxnivyFXRC2HV0/KERxph+OYncufKbKknqRl3TAE1oWGXh Tb5qmSvVuTKeoqqBiDjhUE1TjJSmyIiBgMM6v3GPJKoVv2D1XVWNzSIx3EL3xYvH3iH9 XdbYaI2Xxg5ve4M2fZ2XQPSUzjx/YGiaCd9cUdBEmfUNCcZk25+Snu6Nk0MtbpAZlMvs xyY5adNg4g7+iAFyuMpDYT293dbYq3Irihfim4XXvj8ZbLKxOnxZ+49ado8b9c+NlOsA UqX20+h7QPryglZLsPOBcqS6YHoqcqKiqIIheVyUWWarHDL7ZO66Q63CtcT5cPeeDE19 btUw== X-Gm-Message-State: AJcUukd/nk85ahbnWivdZ1Db/JsszNgC19NsLLGL3cL4ttBIiD2XQ+U8 X46j9RGqKE/3tqPBNdKdJ2k7yOlie/QGp5GSBSsLpA== X-Received: by 2002:a67:b44:: with SMTP id 65mr5106835vsl.77.1548448503258; Fri, 25 Jan 2019 12:35:03 -0800 (PST) MIME-Version: 1.0 References: <20190119103606.GA17400@kroah.com> <8BD4CB7A-944C-4EC5-A198-1D85C9EE28D6@zytor.com> <20190120161003.GB23827@google.com> <20190121014553.GD23827@google.com> <20190122133901.GA189736@google.com> <117d2f96-b0e9-2376-69b7-836fa0c52539@opersys.com> <6f82d99a-d877-aeae-3e63-95dff260c445@opersys.com> <20190124205929.GA141510@google.com> <50511D3D-7193-4B1C-952E-CCC37FA71388@zytor.com> <48D020BD-D344-41CC-9D32-033F592605EB@zytor.com> In-Reply-To: <48D020BD-D344-41CC-9D32-033F592605EB@zytor.com> From: Daniel Colascione Date: Fri, 25 Jan 2019 12:34:52 -0800 Message-ID: Subject: Re: [RFC] Provide in-kernel headers for making it easy to extend the kernel To: "H. Peter Anvin" Cc: Joel Fernandes , Karim Yaghmour , Greg KH , Christoph Hellwig , linux-kernel , Andrew Morton , ast@kernel.org, atish patra , Borislav Petkov , Ingo Molnar , Jan Kara , Jonathan Corbet , Kees Cook , kernel-team@android.com, "open list:DOCUMENTATION" , Manoj Rao , Masahiro Yamada , Paul McKenney , "Peter Zijlstra (Intel)" , Randy Dunlap , rostedt@goodmis.org, Thomas Gleixner , "maintainer:X86 ARCHITECTURE (32-BIT AND 64-BIT)" , yhs@fb.com Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Jan 25, 2019 at 12:28 PM wrote: > > On January 25, 2019 11:15:52 AM PST, Daniel Colascione wrote: > >On Fri, Jan 25, 2019 at 11:01 AM wrote: > >> > >> On January 24, 2019 12:59:29 PM PST, Joel Fernandes > > wrote: > >> >On Thu, Jan 24, 2019 at 07:57:26PM +0100, Karim Yaghmour wrote: > >> >> > >> >> On 1/23/19 11:37 PM, Daniel Colascione wrote: > >> >[..] > >> >> > > Personally I advocated a more aggressive approach with Joel in > >> >private: > >> >> > > just put the darn headers straight into the kernel image, it's > >> >the > >> >> > > *only* artifact we're sure will follow the Android device > >> >whatever > >> >> > > happens to it (like built-in ftrace). > >> >> > > >> >> > I was thinking along similar lines. Ordinarily, we make loadable > >> >> > kernel modules. What we kind of want here is a non-loadable > >kernel > >> >> > module --- or a non-loadable section in the kernel image proper. > >> >I'm > >> >> > not familiar with early-stage kernel loader operation: I know > >it's > >> >> > possible to crease discardable sections in the kernel image, but > >> >can > >> >> > we create sections that are never slurped into memory in the > >first > >> >> > place? If not, maybe loading and immediately discarding the > >header > >> >> > section is good enough. > >> >> > >> >> Interesting. Maybe just append it to the image but have it not > >loaded > >> >and > >> >> have a kernel parameter than enables a "/proc/kheaders" driver to > >> >know where > >> >> the fetch the appended headers from storage at runtime. There > >would > >> >be no > >> >> RAM loading whatsoever of the headers, just some sort of > >> >> "kheaders=/dev/foobar:offset:size" parameter. If you turn the > >option > >> >on, you > >> >> get a fatter kernel image size to store on permanent storage, but > >no > >> >impact > >> >> on what's loaded at boot time. > >> > > >> >Embedding anything into the kernel image does impact boot time > >though > >> >because > >> >it increase the time spent by bootloader. A module OTOH would not > >have > >> >such > >> >overhead. > >> > > >> >Also a kernel can be booted in any number of ways other than mass > >> >storage so > >> >it is not a generic Linux-wide solution to have a kheaders= option > >like > >> >that. > >> >If the option is forgotten, then the running system can't use the > >> >feature. > >> >The other issue is it requires a kernel command line option / > >> >bootloader > >> >changes for that which adds more configuration burden, which not be > >> >needed > >> >with a module. > >> > > >> >> > Would such a thing really do better than LZMA? LZMA already has > >> >very > >> >> > clever techniques for eliminating long-range redundancies in > >> >> > compressible text, including redundancies at the sub-byte level. > >I > >> >can > >> >> > certainly understand the benefit of stripping comments, since > >> >removing > >> >> > comments really does decrease the total amount of information > >the > >> >> > compressor has to preserve, but I'm not sure how much the > >encoding > >> >> > scheme you propose below would help, since it reminds me of the > >> >> > encoding scheme that LZMA would discover automatically. > >> >> > >> >> I'm no compression algorithm expert. If you say LZMA would do the > >> >> same/better than what I suggested then I have no reason to contest > >> >that. My > >> >> goal is to see the headers as part of the kernel image that's > >> >distributed on > >> >> devices so that they don't have to be chased around. I'm just > >trying > >> >to make > >> >> it as palatable as possible. > >> > > >> >I believe LZMA is really good at that sort of thing too. > >> > > >> >Also at 3.3MB of module size, I think we are really good size-wise. > >But > >> >Dan > >> >is helping look at possibly reducing further if he gets time. Many > >> >modules in > >> >my experience are much bigger. amdgpu.ko on my Linux machine is > >6.1MB. > >> > > >> >I really think making it a module is the best way to make sure this > >is > >> >bundled with the kernel on the widest number of Android and other > >Linux > >> >systems, without incurring boot time overhead, or any other command > >> >line > >> >configuration burden. > >> > > >> >I spoke to so many people at LPC personally with other kernel > >> >contributors, > >> >and many folks told me one word - MODULE :D. Even though I > >hesitated > >> >at > >> >first, now it seems the right solution. > >> > > >> >If no one seriously objects, I'll clean this up and post a v2 and > >with > >> >the > >> >RFC tag taken off. Thank you! > >> > > >> > - Joel > >> > >> So let me throw in a different notion. > >> > >> A kernel module really is nothing other than a kernel build system > >artifact stored in the filesystem. > >> > >> I really don't at any reason whatsoever why this is direct from just > >producing an archive and putting it in the module directory, except > >that the latter is far simpler. > >> > >> I see literally *no* problem, social or technical, you are solvin by > >actually making it a kernel ELF object. > > > >Joel does have a point. Suppose we're on Android and we're testing a > >random local kernel we've built. We can command the device to boot > >into the bootloader, then send our locally-built kernel to the device > >with "fastboot boot mykernelz". Having booted the device this way, > >there's no on-disk artifact corresponding to mykernelz: we just sent > >the boot kernel directly to the device's memory. Now, suppose I want > >to attach DCTV or some other fancy ftrace-based analysis tool to the > >device in order to study how mykernelz acts in some scenario I care > >about. With Joel's approach, DCTV would be able to grab the kernel > >headers from the running kernel, compile whatever kprobe or BPF > >incantations needed, and have everything Just Work. If we put the > >headers only on disk without any way to retrieve them at runtime, we'd > >need a different path for kernel self-description in the case where a > >running kernel doesn't have an on-disk representation, and that adds a > >lot of complexity for everyone everywhere. By providing > >guaranteed-correct kernel headers via some runtime interface, we make > >a lot of things Just Work, and that has value. > > Joel specifically is talking about using a module, which *does* have to be in the filesystem. > > You can't have it both ways, unfortunately. In general, whatever we support in module form, we also support as part of the kernel image itself.