Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp5133280yba; Wed, 10 Apr 2019 12:05:38 -0700 (PDT) X-Google-Smtp-Source: APXvYqzcTsDNBdFBrwId3pcDxwd7sapKRSD1bIGRq5j/b2MMBffaCd5WYs1aX7pZ6qRWc2PxUmms X-Received: by 2002:a65:554e:: with SMTP id t14mr31814929pgr.107.1554923138306; Wed, 10 Apr 2019 12:05:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1554923138; cv=none; d=google.com; s=arc-20160816; b=kqkQWsr+QA5F/ecuqDLHD3mmiDGTkyceLuvCGu7GEiG2Rq9wk8knvKavPZjJIx8v/V YSDO7vRH00Dq8aCHzxl5YL8by/9hOg6l5lS8xl62Q+JX7tVnE3yv91v7zkhcY4xlLmGY 0PR1IRHjf/d3V9UDG9igb+U5dPMd7DoafLghaTI/EmvA54wtOyKeM5PKlUjtKN4u4ujb gfEJlw0UHyyor2hE8NWfiRDN/nqcIGVTuQoGRPttLrrrc8rvC9TjoEgHF43T4YS3s+jd NKx7xVpSsSUJdNiMEQ0L/fBFUMbdumN6Y5gmkeYH7kW6FVICv159XDe80HnQDB8eoO3C D0gA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=aK/+LA0Nx639SX2fouzq736GfOt0WRlJ/Dlq/XiIgls=; b=t14vsCqrXp7pUMUHpfLo5LcpAkWn6RrOgKj2wbNjZ8e1jwdccdVkC+ABywTP5mDDN4 m+l5bDfbwppeBs2H7d16q8nj1Dc7oH72GZYKiixNeKbSP3Az9IHQwO3GUo74AivOGV1I jhj14VaYaQ0TNMHxn98QGLVm1FPjpwarWv3EkY77cfOgcHeeKbLeo0HYQ1lprTMTWNbp meIoiEw5VzMro9Y4gJ6T1xwDitFrfT2SZVqSo4fMxDEw/FBwQXu/YofFoHjvXzVGx8r1 Hf+Qzown7QEZM5Txdt07MHnA/wFJFK6JUES6ClZjxpXxZKHpu1gIseucc+rPP0KNQzE2 DP5g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tepLBedC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h7si25814497pgj.363.2019.04.10.12.05.22; Wed, 10 Apr 2019 12:05:38 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tepLBedC; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729544AbfDJRf6 (ORCPT + 99 others); Wed, 10 Apr 2019 13:35:58 -0400 Received: from mail-ot1-f65.google.com ([209.85.210.65]:33438 "EHLO mail-ot1-f65.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1729526AbfDJRf5 (ORCPT ); Wed, 10 Apr 2019 13:35:57 -0400 Received: by mail-ot1-f65.google.com with SMTP id j10so2720143otq.0 for ; Wed, 10 Apr 2019 10:35:56 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=aK/+LA0Nx639SX2fouzq736GfOt0WRlJ/Dlq/XiIgls=; b=tepLBedCNCt4Ruj+Dg1xnXz+f8l+XNHbYV5xT8CVAdbKeobEuGe05XC4DXzJbUjfzC 7X8QSU6Skh/Njl/rRvj5Jipf1gW5caPx42+ooiebzEADnFpbpeet8dugBTCxYG4Z73W4 5sq8YIFvMTwr6c/Xyh4NFfd6IjJ7G6LTqD6Xp3Hr+0TL2cwG3gQlv4hHmjJ+4a5gcAwK tHz6MKyxvmk7pSp8CVk+sBfbk3qI2m03oXi6NvxJyFVmRzb1zcjm/JybXe3UbvsoHZeR 1h6FAOuXdZvvD2fXh/Y2mLSKWZxkS8CkysChiEMlCVM0BIitvpD6jgjVvVS2RSTwiN+l 0oxw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=aK/+LA0Nx639SX2fouzq736GfOt0WRlJ/Dlq/XiIgls=; b=S1SaNLnJnf+79IU17CfCIcFR03iw+uhE3XKiynQrKMbxK999yafqzvx7bFPIJelilm qkiEeoYYf8yXYD1bEhTryINgRMtvFkNeWUFiUQnDddNWlYYYaqwMKIU/VI1lfoRkGE16 BEkeTZOPXg0LbJgvqQEV4ys97hd63TAjqRn7kzwvWx9O/DXofxwBp0W7XoWIHDjDJHNT /po+wlSZ/QcvtTEbKSn5nsPmP90X5kQ5vgeW+CjNQNC03CKS6QLxvqymNu9VR84I/ARy nGkQHR8VUFYarp860MPV2a+BHik4s45Ox59AOy+uHgvxwzfir/14L+rNUtFIkDnIxWz+ GLuw== X-Gm-Message-State: APjAAAWrN8a6eC9UlguAAym4xbpci+aRYDWkja70ug/fhFiB9szo2zL/ RIBw1pFRyG+lQt/C91LRny+HfRn6YlpAjuXTm8GhdQ== X-Received: by 2002:a9d:5509:: with SMTP id l9mr28848933oth.195.1554917755443; Wed, 10 Apr 2019 10:35:55 -0700 (PDT) MIME-Version: 1.0 References: <20190320163116.39275-1-joel@joelfernandes.org> <20190408203601.GF133872@google.com> In-Reply-To: From: Daniel Colascione Date: Wed, 10 Apr 2019 10:35:44 -0700 Message-ID: Subject: Re: [PATCH v5 1/3] Provide in-kernel headers to make extending kernel easier To: Olof Johansson Cc: Joel Fernandes , Joel Fernandes , Linux Kernel Mailing List , Qais Yousef , Dietmar Eggemann , Manoj Rao , Andrew Morton , Alexei Starovoitov , atish patra , Daniel Colascione , Dan Williams , Greg Kroah-Hartman , Guenter Roeck , Jonathan Corbet , Karim Yaghmour , Kees Cook , Android Kernel Team , "open list:DOCUMENTATION" , "open list:KERNEL SELFTEST FRAMEWORK" , linux-trace-devel@vger.kernel.org, Masahiro Yamada , Masami Hiramatsu , Randy Dunlap , Steven Rostedt , Shuah Khan , Yonghong Song Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, Apr 10, 2019 at 9:35 AM Olof Johansson wrote: > There are lots of things where we provide suitable ways of doing > things to the user instead of making them come up with their own > handling of things. devtmpfs is a perfect example of this -- doing > things in userspace was perfectly possible but still a hassle in many > cases, and having the kernel do it for you when it already has the > data makes sense. devtmpfs is something that the kernel can do *better* than userspace. The userspace equivalent is error-prone. Unpacking an archive, on the other hand, is trivial. Does userspace really need the kernel's help to do tar xvf? Are we really so worried about userspace getting tar xvf wrong somehow that we should provide tar xvf in the kernel? The kernel should have a feature only if there's some particular reason to put that feature in the kernel instead of userspace. A smaller kernel is a better kernel, all things being equal. It's also worth noting that years and years ago, many proposals to do what udevd does (but in the kernel) failed to make it into the kernel. > I'd expect many users to still want to do this to tmpfs. Also, I > expect whatever userspace tools and programs that will consume this > data is likely to consume similar or more memory while running anyway. > So mounting + copying + unmounting on the heavily constrained systems > shouldn't be raising the high water mark on memory consumption. Consider the embedded case: who's to say that the machine decompressing the header bundle is the machine that provides it? We can suck a compressed archive off the device without ever decompressing it. I know you that you proposed providing access to both a compressed cpio archive and a filesystem view of that archive, but in this scheme, the filesystem view is redundant. If someone wants a filesystem view of an archive, he can make one with FUSE without the kernel's help and in a general way. > > > If you absolutely need to export a file to userspace with the archive, > > > my suggestion is to do it through debugfs. That way the format isn't > > > in a /proc ABI that can't be changed in the future (debugfs isn't > > > required to be stable in the same way). This way we can change the > > > format carried in the kernel over time without changing the official > > > way we present the data to userspace (via a filesystem view). > > > > > > As far as format goes; there's clear precedent on cpio being used and > > > supported; we already have build time requirements on the userspace > > > tools with some options. Using tar would actually be a new dependency > > > even if it is a common tool to have installed. With a self-populating > > > FS, there's no new tool requirements on the runtime side either. > > > > debugfs is going away for Android and is controversial in the fact > > that its functionality isn't guaranteed to be there (debugfs breakages > > aren't necessarily bugs AFAIK). So this isn't an option. > > The argument that this needs to go into /proc because Android is > removing debugfs isn't a very strong one. > > And "debugfs breakages aren't bugs" is exactly why I'm suggesting to > do the non-supported export of the archive that way instead. Breaking this header bundle *should* be a bug though: tools will rely on it. It's not critical interface in the same way that, say, open(2) is, but the interface stability guarantee should nevertheless be stronger than what debugfs provides. > > > > We had close to 2-3 months of discussions now with various folks up until v5. > > > > I am about to post v6 which is in line with Masahiro Yamada's expecations. In > > > > that I will be dropping module building artifacts due to his module building > > > > concerns and only include the headers. > > > > > > I've found some of the old discussion and read up on it. I think it > > > was pretty quick at dismissing ideas for more robust implementations > > > ("it needs squashfs-tools"), and had some narrow viewpoints (exporting > > > a tarball is the least amount of kernel change, while adding > > > complexity at the system/usage side). > > > > Honestly, that's kind of unfair to be quoting just a few points like > > that. If I remember there were 100s of emails and many good view > > points were brought up by many people. We have done the diligence in > > the discussions of this over a period of time. > > That wasn't captured with the patch submission, and having people go > find 100s of emails to figure out why your seemingly lacking solution > is the best one available is not how you motivate getting your code > into the kernel. > > > > I'd also like to clarify: I'm not opposed to the general idea of > > > providing the needed headers with the kernel somehow. I just think > > > it's worth spending effort making sure an interface for it that we'll > > > need to live with forever is appropriately thought through and not > > > rushed in, especially since we're likely to get substantial > > > infrastructure on top of it quickly (eBPF and friends in particular). > > > > We have spent the time :) This seems like the best solution of all. > > That should be documented. > > > Greg KH and other maintainers are also supportive of it as can be seen > > in other threads. > > I've found support for the desire to provide headers. If there's so > much support for this solution, the number of Acks to the patch should > have been higher. > > > We can consider an alternate proposal if it is > > better, but I don't see any better one proposed at the moment. > > Really? Joel's proposal is the simplest approach that solves the problem we're trying to solve. The model you're proposing adds a lot of complexity, and I'm not convinced that the complexity buys us anything. > > - squashfs-tools requirement on the build really sucks > > Nah, this is a minor detail. tar is ubiquitous. The squashfs tool isn't. Treating both dependencies the same way is a false equivalence. > > - cpio uncompressed to memory equally sucks because it consumes all > > the memory uncompressed instead of reclaimable pages > > Only while mounted. > > > - decompressing into tmpfs will suck for Android because we don't use > > disk-based swap and we run into the same cpio issue above. We use ZRAM > > for compressed swap. > > See comments above about high water marks for memory consumption > likely not moving much. > > > - debugfs is a non-option for Android > > Not my problem. It's important to make interfaces that work for everyone. Robust eBPF use should be possible without debugfs.