Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp2321246imm; Tue, 10 Jul 2018 18:15:26 -0700 (PDT) X-Google-Smtp-Source: AAOMgpdcrxTcUaVecUfkfscXUFlz+JFkForjieDHBL5VV5BKm+6Ek4ep5Ze/xrI9Rz4GyccYDipE X-Received: by 2002:a65:41c6:: with SMTP id b6-v6mr1740142pgq.174.1531271726788; Tue, 10 Jul 2018 18:15:26 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1531271726; cv=none; d=google.com; s=arc-20160816; b=bPImn1WR5uxxowNwMk3sIUr/PcgortfLfS/MwNujRZYaCDinAFkCP7syOX6Q94agOp q1GmbSWi7ZMMGkAEtNqTpNPdJ3nKNPTJ4MmI5kU7rWPRblmM5XpXZjxyLYw4xAGeVZU3 xV58esNGeG6Ar/+pPA9BWxnCGe1b3hOCe4vXQxHrFo+YNqKu9eLNzueSd6jtuC3mPyU1 3SgvaISFSgGmH8mquCpz/zkfk7jxBIsA5mKMHkDPg+oADs4r0QDypUjXGdbMoaCWGu1Q jsKll0MmGGGjX+QRNQI2mLj9a0mqoAc2tEFhvE7VZhv9kRE/iM02/TfFgq1cJ6XCGnnx 00uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature:arc-authentication-results; bh=UXLbQRPf1xMNXiUnB+N/Yq9SN0t779PNOEWQkJqyQTs=; b=b8FL78oSNECKZqdhxWcVryo2svT3I8WraQNqickJ6b7TXQVb0/fEhgGHryskhlceBM CfgR4uCF0hQIEli6A3m0cLS10nrDj2TKh0nu1TOfoI9ZDvB0MZqUquE6U6wCW7zZ/X3q 5xejY9JWxULK5a9N/IsObkGIW5l0ooXEq4SqNJCtpDhuzyhv6uWqrLCFUdVIns6ogXdo m6CJRAJBaVSBNUDPZBExhV9joVcSIK9rlLI6+rJOSKfdLLXyoEN4tVgNIAyg+7xVL225 OQEyvABEWW3lS6KZs5WO9L3wiNBjtfiCETZGVsLt/cMUVYqO2kHrIwrJcVsA6usG5PGb T/EA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=srcZ17tf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id b25-v6si9959980pgf.545.2018.07.10.18.15.11; Tue, 10 Jul 2018 18:15:26 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=srcZ17tf; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1732348AbeGKBQV (ORCPT + 99 others); Tue, 10 Jul 2018 21:16:21 -0400 Received: from mail-oi0-f68.google.com ([209.85.218.68]:39331 "EHLO mail-oi0-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1732278AbeGKBQU (ORCPT ); Tue, 10 Jul 2018 21:16:20 -0400 Received: by mail-oi0-f68.google.com with SMTP id d189-v6so46233147oib.6 for ; Tue, 10 Jul 2018 18:14:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=UXLbQRPf1xMNXiUnB+N/Yq9SN0t779PNOEWQkJqyQTs=; b=srcZ17tfjIVRYQbHNpJK/02lwvg07G0H2HQkjsz6jM6NSL/MObovPXJLAGVT6o4ELf QUohpZznnobrLB5pml432lLsv5xK428niXd0w1dUpTWMcXMMMtKhkyW2R3rMy/epabJh jBhfAzivzYOTXSOYoj80hH7D9fJ6uqhp9Z4db2AstCbuTeOZc4KP/f3vTkwkyJ0euQY0 Ul300WMGN2Y5VwiCFM7ZdIeBDUPNqK17vj9BfZip3uqS9l46AS332/PmtEmTzDPGUV48 eW1xl64Wl2czb3T3SPuz+4wCztArWRgmfkReHQ6bgHT32/8WgSTUUZzKsXt5K+ZAB11B v8Lw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=UXLbQRPf1xMNXiUnB+N/Yq9SN0t779PNOEWQkJqyQTs=; b=Ktrev1Y796Ts4tsZA+he/FmxFfZPM5x1ZE7uMXcvMi50KvnfjyUGknwFnfJNzyOboU 8p76PjCw/VQA/px6cxpv0+uOdP2pxhue9x9PbIMfj0hL8vMaikcLbno2aaQiPsrtXCvf FZPkCBnhaEqXmfj2zd7/ERcYRCA2sVNFPBxny3o2IqyyEVUkSljJJT9QOjRm64yL7Qcm im/lqto2dxe8sYmlAWFDOxPutwik0Ai80aZam9M2DrQUynbe9WrVEHYFfvuBb+olhAvi ojymy1TGZHuHy78ZjAWRLgIAWelxLQNmSvwItjIAN480ZGLrX/Uiitm+0o0Tj66bLJp7 nlHA== X-Gm-Message-State: APt69E02b0Wt87YIg/+cSNH3D1qBhSvN0Q7Sm37biiNPl/toQbOkhodp mnWOMgCRk7nLO9AA6yBeKcFE4xQ2iOzLbzK9QEKF2w== X-Received: by 2002:aca:5bd5:: with SMTP id p204-v6mr32117190oib.91.1531271676749; Tue, 10 Jul 2018 18:14:36 -0700 (PDT) MIME-Version: 1.0 References: <153126248868.14533.9751473662727327569.stgit@warthog.procyon.org.uk> <153126264966.14533.3388004240803696769.stgit@warthog.procyon.org.uk> <686E805C-81F3-43D0-A096-50C644C57EE3@amacapital.net> In-Reply-To: <686E805C-81F3-43D0-A096-50C644C57EE3@amacapital.net> From: Jann Horn Date: Tue, 10 Jul 2018 18:14:10 -0700 Message-ID: Subject: Re: [PATCH 24/32] vfs: syscall: Add fsopen() to prepare for superblock creation [ver #9] To: Andy Lutomirski Cc: David Howells , Al Viro , Linux API , linux-fsdevel@vger.kernel.org, Linus Torvalds , kernel list Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, Jul 10, 2018 at 4:59 PM Andy Lutomirski wrote= : > > [cc Jann - you love this stuff] > > > On Jul 10, 2018, at 3:44 PM, David Howells wrote: > > > > Provide an fsopen() system call that starts the process of preparing to > > create a superblock that will then be mountable, using an fd as a conte= xt > > handle. fsopen() is given the name of the filesystem that will be used= : > > > > int mfd =3D fsopen(const char *fsname, unsigned int flags); > > This is great in principle, but I think you=E2=80=99re seriously playing = with fire with the API. > > > > > where flags can be 0 or FSOPEN_CLOEXEC. > > > > For example: > > > > sfd =3D fsopen("ext4", FSOPEN_CLOEXEC); > > write(sfd, "s /dev/sdb1"); // note I'm ignoring write's length arg > > Imagine some malicious program passes sfd as stdout to a setuid program. = That program gets persuaded to write =E2=80=9Cs /etc/shadow=E2=80=9D. What= happens? You=E2=80=99re okay as long as *every single fs* gets it right, = but that=E2=80=99s asking a lot. > > > write(sfd, "o noatime"); > > write(sfd, "o acl"); > > write(sfd, "o user_attr"); > > write(sfd, "o iversion"); > > write(sfd, "o "); > > write(sfd, "r /my/container"); // root inside the fs > > write(sfd, "x create"); // create the superblock > > From cursory inspection of a bunch of the code, I think the expectation i= s that the actual device access happens in the =E2=80=9Cx=E2=80=9D action. = This is not okay. You can=E2=80=99t do this kind of thing in a write() hand= ler, unless you somehow make every single access using f_cred, which is a r= eal pain. > > I think the right solution is one of: > > (a) Pass a netlink-formatted blob to fsopen() and do the whole thing in o= ne syscall. I don=E2=80=99t mean using netlink sockets =E2=80=94 just the n= lattr format. Or you could use a different format. The part that matters i= s using just one syscall to do the whole thing. > > (b) Keep the current structure but use a new syscall instead of write(). > > (c) Keep using write() but literally just buffer the data. Then have a ne= w syscall to commit it. In other words, replace =E2=80=9Cx=E2=80=9D with a= syscall and call all the fs_context_operations helpers in that context ins= tead of from write(). I also love ioctls, so I think you could also use an ioctl to do the commit? You can do anything (well, almost anything) that you can do in syscall context in ioctl context, too; and when you already have a file descriptor of a specific type that you want to perform an operation on, an ioctl works just fine.