Received: by 2002:ac0:a5b6:0:0:0:0:0 with SMTP id m51-v6csp4355392imm; Mon, 18 Jun 2018 13:31:47 -0700 (PDT) X-Google-Smtp-Source: ADUXVKKdom8PTR4/FGeUa06MYpZfC5rXH6OqrhdZqQ2O8wqgthtj2HlsJuyiDjUAW5nOVqw/f5Be X-Received: by 2002:a63:91c8:: with SMTP id l191-v6mr12384707pge.53.1529353907409; Mon, 18 Jun 2018 13:31:47 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1529353907; cv=none; d=google.com; s=arc-20160816; b=ONBph+uH2MnjT+irr+eekuVJ6iawTVoexUhb/U5EbCQWp3jUyqcwzFznaMsckmu93N xphD12ftHWgW06j1PcB8ZgT5227ZSWzT5ZHjMxhhju25/6gp30NPzbXYaA0pbXsAnh88 ARHgq8+07e0NCS31pOruqmByC/kHs14hwKMhePy3nosRMgv3FLeO89ss7K8CBlOW0D/L tOcCVT1Qmov96KEy1omUrvYNqM9oMTDFGNsr50WM8O9vBE7ihZXbgk+3AtMBK6xOB9wj f50JO8BhOKf1Xdn0/SC7XJk1aHyvzi1zx4ne6t0Qtvjdk72TD0ZHpwjMYcaMnDRAWn/v jq6Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:content-transfer-encoding :mime-version:from:subject:cc:to:references:in-reply-to:organization :arc-authentication-results; bh=CmjGQcBhpQcP/3STU2FDPFIobvx/Q5bQzS7f27RH8Gc=; b=apIURyq8qzbd7BNfdvRv3hJGrQYetsorYFWUGhOd+/xwuolxnH/6uzISdXtIJDY0fr ALe+r8AGGJS48y3f2luo20U2kaVd2whBqYaU/VwKYrYrkDZ+EIjKSIEqOOTRuDAPNMBb fLcFOlJLPky07MHo0nJP5GXmu8UyDGOyOWZAR9hD6z+4Dc/dmaEa/maBUbujuELLnhCH lyfJXb7JqvGbzzqLcedH7ZPU7Zy/Uq2LoZBcrkUldogTeuS2QHPeK+AXFql8dLhzTwND MmcYuJwCHKh7YE8sk8BvsvxCV90qTKT49Pugfso2D8D5YH1GBUDG4G5/+5QRByR3b5Wr Un7A== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id y15-v6si12726099pgr.555.2018.06.18.13.31.33; Mon, 18 Jun 2018 13:31:47 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S935531AbeFRUax convert rfc822-to-8bit (ORCPT + 99 others); Mon, 18 Jun 2018 16:30:53 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:32962 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1754801AbeFRUaw (ORCPT ); Mon, 18 Jun 2018 16:30:52 -0400 Received: from smtp.corp.redhat.com (int-mx05.intmail.prod.int.rdu2.redhat.com [10.11.54.5]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id E844180125CE; Mon, 18 Jun 2018 20:30:51 +0000 (UTC) Received: from warthog.procyon.org.uk (ovpn-121-111.rdu2.redhat.com [10.10.121.111]) by smtp.corp.redhat.com (Postfix) with ESMTP id D4A497C42; Mon, 18 Jun 2018 20:30:50 +0000 (UTC) Organization: Red Hat UK Ltd. Registered Address: Red Hat UK Ltd, Amberley Place, 107-111 Peascod Street, Windsor, Berkshire, SI4 1TE, United Kingdom. Registered in England and Wales under Company Registration No. 3798903 In-Reply-To: <87in6kptqt.fsf@xmission.com> References: <87in6kptqt.fsf@xmission.com> <152720672288.9073.9868393448836301272.stgit@warthog.procyon.org.uk> To: ebiederm@xmission.com (Eric W. Biederman) Cc: dhowells@redhat.com, viro@zeniv.linux.org.uk, linux-fsdevel@vger.kernel.org, linux-afs@lists.infradead.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 00/32] VFS: Introduce filesystem context [ver #8] From: David Howells MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: 8BIT Date: Mon, 18 Jun 2018 21:30:50 +0100 Message-ID: <3949.1529353850@warthog.procyon.org.uk> X-Scanned-By: MIMEDefang 2.79 on 10.11.54.5 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Mon, 18 Jun 2018 20:30:51 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.8]); Mon, 18 Jun 2018 20:30:51 +0000 (UTC) for IP:'10.11.54.5' DOMAIN:'int-mx05.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'dhowells@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Eric W. Biederman wrote: > I have read through these patches and I noticed a significant issue. > > Today in mount_bdev we do something that looks like: > > mount_bdev(...) > { > s = sget(..., bdev); > if (s->s_root) { > /* Noop */ > } else { > err = fill_super(s, ...); > if (err) { > deactivate_locked_super(s); > return ERR_PTR(err); > } > s->s_flags |= SB_ATTIVE; > bdev->bd_super = s; > } > return dget(s->s_root); > } > > The key point is that we don't process the mount options at all if > a super block already exists in the kernel. Similar to what > your fscontext changes are doing (after parsing the options). Actually, no, that's not the case. The fscontext code *requires* you to parse the parameters *before* any attempt to access the superblock is made. Note that this will actually be a problem for, say, ext4 which passes a text string stored in the superblock through the parser *before* parsing the mount syscall data. Fun. I'm intending to deal with that particular case by having ext4 create multiple private contexts, one filled in from the user data, and then a second one filled in from the superblock string. These can then be validated one against the other before the super_block struct is published. And if the super_block struct already exists, the user's specified parameters can be validated against that. > Your fscontext changes do not improve upon this area of the mount api at > all and that concerns me. This is an area where people can and already > do shoot themselves in their feet. This *will* be dealt with, but I wanted to get the core changes upstream before tackling all the filesystems. The legacy wrapper is just that and should be got rid of when all the filesystems have been converted. > ... > > Creating a new mount and finding an old mount are the same operation in > the kernel today. This is fundamentally confusing. In the new api > could we please separate these two operations? > > Perhaps someting like: > x create > x find > > With the "x create" case failing if the filesystem already exists, > still allowing "x find"? And with the "x find" case failing if > the superblock is not already created in the kernel. No. What you're suggesting introduces a userspace-userspace and a userspace-kernel race - unless you're willing to let userspace lock against superblock creation by other parties. Further, some filesystems *have* to be parameterised before you can do the check for the superblock. Network filesystems, for example, where you have to set the network parameters upfront and the key to the superblock might not be known until you've queried the server. > That should make it clear to a userspace program what is going on > and give it a chance to mount a filesystem anyway. That said, I'm willing to provide a "fail if already extant" option if we think that's actually likely to be of use. However, you'd still have to provide parameters before the check can be made. > In a similar vein could we please clarify the rules for changing mount > options for an existing superblock are in the new api? You mean remount/reconfigure? Note that we have to provide backward compatibility with every single filesystem. > Today mount assumes that it has to provide all of the existing options to > reconfigure a mount. What people want to do and what most filesystems > support is just specifying the options that need to be changed. Can we > please make this the rule of how this are expected to work for fscontext? > That only changing mount options need to be specified before: "x > reconfigure" Fine by me - but it must *also* support every option being specified if that is what mount currently does. I don't really want to supply extra parsers if I can avoid it. Miklós, for example wanted a different, incompatible interface, so you'd do: write(fd, "o +foo"); write(fd, "o -bar"); write(fd, "x reconfig"); sort of thing to enable or disable options... but this assumes that options are binary and requires a separate parser to the one that does the initial configuration - and you still have to support the old remount data parse. I'm okay with specifying that you should just specify the options you want to change and that the normal way to 'disable' something is to prefix it with "no". I guess I could pass a flag through to indicate that this came from sys_mount(MS_REMOUNT) rather than the new method. David