2014-04-03 17:57:45

by Dave Reisner

[permalink] [raw]
Subject: Initramfs FSID altered in 3.14

Hi,

[This is a repost of a G+ post at Tejun's request]

With Linux 3.14, you might notice in /proc/self/mountinfo that your
root's parent FSID is now 0, instead of the 1 that it's been for the
last N years. Tejun wrote the change (9e30cc9595303b27b48) that caused
this, but the change comes in a rather innocuous way. Instead of an
internal kernel mount of sysfs being assigned 0, it's now the initramfs.

So far, this has already caused switch_root and findmnt (from
util-linux) to break, cp (from coreutils) to break when using the -x
flag in early userspace, and it's also been pointed out that systemd's
readahead code makes assumptions about a device number of 0.

Are we now supposed to go and change all the assumptions in userspace
about 0 being special? I'm conflicted. The kernel isn't supposed to
break userspace, but it seems to me that FSIDs were never something to
rely on -- similar to the block device numbering scheme.

Cheers,
Dave


2014-04-03 18:14:02

by Thomas Bächler

[permalink] [raw]
Subject: Re: Initramfs FSID altered in 3.14

Am 03.04.2014 19:57, schrieb Dave Reisner:
> Hi,
>
> [This is a repost of a G+ post at Tejun's request]
>
> With Linux 3.14, you might notice in /proc/self/mountinfo that your
> root's parent FSID is now 0, instead of the 1 that it's been for the
> last N years. Tejun wrote the change (9e30cc9595303b27b48) that caused
> this, but the change comes in a rather innocuous way. Instead of an
> internal kernel mount of sysfs being assigned 0, it's now the initramfs.
>
> So far, this has already caused switch_root and findmnt (from
> util-linux) to break, cp (from coreutils) to break when using the -x
> flag in early userspace, and it's also been pointed out that systemd's
> readahead code makes assumptions about a device number of 0.
>
> Are we now supposed to go and change all the assumptions in userspace
> about 0 being special? I'm conflicted. The kernel isn't supposed to
> break userspace, but it seems to me that FSIDs were never something to
> rely on -- similar to the block device numbering scheme.

Most of these bugs were not caused by rootfs' FSID being different from
1, but rather because there was a file system with FSID 0.

Only util-linux/switch_root assumed that rootfs always had exactly FSID
1 - which is IMO a wrong assumption.

However, tt seems that people have been assuming that st_dev > 0 for a
while. If we want to revert this in the kernel, this patch (untested)
should be sufficient:

diff --git a/fs/super.c b/fs/super.c
index 80d5cf2..d9fddde 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -802,7 +802,7 @@ void emergency_remount(void)
static DEFINE_IDA(unnamed_dev_ida);
static DEFINE_SPINLOCK(unnamed_dev_lock);/* protects the above */
-static int unnamed_dev_start = 0; /* don't bother trying below it */
+static int unnamed_dev_start = 1; /* don't bother trying below it */
int get_anon_bdev(dev_t *p)
{



Attachments:
signature.asc (901.00 B)
OpenPGP digital signature

2014-04-03 18:57:38

by Pádraig Brady

[permalink] [raw]
Subject: Re: Initramfs FSID altered in 3.14

On 04/03/2014 06:57 PM, Dave Reisner wrote:
> Hi,
>
> [This is a repost of a G+ post at Tejun's request]
>
> With Linux 3.14, you might notice in /proc/self/mountinfo that your
> root's parent FSID is now 0, instead of the 1 that it's been for the
> last N years. Tejun wrote the change (9e30cc9595303b27b48) that caused
> this, but the change comes in a rather innocuous way. Instead of an
> internal kernel mount of sysfs being assigned 0, it's now the initramfs.
>
> So far, this has already caused switch_root and findmnt (from
> util-linux) to break, cp (from coreutils) to break when using the -x
> flag in early userspace, and it's also been pointed out that systemd's
> readahead code makes assumptions about a device number of 0.

For reference we've changed coreutils not to assume 0 is an invalid device ID:
http://git.sv.gnu.org/gitweb/?p=coreutils.git;a=commit;h=d0294ff3

> Are we now supposed to go and change all the assumptions in userspace
> about 0 being special? I'm conflicted. The kernel isn't supposed to
> break userspace, but it seems to me that FSIDs were never something to
> rely on -- similar to the block device numbering scheme.

I would say the kernel doesn't care what the value is,
so to ease compat worries just use >= 1.

cheers,
P?draig

2014-04-03 19:13:16

by Tejun Heo

[permalink] [raw]
Subject: Re: Initramfs FSID altered in 3.14

Hello, Dave.

On Thu, Apr 03, 2014 at 01:57:44PM -0400, Dave Reisner wrote:
> With Linux 3.14, you might notice in /proc/self/mountinfo that your
> root's parent FSID is now 0, instead of the 1 that it's been for the
> last N years. Tejun wrote the change (9e30cc9595303b27b48) that caused
> this, but the change comes in a rather innocuous way. Instead of an
> internal kernel mount of sysfs being assigned 0, it's now the initramfs.
>
> So far, this has already caused switch_root and findmnt (from
> util-linux) to break, cp (from coreutils) to break when using the -x
> flag in early userspace, and it's also been pointed out that systemd's
> readahead code makes assumptions about a device number of 0.
>
> Are we now supposed to go and change all the assumptions in userspace
> about 0 being special? I'm conflicted. The kernel isn't supposed to
> break userspace, but it seems to me that FSIDs were never something to
> rely on -- similar to the block device numbering scheme.

I think this is the same problem Alexandre Demers reported. Arch was
failing to boot with the commit. There's already a patch pending to
reinstate the internal mount but I think what Thomas is proposing -
just starting allocating FSID from 1 is a better solution. Alexandre,
can you please test Thomas' patch?

Thanks.

--
tejun

2014-04-03 19:16:27

by Tejun Heo

[permalink] [raw]
Subject: Re: Initramfs FSID altered in 3.14

Hello,

On Thu, Apr 03, 2014 at 08:13:50PM +0200, Thomas B?chler wrote:
> Most of these bugs were not caused by rootfs' FSID being different from
> 1, but rather because there was a file system with FSID 0.
>
> Only util-linux/switch_root assumed that rootfs always had exactly FSID
> 1 - which is IMO a wrong assumption.
>
> However, tt seems that people have been assuming that st_dev > 0 for a
> while. If we want to revert this in the kernel, this patch (untested)
> should be sufficient:
>
> diff --git a/fs/super.c b/fs/super.c
> index 80d5cf2..d9fddde 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -802,7 +802,7 @@ void emergency_remount(void)
> static DEFINE_IDA(unnamed_dev_ida);
> static DEFINE_SPINLOCK(unnamed_dev_lock);/* protects the above */
> -static int unnamed_dev_start = 0; /* don't bother trying below it */
> +static int unnamed_dev_start = 1; /* don't bother trying below it */
> int get_anon_bdev(dev_t *p)
> {

Alexandre, this is the one line change that should fix it. Can you
please test it?

Thomas, can you please write proper patch description with reference
to the following thread and stable # 3.14 tag?

http://lkml.kernel.org/g/CAPEhTTFP3N-ReasmgL5n82mve8p8M3crqmaMvzV+F2p5JCSRbQ@mail.gmail.com

Thanks.

--
tejun

2014-04-03 19:20:33

by Theodore Ts'o

[permalink] [raw]
Subject: Re: Initramfs FSID altered in 3.14

I agree with Tejun that we should fix it up, and not force
distributions to have to send out emergency releases of various
userspace utilities --- especially since the fix is so simple.

- Ted

2014-04-03 19:50:21

by Thomas Bächler

[permalink] [raw]
Subject: [PATCH] fs: Don't return 0 from get_anon_bdev

Commit 9e30cc9595303b27b48 removed an internal mount. This
has the side-effect that rootfs now has FSID 0. Many
userspace utilities assume that st_dev in struct stat
is never 0, so this change breaks a number of tools in
early userspace.

Since we don't know how many userspace programs are affected,
make sure that FSID is at least 1.

References: http://article.gmane.org/gmane.linux.kernel/1666905
References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557
Cc: 3.14 <[email protected]>
Signed-off-by: Thomas Bächler <[email protected]>
---
fs/super.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index 80d5cf2..d9fddde 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -802,7 +802,7 @@ void emergency_remount(void)

static DEFINE_IDA(unnamed_dev_ida);
static DEFINE_SPINLOCK(unnamed_dev_lock);/* protects the above */
-static int unnamed_dev_start = 0; /* don't bother trying below it */
+static int unnamed_dev_start = 1; /* don't bother trying below it */

int get_anon_bdev(dev_t *p)
{
--
1.9.1

2014-04-03 19:52:06

by Tejun Heo

[permalink] [raw]
Subject: Re: [PATCH] fs: Don't return 0 from get_anon_bdev

On Thu, Apr 03, 2014 at 09:49:55PM +0200, Thomas B?chler wrote:
> Commit 9e30cc9595303b27b48 removed an internal mount. This
> has the side-effect that rootfs now has FSID 0. Many
> userspace utilities assume that st_dev in struct stat
> is never 0, so this change breaks a number of tools in
> early userspace.
>
> Since we don't know how many userspace programs are affected,
> make sure that FSID is at least 1.
>
> References: http://article.gmane.org/gmane.linux.kernel/1666905
> References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557
> Cc: 3.14 <[email protected]>
> Signed-off-by: Thomas B?chler <[email protected]>

Provided Alexandre confirms the fix.

Acked-by: Tejun Heo <[email protected]>

> ---
> fs/super.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 80d5cf2..d9fddde 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -802,7 +802,7 @@ void emergency_remount(void)
>
> static DEFINE_IDA(unnamed_dev_ida);
> static DEFINE_SPINLOCK(unnamed_dev_lock);/* protects the above */
> -static int unnamed_dev_start = 0; /* don't bother trying below it */
> +static int unnamed_dev_start = 1; /* don't bother trying below it */

Maybe a comment saying that userland considers fsid 0 to be invalid
would be nice?

Thanks.

--
tejun

2014-04-03 19:55:51

by Thomas Bächler

[permalink] [raw]
Subject: [PATCHv2] fs: Don't return 0 from get_anon_bdev

Commit 9e30cc9595303b27b48 removed an internal mount. This
has the side-effect that rootfs now has FSID 0. Many
userspace utilities assume that st_dev in struct stat
is never 0, so this change breaks a number of tools in
early userspace.

Since we don't know how many userspace programs are affected,
make sure that FSID is at least 1.

References: http://article.gmane.org/gmane.linux.kernel/1666905
References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557
Cc: 3.14 <[email protected]>
Signed-off-by: Thomas Bächler <[email protected]>
---
fs/super.c | 5 ++++-
1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/fs/super.c b/fs/super.c
index 80d5cf2..7624267 100644
--- a/fs/super.c
+++ b/fs/super.c
@@ -802,7 +802,10 @@ void emergency_remount(void)

static DEFINE_IDA(unnamed_dev_ida);
static DEFINE_SPINLOCK(unnamed_dev_lock);/* protects the above */
-static int unnamed_dev_start = 0; /* don't bother trying below it */
+/* Many userspace utilities consider an FSID of 0 invalid.
+ * Always return at least 1 from get_anon_bdev.
+ */
+static int unnamed_dev_start = 1;

int get_anon_bdev(dev_t *p)
{
--
1.9.1

2014-04-03 20:22:03

by H. Peter Anvin

[permalink] [raw]
Subject: Re: [PATCH] fs: Don't return 0 from get_anon_bdev

On 04/03/2014 12:49 PM, Thomas Bächler wrote:
> Commit 9e30cc9595303b27b48 removed an internal mount. This
> has the side-effect that rootfs now has FSID 0. Many
> userspace utilities assume that st_dev in struct stat
> is never 0, so this change breaks a number of tools in
> early userspace.
>
> Since we don't know how many userspace programs are affected,
> make sure that FSID is at least 1.
>
> References: http://article.gmane.org/gmane.linux.kernel/1666905
> References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557
> Cc: 3.14 <[email protected]>
> Signed-off-by: Thomas Bächler <[email protected]>
> ---
> fs/super.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>

Acked-by: H. Peter Anvin <[email protected]>

It is worth noting that zero has been documented as a null device number
since at least 1995, probably more like 1993.

-hpa

2014-04-03 20:30:33

by Greg KH

[permalink] [raw]
Subject: Re: [PATCHv2] fs: Don't return 0 from get_anon_bdev

On Thu, Apr 03, 2014 at 09:55:37PM +0200, Thomas B?chler wrote:
> Commit 9e30cc9595303b27b48 removed an internal mount. This
> has the side-effect that rootfs now has FSID 0. Many
> userspace utilities assume that st_dev in struct stat
> is never 0, so this change breaks a number of tools in
> early userspace.
>
> Since we don't know how many userspace programs are affected,
> make sure that FSID is at least 1.
>
> References: http://article.gmane.org/gmane.linux.kernel/1666905
> References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557
> Cc: 3.14 <[email protected]>
> Signed-off-by: Thomas B?chler <[email protected]>
> ---
> fs/super.c | 5 ++++-
> 1 file changed, 4 insertions(+), 1 deletion(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 80d5cf2..7624267 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -802,7 +802,10 @@ void emergency_remount(void)
>
> static DEFINE_IDA(unnamed_dev_ida);
> static DEFINE_SPINLOCK(unnamed_dev_lock);/* protects the above */
> -static int unnamed_dev_start = 0; /* don't bother trying below it */
> +/* Many userspace utilities consider an FSID of 0 invalid.
> + * Always return at least 1 from get_anon_bdev.
> + */
> +static int unnamed_dev_start = 1;
>
> int get_anon_bdev(dev_t *p)
> {

Very nice, I'll go queue this one up instead of the other patch from
Tejun.

greg k-h

2014-04-03 22:02:19

by Alexandre Demers

[permalink] [raw]
Subject: Re: Initramfs FSID altered in 3.14

I'll be testing it later tonight. It was either that or I would have
been completing my Qemu image. ;)

Stay tuned.

Alexandre

On Thu, Apr 3, 2014 at 3:20 PM, Theodore Ts'o <[email protected]> wrote:
> I agree with Tejun that we should fix it up, and not force
> distributions to have to send out emergency releases of various
> userspace utilities --- especially since the fix is so simple.
>
> - Ted

2014-04-03 23:41:22

by Alexandre Demers

[permalink] [raw]
Subject: Re: [PATCH] fs: Don't return 0 from get_anon_bdev

It works over here, tested on 3.14-rc8 which was previously failing. You
have my

Tested-by: Alexandre Demers

Alexandre Demers

On 04/03/2014 03:49 PM, Thomas Bächler wrote:
> Commit 9e30cc9595303b27b48 removed an internal mount. This
> has the side-effect that rootfs now has FSID 0. Many
> userspace utilities assume that st_dev in struct stat
> is never 0, so this change breaks a number of tools in
> early userspace.
>
> Since we don't know how many userspace programs are affected,
> make sure that FSID is at least 1.
>
> References: http://article.gmane.org/gmane.linux.kernel/1666905
> References: http://permalink.gmane.org/gmane.linux.utilities.util-linux-ng/8557
> Cc: 3.14 <[email protected]>
> Signed-off-by: Thomas Bächler <[email protected]>
> ---
> fs/super.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/fs/super.c b/fs/super.c
> index 80d5cf2..d9fddde 100644
> --- a/fs/super.c
> +++ b/fs/super.c
> @@ -802,7 +802,7 @@ void emergency_remount(void)
>
> static DEFINE_IDA(unnamed_dev_ida);
> static DEFINE_SPINLOCK(unnamed_dev_lock);/* protects the above */
> -static int unnamed_dev_start = 0; /* don't bother trying below it */
> +static int unnamed_dev_start = 1; /* don't bother trying below it */
>
> int get_anon_bdev(dev_t *p)
> {