2001-11-03 20:24:05

by Simon Kirby

[permalink] [raw]
Subject: Something broken in sys_swapon

Looking through sys_swapon() for the culprit of my corruption after a
nonexistent swap device is added (/dev/hdb2 when /dev/hda is my only hard
drive and hdc and hdd are cdroms), I notice a things that look a bit odd.

First, set_blocksize(dev, PAGE_SIZE) is done twice in the S_ISBLK block
(it should only be needed once?), but furthermore:

kdev_t dev = swap_inode->i_rdev;
struct block_device_operations *bdops;

p->swap_device = dev;
set_blocksize(dev, PAGE_SIZE);

I don't know much at all about the inode structure, but doesn't this set
the block size of the originating filesystem containing the inode rather
than the block device that inode happens to be pointing to? That would
definitely explain the corruption I see if my file system block size is
changed (/ is a 2KB block-sized EXT2 filesystem).

Simon-

[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ [email protected] ][ [email protected] ]
[ Opinions expressed are not necessarily those of my employers. ]


2001-11-03 20:31:45

by Alexander Viro

[permalink] [raw]
Subject: Re: Something broken in sys_swapon



On Sat, 3 Nov 2001, Simon Kirby wrote:

> kdev_t dev = swap_inode->i_rdev;
> struct block_device_operations *bdops;
>
> p->swap_device = dev;
> set_blocksize(dev, PAGE_SIZE);
>
> I don't know much at all about the inode structure, but doesn't this set
> the block size of the originating filesystem containing the inode rather
> than the block device that inode happens to be pointing to? That would

man 2 stat

i_rdev is equivalent of st_rdev, i_dev - of st_dev.

2001-11-03 21:14:09

by Simon Kirby

[permalink] [raw]
Subject: Re: Something broken in sys_swapon

On Sat, Nov 03, 2001 at 03:31:25PM -0500, Alexander Viro wrote:

> On Sat, 3 Nov 2001, Simon Kirby wrote:
>
> > kdev_t dev = swap_inode->i_rdev;
> > struct block_device_operations *bdops;
> >
> > p->swap_device = dev;
> > set_blocksize(dev, PAGE_SIZE);
> >
> > I don't know much at all about the inode structure, but doesn't this set
> > the block size of the originating filesystem containing the inode rather
> > than the block device that inode happens to be pointing to? That would
>
> man 2 stat
>
> i_rdev is equivalent of st_rdev, i_dev - of st_dev.

Okay, would you see any other reason why my root filesystem would
completely blow up after swapon /dev/hdb2 when /dev/hdb no longer exists?

All I did was remove /dev/hdb and forget to take the swap entry out of
/etc/fstab. On boot I got "attempt to access beyond end of device"
messages looping endlessly. I tried once with / mounted rw (it looks
like Debian still has / mounted ro when it turns on swap), and lots of
filesystem corruption resulted.

Simon-

[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ [email protected] ][ [email protected] ]
[ Opinions expressed are not necessarily those of my employers. ]

2001-11-04 11:46:26

by Mike Black

[permalink] [raw]
Subject: Re: Something broken in sys_swapon

I think I see a potential problem (I'm looking at 2.4.10) in fs/namei.c.

If path_init() returns 0 in __user_walk() then err is not set to anything
(it defaults to 0)

int __user_walk(const char *name, unsigned flags, struct nameidata *nd)
{
char *tmp;
int err;

tmp = getname(name);
err = PTR_ERR(tmp);
if (!IS_ERR(tmp)) {
err = 0;
if (path_init(tmp, flags, nd))
err = path_walk(tmp, nd);
putname(tmp);
}
return err;
}

----- Original Message -----
From: "Simon Kirby" <[email protected]>
To: "Alexander Viro" <[email protected]>
Cc: <[email protected]>
Sent: Saturday, November 03, 2001 4:13 PM
Subject: Re: Something broken in sys_swapon


> On Sat, Nov 03, 2001 at 03:31:25PM -0500, Alexander Viro wrote:
>
> > On Sat, 3 Nov 2001, Simon Kirby wrote:
> >
> > > kdev_t dev = swap_inode->i_rdev;
> > > struct block_device_operations *bdops;
> > >
> > > p->swap_device = dev;
> > > set_blocksize(dev, PAGE_SIZE);
> > >
> > > I don't know much at all about the inode structure, but doesn't this
set
> > > the block size of the originating filesystem containing the inode
rather
> > > than the block device that inode happens to be pointing to? That
would
> >
> > man 2 stat
> >
> > i_rdev is equivalent of st_rdev, i_dev - of st_dev.
>
> Okay, would you see any other reason why my root filesystem would
> completely blow up after swapon /dev/hdb2 when /dev/hdb no longer exists?
>
> All I did was remove /dev/hdb and forget to take the swap entry out of
> /etc/fstab. On boot I got "attempt to access beyond end of device"
> messages looping endlessly. I tried once with / mounted rw (it looks
> like Debian still has / mounted ro when it turns on swap), and lots of
> filesystem corruption resulted.
>
> Simon-
>
> [ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
> [ [email protected] ][ [email protected] ]
> [ Opinions expressed are not necessarily those of my employers. ]
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2001-11-04 12:05:26

by Alexander Viro

[permalink] [raw]
Subject: Re: Something broken in sys_swapon



On Sun, 4 Nov 2001, Mike Black wrote:

> I think I see a potential problem (I'm looking at 2.4.10) in fs/namei.c.
>
> If path_init() returns 0 in __user_walk() then err is not set to anything
> (it defaults to 0)

path_init() returns 0 only if you are in altroot situation (running with
personality that has its own /usr/gnuemul/<...>/) _and_ it had succeeded
doing lookup from the altroot. Otherwise it returns non-zero and lets
path_walk() do the right thing.

2001-11-04 20:23:03

by Simon Kirby

[permalink] [raw]
Subject: Re: Something broken in sys_swapon

Hmm...Guessing that it had something to do with set_blocksize, I added
this patch to the kernel:

--- linux/fs/block_dev.c.orig Sun Nov 4 11:35:05 2001
+++ linux/fs/block_dev.c Sun Nov 4 11:54:39 2001
@@ -95,6 +95,9 @@
/* Ok, we're actually changing the blocksize.. */
bdev = bdget(dev);
sync_buffers(dev, 2);
+ printk("Changing device %02x:%02x block size from %u to %u\n",
+ MAJOR(dev),MINOR(dev),
+ blksize_size[MAJOR(dev)][MINOR(dev)],size);
blksize_size[MAJOR(dev)][MINOR(dev)] = size;
bdev->bd_inode->i_blkbits = blksize_bits(size);
kill_bdev(bdev);

And tried booting again with /dev/hdb2 swap in my fstab...

NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
Changing device 03:02 block size from 1024 to 2048
VFS: Mounted root (ext2 filesystem) readonly.
Freeing unused kernel memory: 224k freed
Changing device 03:03 block size from 1024 to 4096
Adding Swap: 265064k swap-space (priority 0)
Changing device 03:42 block size from 10739452 to 4096
attempt to access beyond end of device
03:02: rw=0, want=10555924, limit=4096
attempt to access beyond end of device
03:02: rw=0, want=10555926, limit=4096

It looks like it's not changing 03:02 (/dev/hda2, my root fs), which is
good, but it seems to be trying to change 03:42 (/dev/hdb2) even though
that device doesn't exist (which could also be fine, but it looks like it
wasn't initialized). I'm guessing blksize_size is statically allocated,
so this isn't a problem.

I wonder what else could cause 03:02 to barf... Hmm.

Simon-

[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ [email protected] ][ [email protected] ]
[ Opinions expressed are not necessarily those of my employers. ]

2001-11-04 20:33:23

by Simon Kirby

[permalink] [raw]
Subject: Re: Something broken in sys_swapon

Aha! I then tried this patch:

--- linux/fs/block_dev.c.orig Sun Nov 4 11:35:05 2001
+++ linux/fs/block_dev.c Sun Nov 4 12:21:51 2001
@@ -84,6 +84,15 @@
}

oldsize = blksize_size[MAJOR(dev)][MINOR(dev)];
+
+ printk("Changing device %02x:%02x block size from %u to %u\n",
+ MAJOR(dev),MINOR(dev),
+ oldsize,size);
+ if (MAJOR(dev) == 0x03 && MINOR(dev) == 0x42){
+ printk("...Refused.\n");
+ return 0;
+ }
+
if (oldsize == size)
return 0;

...And now my system boots fine with /dev/hdb2 swap in the fstab.

In fact, I tried /dev/hdb1 after and then I couldn't read any more from
/boot (which is /dev/hda1). So, some sort of wraparound is happening
here. Why would blksize_size[3][2] be affected by blksize_size[3][0x42]?

Simon-

[ Stormix Technologies Inc. ][ NetNation Communications Inc. ]
[ [email protected] ][ [email protected] ]
[ Opinions expressed are not necessarily those of my employers. ]