2014-06-19 09:30:16

by Alan

[permalink] [raw]
Subject: Cannot partition 32GB disk on a 32bit machine

The block code has 32bit cleanness problems with the iterator. This
prevents things like partitioning a 32GB volume on a 32bit system.

I hit this with a volume of exactly 32GB in size (easy to duplicate with
virtual machines). Tracing at step by step through the kernel I found
the problem lines in blkdev_read_iter which truncates the size value
into a 32bit value when setting up the iterator.

The hack below if applied "fixes" this and I think demonstrates that
this is the problem spot.

What I'm less clear on is what the correct fix for this should be.


diff --git a/fs/block_dev.c b/fs/block_dev.c
index 6d72746..3792406 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1603,6 +1603,9 @@ static ssize_t blkdev_read_iter(struct kiocb
*iocb, struct iov_iter *to)

size -= pos;
iov_iter_truncate(to, size);
+ /* Fix up for 32bit boxes for now */
+ if (to->count < size)
+ to->count = size;
return generic_file_read_iter(iocb, to);
}



2014-06-19 09:33:31

by Cox, Alan

[permalink] [raw]
Subject: Re: Cannot partition 32GB disk on a 32bit machine (correct version of the patch this time)

On Thu, 2014-06-19 at 10:30 +0100, Alan Cox wrote:
> The block code has 32bit cleanness problems with the iterator. This
> prevents things like partitioning a 32GB volume on a 32bit system.
>
> I hit this with a volume of exactly 32GB in size (easy to duplicate with
> virtual machines). Tracing at step by step through the kernel I found
> the problem lines in blkdev_read_iter which truncates the size value
> into a 32bit value when setting up the iterator.
>
> The hack below if applied "fixes" this and I think demonstrates that
> this is the problem spot.
>
> What I'm less clear on is what the correct fix for this should be.

block: fix inability to partition large volumes

From: Alan <[email protected]>

The block code has 32bit cleanness problems with the iterator. This
prevents things like partitioning a 32GB volume on a 32bit system.

This is a simple initial "fix" that clips the problem cases so get
behaviour that is at least sane and trivially backportable.

Signed-off-by: Alan Cox <[email protected]>
---
fs/block_dev.c | 3 +++
1 file changed, 3 insertions(+)

diff --git a/fs/block_dev.c b/fs/block_dev.c
index 6d72746..bef2414 100644
--- a/fs/block_dev.c
+++ b/fs/block_dev.c
@@ -1603,6 +1603,9 @@ static ssize_t blkdev_read_iter(struct kiocb
*iocb, struct iov_iter *to)

size -= pos;
iov_iter_truncate(to, size);
+ /* Fix up for 32bit boxes for now */
+ if (to->count < size)
+ to->count = 0xFFFFFFFF;
return generic_file_read_iter(iocb, to);
}


---------------------------------------------------------------------
Intel Corporation (UK) Limited
Registered No. 1134945 (England)
Registered Office: Pipers Way, Swindon SN3 1RJ
VAT No: 860 2173 47

This e-mail and any attachments may contain confidential material for
the sole use of the intended recipient(s). Any review or distribution
by others is strictly prohibited. If you are not the intended
recipient, please contact the sender and delete all copies.
????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m???? ????????I?

2014-06-19 18:44:04

by James Bottomley

[permalink] [raw]
Subject: Re: Cannot partition 32GB disk on a 32bit machine (correct version of the patch this time)

On Thu, 2014-06-19 at 09:33 +0000, Cox, Alan wrote:
> On Thu, 2014-06-19 at 10:30 +0100, Alan Cox wrote:
> > The block code has 32bit cleanness problems with the iterator. This
> > prevents things like partitioning a 32GB volume on a 32bit system.
> >
> > I hit this with a volume of exactly 32GB in size (easy to duplicate with
> > virtual machines). Tracing at step by step through the kernel I found
> > the problem lines in blkdev_read_iter which truncates the size value
> > into a 32bit value when setting up the iterator.
> >
> > The hack below if applied "fixes" this and I think demonstrates that
> > this is the problem spot.
> >
> > What I'm less clear on is what the correct fix for this should be.
>
> block: fix inability to partition large volumes
>
> From: Alan <[email protected]>
>
> The block code has 32bit cleanness problems with the iterator. This
> prevents things like partitioning a 32GB volume on a 32bit system.
>
> This is a simple initial "fix" that clips the problem cases so get
> behaviour that is at least sane and trivially backportable.
>
> Signed-off-by: Alan Cox <[email protected]>
> ---
> fs/block_dev.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 6d72746..bef2414 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1603,6 +1603,9 @@ static ssize_t blkdev_read_iter(struct kiocb
> *iocb, struct iov_iter *to)
>
> size -= pos;
> iov_iter_truncate(to, size);
> + /* Fix up for 32bit boxes for now */
> + if (to->count < size)
> + to->count = 0xFFFFFFFF;
> return generic_file_read_iter(iocb, to);
> }
>
>
> ---------------------------------------------------------------------
> Intel Corporation (UK) Limited
> Registered No. 1134945 (England)
> Registered Office: Pipers Way, Swindon SN3 1RJ
> VAT No: 860 2173 47
>
> This e-mail and any attachments may contain confidential material for
> the sole use of the intended recipient(s). Any review or distribution
> by others is strictly prohibited. If you are not the intended
> recipient, please contact the sender and delete all copies.
> NrybXǧv^)޺{.n+{~^b^nrzh&Gh(階ݢj"mzޖfh~m

Wow that's junk issued by an Exchange server ... Alan, really ...

Do you have CONFIG_LBD turned on? That's supposed to let us go up to
about 16TB before we run out of page index bits. If you do, we might
have a variable that's int but should be sector_t somewhere.

James

2014-06-19 20:51:48

by Andrew Morton

[permalink] [raw]
Subject: Re: Cannot partition 32GB disk on a 32bit machine (correct version of the patch this time)

On Thu, 19 Jun 2014 11:43:57 -0700 James Bottomley <[email protected]> wrote:

> Do you have CONFIG_LBD turned on? That's supposed to let us go up to
> about 16TB before we run out of page index bits. If you do, we might
> have a variable that's int but should be sector_t somewhere.

I assume the problem is that iov_iter uses size_t (which is 32-bit on
32-bit) and we're trying to use iov_iter for blockdevs.

All this code is new and I've NFI how it's supposed to work and cannot
find a description anywhere.

cc's viro and runs away.

2014-06-19 21:12:07

by Paul Bolle

[permalink] [raw]
Subject: Re: Cannot partition 32GB disk on a 32bit machine (correct version of the patch this time)

On Thu, 2014-06-19 at 11:43 -0700, James Bottomley wrote:
> On Thu, 2014-06-19 at 09:33 +0000, Cox, Alan wrote:
> > NrybXǧv^)޺{.n+{~^b^nrzh&Gh(階ݢj"mzޖfh~m
>
> Wow that's junk issued by an Exchange server ... Alan, really ...

I'm guessing that would be your client (Evolution) trying to decode
lkml's footer as if it were base64. I use Evolution too and notice that
all the time.


Paul Bolle

2014-06-19 21:29:25

by Andries E. Brouwer

[permalink] [raw]
Subject: Re: Cannot partition 32GB disk on a 32bit machine (correct version of the patch this time)

On Thu, Jun 19, 2014 at 09:33:26AM +0000, Cox, Alan wrote:
> On Thu, 2014-06-19 at 10:30 +0100, Alan Cox wrote:
> > The block code has 32bit cleanness problems with the iterator. This
> > prevents things like partitioning a 32GB volume on a 32bit system.
> >
> > I hit this with a volume of exactly 32GB in size (easy to duplicate with
> > virtual machines). Tracing at step by step through the kernel I found
> > the problem lines in blkdev_read_iter which truncates the size value
> > into a 32bit value when setting up the iterator.
>
> This is a simple initial "fix" that clips the problem cases so get
> behaviour that is at least sane and trivially backportable.
>
> Signed-off-by: Alan Cox <[email protected]>
> ---
> fs/block_dev.c | 3 +++
> 1 file changed, 3 insertions(+)
>
> diff --git a/fs/block_dev.c b/fs/block_dev.c
> index 6d72746..bef2414 100644
> --- a/fs/block_dev.c
> +++ b/fs/block_dev.c
> @@ -1603,6 +1603,9 @@ static ssize_t blkdev_read_iter(struct kiocb
> *iocb, struct iov_iter *to)
>
> size -= pos;
> iov_iter_truncate(to, size);
> + /* Fix up for 32bit boxes for now */
> + if (to->count < size)
> + to->count = 0xFFFFFFFF;
> return generic_file_read_iter(iocb, to);
> }


It is ages ago that I last looked at such things.
Certainly I have partitioned 160GB+ disks on 32-bit machines, years ago,
so maybe the problem is due to recent bitrot, e.g. the use of a size_t
instead of a loff_t somewhere.

Fetched linux-3.15.1 and linux-3.16-rc1 tar balls.
The diff shows

-static ssize_t blkdev_aio_read(struct kiocb *iocb, const struct iovec *iov,
- unsigned long nr_segs, loff_t pos)
+static ssize_t blkdev_read_iter(struct kiocb *iocb, struct iov_iter *to)
{
struct file *file = iocb->ki_filp;
struct inode *bd_inode = file->f_mapping->host;
loff_t size = i_size_read(bd_inode);
+ loff_t pos = iocb->ki_pos;

if (pos >= size)
return 0;

size -= pos;
- if (size < iocb->ki_nbytes)
- nr_segs = iov_shorten((struct iovec *)iov, nr_segs, size);
- return generic_file_aio_read(iocb, iov, nr_segs, pos);
+ iov_iter_truncate(to, size);
+ return generic_file_read_iter(iocb, to);
}

that a test of size was deleted.

In older kernels the test was

if (size < INT_MAX)
nr_segs = iov_shorten((struct iovec *)iov, nr_segs, size);

which more clearly shows that this is because the last arg of iov_shorten()
is a size_t. In later source this is called iov_iter_truncate,

static inline void iov_iter_truncate(struct iov_iter *i, size_t count)

still with a size_t as lat arg, so probably the test is still needed.

Andries

2014-06-19 23:04:47

by Alan

[permalink] [raw]
Subject: Re: Cannot partition 32GB disk on a 32bit machine (correct version of the patch this time)


> Wow that's junk issued by an Exchange server ... Alan, really ...

Blame evolution. It apparently thinks that if you follow up to your own
email from one address it should randomly switch to another.

> Do you have CONFIG_LBD turned on? That's supposed to let us go up to
> about 16TB before we run out of page index bits. If you do, we might
> have a variable that's int but should be sector_t somewhere.

LBDAF is set


2014-06-19 23:44:10

by James Bottomley

[permalink] [raw]
Subject: Re: Cannot partition 32GB disk on a 32bit machine (correct version of the patch this time)

On Fri, 2014-06-20 at 00:04 +0100, Alan Cox wrote:
> > Wow that's junk issued by an Exchange server ... Alan, really ...
>
> Blame evolution. It apparently thinks that if you follow up to your own
> email from one address it should randomly switch to another.

so you @linux.intel.com is sane and you @intel.com is exchange ...

> > Do you have CONFIG_LBD turned on? That's supposed to let us go up to
> > about 16TB before we run out of page index bits. If you do, we might
> > have a variable that's int but should be sector_t somewhere.
>
> LBDAF is set

OK, so as Andrew said, it looks like Viro's conversion from aio to
iterators has a length and offset problem on 32 bits.

my suspicion is that iov_offset and count in struct iov_iter have to
become loff_t quantities ... plus a lot of other size_t to loff_t
conversions. Hopefully Viro will fix it as a matter of urgency.

James