2015-07-17 07:27:14

by Hannes Reinecke

[permalink] [raw]
Subject: [PATCH 0/4] loop: enable different physical blocksizes

Currently the loop driver just simulates 512-byte blocks. When
creating images for virtual machines it might be required to use
a different physical blocksize (eg 4k for S/390 DASD).
This patchset extends the current LOOP_SET_STATUS64 ioctl to
set the logical and physical blocksize by re-using the existing
'init' fields, which are currently unused.

As usual, comments and reviews are welcome.

Hannes Reinecke (4):
loop: Enable correct physical blocksize
loop: Remove unused 'bdev' argument from loop_set_capacity
loop: Add 'lo_logical_blocksize'
loop: Pass logical blocksize in 'lo_init[0]' ioctl field

drivers/block/loop.c | 35 ++++++++++++++++++++++++++++++-----
drivers/block/loop.h | 1 +
include/uapi/linux/loop.h | 1 +
3 files changed, 32 insertions(+), 5 deletions(-)

--
1.8.5.2


2015-07-17 07:27:17

by Hannes Reinecke

[permalink] [raw]
Subject: [PATCH 1/4] loop: Enable correct physical blocksize

When running on files the physical blocksize is actually 4k,
so we should be announcing it as such. This is enabled with
a new LO_FLAGS_BLOCKSIZE flag value to the existing ioctl.

Signed-off-by: Hannes Reinecke <[email protected]>
---
drivers/block/loop.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index f7a4c9d..62d74c0 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -177,6 +177,8 @@ figure_loop_size(struct loop_device *lo, loff_t offset, loff_t sizelimit)
lo->lo_offset = offset;
if (lo->lo_sizelimit != sizelimit)
lo->lo_sizelimit = sizelimit;
+ if (lo->lo_flags & LO_FLAGS_BLOCKSIZE)
+ blk_queue_physical_block_size(lo->lo_queue, lo->lo_blocksize);
set_capacity(lo->lo_disk, x);
bd_set_size(bdev, (loff_t)get_capacity(bdev->bd_disk) << 9);
/* let user-space know about the new size */
@@ -758,7 +760,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,

lo->lo_blocksize = lo_blocksize;
lo->lo_device = bdev;
- lo->lo_flags = lo_flags;
+ lo->lo_flags |= lo_flags;
lo->lo_backing_file = file;
lo->transfer = NULL;
lo->ioctl = NULL;
@@ -769,6 +771,8 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
if (!(lo_flags & LO_FLAGS_READ_ONLY) && file->f_op->fsync)
blk_queue_flush(lo->lo_queue, REQ_FLUSH);

+ if (lo->lo_flags & LO_FLAGS_BLOCKSIZE)
+ blk_queue_physical_block_size(lo->lo_queue, lo->lo_blocksize);
set_capacity(lo->lo_disk, size);
bd_set_size(bdev, size << 9);
loop_sysfs_init(lo);
@@ -951,6 +955,9 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
if (err)
return err;

+ if (info->lo_flags & LO_FLAGS_BLOCKSIZE)
+ lo->lo_flags |= LO_FLAGS_BLOCKSIZE;
+
if (lo->lo_offset != info->lo_offset ||
lo->lo_sizelimit != info->lo_sizelimit)
if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit))
--
1.8.5.2

2015-07-17 07:27:15

by Hannes Reinecke

[permalink] [raw]
Subject: [PATCH 2/4] loop: Remove unused 'bdev' argument from loop_set_capacity

Signed-off-by: Hannes Reinecke <[email protected]>
---
drivers/block/loop.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 62d74c0..fce13bd 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -1137,7 +1137,7 @@ loop_get_status64(struct loop_device *lo, struct loop_info64 __user *arg) {
return err;
}

-static int loop_set_capacity(struct loop_device *lo, struct block_device *bdev)
+static int loop_set_capacity(struct loop_device *lo)
{
if (unlikely(lo->lo_state != Lo_bound))
return -ENXIO;
@@ -1186,7 +1186,7 @@ static int lo_ioctl(struct block_device *bdev, fmode_t mode,
case LOOP_SET_CAPACITY:
err = -EPERM;
if ((mode & FMODE_WRITE) || capable(CAP_SYS_ADMIN))
- err = loop_set_capacity(lo, bdev);
+ err = loop_set_capacity(lo);
break;
default:
err = lo->ioctl ? lo->ioctl(lo, cmd, arg) : -EINVAL;
--
1.8.5.2

2015-07-17 07:27:13

by Hannes Reinecke

[permalink] [raw]
Subject: [PATCH 3/4] loop: Add 'lo_logical_blocksize'

Add a new field 'lo_logical_blocksize' to hold the logical
blocksize of the loop device.

Signed-off-by: Hannes Reinecke <[email protected]>
---
drivers/block/loop.c | 14 +++++++++++---
drivers/block/loop.h | 1 +
include/uapi/linux/loop.h | 1 +
3 files changed, 13 insertions(+), 3 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index fce13bd..321f296 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -177,8 +177,11 @@ figure_loop_size(struct loop_device *lo, loff_t offset, loff_t sizelimit)
lo->lo_offset = offset;
if (lo->lo_sizelimit != sizelimit)
lo->lo_sizelimit = sizelimit;
- if (lo->lo_flags & LO_FLAGS_BLOCKSIZE)
+ if (lo->lo_flags & LO_FLAGS_BLOCKSIZE) {
blk_queue_physical_block_size(lo->lo_queue, lo->lo_blocksize);
+ blk_queue_logical_block_size(lo->lo_queue,
+ lo->lo_logical_blocksize);
+ }
set_capacity(lo->lo_disk, x);
bd_set_size(bdev, (loff_t)get_capacity(bdev->bd_disk) << 9);
/* let user-space know about the new size */
@@ -666,6 +669,7 @@ static void loop_config_discard(struct loop_device *lo)
struct file *file = lo->lo_backing_file;
struct inode *inode = file->f_mapping->host;
struct request_queue *q = lo->lo_queue;
+ int lo_bits = blksize_bits(lo->lo_logical_blocksize);

/*
* We use punch hole to reclaim the free space used by the
@@ -685,7 +689,7 @@ static void loop_config_discard(struct loop_device *lo)

q->limits.discard_granularity = inode->i_sb->s_blocksize;
q->limits.discard_alignment = 0;
- q->limits.max_discard_sectors = UINT_MAX >> 9;
+ q->limits.max_discard_sectors = UINT_MAX >> lo_bits;
q->limits.discard_zeroes_data = 1;
queue_flag_set_unlocked(QUEUE_FLAG_DISCARD, q);
}
@@ -759,6 +763,7 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
set_device_ro(bdev, (lo_flags & LO_FLAGS_READ_ONLY) != 0);

lo->lo_blocksize = lo_blocksize;
+ lo->lo_logical_blocksize = 512;
lo->lo_device = bdev;
lo->lo_flags |= lo_flags;
lo->lo_backing_file = file;
@@ -771,8 +776,11 @@ static int loop_set_fd(struct loop_device *lo, fmode_t mode,
if (!(lo_flags & LO_FLAGS_READ_ONLY) && file->f_op->fsync)
blk_queue_flush(lo->lo_queue, REQ_FLUSH);

- if (lo->lo_flags & LO_FLAGS_BLOCKSIZE)
+ if (lo->lo_flags & LO_FLAGS_BLOCKSIZE) {
blk_queue_physical_block_size(lo->lo_queue, lo->lo_blocksize);
+ blk_queue_logical_block_size(lo->lo_queue,
+ lo->lo_logical_blocksize);
+ }
set_capacity(lo->lo_disk, size);
bd_set_size(bdev, size << 9);
loop_sysfs_init(lo);
diff --git a/drivers/block/loop.h b/drivers/block/loop.h
index 25e8997..93af885 100644
--- a/drivers/block/loop.h
+++ b/drivers/block/loop.h
@@ -49,6 +49,7 @@ struct loop_device {
struct file * lo_backing_file;
struct block_device *lo_device;
unsigned lo_blocksize;
+ unsigned lo_logical_blocksize;
void *key_data;

gfp_t old_gfp_mask;
diff --git a/include/uapi/linux/loop.h b/include/uapi/linux/loop.h
index e0cecd2..caec9d3 100644
--- a/include/uapi/linux/loop.h
+++ b/include/uapi/linux/loop.h
@@ -21,6 +21,7 @@ enum {
LO_FLAGS_READ_ONLY = 1,
LO_FLAGS_AUTOCLEAR = 4,
LO_FLAGS_PARTSCAN = 8,
+ LO_FLAGS_BLOCKSIZE = 16,
};

#include <asm/posix_types.h> /* for __kernel_old_dev_t */
--
1.8.5.2

2015-07-17 07:27:18

by Hannes Reinecke

[permalink] [raw]
Subject: [PATCH 4/4] loop: Pass logical blocksize in 'lo_init[0]' ioctl field

The current LOOP_SET_STATUS64 ioctl has two unused fields
'init[2]', which can be used in conjunction with the
LO_FLAGS_BLOCKSIZE flag to pass in the new logical blocksize.

Signed-off-by: Hannes Reinecke <[email protected]>
---
drivers/block/loop.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)

diff --git a/drivers/block/loop.c b/drivers/block/loop.c
index 321f296..3d2ee0f 100644
--- a/drivers/block/loop.c
+++ b/drivers/block/loop.c
@@ -963,11 +963,21 @@ loop_set_status(struct loop_device *lo, const struct loop_info64 *info)
if (err)
return err;

- if (info->lo_flags & LO_FLAGS_BLOCKSIZE)
+ if (info->lo_flags & LO_FLAGS_BLOCKSIZE) {
lo->lo_flags |= LO_FLAGS_BLOCKSIZE;
+ if ((info->lo_init[0] != 512) &&
+ (info->lo_init[0] != 1024) &&
+ (info->lo_init[0] != 2048) &&
+ (info->lo_init[0] != 4096))
+ return -EINVAL;
+ if (info->lo_init[0] > lo->lo_blocksize)
+ return -EINVAL;
+ lo->lo_logical_blocksize = info->lo_init[0];
+ }

if (lo->lo_offset != info->lo_offset ||
- lo->lo_sizelimit != info->lo_sizelimit)
+ lo->lo_sizelimit != info->lo_sizelimit ||
+ lo->lo_flags & LO_FLAGS_BLOCKSIZE)
if (figure_loop_size(lo, info->lo_offset, info->lo_sizelimit))
return -EFBIG;

--
1.8.5.2

2015-07-27 05:15:55

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/4] loop: Enable correct physical blocksize

On Fri, Jul 17, 2015 at 09:27:04AM +0200, Hannes Reinecke wrote:
> When running on files the physical blocksize is actually 4k,
> so we should be announcing it as such. This is enabled with
> a new LO_FLAGS_BLOCKSIZE flag value to the existing ioctl.

The flag is only used in this patch, but not actually defined anywhere.

2015-07-27 05:16:14

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 2/4] loop: Remove unused 'bdev' argument from loop_set_capacity

On Fri, Jul 17, 2015 at 09:27:05AM +0200, Hannes Reinecke wrote:
> Signed-off-by: Hannes Reinecke <[email protected]>

Looks good,

Reviewed-by: Christoph Hellwig <[email protected]>

(and should be moved first in the series)

2015-07-27 05:59:49

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH 1/4] loop: Enable correct physical blocksize

On 07/27/2015 07:15 AM, Christoph Hellwig wrote:
> On Fri, Jul 17, 2015 at 09:27:04AM +0200, Hannes Reinecke wrote:
>> When running on files the physical blocksize is actually 4k,
>> so we should be announcing it as such. This is enabled with
>> a new LO_FLAGS_BLOCKSIZE flag value to the existing ioctl.
>
> The flag is only used in this patch, but not actually defined anywhere.
>
Ah, Merge error.
I'll fix it up.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
[email protected] +49 911 74053 688
SUSE LINUX Products GmbH, Maxfeldstr. 5, 90409 N?rnberg
GF: J. Hawn, J. Guild, F. Imend?rffer, HRB 16746 (AG N?rnberg)

2015-07-27 10:40:59

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Fri, Jul 17, 2015 at 3:27 AM, Hannes Reinecke <[email protected]> wrote:
> Currently the loop driver just simulates 512-byte blocks. When
> creating images for virtual machines it might be required to use
> a different physical blocksize (eg 4k for S/390 DASD).

Looks 'qemu-img create' doesn't have parameter of block size,
so could you share your use case? And I am just curious why
512-byte can't work for this case.

> This patchset extends the current LOOP_SET_STATUS64 ioctl to
> set the logical and physical blocksize by re-using the existing
> 'init' fields, which are currently unused.
>
> As usual, comments and reviews are welcome.
>
> Hannes Reinecke (4):
> loop: Enable correct physical blocksize
> loop: Remove unused 'bdev' argument from loop_set_capacity
> loop: Add 'lo_logical_blocksize'
> loop: Pass logical blocksize in 'lo_init[0]' ioctl field
>
> drivers/block/loop.c | 35 ++++++++++++++++++++++++++++++-----
> drivers/block/loop.h | 1 +
> include/uapi/linux/loop.h | 1 +
> 3 files changed, 32 insertions(+), 5 deletions(-)
>
> --
> 1.8.5.2
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/



--
Ming Lei

2015-08-03 23:00:36

by Alexander Graf

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes



On 27.07.15 11:40, Ming Lei wrote:
> On Fri, Jul 17, 2015 at 3:27 AM, Hannes Reinecke <[email protected]> wrote:
>> Currently the loop driver just simulates 512-byte blocks. When
>> creating images for virtual machines it might be required to use
>> a different physical blocksize (eg 4k for S/390 DASD).
>
> Looks 'qemu-img create' doesn't have parameter of block size,
> so could you share your use case? And I am just curious why
> 512-byte can't work for this case.

If you only want to access the virtual disk inside of QEMU you're all
good. However, if you want to directly run mkfs or fdasd on a loopback
device, then you need to expose 4k blocksize to the tools or they end up
creating a different on-disk format (if they work at all).

So the easiest case where things go wrong is

$ qemu-img create test.img 1G
$ losetup /dev/loop0 test.img
$ mkfs.ext4 /dev/loop0
$ qemu <with lots of options to pass the disk as 4k disk>

because the guest thinks the disk is formatted with 4k sector size,
while mkfs thought it's formatted with 512 byte sector size.

With this patch you can do

$ qemu-img create test.img 1G
$ losetup -B 4096 /dev/loop0 test.img
$ mkfs.ext4 /dev/loop0
$ qemu <with lots of options to pass the disk as 4k disk>

and it will work because both views of the world are identical. The same
applies for images you pull from a disk. So if you have a real 4k
logical sector size disk and you dd an image from it, you won't be able
to loop mount it today. With this patch set, you can.


Alex

2015-08-07 05:07:11

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Mon, Aug 3, 2015 at 7:00 PM, Alexander Graf <[email protected]> wrote:
>
>
> On 27.07.15 11:40, Ming Lei wrote:
>> On Fri, Jul 17, 2015 at 3:27 AM, Hannes Reinecke <[email protected]> wrote:
>>> Currently the loop driver just simulates 512-byte blocks. When
>>> creating images for virtual machines it might be required to use
>>> a different physical blocksize (eg 4k for S/390 DASD).
>>
>> Looks 'qemu-img create' doesn't have parameter of block size,
>> so could you share your use case? And I am just curious why
>> 512-byte can't work for this case.
>
> If you only want to access the virtual disk inside of QEMU you're all
> good. However, if you want to directly run mkfs or fdasd on a loopback
> device, then you need to expose 4k blocksize to the tools or they end up
> creating a different on-disk format (if they work at all).
>
> So the easiest case where things go wrong is
>
> $ qemu-img create test.img 1G
> $ losetup /dev/loop0 test.img
> $ mkfs.ext4 /dev/loop0

The default block size of mkfs.ext4 is 4K, so suppose it is 1024 by passing
'-b 1024'; otherwise inside VM, the block device(with 4k logical block size)
can be mounted correctly.

> $ qemu <with lots of options to pass the disk as 4k disk>

Then you should pass 'logical_block_size=1024 or 512' in '-device '
parameter of qemu. The story is that block size of filesystem should
be equal or greater than logical block size of block device, see
sb_min_blocksize().

>
> because the guest thinks the disk is formatted with 4k sector size,
> while mkfs thought it's formatted with 512 byte sector size.

I am wondering if mkfs is remembering the sector size of actual block
device, and at least it can't be found by 'dumpe2fs'. And it shouldn't have
do that, otherwise it isn't flexible. And one fs image often can be looped
successully by loop because loop's block size is 512.

That is why I am wondering if we need support other logical block size
for loop.

>
> With this patch you can do
>
> $ qemu-img create test.img 1G
> $ losetup -B 4096 /dev/loop0 test.img
> $ mkfs.ext4 /dev/loop0
> $ qemu <with lots of options to pass the disk as 4k disk>
>
> and it will work because both views of the world are identical. The same
> applies for images you pull from a disk. So if you have a real 4k
> logical sector size disk and you dd an image from it, you won't be able
> to loop mount it today. With this patch set, you can.

No, the filesystem block size is just equal or bigger than logical block size
of the backing device, then it can be loop mounted successfully without
any problem.


Thanks,
Ming Lei

2015-08-07 06:46:42

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On 08/07/2015 07:07 AM, Ming Lei wrote:
> On Mon, Aug 3, 2015 at 7:00 PM, Alexander Graf <[email protected]> wrote:
>>

[ .. ]

>>
>> because the guest thinks the disk is formatted with 4k sector size,
>> while mkfs thought it's formatted with 512 byte sector size.
>
> I am wondering if mkfs is remembering the sector size of actual block
> device, and at least it can't be found by 'dumpe2fs'. And it shouldn't have
> do that, otherwise it isn't flexible. And one fs image often can be looped
> successully by loop because loop's block size is 512.
>
> That is why I am wondering if we need support other logical block size
> for loop.
>
If you were to install a bootloader (like lilo or zipl for S/390) it
needs to write the _physical_ block addresses of the kernel and the
initrd. And these do vary, depending in the physical blocksize.
So while the filesystems indeed do not care (all translation is done
in the block driver, not the filesystem), bootloaders most certainly
do.
If you were to create a bootable disk on 4k disks you need this patch.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

2015-08-07 07:23:07

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Fri, Aug 7, 2015 at 2:46 AM, Hannes Reinecke <[email protected]> wrote:
> On 08/07/2015 07:07 AM, Ming Lei wrote:
>> On Mon, Aug 3, 2015 at 7:00 PM, Alexander Graf <[email protected]> wrote:
>>>
>
> [ .. ]
>
>>>
>>> because the guest thinks the disk is formatted with 4k sector size,
>>> while mkfs thought it's formatted with 512 byte sector size.
>>
>> I am wondering if mkfs is remembering the sector size of actual block
>> device, and at least it can't be found by 'dumpe2fs'. And it shouldn't have
>> do that, otherwise it isn't flexible. And one fs image often can be looped
>> successully by loop because loop's block size is 512.
>>
>> That is why I am wondering if we need support other logical block size
>> for loop.
>>
> If you were to install a bootloader (like lilo or zipl for S/390) it
> needs to write the _physical_ block addresses of the kernel and the
> initrd. And these do vary, depending in the physical blocksize.

So there isn't filesystem involved in your case of installing bootloader,
then I am wondering why you don't write the data to the backing block
directly? And why does loop have to be involved in this special case?

> So while the filesystems indeed do not care (all translation is done
> in the block driver, not the filesystem), bootloaders most certainly
> do.
> If you were to create a bootable disk on 4k disks you need this patch.

It it were me, I choose to do that against the disk directly, instead of
using loop, :-)


Thanks,
Ming Lei

2015-08-07 07:31:46

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 1/4] loop: Enable correct physical blocksize

On Mon, Jul 27, 2015 at 07:59:47AM +0200, Hannes Reinecke wrote:
> On 07/27/2015 07:15 AM, Christoph Hellwig wrote:
> > On Fri, Jul 17, 2015 at 09:27:04AM +0200, Hannes Reinecke wrote:
> >> When running on files the physical blocksize is actually 4k,
> >> so we should be announcing it as such. This is enabled with
> >> a new LO_FLAGS_BLOCKSIZE flag value to the existing ioctl.
> >
> > The flag is only used in this patch, but not actually defined anywhere.
> >
> Ah, Merge error.
> I'll fix it up.

Can you resen the series?

2015-08-07 07:33:33

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Mon, Jul 27, 2015 at 06:40:57AM -0400, Ming Lei wrote:
> On Fri, Jul 17, 2015 at 3:27 AM, Hannes Reinecke <[email protected]> wrote:
> > Currently the loop driver just simulates 512-byte blocks. When
> > creating images for virtual machines it might be required to use
> > a different physical blocksize (eg 4k for S/390 DASD).
>
> Looks 'qemu-img create' doesn't have parameter of block size,
> so could you share your use case? And I am just curious why
> 512-byte can't work for this case.

The use case is to suppot 4k sectors sizes such as DASDs usually provide,
or just to create a 4k block device to check your filesystem of choice
handles 4k sectors just fine. Replace 4k with other sector sizes of
your choice for added benefit. In addition to the DASD use case it's really a
very useful debugging tool.

2015-08-07 07:45:00

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Fri, Aug 7, 2015 at 3:33 AM, Christoph Hellwig <[email protected]> wrote:
> On Mon, Jul 27, 2015 at 06:40:57AM -0400, Ming Lei wrote:
>> On Fri, Jul 17, 2015 at 3:27 AM, Hannes Reinecke <[email protected]> wrote:
>> > Currently the loop driver just simulates 512-byte blocks. When
>> > creating images for virtual machines it might be required to use
>> > a different physical blocksize (eg 4k for S/390 DASD).
>>
>> Looks 'qemu-img create' doesn't have parameter of block size,
>> so could you share your use case? And I am just curious why
>> 512-byte can't work for this case.
>
> The use case is to suppot 4k sectors sizes such as DASDs usually provide,
> or just to create a 4k block device to check your filesystem of choice
> handles 4k sectors just fine. Replace 4k with other sector sizes of
> your choice for added benefit. In addition to the DASD use case it's really a
> very useful debugging tool.

There shouldn't be any problem about looping over DASP which has
4k sector size. Also for debugging purpose, we can easily emulate 4k
sector size disk by QEMU/virtio-blk.

We can support 4k sector size on loop for debugging purpose too, but
the side effect is that some images can't be loop mounted any more
after its secror size is become larger, then people might complain that.

Thanks,
Ming Lei

2015-08-07 07:45:22

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On 08/07/2015 09:23 AM, Ming Lei wrote:
> On Fri, Aug 7, 2015 at 2:46 AM, Hannes Reinecke <[email protected]> wrote:
>> On 08/07/2015 07:07 AM, Ming Lei wrote:
>>> On Mon, Aug 3, 2015 at 7:00 PM, Alexander Graf <[email protected]> wrote:
>>>>
>>
>> [ .. ]
>>
>>>>
>>>> because the guest thinks the disk is formatted with 4k sector size,
>>>> while mkfs thought it's formatted with 512 byte sector size.
>>>
>>> I am wondering if mkfs is remembering the sector size of actual block
>>> device, and at least it can't be found by 'dumpe2fs'. And it shouldn't have
>>> do that, otherwise it isn't flexible. And one fs image often can be looped
>>> successully by loop because loop's block size is 512.
>>>
>>> That is why I am wondering if we need support other logical block size
>>> for loop.
>>>
>> If you were to install a bootloader (like lilo or zipl for S/390) it
>> needs to write the _physical_ block addresses of the kernel and the
>> initrd. And these do vary, depending in the physical blocksize.
>
> So there isn't filesystem involved in your case of installing bootloader,
> then I am wondering why you don't write the data to the backing block
> directly? And why does loop have to be involved in this special case?
>
Because this is a virtual environment.
Hardware is a limited resource, and you would need to assign each
one to a guest.
Using loop you can run fully virtualized, without having to recurse
on hardware limitations.

>> So while the filesystems indeed do not care (all translation is done
>> in the block driver, not the filesystem), bootloaders most certainly
>> do.
>> If you were to create a bootable disk on 4k disks you need this patch.
>
> It it were me, I choose to do that against the disk directly, instead of
> using loop, :-)
>
See above. The reason why we did this patch is precisely because we
do _not_ want to use physical disks.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

2015-08-07 07:48:43

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Fri, Aug 07, 2015 at 03:44:58AM -0400, Ming Lei wrote:
> There shouldn't be any problem about looping over DASP which has
> 4k sector size. Also for debugging purpose, we can easily emulate 4k
> sector size disk by QEMU/virtio-blk.
>
> We can support 4k sector size on loop for debugging purpose too, but
> the side effect is that some images can't be loop mounted any more
> after its secror size is become larger, then people might complain that.

Have you read the patches?

There isn't any change in default behavior, but it allows you to expose
a non-512 byte sector size _optionally_. So no images will stop being
loop mountable, quite to the contrary - you can now loop mount an image
copied off from the 4k disk which otherwise would have been unusable
because the file system geometry didn't match what's returned by
the block layer as the sector size.

2015-08-07 07:50:56

by Hannes Reinecke

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On 08/07/2015 09:44 AM, Ming Lei wrote:
> On Fri, Aug 7, 2015 at 3:33 AM, Christoph Hellwig <[email protected]> wrote:
>> On Mon, Jul 27, 2015 at 06:40:57AM -0400, Ming Lei wrote:
>>> On Fri, Jul 17, 2015 at 3:27 AM, Hannes Reinecke <[email protected]> wrote:
>>>> Currently the loop driver just simulates 512-byte blocks. When
>>>> creating images for virtual machines it might be required to use
>>>> a different physical blocksize (eg 4k for S/390 DASD).
>>>
>>> Looks 'qemu-img create' doesn't have parameter of block size,
>>> so could you share your use case? And I am just curious why
>>> 512-byte can't work for this case.
>>
>> The use case is to suppot 4k sectors sizes such as DASDs usually provide,
>> or just to create a 4k block device to check your filesystem of choice
>> handles 4k sectors just fine. Replace 4k with other sector sizes of
>> your choice for added benefit. In addition to the DASD use case it's really a
>> very useful debugging tool.
>
> There shouldn't be any problem about looping over DASP which has
> 4k sector size. Also for debugging purpose, we can easily emulate 4k
> sector size disk by QEMU/virtio-blk.
>
> We can support 4k sector size on loop for debugging purpose too, but
> the side effect is that some images can't be loop mounted any more
> after its secror size is become larger, then people might complain that.
>
Which is why I made it optional, and having to use some ioctl fields
to enable this feature.
So _if_ someone uses these new features _and_ then complains that
the sector size is different I'll have only limited compassion.

Cheers,

Hannes
--
Dr. Hannes Reinecke zSeries & Storage
[email protected] +49 911 74053 688
SUSE LINUX GmbH, Maxfeldstr. 5, 90409 Nürnberg
GF: F. Imendörffer, J. Smithard, J. Guild, D. Upmanyu, G. Norton
HRB 21284 (AG Nürnberg)

2015-08-07 07:58:31

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Fri, Aug 7, 2015 at 3:48 AM, Christoph Hellwig <[email protected]> wrote:
> On Fri, Aug 07, 2015 at 03:44:58AM -0400, Ming Lei wrote:
>> There shouldn't be any problem about looping over DASP which has
>> 4k sector size. Also for debugging purpose, we can easily emulate 4k
>> sector size disk by QEMU/virtio-blk.
>>
>> We can support 4k sector size on loop for debugging purpose too, but
>> the side effect is that some images can't be loop mounted any more
>> after its secror size is become larger, then people might complain that.
>
> Have you read the patches?
>
> There isn't any change in default behavior, but it allows you to expose
> a non-512 byte sector size _optionally_. So no images will stop being
> loop mountable, quite to the contrary - you can now loop mount an image
> copied off from the 4k disk which otherwise would have been unusable
> because the file system geometry didn't match what's returned by
> the block layer as the sector size.

I mean the following case:

#losetup -B 4096 /dev/loop0 test.img
#mount /dev/loop0 /mnt
$...
#umount /mnt
#losetup -d /dev/loop0
#losetup /dev/loop0 test1.img
#mount /dev/loop0 /mnt

Then the last mount may fail becase the logical block size is still 4096.

Thanks,
Ming Lei

2015-08-07 08:02:28

by Ming Lei

[permalink] [raw]
Subject: Re: [PATCH 0/4] loop: enable different physical blocksizes

On Fri, Aug 7, 2015 at 3:45 AM, Hannes Reinecke <[email protected]> wrote:
> On 08/07/2015 09:23 AM, Ming Lei wrote:
>> On Fri, Aug 7, 2015 at 2:46 AM, Hannes Reinecke <[email protected]> wrote:
>>> On 08/07/2015 07:07 AM, Ming Lei wrote:
>>>> On Mon, Aug 3, 2015 at 7:00 PM, Alexander Graf <[email protected]> wrote:
>>>>>
>>>
>>> [ .. ]
>>>
>>>>>
>>>>> because the guest thinks the disk is formatted with 4k sector size,
>>>>> while mkfs thought it's formatted with 512 byte sector size.
>>>>
>>>> I am wondering if mkfs is remembering the sector size of actual block
>>>> device, and at least it can't be found by 'dumpe2fs'. And it shouldn't have
>>>> do that, otherwise it isn't flexible. And one fs image often can be looped
>>>> successully by loop because loop's block size is 512.
>>>>
>>>> That is why I am wondering if we need support other logical block size
>>>> for loop.
>>>>
>>> If you were to install a bootloader (like lilo or zipl for S/390) it
>>> needs to write the _physical_ block addresses of the kernel and the
>>> initrd. And these do vary, depending in the physical blocksize.
>>
>> So there isn't filesystem involved in your case of installing bootloader,
>> then I am wondering why you don't write the data to the backing block
>> directly? And why does loop have to be involved in this special case?
>>
> Because this is a virtual environment.
> Hardware is a limited resource, and you would need to assign each
> one to a guest.
> Using loop you can run fully virtualized, without having to recurse
> on hardware limitations.

OK, sounds a valid case, and suggest to add the install bootloader story
into the commit log.


thanks,
Ming Lei