2002-10-13 17:36:10

by Michael Clark

[permalink] [raw]
Subject: Re: [Evms-devel] Re: Linux v2.5.42

On 10/14/02 01:10, Alexander Viro wrote:
>
> On Mon, 14 Oct 2002, Michael Clark wrote:
>
>
>>Some of us have large arrays and SANs where the absence a volume
>>manager is a big thing. I'm glad to see the distros picking it up
>>- i guess they have customers who need this sort of stuff.
>>
>>How about feedback from other kernel developers on EVMS. Does anyone
>>think 'its good enough for inclusion now as long as a few cleanups
>>are done after the freeze'?
>
> Mostly those who won't have to clean up the mess afterwards.
> For the record, my vote is "not ready".
>
> There are good chunks, no arguments about that. However, IMNSHO
> we will be better off if we gradually pick the pieces that make sense
> and integrate them into the system. As it is, wholesale merge would cost
> us too much half a year down the road.

I guess it boils down to differentiation between architectural flaws
and more trivial code cleanup. I guess the thing that drew me to Christoph's
particular criticism was whether or not it is a flaw or a feature for
remapping layers to just be remapping layers and not also block devices.

If it is the concensus that remapping layers should also be block devices
then i concede. Although clearly there needs to be a little more reason
than having a device node to do an ioctl on.

> I have seen major subsystem rewrites. I have done several myself.
> I have also done more than a bit of wading through "yet another drivers".
> EVMS in its current state shows a lot of signs promising very painful
> work on cleanups and intergration. "Few cleanups after the freeze" doesn't
> come anywhere near the impression I'm getting from it and I would bet a lot
> on that particular impression.

Okay. It's just not clear which criticism are of the trival post merge
code cleanup kind, which are true architectural problems, and which are
singly held opinions on architectural requirements.

Can we have some concensus on whether intermediate remapping layers also need
to be exposed as block devices as this requiement would have a large impact
on the code.

From the discussion so far:

Pros
* Simplify ioctl routing to plugins

Cons
* Chew up a minor
* Get a block device we don't need or want (ie. we can still easily
directly access the underlying physical block devices)
* loose purely logical remapping abstraction in plugins
* Complicate mapping of request queues to devices (ie. shouldn't only
the top level volume device and the underlying physical devices need
request queues)

~mc


2002-10-14 04:42:02

by Andreas Dilger

[permalink] [raw]
Subject: Re: [Evms-devel] Re: Linux v2.5.42

On Oct 14, 2002 01:41 +0800, Michael Clark wrote:
> Can we have some concensus on whether intermediate remapping layers also
> need to be exposed as block devices as this requiement would have a large
> impact on the code.
>
> From the discussion so far:
>
> Pros
> * Simplify ioctl routing to plugins
>
> Cons
> * Chew up a minor
> * Get a block device we don't need or want (ie. we can still easily
> directly access the underlying physical block devices)
> * lose purely logical remapping abstraction in plugins
> * Complicate mapping of request queues to devices (ie. shouldn't only
> the top level volume device and the underlying physical devices need
> request queues)

I never did get a clear understanding why Christoph wants access to
"intermediate" block devices from EVMS, except for the ioctl issue.
Granted, it _may_ be more "pure" to keep within the current block device
paradigm for each internal stacking layer, or whatever. If AV wants
it implemented differently, then do it, I say. But, that said, the
fact that each intermediate layer _could_ be considered a block device
does not mean that it makes sense to allow user-space access to these
intermediate layers.

Why on earth would I want to start reading from or writing to a block
device which is part of a RAID 5 volume which is chunked into 16MB
logical extents for a volume which is part of a snapshot of some
totally different volume? Yes, in some strange recovery scenario, I
might want to 'dd' the underlying disks somewhere else for backup, but
otherwise the only other thing I can possibly do is corrupt my data
by mistake.

Don't get me wrong, I'm all for giving people enough rope to shoot
themselves in the foot, but I'd rather have /proc/partitions show me
"you have 10 volumes" than "you have these 500 partial volumes that
aren't really useful by themselves - good luck finding which one
currently isn't in use for the filesystem you need to make".

It's like "ls -l" showing you each and every block that makes up a file,
or "ps aux" listing the address of each chunk of RAM that a process has
allocated. Even better (Al will hate this one), it's like processes
being able to read(2) and write(2) directory "files", ugh. Sure, in
some strange context it might be useful to have this information (and
there are definitely tools which will let you know it and work directly
on the raw bits), but in 99.9999% of cases it is just an accident
waiting to happen, a waste of time to see this much detail, and confusion
for users.

Ted will recall an incident at VA where an intern mke2fs'd part of an
MD RAID device that made up Sourceforge, because he couldn't see it
mounted anywhere, and thought it wasn't in use.

Cheers, Andreas
--
Andreas Dilger
http://www-mddsp.enel.ucalgary.ca/People/adilger/
http://sourceforge.net/projects/ext2resize/

2002-10-14 15:15:46

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [Evms-devel] Re: Linux v2.5.42

On Mon, Oct 14, 2002 at 01:41:51AM +0800, Michael Clark wrote:
> From the discussion so far:
>
> Pros
> * Simplify ioctl routing to plugins

* allow code reuse
* simplify userspace access to intermediate layer
* avoid data duplication
* avoid having different data structures for very differen things

> Cons
> * Chew up a minor
> * Get a block device we don't need or want (ie. we can still easily
> directly access the underlying physical block devices)
> * loose purely logical remapping abstraction in plugins

Explain the exact meaning of this to non/native speakers like me, please

> * Complicate mapping of request queues to devices (ie. shouldn't only
> the top level volume device and the underlying physical devices need
> request queues)

You need struct request queue as data structure, but you don't actually
need a queue. Please check the code.

And no, I don't think having a peoper, custom make_request_fn will
complicate the code.

2002-10-14 16:10:50

by Christoph Hellwig

[permalink] [raw]
Subject: Re: [Evms-devel] Re: Linux v2.5.42

On Sun, Oct 13, 2002 at 10:43:55PM -0600, Andreas Dilger wrote:
> I never did get a clear understanding why Christoph wants access to
> "intermediate" block devices from EVMS, except for the ioctl issue.

It's not really the userspace access that matters (it comes for free
when doing it properly) but more that I want to avoid duplicating
kernel-internal data structures and code. Just look at ldev_mgr.c
in the evms source code and see how much simpler it would get if we
merged struct evms_logical_node (and it's members) into struct gendisk and
struct block_device - sure that's not a trivial task, but it'll pay
out in the long term.

> It's like "ls -l" showing you each and every block that makes up a file,
> or "ps aux"

It's more like ps aux showing you all threads of a multithreaded
process. Yes, people have turned it off now, but you really want
to be able to see it without doing hacks.