2005-10-18 05:52:28

by Roushan Ali

[permalink] [raw]
Subject: file system block size

Hi All,
we want to write a new file system with block size more than
4KB. Can anyone suggest us how should we proceed ?



Regards,
Roushan


2005-10-18 14:14:21

by Nathan Scott

[permalink] [raw]
Subject: Re: file system block size

On Tue, Oct 18, 2005 at 11:22:27AM +0530, Roushan Ali wrote:
> Hi All,
> we want to write a new file system with block size more than
> 4KB. Can anyone suggest us how should we proceed ?

With great difficulty. ;)

There is really no support for this in the generic page cache
code in the kernel, and you'd probably need some mechanism for
doing multi-page metadata IOs. There's been sporadic discussion
on fs-devel and linux-xfs in the past on this topic - you could
search those archives for details.

cheers.

--
Nathan

2005-10-18 14:37:03

by Anton Altaparmakov

[permalink] [raw]
Subject: Re: file system block size

On Wed, 2005-10-19 at 00:12 +1000, Nathan Scott wrote:
> On Tue, Oct 18, 2005 at 11:22:27AM +0530, Roushan Ali wrote:
> > Hi All,
> > we want to write a new file system with block size more than
> > 4KB. Can anyone suggest us how should we proceed ?
>
> With great difficulty. ;)

Indeed, it makes everything harder. Have a look at ntfs in the kernel
which has to cope with file system block sizes between 512 bytes and
several hundred kiB (at least they are in powers of two thank
goodness...). You end up not being able to use a lot of generic
functions as you for example need to lock multiple pages which needs to
be ordered correctly, etc... If you look at the latest -mm kernel, the
ntfs driver there has file write(2) support for any cluster size and you
will see an example of the multiple page locking problem solution there.

Just one note is that I chose not to do full fs block size io in ntfs so
I avoid most of the big problems. The only time it really hits is at
allocation time, because you can only allocate in units of fs block size
so when you write even one byte you need to lock the pages for the whole
byte range from the fs block start to block end that includes the byte
being written to. If already allocated, ntfs simply only writes to that
one byte in the page it is in and ignores the other pages. Makes life
much easier and faster given you save doing unnecessary writes to the
other pages. (-: In fact we would only write out the 512-byte block
containing the 1 written byte, so it is even faster than just page
granularity. (-:

Best regards,

Anton
--
Anton Altaparmakov <aia21 at cam.ac.uk> (replace at with @)
Unix Support, Computing Service, University of Cambridge, CB2 3QH, UK
Linux NTFS maintainer / IRC: #ntfs on irc.freenode.net
WWW: http://linux-ntfs.sf.net/ & http://www-stu.christs.cam.ac.uk/~aia21/

2005-10-18 15:12:13

by Bob Copeland

[permalink] [raw]
Subject: Re: file system block size

> Indeed, it makes everything harder. Have a look at ntfs in the kernel
> which has to cope with file system block sizes between 512 bytes and
> several hundred kiB (at least they are in powers of two thank
> goodness...). You end up not being able to use a lot of generic
> functions as you for example need to lock multiple pages which needs to
> be ordered correctly, etc... If you look at the latest -mm kernel, the
> ntfs driver there has file write(2) support for any cluster size and you
> will see an example of the multiple page locking problem solution there.

Any chance this will make it into common code? I also need it for my
filesystem driver (here: http://bobcopeland.com/karma/). So far I've
only done read side which is not too bad, but as you say writing makes
things complicated.

That, and the extent-supporting mpage_readpages would make me a happy person.

-Bob

2005-10-18 15:31:11

by Badari Pulavarty

[permalink] [raw]
Subject: Re: file system block size

On Tue, 2005-10-18 at 11:12 -0400, Bob Copeland wrote:
> > Indeed, it makes everything harder. Have a look at ntfs in the kernel
> > which has to cope with file system block sizes between 512 bytes and
> > several hundred kiB (at least they are in powers of two thank
> > goodness...). You end up not being able to use a lot of generic
> > functions as you for example need to lock multiple pages which needs to
> > be ordered correctly, etc... If you look at the latest -mm kernel, the
> > ntfs driver there has file write(2) support for any cluster size and you
> > will see an example of the multiple page locking problem solution there.
>
> Any chance this will make it into common code? I also need it for my
> filesystem driver (here: http://bobcopeland.com/karma/). So far I've
> only done read side which is not too bad, but as you say writing makes
> things complicated.
>
> That, and the extent-supporting mpage_readpages would make me a happy person.

Can you elaborate ? What would you like to see in mpage_readpages() ?
Christoph recently posted patches to add support for getblocks() in
mpage_readpages(). What else do you need ?


Thanks,
Badari

2005-10-18 15:42:10

by Bob Copeland

[permalink] [raw]
Subject: Re: file system block size

On 10/18/05, Badari Pulavarty <[email protected]> wrote:
> On Tue, 2005-10-18 at 11:12 -0400, Bob Copeland wrote:
> > That, and the extent-supporting mpage_readpages would make me a happy person.
>
> Can you elaborate ? What would you like to see in mpage_readpages() ?
> Christoph recently posted patches to add support for getblocks() in
> mpage_readpages(). What else do you need ?

Yes, that is exactly what I was referring to. I think Christoph's
patch will work great for my fs but I haven't had a chance to try it
out yet.

-Bob