Hello!
This patch makes sure that indirect pointers for holes are correctly filled in by zeroes at
hole-creation time. (Author is Chris Mason. fs/buffer.c part (generic_cont_expand) were written by
Alexander Viro)
Please apply.
Bye,
Oleg
Oleg Drokin wrote:
>
> Hello!
>
> This patch makes sure that indirect pointers for holes are correctly filled in by zeroes at
> hole-creation time. (Author is Chris Mason. fs/buffer.c part (generic_cont_expand) were written by
> Alexander Viro)
>
> Please apply.
>
> ...
>
Please. Do not add new library functions to the kernel with
no descriptive commentary at all. We really need to get past
that stage.
What is this function supposed to do? It appears that if
the page at `size' is not in cache, we write uninitialised
data into the file.
--- linux-2.4.18-pre1/fs/buffer.c.orig Thu Jan 3 10:02:03 2002
+++ linux-2.4.18-pre1/fs/buffer.c Thu Jan 3 10:09:24 2002
@@ -1760,6 +1760,46 @@
return 0;
}
+int generic_cont_expand(struct inode *inode, loff_t size)
+{
+ struct address_space *mapping = inode->i_mapping;
+ struct page *page;
+ unsigned long index, offset, limit;
+ int err;
+
+ limit = current->rlim[RLIMIT_FSIZE].rlim_cur;
+ if (limit != RLIM_INFINITY) {
+ if (size > limit) {
+ send_sig(SIGXFSZ, current, 0);
+ size = limit;
+ }
+ }
+ offset = (size & (PAGE_CACHE_SIZE-1)); /* Within page */
+
+ /* ugh. in prepare/commit_write, if from==to==start of block, we
+ ** skip the prepare. make sure we never send an offset for the start
+ ** of a block
+ */
+ if ((offset & (inode->i_sb->s_blocksize - 1)) == 0) {
+ offset++ ;
+ }
+ index = size >> PAGE_CACHE_SHIFT;
+ err = -ENOMEM;
+ page = grab_cache_page(mapping, index);
+ if (!page)
+ goto out;
+ err = mapping->a_ops->prepare_write(NULL, page, offset, offset);
+ if (!err) {
+ err = mapping->a_ops->commit_write(NULL, page, offset, offset);
+ }
+ UnlockPage(page);
+ page_cache_release(page);
+ if (err > 0)
+ err = 0;
+out:
+ return err;
+}
+
> This patch makes sure that indirect pointers for holes are correctly filled in by zeroes at
> hole-creation time. (Author is Chris Mason. fs/buffer.c part (generic_cont_expand) were written by
> Alexander Viro)
Why is that even needed. If you truncate a file larger it doesn't need to
fill in the datablocks until they are touched surely
Hello!
On Thu, Jan 03, 2002 at 12:25:34PM +0000, Alan Cox wrote:
> > This patch makes sure that indirect pointers for holes are correctly filled in by zeroes at
> > hole-creation time. (Author is Chris Mason. fs/buffer.c part (generic_cont_expand) were written by
> > Alexander Viro)
> Why is that even needed. If you truncate a file larger it doesn't need to
> fill in the datablocks until they are touched surely
Purpose of this patch is of course not to fill in the datablocks with zeroes.
The purpose (as applied to reiserfs) is to fill indirect data pointers (that is - pointers to real data blocks)
with zeroes (and to organize proper in-tree data structure for such pointers).
As of now such organization and zero-filling is done on a lazy manner at disk-flushing time.
Unfortunatelly this leads to races in the code.
I do not know why parts of this code can be needed by other filesystem and why Al Viro put it in generic VFS
code. (but he can comment on it, I think)
Bye,
Oleg
On Thu, 3 Jan 2002, Oleg Drokin wrote:
> Purpose of this patch is of course not to fill in the datablocks with zeroes.
> The purpose (as applied to reiserfs) is to fill indirect data pointers (that is - pointers to real data blocks)
> with zeroes (and to organize proper in-tree data structure for such pointers).
> As of now such organization and zero-filling is done on a lazy manner at disk-flushing time.
> Unfortunatelly this leads to races in the code.
> I do not know why parts of this code can be needed by other filesystem and why Al Viro put it in generic VFS
> code. (but he can comment on it, I think)
I could, if I would remember doing that... ;-/
Seriously, it looks like a half-arsed and very old attempt to do common
expanding truncate() for no-holes filesystems. BTW, these days rlimit checks
are done by vmtruncate().
On Thursday, January 03, 2002 11:17:56 AM -0500 Alexander Viro
<[email protected]> wrote:
[ expanding truncate patch ]
>
> Seriously, it looks like a half-arsed and very old attempt to do common
> expanding truncate() for no-holes filesystems. BTW, these days rlimit
> checks are done by vmtruncate().
Whoops, good point. It was a very old and half-arsed attempt at doing
expanding truncate to no-holes filesystems (mostly the fat flavors).
reiserfs likes it because it needs to insert zero'd out indirect items for
holes.
I've got the bits to make fat use it too, the only real gain being able to
save staroffice files onto fat.
<email backlog disclaimer, ignore if already dealt with>
-chris
On Friday, January 04, 2002 12:33:32 PM -0500 Chris Mason <[email protected]> wrote:
[ expanding truncate patch ]
>> Seriously, it looks like a half-arsed and very old attempt to do common
>> expanding truncate() for no-holes filesystems. BTW, these days rlimit
>> checks are done by vmtruncate().
Ok, the rlimit checks where still there because generic_cont_expand is called before vmtruncate. I changed it around to match vmtruncate better though.
Does anyone still object to the patch below?
-chris
--- 0.22/fs/reiserfs/file.c Fri, 11 Jan 2002 14:17:09 -0500
+++ 0.22(w)/fs/reiserfs/file.c Fri, 11 Jan 2002 15:28:35 -0500
@@ -103,6 +103,20 @@
if (get_inode_item_key_version(inode) == KEY_FORMAT_3_5 &&
attr->ia_size > MAX_NON_LFS)
return -EFBIG ;
+
+ /* fill in hole pointers in the expanding truncate case. */
+ if (attr->ia_size > inode->i_size) {
+ error = generic_cont_expand(inode, attr->ia_size) ;
+ if (inode->u.reiserfs_i.i_prealloc_count > 0) {
+ struct reiserfs_transaction_handle th ;
+ /* we're changing at most 2 bitmaps, inode + super */
+ journal_begin(&th, inode->i_sb, 4) ;
+ reiserfs_discard_prealloc (&th, inode);
+ journal_end(&th, inode->i_sb, 4) ;
+ }
+ if (error)
+ return error ;
+ }
}
if ((((attr->ia_valid & ATTR_UID) && (attr->ia_uid & ~0xffff)) ||
--- 0.22/fs/reiserfs/inode.c Fri, 11 Jan 2002 14:17:09 -0500
+++ 0.22(w)/fs/reiserfs/inode.c Fri, 11 Jan 2002 14:43:42 -0500
@@ -2125,7 +2125,7 @@
/* we test for O_SYNC here so we can commit the transaction
** for any packed tails the file might have had
*/
- if (f->f_flags & O_SYNC) {
+ if (f && f->f_flags & O_SYNC) {
lock_kernel() ;
reiserfs_commit_for_inode(inode) ;
unlock_kernel();
--- 0.22/fs/buffer.c Fri, 11 Jan 2002 11:53:37 -0500
+++ 0.22(w)/fs/buffer.c Fri, 11 Jan 2002 15:28:08 -0500
@@ -1760,6 +1760,52 @@
return 0;
}
+/* utility function for filesystems that need to do work on expanding
+ * truncates. Uses prepare/commit_write to allow the filesystem to
+ * deal with the hole.
+ */
+int generic_cont_expand(struct inode *inode, loff_t size)
+{
+ struct address_space *mapping = inode->i_mapping;
+ struct page *page;
+ unsigned long index, offset, limit;
+ int err;
+
+ err = -EFBIG;
+ limit = current->rlim[RLIMIT_FSIZE].rlim_cur;
+ if (limit != RLIM_INFINITY && size > (loff_t)limit) {
+ send_sig(SIGXFSZ, current, 0);
+ goto out;
+ }
+ if (size > inode->i_sb->s_maxbytes)
+ goto out;
+
+ offset = (size & (PAGE_CACHE_SIZE-1)); /* Within page */
+
+ /* ugh. in prepare/commit_write, if from==to==start of block, we
+ ** skip the prepare. make sure we never send an offset for the start
+ ** of a block
+ */
+ if ((offset & (inode->i_sb->s_blocksize - 1)) == 0) {
+ offset++;
+ }
+ index = size >> PAGE_CACHE_SHIFT;
+ err = -ENOMEM;
+ page = grab_cache_page(mapping, index);
+ if (!page)
+ goto out;
+ err = mapping->a_ops->prepare_write(NULL, page, offset, offset);
+ if (!err) {
+ err = mapping->a_ops->commit_write(NULL, page, offset, offset);
+ }
+ UnlockPage(page);
+ page_cache_release(page);
+ if (err > 0)
+ err = 0;
+out:
+ return err;
+}
+
/*
* For moronic filesystems that do not allow holes in file.
* We may have to extend the file.
--- 0.22/kernel/ksyms.c Fri, 11 Jan 2002 11:54:05 -0500
+++ 0.22(w)/kernel/ksyms.c Fri, 11 Jan 2002 14:17:35 -0500
@@ -204,6 +204,7 @@
EXPORT_SYMBOL(block_read_full_page);
EXPORT_SYMBOL(block_prepare_write);
EXPORT_SYMBOL(block_sync_page);
+EXPORT_SYMBOL(generic_cont_expand);
EXPORT_SYMBOL(cont_prepare_write);
EXPORT_SYMBOL(generic_commit_write);
EXPORT_SYMBOL(block_truncate_page);
--- 0.22/include/linux/fs.h Fri, 11 Jan 2002 14:17:09 -0500
+++ 0.22(w)/include/linux/fs.h Fri, 11 Jan 2002 14:17:35 -0500
@@ -1416,6 +1416,7 @@
extern int block_prepare_write(struct page*, unsigned, unsigned, get_block_t*);
extern int cont_prepare_write(struct page*, unsigned, unsigned, get_block_t*,
unsigned long *);
+extern int generic_cont_expand(struct inode *inode, loff_t size) ;
extern int block_commit_write(struct page *page, unsigned from, unsigned to);
extern int block_sync_page(struct page *);