2010-01-11 18:40:46

by OGAWA Hirofumi

[permalink] [raw]
Subject: [PATCH] vfs: Fix vmtruncate() regression

Hi,

Could you review this one?



If __block_prepare_write() was failed in block_write_begin(), the
allocated blocks can be outside of ->i_size.

But new truncate_pagecache() in vmtuncate() does nothing if new < old.
It means the above usage is not working anymore.

So, this patch fixes it by removing "new < old" check. It would need
more cleanup/change. But, now -rc and truncate working is in progress,
so, this tried to fix it minimum change.

Cc: [email protected]
Signed-off-by: OGAWA Hirofumi <[email protected]>
---

mm/truncate.c | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)

diff -puN mm/truncate.c~truncate_pagecache-fix mm/truncate.c
--- linux-2.6/mm/truncate.c~truncate_pagecache-fix 2010-01-12 02:41:27.000000000 +0900
+++ linux-2.6-hirofumi/mm/truncate.c 2010-01-12 02:42:53.000000000 +0900
@@ -522,22 +522,20 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages
*/
void truncate_pagecache(struct inode *inode, loff_t old, loff_t new)
{
- if (new < old) {
- struct address_space *mapping = inode->i_mapping;
+ struct address_space *mapping = inode->i_mapping;

- /*
- * unmap_mapping_range is called twice, first simply for
- * efficiency so that truncate_inode_pages does fewer
- * single-page unmaps. However after this first call, and
- * before truncate_inode_pages finishes, it is possible for
- * private pages to be COWed, which remain after
- * truncate_inode_pages finishes, hence the second
- * unmap_mapping_range call must be made for correctness.
- */
- unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
- truncate_inode_pages(mapping, new);
- unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
- }
+ /*
+ * unmap_mapping_range is called twice, first simply for
+ * efficiency so that truncate_inode_pages does fewer
+ * single-page unmaps. However after this first call, and
+ * before truncate_inode_pages finishes, it is possible for
+ * private pages to be COWed, which remain after
+ * truncate_inode_pages finishes, hence the second
+ * unmap_mapping_range call must be made for correctness.
+ */
+ unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
+ truncate_inode_pages(mapping, new);
+ unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
}
EXPORT_SYMBOL(truncate_pagecache);

_

--
OGAWA Hirofumi <[email protected]>


2010-01-13 06:07:33

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] vfs: Fix vmtruncate() regression

On Tue, Jan 12, 2010 at 03:40:42AM +0900, OGAWA Hirofumi wrote:
> Hi,
>
> Could you review this one?
>
>
>
> If __block_prepare_write() was failed in block_write_begin(), the
> allocated blocks can be outside of ->i_size.
>
> But new truncate_pagecache() in vmtuncate() does nothing if new < old.
> It means the above usage is not working anymore.
>
> So, this patch fixes it by removing "new < old" check. It would need
> more cleanup/change. But, now -rc and truncate working is in progress,
> so, this tried to fix it minimum change.
>
> Cc: [email protected]
> Signed-off-by: OGAWA Hirofumi <[email protected]>

Hmm, truncate_pagecache() is for truncating the mm/vm part of the
pagecache. vmtruncate should still call inode->i_op->truncate() to
trim blocks if required.

However I'd say we do still need to ensure do_invalidatepage is
called for the page, for private metadata. So yes I think your patch
looks good.

Acked-by: Nick Piggin <[email protected]>

Please apply to mainline and 2.6.32.

> ---
>
> mm/truncate.c | 28 +++++++++++++---------------
> 1 file changed, 13 insertions(+), 15 deletions(-)
>
> diff -puN mm/truncate.c~truncate_pagecache-fix mm/truncate.c
> --- linux-2.6/mm/truncate.c~truncate_pagecache-fix 2010-01-12 02:41:27.000000000 +0900
> +++ linux-2.6-hirofumi/mm/truncate.c 2010-01-12 02:42:53.000000000 +0900
> @@ -522,22 +522,20 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages
> */
> void truncate_pagecache(struct inode *inode, loff_t old, loff_t new)
> {
> - if (new < old) {
> - struct address_space *mapping = inode->i_mapping;
> + struct address_space *mapping = inode->i_mapping;
>
> - /*
> - * unmap_mapping_range is called twice, first simply for
> - * efficiency so that truncate_inode_pages does fewer
> - * single-page unmaps. However after this first call, and
> - * before truncate_inode_pages finishes, it is possible for
> - * private pages to be COWed, which remain after
> - * truncate_inode_pages finishes, hence the second
> - * unmap_mapping_range call must be made for correctness.
> - */
> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> - truncate_inode_pages(mapping, new);
> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> - }
> + /*
> + * unmap_mapping_range is called twice, first simply for
> + * efficiency so that truncate_inode_pages does fewer
> + * single-page unmaps. However after this first call, and
> + * before truncate_inode_pages finishes, it is possible for
> + * private pages to be COWed, which remain after
> + * truncate_inode_pages finishes, hence the second
> + * unmap_mapping_range call must be made for correctness.
> + */
> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> + truncate_inode_pages(mapping, new);
> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> }
> EXPORT_SYMBOL(truncate_pagecache);
>
> _
>
> --
> OGAWA Hirofumi <[email protected]>

2010-01-13 12:08:12

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] vfs: Fix vmtruncate() regression

Nick Piggin <[email protected]> writes:

>> If __block_prepare_write() was failed in block_write_begin(), the
>> allocated blocks can be outside of ->i_size.
>>
>> But new truncate_pagecache() in vmtuncate() does nothing if new < old.
>> It means the above usage is not working anymore.
>>
>> So, this patch fixes it by removing "new < old" check. It would need
>> more cleanup/change. But, now -rc and truncate working is in progress,
>> so, this tried to fix it minimum change.
>>
>> Cc: [email protected]
>> Signed-off-by: OGAWA Hirofumi <[email protected]>
>
> Hmm, truncate_pagecache() is for truncating the mm/vm part of the
> pagecache. vmtruncate should still call inode->i_op->truncate() to
> trim blocks if required.
>
> However I'd say we do still need to ensure do_invalidatepage is
> called for the page, for private metadata. So yes I think your patch
> looks good.

Thanks for reviewing. Yes, and it also needs to be called to ensure that
have the same state on-disk and page/bh state. [BTW, this became the
cause of fatfs corruption.]

Thanks.
--
OGAWA Hirofumi <[email protected]>

2010-01-13 12:14:16

by OGAWA Hirofumi

[permalink] [raw]
Subject: [PATCH] vfs: Fix vmtruncate() regression


If __block_prepare_write() was failed in block_write_begin(), the
allocated blocks can be outside of ->i_size.

But new truncate_pagecache() in vmtuncate() does nothing if new < old.
It means the above usage is not working anymore.

So, this patch fixes it by removing "new < old" check. It would need
more cleanup/change. But, now -rc and truncate working is in progress,
so, this tried to fix it minimum change.

Acked-by: Nick Piggin <[email protected]>
Signed-off-by: OGAWA Hirofumi <[email protected]>
---

mm/truncate.c | 28 +++++++++++++---------------
1 file changed, 13 insertions(+), 15 deletions(-)

diff -puN mm/truncate.c~truncate_pagecache-fix mm/truncate.c
--- linux-2.6/mm/truncate.c~truncate_pagecache-fix 2010-01-12 05:43:06.000000000 +0900
+++ linux-2.6-hirofumi/mm/truncate.c 2010-01-12 05:43:06.000000000 +0900
@@ -522,22 +522,20 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages
*/
void truncate_pagecache(struct inode *inode, loff_t old, loff_t new)
{
- if (new < old) {
- struct address_space *mapping = inode->i_mapping;
+ struct address_space *mapping = inode->i_mapping;

- /*
- * unmap_mapping_range is called twice, first simply for
- * efficiency so that truncate_inode_pages does fewer
- * single-page unmaps. However after this first call, and
- * before truncate_inode_pages finishes, it is possible for
- * private pages to be COWed, which remain after
- * truncate_inode_pages finishes, hence the second
- * unmap_mapping_range call must be made for correctness.
- */
- unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
- truncate_inode_pages(mapping, new);
- unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
- }
+ /*
+ * unmap_mapping_range is called twice, first simply for
+ * efficiency so that truncate_inode_pages does fewer
+ * single-page unmaps. However after this first call, and
+ * before truncate_inode_pages finishes, it is possible for
+ * private pages to be COWed, which remain after
+ * truncate_inode_pages finishes, hence the second
+ * unmap_mapping_range call must be made for correctness.
+ */
+ unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
+ truncate_inode_pages(mapping, new);
+ unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
}
EXPORT_SYMBOL(truncate_pagecache);

_

2010-01-14 22:32:10

by Andrew Morton

[permalink] [raw]
Subject: Re: [PATCH] vfs: Fix vmtruncate() regression

On Wed, 13 Jan 2010 21:14:09 +0900
OGAWA Hirofumi <[email protected]> wrote:

>
> If __block_prepare_write() was failed in block_write_begin(), the
> allocated blocks can be outside of ->i_size.
>
> But new truncate_pagecache() in vmtuncate() does nothing if new < old.
> It means the above usage is not working anymore.
>
> So, this patch fixes it by removing "new < old" check. It would need
> more cleanup/change. But, now -rc and truncate working is in progress,
> so, this tried to fix it minimum change.
>
> Acked-by: Nick Piggin <[email protected]>
> Signed-off-by: OGAWA Hirofumi <[email protected]>
> ---
>
> mm/truncate.c | 28 +++++++++++++---------------
> 1 file changed, 13 insertions(+), 15 deletions(-)
>
> diff -puN mm/truncate.c~truncate_pagecache-fix mm/truncate.c
> --- linux-2.6/mm/truncate.c~truncate_pagecache-fix 2010-01-12 05:43:06.000000000 +0900
> +++ linux-2.6-hirofumi/mm/truncate.c 2010-01-12 05:43:06.000000000 +0900
> @@ -522,22 +522,20 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages
> */
> void truncate_pagecache(struct inode *inode, loff_t old, loff_t new)
> {
> - if (new < old) {
> - struct address_space *mapping = inode->i_mapping;
> + struct address_space *mapping = inode->i_mapping;
>
> - /*
> - * unmap_mapping_range is called twice, first simply for
> - * efficiency so that truncate_inode_pages does fewer
> - * single-page unmaps. However after this first call, and
> - * before truncate_inode_pages finishes, it is possible for
> - * private pages to be COWed, which remain after
> - * truncate_inode_pages finishes, hence the second
> - * unmap_mapping_range call must be made for correctness.
> - */
> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> - truncate_inode_pages(mapping, new);
> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> - }
> + /*
> + * unmap_mapping_range is called twice, first simply for
> + * efficiency so that truncate_inode_pages does fewer
> + * single-page unmaps. However after this first call, and
> + * before truncate_inode_pages finishes, it is possible for
> + * private pages to be COWed, which remain after
> + * truncate_inode_pages finishes, hence the second
> + * unmap_mapping_range call must be made for correctness.
> + */
> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> + truncate_inode_pages(mapping, new);
> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> }
> EXPORT_SYMBOL(truncate_pagecache);

The fix was applied to 2.6.33-rcX
(cedabed49b39b4319bccc059a63344b6232b619c), appears to be needed in
2.6.32.x but no cc:stable's are present?

2010-01-15 00:26:11

by OGAWA Hirofumi

[permalink] [raw]
Subject: Re: [PATCH] vfs: Fix vmtruncate() regression

Andrew Morton <[email protected]> writes:

> On Wed, 13 Jan 2010 21:14:09 +0900
> OGAWA Hirofumi <[email protected]> wrote:
>
>>
>> If __block_prepare_write() was failed in block_write_begin(), the
>> allocated blocks can be outside of ->i_size.
>>
>> But new truncate_pagecache() in vmtuncate() does nothing if new < old.
>> It means the above usage is not working anymore.
>>
>> So, this patch fixes it by removing "new < old" check. It would need
>> more cleanup/change. But, now -rc and truncate working is in progress,
>> so, this tried to fix it minimum change.
>>
>> Acked-by: Nick Piggin <[email protected]>
>> Signed-off-by: OGAWA Hirofumi <[email protected]>
>> ---
>>
>> mm/truncate.c | 28 +++++++++++++---------------
>> 1 file changed, 13 insertions(+), 15 deletions(-)
>>
>> diff -puN mm/truncate.c~truncate_pagecache-fix mm/truncate.c
>> --- linux-2.6/mm/truncate.c~truncate_pagecache-fix 2010-01-12 05:43:06.000000000 +0900
>> +++ linux-2.6-hirofumi/mm/truncate.c 2010-01-12 05:43:06.000000000 +0900
>> @@ -522,22 +522,20 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages
>> */
>> void truncate_pagecache(struct inode *inode, loff_t old, loff_t new)
>> {
>> - if (new < old) {
>> - struct address_space *mapping = inode->i_mapping;
>> + struct address_space *mapping = inode->i_mapping;
>>
>> - /*
>> - * unmap_mapping_range is called twice, first simply for
>> - * efficiency so that truncate_inode_pages does fewer
>> - * single-page unmaps. However after this first call, and
>> - * before truncate_inode_pages finishes, it is possible for
>> - * private pages to be COWed, which remain after
>> - * truncate_inode_pages finishes, hence the second
>> - * unmap_mapping_range call must be made for correctness.
>> - */
>> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
>> - truncate_inode_pages(mapping, new);
>> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
>> - }
>> + /*
>> + * unmap_mapping_range is called twice, first simply for
>> + * efficiency so that truncate_inode_pages does fewer
>> + * single-page unmaps. However after this first call, and
>> + * before truncate_inode_pages finishes, it is possible for
>> + * private pages to be COWed, which remain after
>> + * truncate_inode_pages finishes, hence the second
>> + * unmap_mapping_range call must be made for correctness.
>> + */
>> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
>> + truncate_inode_pages(mapping, new);
>> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
>> }
>> EXPORT_SYMBOL(truncate_pagecache);
>
> The fix was applied to 2.6.33-rcX
> (cedabed49b39b4319bccc059a63344b6232b619c), appears to be needed in
> 2.6.32.x but no cc:stable's are present?

Ah, yes. I forgot to add "Cc: stable". Please apply this to 2.6.32.x.
--
OGAWA Hirofumi <[email protected]>

2010-01-18 03:58:31

by Nick Piggin

[permalink] [raw]
Subject: Re: [PATCH] vfs: Fix vmtruncate() regression

On Fri, Jan 15, 2010 at 09:26:02AM +0900, OGAWA Hirofumi wrote:
> Andrew Morton <[email protected]> writes:
>
> > On Wed, 13 Jan 2010 21:14:09 +0900
> > OGAWA Hirofumi <[email protected]> wrote:
> >
> >>
> >> If __block_prepare_write() was failed in block_write_begin(), the
> >> allocated blocks can be outside of ->i_size.
> >>
> >> But new truncate_pagecache() in vmtuncate() does nothing if new < old.
> >> It means the above usage is not working anymore.
> >>
> >> So, this patch fixes it by removing "new < old" check. It would need
> >> more cleanup/change. But, now -rc and truncate working is in progress,
> >> so, this tried to fix it minimum change.
> >>
> >> Acked-by: Nick Piggin <[email protected]>
> >> Signed-off-by: OGAWA Hirofumi <[email protected]>
> >> ---
> >>
> >> mm/truncate.c | 28 +++++++++++++---------------
> >> 1 file changed, 13 insertions(+), 15 deletions(-)
> >>
> >> diff -puN mm/truncate.c~truncate_pagecache-fix mm/truncate.c
> >> --- linux-2.6/mm/truncate.c~truncate_pagecache-fix 2010-01-12 05:43:06.000000000 +0900
> >> +++ linux-2.6-hirofumi/mm/truncate.c 2010-01-12 05:43:06.000000000 +0900
> >> @@ -522,22 +522,20 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages
> >> */
> >> void truncate_pagecache(struct inode *inode, loff_t old, loff_t new)
> >> {
> >> - if (new < old) {
> >> - struct address_space *mapping = inode->i_mapping;
> >> + struct address_space *mapping = inode->i_mapping;
> >>
> >> - /*
> >> - * unmap_mapping_range is called twice, first simply for
> >> - * efficiency so that truncate_inode_pages does fewer
> >> - * single-page unmaps. However after this first call, and
> >> - * before truncate_inode_pages finishes, it is possible for
> >> - * private pages to be COWed, which remain after
> >> - * truncate_inode_pages finishes, hence the second
> >> - * unmap_mapping_range call must be made for correctness.
> >> - */
> >> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> >> - truncate_inode_pages(mapping, new);
> >> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> >> - }
> >> + /*
> >> + * unmap_mapping_range is called twice, first simply for
> >> + * efficiency so that truncate_inode_pages does fewer
> >> + * single-page unmaps. However after this first call, and
> >> + * before truncate_inode_pages finishes, it is possible for
> >> + * private pages to be COWed, which remain after
> >> + * truncate_inode_pages finishes, hence the second
> >> + * unmap_mapping_range call must be made for correctness.
> >> + */
> >> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> >> + truncate_inode_pages(mapping, new);
> >> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> >> }
> >> EXPORT_SYMBOL(truncate_pagecache);
> >
> > The fix was applied to 2.6.33-rcX
> > (cedabed49b39b4319bccc059a63344b6232b619c), appears to be needed in
> > 2.6.32.x but no cc:stable's are present?
>
> Ah, yes. I forgot to add "Cc: stable". Please apply this to 2.6.32.x.

Thanks guys, it's quite important so please apply to stable 2.6.32 and
consider it as a high priority to release.

2010-01-20 00:01:40

by Greg KH

[permalink] [raw]
Subject: Re: [stable] [PATCH] vfs: Fix vmtruncate() regression

On Mon, Jan 18, 2010 at 02:51:46PM +1100, Nick Piggin wrote:
> On Fri, Jan 15, 2010 at 09:26:02AM +0900, OGAWA Hirofumi wrote:
> > Andrew Morton <[email protected]> writes:
> >
> > > On Wed, 13 Jan 2010 21:14:09 +0900
> > > OGAWA Hirofumi <[email protected]> wrote:
> > >
> > >>
> > >> If __block_prepare_write() was failed in block_write_begin(), the
> > >> allocated blocks can be outside of ->i_size.
> > >>
> > >> But new truncate_pagecache() in vmtuncate() does nothing if new < old.
> > >> It means the above usage is not working anymore.
> > >>
> > >> So, this patch fixes it by removing "new < old" check. It would need
> > >> more cleanup/change. But, now -rc and truncate working is in progress,
> > >> so, this tried to fix it minimum change.
> > >>
> > >> Acked-by: Nick Piggin <[email protected]>
> > >> Signed-off-by: OGAWA Hirofumi <[email protected]>
> > >> ---
> > >>
> > >> mm/truncate.c | 28 +++++++++++++---------------
> > >> 1 file changed, 13 insertions(+), 15 deletions(-)
> > >>
> > >> diff -puN mm/truncate.c~truncate_pagecache-fix mm/truncate.c
> > >> --- linux-2.6/mm/truncate.c~truncate_pagecache-fix 2010-01-12 05:43:06.000000000 +0900
> > >> +++ linux-2.6-hirofumi/mm/truncate.c 2010-01-12 05:43:06.000000000 +0900
> > >> @@ -522,22 +522,20 @@ EXPORT_SYMBOL_GPL(invalidate_inode_pages
> > >> */
> > >> void truncate_pagecache(struct inode *inode, loff_t old, loff_t new)
> > >> {
> > >> - if (new < old) {
> > >> - struct address_space *mapping = inode->i_mapping;
> > >> + struct address_space *mapping = inode->i_mapping;
> > >>
> > >> - /*
> > >> - * unmap_mapping_range is called twice, first simply for
> > >> - * efficiency so that truncate_inode_pages does fewer
> > >> - * single-page unmaps. However after this first call, and
> > >> - * before truncate_inode_pages finishes, it is possible for
> > >> - * private pages to be COWed, which remain after
> > >> - * truncate_inode_pages finishes, hence the second
> > >> - * unmap_mapping_range call must be made for correctness.
> > >> - */
> > >> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> > >> - truncate_inode_pages(mapping, new);
> > >> - unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> > >> - }
> > >> + /*
> > >> + * unmap_mapping_range is called twice, first simply for
> > >> + * efficiency so that truncate_inode_pages does fewer
> > >> + * single-page unmaps. However after this first call, and
> > >> + * before truncate_inode_pages finishes, it is possible for
> > >> + * private pages to be COWed, which remain after
> > >> + * truncate_inode_pages finishes, hence the second
> > >> + * unmap_mapping_range call must be made for correctness.
> > >> + */
> > >> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> > >> + truncate_inode_pages(mapping, new);
> > >> + unmap_mapping_range(mapping, new + PAGE_SIZE - 1, 0, 1);
> > >> }
> > >> EXPORT_SYMBOL(truncate_pagecache);
> > >
> > > The fix was applied to 2.6.33-rcX
> > > (cedabed49b39b4319bccc059a63344b6232b619c), appears to be needed in
> > > 2.6.32.x but no cc:stable's are present?
> >
> > Ah, yes. I forgot to add "Cc: stable". Please apply this to 2.6.32.x.
>
> Thanks guys, it's quite important so please apply to stable 2.6.32 and
> consider it as a high priority to release.

Now queued up for the next .32 -stable release.

greg k-h