2016-03-15 14:16:21

by Richard Weinberger

[permalink] [raw]
Subject: Page migration issue with UBIFS

Hi!

We're facing this issue from 2014 on UBIFS:
http://www.spinics.net/lists/linux-fsdevel/msg79941.html

So sum up:
UBIFS does not allow pages directly marked as dirty. It want's everyone to do it via UBIFS's
->wirte_end() and ->page_mkwirte() functions.
This assumption *seems* to be violated by CMA which migrates pages.
UBIFS enforces this because it has to account free space on the flash,
in UBIFS speak "budget", for details please see fs/ubifs/file.c.

As in the report from 2014 the page is writable but not dirty.
The kernel has this debug patch applied:
http://www.spinics.net/lists/linux-fsdevel/msg80471.html
But our kernel is based on v4.4 and does *not* use proprietary modules.

[ 213.450000] page:debe03c0 count:3 mapcount:1 mapping:dce4b5fc index:0x2f
[ 213.460000] flags: 0x9(locked|uptodate)
[ 213.460000] page dumped because: try_to_unmap_one
[ 213.470000] pte_write: 1
[ 213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
[ 213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008
[ 213.490000] Hardware name: Allwinner sun4i/sun5i Families
[ 213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14)
[ 213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0)
[ 213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50)
[ 213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8)
[ 213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290)
[ 213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80)
[ 213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0)
[ 213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4)
[ 213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0)
[ 213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8)
[ 213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274)
[ 213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c)
[ 213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0)
[ 213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
[ 213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48)
[ 213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444)
[ 213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614)
[ 213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c)
[ 213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)

The full kernellog can be found here:
http://code.bulix.org/ysuo9x-93716?raw

So, let me repeat Artem's question from 2014:
> Now the question is: is it UBIFS which has incorrect assumptions, or this is the
> Linux MM which is not doing the right thing? I do not know the answer, let's see
> if the MM list may give us a clue.

Thanks,
//richard


2016-03-15 15:17:33

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Tue, Mar 15, 2016 at 03:16:11PM +0100, Richard Weinberger wrote:
> Hi!
>
> We're facing this issue from 2014 on UBIFS:
> http://www.spinics.net/lists/linux-fsdevel/msg79941.html
>
> So sum up:
> UBIFS does not allow pages directly marked as dirty. It want's everyone to do it via UBIFS's
> ->wirte_end() and ->page_mkwirte() functions.
> This assumption *seems* to be violated by CMA which migrates pages.

I don't thing the CMA/migration is the root cause.

How did we end up with writable and dirty pte, but not having
->page_mkwrite() called for the page?

Or if ->page_mkwrite() was called, why the page is not dirty?


> UBIFS enforces this because it has to account free space on the flash,
> in UBIFS speak "budget", for details please see fs/ubifs/file.c.
>
> As in the report from 2014 the page is writable but not dirty.
> The kernel has this debug patch applied:
> http://www.spinics.net/lists/linux-fsdevel/msg80471.html
> But our kernel is based on v4.4 and does *not* use proprietary modules.
>
> [ 213.450000] page:debe03c0 count:3 mapcount:1 mapping:dce4b5fc index:0x2f
> [ 213.460000] flags: 0x9(locked|uptodate)
> [ 213.460000] page dumped because: try_to_unmap_one
> [ 213.470000] pte_write: 1
> [ 213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
> [ 213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008
> [ 213.490000] Hardware name: Allwinner sun4i/sun5i Families
> [ 213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14)
> [ 213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0)
> [ 213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50)
> [ 213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8)
> [ 213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290)
> [ 213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80)
> [ 213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0)
> [ 213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4)
> [ 213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0)
> [ 213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8)
> [ 213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274)
> [ 213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c)
> [ 213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0)
> [ 213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
> [ 213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48)
> [ 213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444)
> [ 213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614)
> [ 213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c)
> [ 213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)
>
> The full kernellog can be found here:
> http://code.bulix.org/ysuo9x-93716?raw
>
> So, let me repeat Artem's question from 2014:
> > Now the question is: is it UBIFS which has incorrect assumptions, or this is the
> > Linux MM which is not doing the right thing? I do not know the answer, let's see
> > if the MM list may give us a clue.
>
> Thanks,
> //richard
> --
> To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html

--
Kirill A. Shutemov

2016-03-15 15:26:19

by Richard Weinberger

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Kirill,

Am 15.03.2016 um 16:17 schrieb Kirill A. Shutemov:
> On Tue, Mar 15, 2016 at 03:16:11PM +0100, Richard Weinberger wrote:
>> Hi!
>>
>> We're facing this issue from 2014 on UBIFS:
>> http://www.spinics.net/lists/linux-fsdevel/msg79941.html
>>
>> So sum up:
>> UBIFS does not allow pages directly marked as dirty. It want's everyone to do it via UBIFS's
>> ->wirte_end() and ->page_mkwirte() functions.
>> This assumption *seems* to be violated by CMA which migrates pages.
>
> I don't thing the CMA/migration is the root cause.
>
> How did we end up with writable and dirty pte, but not having
> ->page_mkwrite() called for the page?
>
> Or if ->page_mkwrite() was called, why the page is not dirty?

Thanks for your quick response!

I also don't think that the root cause is CMA or migration but it seems
to be the messenger.

Can you confirm that UBIFS's assumptions are valid?
I'm trying to rule out possible issues and hunt down the root cause...

Thanks,
//richard

2016-03-15 15:32:46

by Richard Weinberger

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Am 15.03.2016 um 16:17 schrieb Kirill A. Shutemov:
> On Tue, Mar 15, 2016 at 03:16:11PM +0100, Richard Weinberger wrote:
>> Hi!
>>
>> We're facing this issue from 2014 on UBIFS:
>> http://www.spinics.net/lists/linux-fsdevel/msg79941.html
>>
>> So sum up:
>> UBIFS does not allow pages directly marked as dirty. It want's everyone to do it via UBIFS's
>> ->wirte_end() and ->page_mkwirte() functions.
>> This assumption *seems* to be violated by CMA which migrates pages.
>
> I don't thing the CMA/migration is the root cause.
>
> How did we end up with writable and dirty pte, but not having
> ->page_mkwrite() called for the page?
>
> Or if ->page_mkwrite() was called, why the page is not dirty?

BTW: UBIFS does not implement ->migratepage(), could this be a problem?

Thanks,
//richard

2016-03-15 15:36:08

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Tue, Mar 15, 2016 at 04:25:50PM +0100, Richard Weinberger wrote:
> Thanks for your quick response!
>
> I also don't think that the root cause is CMA or migration but it seems
> to be the messenger.
>
> Can you confirm that UBIFS's assumptions are valid?
> I'm trying to rule out possible issues and hunt down the root cause...

FYI, XFS would blow up unless either ->write_begin or ->page_mkwrite
were called before dirtying a page. We do an assert that the page has
buffers as the first thing in writepage, and those are the only
two places that should create buffers.

So either no one is using CMA with XFS, or there is another weird
interaction involved.

2016-03-15 15:38:37

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
> > Or if ->page_mkwrite() was called, why the page is not dirty?
>
> BTW: UBIFS does not implement ->migratepage(), could this be a problem?

This might be the reason. I can't reall make sense of
buffer_migrate_page, but it seems to migrate buffer_head state to
the new page.

I'd love to know why CMA even tries to migrate pages that don't have a
->migratepage method, this seems incredibly dangerous to me.

2016-03-15 15:47:14

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Tue, Mar 15, 2016 at 04:25:50PM +0100, Richard Weinberger wrote:
> Kirill,
>
> Am 15.03.2016 um 16:17 schrieb Kirill A. Shutemov:
> > On Tue, Mar 15, 2016 at 03:16:11PM +0100, Richard Weinberger wrote:
> >> Hi!
> >>
> >> We're facing this issue from 2014 on UBIFS:
> >> http://www.spinics.net/lists/linux-fsdevel/msg79941.html
> >>
> >> So sum up:
> >> UBIFS does not allow pages directly marked as dirty. It want's everyone to do it via UBIFS's
> >> ->wirte_end() and ->page_mkwirte() functions.
> >> This assumption *seems* to be violated by CMA which migrates pages.
> >
> > I don't thing the CMA/migration is the root cause.
> >
> > How did we end up with writable and dirty pte, but not having
> > ->page_mkwrite() called for the page?
> >
> > Or if ->page_mkwrite() was called, why the page is not dirty?
>
> Thanks for your quick response!
>
> I also don't think that the root cause is CMA or migration but it seems
> to be the messenger.
>
> Can you confirm that UBIFS's assumptions are valid?
> I'm trying to rule out possible issues and hunt down the root cause...

The assumption looks reasonable for me, but I am not confident enough to
"confirm" it.

--
Kirill A. Shutemov

2016-03-15 16:02:24

by Richard Weinberger

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Christoph,

Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
> On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
>>> Or if ->page_mkwrite() was called, why the page is not dirty?
>>
>> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
>
> This might be the reason. I can't reall make sense of
> buffer_migrate_page, but it seems to migrate buffer_head state to
> the new page.

Oh, yes. This makes a lot of sense.

> I'd love to know why CMA even tries to migrate pages that don't have a
> ->migratepage method, this seems incredibly dangerous to me.

CMA folks, can you please clarify? :-)

UBIFS cannot use buffer_migrate_page() as this function assumes a
buffer head and UBIFS works on top of an MTD.
This is most likely why ->migratepage() was never implemented in UBIFS.

Also the documentation is not clear, reads more like an not required optimization:
migrate_page: This is used to compact the physical memory usage.
If the VM wants to relocate a page (maybe off a memory card
that is signalling imminent failure) it will pass a new page
and an old page to this function. migrate_page should
transfer any private data across and update any references
that it has to the page.

...assuming s/migrate_page/migratepage.

Thanks,
//richard

2016-03-15 23:18:58

by Richard Weinberger

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
> On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
>>> Or if ->page_mkwrite() was called, why the page is not dirty?
>>
>> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
>
> This might be the reason. I can't reall make sense of
> buffer_migrate_page, but it seems to migrate buffer_head state to
> the new page.
>
> I'd love to know why CMA even tries to migrate pages that don't have a
> ->migratepage method, this seems incredibly dangerous to me.

FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
longer explode upon page migration.
Tomorrow I'll do more tests to make sure.

Thanks,
//richard

2016-03-16 14:22:03

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Wed, Mar 16, 2016 at 12:18:50AM +0100, Richard Weinberger wrote:
> Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
> > On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
> >>> Or if ->page_mkwrite() was called, why the page is not dirty?
> >>
> >> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
> >
> > This might be the reason. I can't reall make sense of
> > buffer_migrate_page, but it seems to migrate buffer_head state to
> > the new page.
> >
> > I'd love to know why CMA even tries to migrate pages that don't have a
> > ->migratepage method, this seems incredibly dangerous to me.
>
> FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
> longer explode upon page migration.
> Tomorrow I'll do more tests to make sure.

Could you check if something like this would fix the issue.
Completely untested.

diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 065c88f8e4b8..9da34120dc5e 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -52,6 +52,7 @@
#include "ubifs.h"
#include <linux/mount.h>
#include <linux/slab.h>
+#include <linux/migrate.h>

static int read_block(struct inode *inode, void *addr, unsigned int block,
struct ubifs_data_node *dn)
@@ -1452,6 +1453,20 @@ static int ubifs_set_page_dirty(struct page *page)
return ret;
}

+static int ubifs_migrate_page(struct address_space *mapping,
+ struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+ if (PagePrivate(page)) {
+ SetPagePrivate(newpage);
+ __set_page_dirty_nobuffers(newpage);
+ }
+
+ if (PageChecked(page))
+ SetPageChecked(newpage);
+
+ return migrate_page(mapping, newpage, page, mode);
+}
+
static int ubifs_releasepage(struct page *page, gfp_t unused_gfp_flags)
{
/*
@@ -1591,6 +1606,7 @@ const struct address_space_operations ubifs_file_address_operations = {
.write_end = ubifs_write_end,
.invalidatepage = ubifs_invalidatepage,
.set_page_dirty = ubifs_set_page_dirty,
+ .migratepage = ubifs_migrate_page,
.releasepage = ubifs_releasepage,
};

--
Kirill A. Shutemov

2016-03-16 14:27:41

by Kirill A. Shutemov

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Wed, Mar 16, 2016 at 05:21:56PM +0300, Kirill A. Shutemov wrote:
> On Wed, Mar 16, 2016 at 12:18:50AM +0100, Richard Weinberger wrote:
> > Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
> > > On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
> > >>> Or if ->page_mkwrite() was called, why the page is not dirty?
> > >>
> > >> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
> > >
> > > This might be the reason. I can't reall make sense of
> > > buffer_migrate_page, but it seems to migrate buffer_head state to
> > > the new page.
> > >
> > > I'd love to know why CMA even tries to migrate pages that don't have a
> > > ->migratepage method, this seems incredibly dangerous to me.
> >
> > FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
> > longer explode upon page migration.
> > Tomorrow I'll do more tests to make sure.
>
> Could you check if something like this would fix the issue.
> Completely untested.
>
> diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
> index 065c88f8e4b8..9da34120dc5e 100644
> --- a/fs/ubifs/file.c
> +++ b/fs/ubifs/file.c
> @@ -52,6 +52,7 @@
> #include "ubifs.h"
> #include <linux/mount.h>
> #include <linux/slab.h>
> +#include <linux/migrate.h>
>
> static int read_block(struct inode *inode, void *addr, unsigned int block,
> struct ubifs_data_node *dn)
> @@ -1452,6 +1453,20 @@ static int ubifs_set_page_dirty(struct page *page)
> return ret;
> }
>
> +static int ubifs_migrate_page(struct address_space *mapping,
> + struct page *newpage, struct page *page, enum migrate_mode mode)
> +{
> + if (PagePrivate(page)) {
> + SetPagePrivate(newpage);
> + __set_page_dirty_nobuffers(newpage);
> + }
> +
> + if (PageChecked(page))
> + SetPageChecked(newpage);

These two lines are redundant, migrate_page_copy() would do this for us.

> +
> + return migrate_page(mapping, newpage, page, mode);
> +}
> +
> static int ubifs_releasepage(struct page *page, gfp_t unused_gfp_flags)
> {
> /*
> @@ -1591,6 +1606,7 @@ const struct address_space_operations ubifs_file_address_operations = {
> .write_end = ubifs_write_end,
> .invalidatepage = ubifs_invalidatepage,
> .set_page_dirty = ubifs_set_page_dirty,
> + .migratepage = ubifs_migrate_page,
> .releasepage = ubifs_releasepage,
> };
>
> --
> Kirill A. Shutemov

--
Kirill A. Shutemov

2016-03-16 20:47:33

by Richard Weinberger

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Adding more CC's.

Am 16.03.2016 um 15:27 schrieb Kirill A. Shutemov:
> On Wed, Mar 16, 2016 at 05:21:56PM +0300, Kirill A. Shutemov wrote:
>> On Wed, Mar 16, 2016 at 12:18:50AM +0100, Richard Weinberger wrote:
>>> Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
>>>> On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
>>>>>> Or if ->page_mkwrite() was called, why the page is not dirty?
>>>>>
>>>>> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
>>>>
>>>> This might be the reason. I can't reall make sense of
>>>> buffer_migrate_page, but it seems to migrate buffer_head state to
>>>> the new page.
>>>>
>>>> I'd love to know why CMA even tries to migrate pages that don't have a
>>>> ->migratepage method, this seems incredibly dangerous to me.
>>>
>>> FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
>>> longer explode upon page migration.
>>> Tomorrow I'll do more tests to make sure.
>>
>> Could you check if something like this would fix the issue.

Nope.

[ 108.080000] BUG: Bad page state in process drm-stress-test pfn:5c674
[ 108.080000] page:deb8ce80 count:0 mapcount:0 mapping: (null) index:0x0
[ 108.090000] flags: 0x810(dirty|private)
[ 108.100000] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
[ 108.100000] bad because of flags:
[ 108.110000] flags: 0x800(private)
[ 108.110000] Modules linked in:
[ 108.120000] CPU: 0 PID: 1855 Comm: drm-stress-test Not tainted 4.4.4-gaae1ad1-dirty #14
[ 108.120000] Hardware name: Allwinner sun4i/sun5i Families
[ 108.120000] [<c0015eb4>] (unwind_backtrace) from [<c0012cec>] (show_stack+0x10/0x14)
[ 108.120000] [<c0012cec>] (show_stack) from [<c02abaf8>] (dump_stack+0x8c/0xa0)
[ 108.120000] [<c02abaf8>] (dump_stack) from [<c00cbe78>] (bad_page+0xcc/0x11c)
[ 108.120000] [<c00cbe78>] (bad_page) from [<c00cc0f4>] (free_pages_prepare+0x22c/0x2f4)
[ 108.120000] [<c00cc0f4>] (free_pages_prepare) from [<c00cdf2c>] (free_hot_cold_page+0x34/0x194)
[ 108.120000] [<c00cdf2c>] (free_hot_cold_page) from [<c00ce0d4>] (free_hot_cold_page_list+0x48/0xdc)
[ 108.120000] [<c00ce0d4>] (free_hot_cold_page_list) from [<c00d55a8>] (release_pages+0x1dc/0x224)
[ 108.120000] [<c00d55a8>] (release_pages) from [<c00d56d8>] (pagevec_lru_move_fn+0xe8/0xf8)
[ 108.120000] [<c00d56d8>] (pagevec_lru_move_fn) from [<c00d579c>] (__lru_cache_add+0x60/0x88)
[ 108.120000] [<c00d579c>] (__lru_cache_add) from [<c00d9578>] (putback_lru_page+0x68/0xbc)
[ 108.120000] [<c00d9578>] (putback_lru_page) from [<c010bd6c>] (migrate_pages+0x208/0x730)
[ 108.120000] [<c010bd6c>] (migrate_pages) from [<c00d0860>] (alloc_contig_range+0x168/0x2f4)
[ 108.120000] [<c00d0860>] (alloc_contig_range) from [<c010cdb4>] (cma_alloc+0x170/0x2c0)
[ 108.120000] [<c010cdb4>] (cma_alloc) from [<c001a9d4>] (__alloc_from_contiguous+0x38/0xd8)
[ 108.120000] [<c001a9d4>] (__alloc_from_contiguous) from [<c001adb8>] (__dma_alloc+0x234/0x278)
[ 108.120000] [<c001adb8>] (__dma_alloc) from [<c001ae8c>] (arm_dma_alloc+0x54/0x5c)
[ 108.120000] [<c001ae8c>] (arm_dma_alloc) from [<c035bd70>] (drm_gem_cma_create+0x9c/0xf0)
[ 108.120000] [<c035bd70>] (drm_gem_cma_create) from [<c035bde0>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
[ 108.120000] [<c035bde0>] (drm_gem_cma_create_with_handle) from [<c035bf48>] (drm_gem_cma_dumb_create+0x3c/0x48)
[ 108.120000] [<c035bf48>] (drm_gem_cma_dumb_create) from [<c0340d18>] (drm_ioctl+0x12c/0x440)
[ 108.120000] [<c0340d18>] (drm_ioctl) from [<c011fc7c>] (do_vfs_ioctl+0x3f4/0x614)
[ 108.120000] [<c011fc7c>] (do_vfs_ioctl) from [<c011fed0>] (SyS_ioctl+0x34/0x5c)
[ 108.120000] [<c011fed0>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)

It is still not clear why UBIFS has to provide a >migratepage() and what the expected semantics
are.
What we know so far is that the fall back migration function is broken. I'm sure not only on UBIFS.

Can CMA folks please clarify? :-)

Thanks,
//richard

2016-03-16 22:55:39

by Richard Weinberger

[permalink] [raw]
Subject: [PATCH] UBIFS: Implement ->migratepage()

From: "Kirill A. Shutemov" <[email protected]>

When using CMA during page migrations UBIFS might get confused
and the following assert triggers:
UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)

UBIFS is using PagePrivate() which can have different meanings across
filesystems. Therefore the generic page migration code cannot handle this
case correctly.
We have to implement our own migration function which basically does a
plain copy but also duplicates the page private flag.

Signed-off-by: Kirill A. Shutemov <[email protected]>
[rw: Massaged changelog]
Signed-off-by: Richard Weinberger <[email protected]>
---
fs/ubifs/file.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
index 0edc128..48b2944 100644
--- a/fs/ubifs/file.c
+++ b/fs/ubifs/file.c
@@ -52,6 +52,7 @@
#include "ubifs.h"
#include <linux/mount.h>
#include <linux/slab.h>
+#include <linux/migrate.h>

static int read_block(struct inode *inode, void *addr, unsigned int block,
struct ubifs_data_node *dn)
@@ -1452,6 +1453,24 @@ static int ubifs_set_page_dirty(struct page *page)
return ret;
}

+static int ubifs_migrate_page(struct address_space *mapping,
+ struct page *newpage, struct page *page, enum migrate_mode mode)
+{
+ int rc;
+
+ rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
+ if (rc != MIGRATEPAGE_SUCCESS)
+ return rc;
+
+ if (PagePrivate(page)) {
+ ClearPagePrivate(page);
+ SetPagePrivate(newpage);
+ }
+
+ migrate_page_copy(newpage, page);
+ return MIGRATEPAGE_SUCCESS;
+}
+
static int ubifs_releasepage(struct page *page, gfp_t unused_gfp_flags)
{
/*
@@ -1591,6 +1610,7 @@ const struct address_space_operations ubifs_file_address_operations = {
.write_end = ubifs_write_end,
.invalidatepage = ubifs_invalidatepage,
.set_page_dirty = ubifs_set_page_dirty,
+ .migratepage = ubifs_migrate_page,
.releasepage = ubifs_releasepage,
};

--
2.5.0

2016-03-16 23:14:16

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] UBIFS: Implement ->migratepage()

Hi Kirill,

[auto build test ERROR on v4.5-rc7]
[also build test ERROR on next-20160316]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url: https://github.com/0day-ci/linux/commits/Richard-Weinberger/UBIFS-Implement-migratepage/20160317-065742
config: sh-allyesconfig (attached as .config)
reproduce:
wget https://git.kernel.org/cgit/linux/kernel/git/wfg/lkp-tests.git/plain/sbin/make.cross -O ~/bin/make.cross
chmod +x ~/bin/make.cross
# save the attached .config to linux build tree
make.cross ARCH=sh

All errors (new ones prefixed by >>):

fs/ubifs/file.c: In function 'ubifs_migrate_page':
>> fs/ubifs/file.c:1461:2: error: implicit declaration of function 'migrate_page_move_mapping' [-Werror=implicit-function-declaration]
cc1: some warnings being treated as errors

vim +/migrate_page_move_mapping +1461 fs/ubifs/file.c

1455
1456 static int ubifs_migrate_page(struct address_space *mapping,
1457 struct page *newpage, struct page *page, enum migrate_mode mode)
1458 {
1459 int rc;
1460
> 1461 rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
1462 if (rc != MIGRATEPAGE_SUCCESS)
1463 return rc;
1464

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (1.38 kB)
.config.gz (37.68 kB)
Download all attachments

2016-03-17 04:40:37

by kernel test robot

[permalink] [raw]
Subject: Re: [PATCH] UBIFS: Implement ->migratepage()

Hi Kirill,

[auto build test ERROR on v4.5-rc7]
[also build test ERROR on next-20160316]
[if your patch is applied to the wrong git tree, please drop us a note to help improving the system]

url: https://github.com/0day-ci/linux/commits/Richard-Weinberger/UBIFS-Implement-migratepage/20160317-065742
config: x86_64-allmodconfig (attached as .config)
reproduce:
# save the attached .config to linux build tree
make ARCH=x86_64

All errors (new ones prefixed by >>):

>> ERROR: "migrate_page_move_mapping" [fs/ubifs/ubifs.ko] undefined!
>> ERROR: "migrate_page_copy" [fs/ubifs/ubifs.ko] undefined!

---
0-DAY kernel test infrastructure Open Source Technology Center
https://lists.01.org/pipermail/kbuild-all Intel Corporation


Attachments:
(No filename) (777.00 B)
.config.gz (51.88 kB)
Download all attachments

2016-03-17 07:10:52

by Joonsoo Kim

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Wed, Mar 16, 2016 at 09:47:20PM +0100, Richard Weinberger wrote:
> Adding more CC's.
>
> Am 16.03.2016 um 15:27 schrieb Kirill A. Shutemov:
> > On Wed, Mar 16, 2016 at 05:21:56PM +0300, Kirill A. Shutemov wrote:
> >> On Wed, Mar 16, 2016 at 12:18:50AM +0100, Richard Weinberger wrote:
> >>> Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
> >>>> On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
> >>>>>> Or if ->page_mkwrite() was called, why the page is not dirty?
> >>>>>
> >>>>> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
> >>>>
> >>>> This might be the reason. I can't reall make sense of
> >>>> buffer_migrate_page, but it seems to migrate buffer_head state to
> >>>> the new page.
> >>>>
> >>>> I'd love to know why CMA even tries to migrate pages that don't have a
> >>>> ->migratepage method, this seems incredibly dangerous to me.
> >>>
> >>> FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
> >>> longer explode upon page migration.
> >>> Tomorrow I'll do more tests to make sure.
> >>
> >> Could you check if something like this would fix the issue.
>
> Nope.
>
> [ 108.080000] BUG: Bad page state in process drm-stress-test pfn:5c674
> [ 108.080000] page:deb8ce80 count:0 mapcount:0 mapping: (null) index:0x0
> [ 108.090000] flags: 0x810(dirty|private)
> [ 108.100000] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> [ 108.100000] bad because of flags:
> [ 108.110000] flags: 0x800(private)
> [ 108.110000] Modules linked in:
> [ 108.120000] CPU: 0 PID: 1855 Comm: drm-stress-test Not tainted 4.4.4-gaae1ad1-dirty #14
> [ 108.120000] Hardware name: Allwinner sun4i/sun5i Families
> [ 108.120000] [<c0015eb4>] (unwind_backtrace) from [<c0012cec>] (show_stack+0x10/0x14)
> [ 108.120000] [<c0012cec>] (show_stack) from [<c02abaf8>] (dump_stack+0x8c/0xa0)
> [ 108.120000] [<c02abaf8>] (dump_stack) from [<c00cbe78>] (bad_page+0xcc/0x11c)
> [ 108.120000] [<c00cbe78>] (bad_page) from [<c00cc0f4>] (free_pages_prepare+0x22c/0x2f4)
> [ 108.120000] [<c00cc0f4>] (free_pages_prepare) from [<c00cdf2c>] (free_hot_cold_page+0x34/0x194)
> [ 108.120000] [<c00cdf2c>] (free_hot_cold_page) from [<c00ce0d4>] (free_hot_cold_page_list+0x48/0xdc)
> [ 108.120000] [<c00ce0d4>] (free_hot_cold_page_list) from [<c00d55a8>] (release_pages+0x1dc/0x224)
> [ 108.120000] [<c00d55a8>] (release_pages) from [<c00d56d8>] (pagevec_lru_move_fn+0xe8/0xf8)
> [ 108.120000] [<c00d56d8>] (pagevec_lru_move_fn) from [<c00d579c>] (__lru_cache_add+0x60/0x88)
> [ 108.120000] [<c00d579c>] (__lru_cache_add) from [<c00d9578>] (putback_lru_page+0x68/0xbc)
> [ 108.120000] [<c00d9578>] (putback_lru_page) from [<c010bd6c>] (migrate_pages+0x208/0x730)
> [ 108.120000] [<c010bd6c>] (migrate_pages) from [<c00d0860>] (alloc_contig_range+0x168/0x2f4)
> [ 108.120000] [<c00d0860>] (alloc_contig_range) from [<c010cdb4>] (cma_alloc+0x170/0x2c0)
> [ 108.120000] [<c010cdb4>] (cma_alloc) from [<c001a9d4>] (__alloc_from_contiguous+0x38/0xd8)
> [ 108.120000] [<c001a9d4>] (__alloc_from_contiguous) from [<c001adb8>] (__dma_alloc+0x234/0x278)
> [ 108.120000] [<c001adb8>] (__dma_alloc) from [<c001ae8c>] (arm_dma_alloc+0x54/0x5c)
> [ 108.120000] [<c001ae8c>] (arm_dma_alloc) from [<c035bd70>] (drm_gem_cma_create+0x9c/0xf0)
> [ 108.120000] [<c035bd70>] (drm_gem_cma_create) from [<c035bde0>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
> [ 108.120000] [<c035bde0>] (drm_gem_cma_create_with_handle) from [<c035bf48>] (drm_gem_cma_dumb_create+0x3c/0x48)
> [ 108.120000] [<c035bf48>] (drm_gem_cma_dumb_create) from [<c0340d18>] (drm_ioctl+0x12c/0x440)
> [ 108.120000] [<c0340d18>] (drm_ioctl) from [<c011fc7c>] (do_vfs_ioctl+0x3f4/0x614)
> [ 108.120000] [<c011fc7c>] (do_vfs_ioctl) from [<c011fed0>] (SyS_ioctl+0x34/0x5c)
> [ 108.120000] [<c011fed0>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)
>
> It is still not clear why UBIFS has to provide a >migratepage() and what the expected semantics
> are.
> What we know so far is that the fall back migration function is broken. I'm sure not only on UBIFS.
>
> Can CMA folks please clarify? :-)

Hello,

As you mentioned earlier, this issue would not be directly related
to CMA. It looks like it is more general issue related to interaction
between MM and FS. Your first error log shows that error happens when
ubifs_set_page_dirty() is called in try_to_unmap_one() which also
can be called by reclaimer (kswapd or direct reclaim). Quick search shows
that problem also happens on reclaim. Is that fixed?

http://www.spinics.net/lists/linux-fsdevel/msg79531.html

I think that you need to CC other people who understand interaction
between MM and FS perfectly.

Sorry about not much helpful here.

Thanks.

2016-03-17 08:10:13

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH] UBIFS: Implement ->migratepage()

Am 17.03.2016 um 05:39 schrieb kbuild test robot:
> Hi Kirill,
>
> [auto build test ERROR on v4.5-rc7]
> [also build test ERROR on next-20160316]
> [if your patch is applied to the wrong git tree, please drop us a note to help improving the system]
>
> url: https://github.com/0day-ci/linux/commits/Richard-Weinberger/UBIFS-Implement-migratepage/20160317-065742
> config: x86_64-allmodconfig (attached as .config)
> reproduce:
> # save the attached .config to linux build tree
> make ARCH=x86_64
>
> All errors (new ones prefixed by >>):
>
>>> ERROR: "migrate_page_move_mapping" [fs/ubifs/ubifs.ko] undefined!
>>> ERROR: "migrate_page_copy" [fs/ubifs/ubifs.ko] undefined!

Meh. Just noticted that these functions are not exported and therefore not
usable in modules.
So, this patch is not really the solution although it makes the problem go away.

Thanks,
//richard

2016-03-17 08:13:44

by Richard Weinberger

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Am 17.03.2016 um 08:11 schrieb Joonsoo Kim:
>> It is still not clear why UBIFS has to provide a >migratepage() and what the expected semantics
>> are.
>> What we know so far is that the fall back migration function is broken. I'm sure not only on UBIFS.
>>
>> Can CMA folks please clarify? :-)
>
> Hello,
>
> As you mentioned earlier, this issue would not be directly related
> to CMA. It looks like it is more general issue related to interaction
> between MM and FS. Your first error log shows that error happens when
> ubifs_set_page_dirty() is called in try_to_unmap_one() which also
> can be called by reclaimer (kswapd or direct reclaim). Quick search shows
> that problem also happens on reclaim. Is that fixed?
>
> http://www.spinics.net/lists/linux-fsdevel/msg79531.html

Well, this problem happened only on a tainted kernel and never popped up again.
So, I really don't know. :-)

> I think that you need to CC other people who understand interaction
> between MM and FS perfectly.

Who is missing on the CC list?

Thanks,
//richard

2016-03-17 09:57:50

by Vlastimil Babka

[permalink] [raw]
Subject: Re: [PATCH] UBIFS: Implement ->migratepage()

+CC Hugh, Mel

On 03/16/2016 11:55 PM, Richard Weinberger wrote:
> From: "Kirill A. Shutemov" <[email protected]>
>
> When using CMA during page migrations UBIFS might get confused

It shouldn't be CMA specific, the same code runs from compaction,
autonuma balancing...

> and the following assert triggers:
> UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
>
> UBIFS is using PagePrivate() which can have different meanings across
> filesystems. Therefore the generic page migration code cannot handle this
> case correctly.
> We have to implement our own migration function which basically does a
> plain copy but also duplicates the page private flag.

Lack of PagePrivate() migration is surely a bug, but at a glance of how
UBIFS uses the flag, it's more about accounting, it shouldn't prevent a
page from being marked PageDirty()?
I suspect your initial bug (which is IIUC the fact that there's a dirty
pte, but PageDirty(page) is false) comes from the generic
fallback_migrate_page() which does:

if (PageDirty(page)) {
/* Only writeback pages in full synchronous migration */
if (mode != MIGRATE_SYNC)
return -EBUSY;
return writeout(mapping, page);
}

And writeout() seems to Clear PageDirty() through
clear_page_dirty_for_io() but I'm not so sure about the pte (or pte's in
all rmaps). But this comment in the latter function:

* Yes, Virginia, this is indeed insane.

scared me enough to not investigate further. Hopefully the people I CC'd
understand more about page migration than me. I'm just an user :)

In any case, this patch would solve both lack of PageDirty() transfer,
and avoid the path leading from fallback_migrate_page() to writeout().
But I'm not confident enough here to ack it.

> Signed-off-by: Kirill A. Shutemov <[email protected]>
> [rw: Massaged changelog]
> Signed-off-by: Richard Weinberger <[email protected]>
> ---
> fs/ubifs/file.c | 20 ++++++++++++++++++++
> 1 file changed, 20 insertions(+)
>
> diff --git a/fs/ubifs/file.c b/fs/ubifs/file.c
> index 0edc128..48b2944 100644
> --- a/fs/ubifs/file.c
> +++ b/fs/ubifs/file.c
> @@ -52,6 +52,7 @@
> #include "ubifs.h"
> #include <linux/mount.h>
> #include <linux/slab.h>
> +#include <linux/migrate.h>
>
> static int read_block(struct inode *inode, void *addr, unsigned int block,
> struct ubifs_data_node *dn)
> @@ -1452,6 +1453,24 @@ static int ubifs_set_page_dirty(struct page *page)
> return ret;
> }
>
> +static int ubifs_migrate_page(struct address_space *mapping,
> + struct page *newpage, struct page *page, enum migrate_mode mode)
> +{
> + int rc;
> +
> + rc = migrate_page_move_mapping(mapping, newpage, page, NULL, mode, 0);
> + if (rc != MIGRATEPAGE_SUCCESS)
> + return rc;
> +
> + if (PagePrivate(page)) {
> + ClearPagePrivate(page);
> + SetPagePrivate(newpage);
> + }
> +
> + migrate_page_copy(newpage, page);
> + return MIGRATEPAGE_SUCCESS;
> +}
> +
> static int ubifs_releasepage(struct page *page, gfp_t unused_gfp_flags)
> {
> /*
> @@ -1591,6 +1610,7 @@ const struct address_space_operations ubifs_file_address_operations = {
> .write_end = ubifs_write_end,
> .invalidatepage = ubifs_invalidatepage,
> .set_page_dirty = ubifs_set_page_dirty,
> + .migratepage = ubifs_migrate_page,
> .releasepage = ubifs_releasepage,
> };
>
>

2016-03-17 15:17:22

by Joonsoo Kim

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

2016-03-17 17:13 GMT+09:00 Richard Weinberger <[email protected]>:
> Am 17.03.2016 um 08:11 schrieb Joonsoo Kim:
>>> It is still not clear why UBIFS has to provide a >migratepage() and what the expected semantics
>>> are.
>>> What we know so far is that the fall back migration function is broken. I'm sure not only on UBIFS.
>>>
>>> Can CMA folks please clarify? :-)
>>
>> Hello,
>>
>> As you mentioned earlier, this issue would not be directly related
>> to CMA. It looks like it is more general issue related to interaction
>> between MM and FS. Your first error log shows that error happens when
>> ubifs_set_page_dirty() is called in try_to_unmap_one() which also
>> can be called by reclaimer (kswapd or direct reclaim). Quick search shows
>> that problem also happens on reclaim. Is that fixed?
>>
>> http://www.spinics.net/lists/linux-fsdevel/msg79531.html
>
> Well, this problem happened only on a tainted kernel and never popped up again.
> So, I really don't know. :-)
>
>> I think that you need to CC other people who understand interaction
>> between MM and FS perfectly.
>
> Who is missing on the CC list?

Vlastimil already added Hugh and Mel to other relevant thread but I think
this thread has more information about the problem so I add them here.

Thanks.

2016-03-17 15:26:03

by Boris Brezillon

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Hi,

On Tue, 15 Mar 2016 15:16:11 +0100
Richard Weinberger <[email protected]> wrote:

> Hi!
>
> We're facing this issue from 2014 on UBIFS:
> http://www.spinics.net/lists/linux-fsdevel/msg79941.html

Just to let you know I was able to reproduce the exact same bug on a
sama5d3 with UBIFS + CMA enabled (CMA allocation through the
generic DRM/CMA code), so I think we can exclude a platform specific
bug.

>
> So sum up:
> UBIFS does not allow pages directly marked as dirty. It want's everyone to do it via UBIFS's
> ->wirte_end() and ->page_mkwirte() functions.
> This assumption *seems* to be violated by CMA which migrates pages.
> UBIFS enforces this because it has to account free space on the flash,
> in UBIFS speak "budget", for details please see fs/ubifs/file.c.
>
> As in the report from 2014 the page is writable but not dirty.
> The kernel has this debug patch applied:
> http://www.spinics.net/lists/linux-fsdevel/msg80471.html
> But our kernel is based on v4.4 and does *not* use proprietary modules.
>
> [ 213.450000] page:debe03c0 count:3 mapcount:1 mapping:dce4b5fc index:0x2f
> [ 213.460000] flags: 0x9(locked|uptodate)
> [ 213.460000] page dumped because: try_to_unmap_one
> [ 213.470000] pte_write: 1
> [ 213.480000] UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
> [ 213.490000] CPU: 0 PID: 436 Comm: drm-stress-test Not tainted 4.4.4-00176-geaa802524636-dirty #1008
> [ 213.490000] Hardware name: Allwinner sun4i/sun5i Families
> [ 213.490000] [<c0015e70>] (unwind_backtrace) from [<c0012cdc>] (show_stack+0x10/0x14)
> [ 213.490000] [<c0012cdc>] (show_stack) from [<c02ad834>] (dump_stack+0x8c/0xa0)
> [ 213.490000] [<c02ad834>] (dump_stack) from [<c0236ee8>] (ubifs_set_page_dirty+0x44/0x50)
> [ 213.490000] [<c0236ee8>] (ubifs_set_page_dirty) from [<c00fa0bc>] (try_to_unmap_one+0x10c/0x3a8)
> [ 213.490000] [<c00fa0bc>] (try_to_unmap_one) from [<c00fadb4>] (rmap_walk+0xb4/0x290)
> [ 213.490000] [<c00fadb4>] (rmap_walk) from [<c00fb1bc>] (try_to_unmap+0x64/0x80)
> [ 213.490000] [<c00fb1bc>] (try_to_unmap) from [<c010dc28>] (migrate_pages+0x328/0x7a0)
> [ 213.490000] [<c010dc28>] (migrate_pages) from [<c00d0cb0>] (alloc_contig_range+0x168/0x2f4)
> [ 213.490000] [<c00d0cb0>] (alloc_contig_range) from [<c010ec00>] (cma_alloc+0x170/0x2c0)
> [ 213.490000] [<c010ec00>] (cma_alloc) from [<c001a958>] (__alloc_from_contiguous+0x38/0xd8)
> [ 213.490000] [<c001a958>] (__alloc_from_contiguous) from [<c001ad44>] (__dma_alloc+0x23c/0x274)
> [ 213.490000] [<c001ad44>] (__dma_alloc) from [<c001ae08>] (arm_dma_alloc+0x54/0x5c)
> [ 213.490000] [<c001ae08>] (arm_dma_alloc) from [<c035cecc>] (drm_gem_cma_create+0xb8/0xf0)
> [ 213.490000] [<c035cecc>] (drm_gem_cma_create) from [<c035cf20>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
> [ 213.490000] [<c035cf20>] (drm_gem_cma_create_with_handle) from [<c035d088>] (drm_gem_cma_dumb_create+0x3c/0x48)
> [ 213.490000] [<c035d088>] (drm_gem_cma_dumb_create) from [<c0341ed8>] (drm_ioctl+0x12c/0x444)
> [ 213.490000] [<c0341ed8>] (drm_ioctl) from [<c0121adc>] (do_vfs_ioctl+0x3f4/0x614)
> [ 213.490000] [<c0121adc>] (do_vfs_ioctl) from [<c0121d30>] (SyS_ioctl+0x34/0x5c)
> [ 213.490000] [<c0121d30>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)
>
> The full kernellog can be found here:
> http://code.bulix.org/ysuo9x-93716?raw
>
> So, let me repeat Artem's question from 2014:
> > Now the question is: is it UBIFS which has incorrect assumptions, or this is the
> > Linux MM which is not doing the right thing? I do not know the answer, let's see
> > if the MM list may give us a clue.
>
> Thanks,
> //richard



--
Boris Brezillon, Free Electrons
Embedded Linux and Kernel engineering
http://free-electrons.com

2016-03-21 15:28:49

by Christoph Hellwig

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Wed, Mar 16, 2016 at 05:21:56PM +0300, Kirill A. Shutemov wrote:
> > FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
> > longer explode upon page migration.
> > Tomorrow I'll do more tests to make sure.
>
> Could you check if something like this would fix the issue.
> Completely untested.

We really need to make the default safe for file systems that don't
have a migratepage method.

2016-03-21 23:00:08

by Andrew Morton

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

On Wed, 16 Mar 2016 21:47:20 +0100 Richard Weinberger <[email protected]> wrote:

> Adding more CC's.
>
> Am 16.03.2016 um 15:27 schrieb Kirill A. Shutemov:
> > On Wed, Mar 16, 2016 at 05:21:56PM +0300, Kirill A. Shutemov wrote:
> >> On Wed, Mar 16, 2016 at 12:18:50AM +0100, Richard Weinberger wrote:
> >>> Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
> >>>> On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
> >>>>>> Or if ->page_mkwrite() was called, why the page is not dirty?
> >>>>>
> >>>>> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
> >>>>
> >>>> This might be the reason. I can't reall make sense of
> >>>> buffer_migrate_page, but it seems to migrate buffer_head state to
> >>>> the new page.
> >>>>
> >>>> I'd love to know why CMA even tries to migrate pages that don't have a
> >>>> ->migratepage method, this seems incredibly dangerous to me.
> >>>
> >>> FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
> >>> longer explode upon page migration.
> >>> Tomorrow I'll do more tests to make sure.
> >>
> >> Could you check if something like this would fix the issue.
>
> Nope.
>
> [ 108.080000] BUG: Bad page state in process drm-stress-test pfn:5c674
> [ 108.080000] page:deb8ce80 count:0 mapcount:0 mapping: (null) index:0x0
> [ 108.090000] flags: 0x810(dirty|private)
> [ 108.100000] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
> [ 108.100000] bad because of flags:
> [ 108.110000] flags: 0x800(private)
> [ 108.110000] Modules linked in:
> [ 108.120000] CPU: 0 PID: 1855 Comm: drm-stress-test Not tainted 4.4.4-gaae1ad1-dirty #14
> [ 108.120000] Hardware name: Allwinner sun4i/sun5i Families
> [ 108.120000] [<c0015eb4>] (unwind_backtrace) from [<c0012cec>] (show_stack+0x10/0x14)
> [ 108.120000] [<c0012cec>] (show_stack) from [<c02abaf8>] (dump_stack+0x8c/0xa0)
> [ 108.120000] [<c02abaf8>] (dump_stack) from [<c00cbe78>] (bad_page+0xcc/0x11c)
> [ 108.120000] [<c00cbe78>] (bad_page) from [<c00cc0f4>] (free_pages_prepare+0x22c/0x2f4)
> [ 108.120000] [<c00cc0f4>] (free_pages_prepare) from [<c00cdf2c>] (free_hot_cold_page+0x34/0x194)
> [ 108.120000] [<c00cdf2c>] (free_hot_cold_page) from [<c00ce0d4>] (free_hot_cold_page_list+0x48/0xdc)
> [ 108.120000] [<c00ce0d4>] (free_hot_cold_page_list) from [<c00d55a8>] (release_pages+0x1dc/0x224)
> [ 108.120000] [<c00d55a8>] (release_pages) from [<c00d56d8>] (pagevec_lru_move_fn+0xe8/0xf8)
> [ 108.120000] [<c00d56d8>] (pagevec_lru_move_fn) from [<c00d579c>] (__lru_cache_add+0x60/0x88)
> [ 108.120000] [<c00d579c>] (__lru_cache_add) from [<c00d9578>] (putback_lru_page+0x68/0xbc)
> [ 108.120000] [<c00d9578>] (putback_lru_page) from [<c010bd6c>] (migrate_pages+0x208/0x730)
> [ 108.120000] [<c010bd6c>] (migrate_pages) from [<c00d0860>] (alloc_contig_range+0x168/0x2f4)
> [ 108.120000] [<c00d0860>] (alloc_contig_range) from [<c010cdb4>] (cma_alloc+0x170/0x2c0)
> [ 108.120000] [<c010cdb4>] (cma_alloc) from [<c001a9d4>] (__alloc_from_contiguous+0x38/0xd8)
> [ 108.120000] [<c001a9d4>] (__alloc_from_contiguous) from [<c001adb8>] (__dma_alloc+0x234/0x278)
> [ 108.120000] [<c001adb8>] (__dma_alloc) from [<c001ae8c>] (arm_dma_alloc+0x54/0x5c)
> [ 108.120000] [<c001ae8c>] (arm_dma_alloc) from [<c035bd70>] (drm_gem_cma_create+0x9c/0xf0)
> [ 108.120000] [<c035bd70>] (drm_gem_cma_create) from [<c035bde0>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
> [ 108.120000] [<c035bde0>] (drm_gem_cma_create_with_handle) from [<c035bf48>] (drm_gem_cma_dumb_create+0x3c/0x48)
> [ 108.120000] [<c035bf48>] (drm_gem_cma_dumb_create) from [<c0340d18>] (drm_ioctl+0x12c/0x440)
> [ 108.120000] [<c0340d18>] (drm_ioctl) from [<c011fc7c>] (do_vfs_ioctl+0x3f4/0x614)
> [ 108.120000] [<c011fc7c>] (do_vfs_ioctl) from [<c011fed0>] (SyS_ioctl+0x34/0x5c)
> [ 108.120000] [<c011fed0>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)
>
> It is still not clear why UBIFS has to provide a >migratepage() and what the expected semantics
> are.
> What we know so far is that the fall back migration function is broken. I'm sure not only on UBIFS.
>

The above says "PagePrivate was still set", and UBIFS does muck with
PagePrivate. Perhaps the fs isn't clearing things up in all the right
places.

2016-03-21 23:06:46

by Richard Weinberger

[permalink] [raw]
Subject: Re: Page migration issue with UBIFS

Am 22.03.2016 um 00:00 schrieb Andrew Morton:
> On Wed, 16 Mar 2016 21:47:20 +0100 Richard Weinberger <[email protected]> wrote:
>
>> Adding more CC's.
>>
>> Am 16.03.2016 um 15:27 schrieb Kirill A. Shutemov:
>>> On Wed, Mar 16, 2016 at 05:21:56PM +0300, Kirill A. Shutemov wrote:
>>>> On Wed, Mar 16, 2016 at 12:18:50AM +0100, Richard Weinberger wrote:
>>>>> Am 15.03.2016 um 16:37 schrieb Christoph Hellwig:
>>>>>> On Tue, Mar 15, 2016 at 04:32:40PM +0100, Richard Weinberger wrote:
>>>>>>>> Or if ->page_mkwrite() was called, why the page is not dirty?
>>>>>>>
>>>>>>> BTW: UBIFS does not implement ->migratepage(), could this be a problem?
>>>>>>
>>>>>> This might be the reason. I can't reall make sense of
>>>>>> buffer_migrate_page, but it seems to migrate buffer_head state to
>>>>>> the new page.
>>>>>>
>>>>>> I'd love to know why CMA even tries to migrate pages that don't have a
>>>>>> ->migratepage method, this seems incredibly dangerous to me.
>>>>>
>>>>> FYI, with a dummy ->migratepage() which returns only -EINVAL UBIFS does no
>>>>> longer explode upon page migration.
>>>>> Tomorrow I'll do more tests to make sure.
>>>>
>>>> Could you check if something like this would fix the issue.
>>
>> Nope.
>>
>> [ 108.080000] BUG: Bad page state in process drm-stress-test pfn:5c674
>> [ 108.080000] page:deb8ce80 count:0 mapcount:0 mapping: (null) index:0x0
>> [ 108.090000] flags: 0x810(dirty|private)
>> [ 108.100000] page dumped because: PAGE_FLAGS_CHECK_AT_FREE flag(s) set
>> [ 108.100000] bad because of flags:
>> [ 108.110000] flags: 0x800(private)
>> [ 108.110000] Modules linked in:
>> [ 108.120000] CPU: 0 PID: 1855 Comm: drm-stress-test Not tainted 4.4.4-gaae1ad1-dirty #14
>> [ 108.120000] Hardware name: Allwinner sun4i/sun5i Families
>> [ 108.120000] [<c0015eb4>] (unwind_backtrace) from [<c0012cec>] (show_stack+0x10/0x14)
>> [ 108.120000] [<c0012cec>] (show_stack) from [<c02abaf8>] (dump_stack+0x8c/0xa0)
>> [ 108.120000] [<c02abaf8>] (dump_stack) from [<c00cbe78>] (bad_page+0xcc/0x11c)
>> [ 108.120000] [<c00cbe78>] (bad_page) from [<c00cc0f4>] (free_pages_prepare+0x22c/0x2f4)
>> [ 108.120000] [<c00cc0f4>] (free_pages_prepare) from [<c00cdf2c>] (free_hot_cold_page+0x34/0x194)
>> [ 108.120000] [<c00cdf2c>] (free_hot_cold_page) from [<c00ce0d4>] (free_hot_cold_page_list+0x48/0xdc)
>> [ 108.120000] [<c00ce0d4>] (free_hot_cold_page_list) from [<c00d55a8>] (release_pages+0x1dc/0x224)
>> [ 108.120000] [<c00d55a8>] (release_pages) from [<c00d56d8>] (pagevec_lru_move_fn+0xe8/0xf8)
>> [ 108.120000] [<c00d56d8>] (pagevec_lru_move_fn) from [<c00d579c>] (__lru_cache_add+0x60/0x88)
>> [ 108.120000] [<c00d579c>] (__lru_cache_add) from [<c00d9578>] (putback_lru_page+0x68/0xbc)
>> [ 108.120000] [<c00d9578>] (putback_lru_page) from [<c010bd6c>] (migrate_pages+0x208/0x730)
>> [ 108.120000] [<c010bd6c>] (migrate_pages) from [<c00d0860>] (alloc_contig_range+0x168/0x2f4)
>> [ 108.120000] [<c00d0860>] (alloc_contig_range) from [<c010cdb4>] (cma_alloc+0x170/0x2c0)
>> [ 108.120000] [<c010cdb4>] (cma_alloc) from [<c001a9d4>] (__alloc_from_contiguous+0x38/0xd8)
>> [ 108.120000] [<c001a9d4>] (__alloc_from_contiguous) from [<c001adb8>] (__dma_alloc+0x234/0x278)
>> [ 108.120000] [<c001adb8>] (__dma_alloc) from [<c001ae8c>] (arm_dma_alloc+0x54/0x5c)
>> [ 108.120000] [<c001ae8c>] (arm_dma_alloc) from [<c035bd70>] (drm_gem_cma_create+0x9c/0xf0)
>> [ 108.120000] [<c035bd70>] (drm_gem_cma_create) from [<c035bde0>] (drm_gem_cma_create_with_handle+0x1c/0xe8)
>> [ 108.120000] [<c035bde0>] (drm_gem_cma_create_with_handle) from [<c035bf48>] (drm_gem_cma_dumb_create+0x3c/0x48)
>> [ 108.120000] [<c035bf48>] (drm_gem_cma_dumb_create) from [<c0340d18>] (drm_ioctl+0x12c/0x440)
>> [ 108.120000] [<c0340d18>] (drm_ioctl) from [<c011fc7c>] (do_vfs_ioctl+0x3f4/0x614)
>> [ 108.120000] [<c011fc7c>] (do_vfs_ioctl) from [<c011fed0>] (SyS_ioctl+0x34/0x5c)
>> [ 108.120000] [<c011fed0>] (SyS_ioctl) from [<c000f2c0>] (ret_fast_syscall+0x0/0x34)
>>
>> It is still not clear why UBIFS has to provide a >migratepage() and what the expected semantics
>> are.
>> What we know so far is that the fall back migration function is broken. I'm sure not only on UBIFS.
>>
>
> The above says "PagePrivate was still set", and UBIFS does muck with
> PagePrivate. Perhaps the fs isn't clearing things up in all the right
> places.

This is not the original splat. It is the output after applying a non-functional patch from Kirill.

Thanks,
//richard

2016-03-25 22:53:57

by Richard Weinberger

[permalink] [raw]
Subject: Re: [PATCH] UBIFS: Implement ->migratepage()

Am 17.03.2016 um 10:57 schrieb Vlastimil Babka:
> +CC Hugh, Mel
>
> On 03/16/2016 11:55 PM, Richard Weinberger wrote:
>> From: "Kirill A. Shutemov" <[email protected]>
>>
>> When using CMA during page migrations UBIFS might get confused
>
> It shouldn't be CMA specific, the same code runs from compaction, autonuma balancing...
>
>> and the following assert triggers:
>> UBIFS assert failed in ubifs_set_page_dirty at 1451 (pid 436)
>>
>> UBIFS is using PagePrivate() which can have different meanings across
>> filesystems. Therefore the generic page migration code cannot handle this
>> case correctly.
>> We have to implement our own migration function which basically does a
>> plain copy but also duplicates the page private flag.
>
> Lack of PagePrivate() migration is surely a bug, but at a glance of how UBIFS uses the flag, it's more about accounting, it shouldn't prevent a page from being marked PageDirty()?
> I suspect your initial bug (which is IIUC the fact that there's a dirty pte, but PageDirty(page) is false) comes from the generic fallback_migrate_page() which does:
>
> if (PageDirty(page)) {
> /* Only writeback pages in full synchronous migration */
> if (mode != MIGRATE_SYNC)
> return -EBUSY;
> return writeout(mapping, page);
> }
>
> And writeout() seems to Clear PageDirty() through clear_page_dirty_for_io() but I'm not so sure about the pte (or pte's in all rmaps). But this comment in the latter function:
>
> * Yes, Virginia, this is indeed insane.
>
> scared me enough to not investigate further. Hopefully the people I CC'd understand more about page migration than me. I'm just an user :)
>
> In any case, this patch would solve both lack of PageDirty() transfer, and avoid the path leading from fallback_migrate_page() to writeout(). But I'm not confident enough here to
> ack it.

Hugh? Mel? Anyone? :-)

It is still not clear to me whether this needs fixing in MM or UBIFS.

Thanks,
//richard