Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754856AbdCIMPP (ORCPT ); Thu, 9 Mar 2017 07:15:15 -0500 Received: from mx2.suse.de ([195.135.220.15]:36155 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753160AbdCIMPM (ORCPT ); Thu, 9 Mar 2017 07:15:12 -0500 Date: Thu, 9 Mar 2017 13:15:07 +0100 From: Jan Kara To: Vlastimil Babka Cc: akpm@linux-foundation.org, zhouxianrong@huawei.com, Mi.Sophia.Wang@huawei.com, hannes@cmpxchg.org, kirill.shutemov@linux.intel.com, mgorman@techsingularity.net, minchan@kernel.org, viro@zeniv.linux.org.uk, weidu.du@huawei.com, won.ho.park@huawei.com, zhangshiming5@huawei.com, zhouxiaoyan1@huawei.com, zhouxiyu@huawei.com, mm-commits@vger.kernel.org, Jan Kara , Hugh Dickins , Linux-FSDevel , LKML Subject: Re: + compaction-add-def_blk_aops-migrate-function-for-memory-compaction.patch added to -mm tree Message-ID: <20170309121507.GH15874@quack2.suse.cz> References: <58c099dc.qZs+2fQBHcULYdhi%akpm@linux-foundation.org> <6867e5fe-b67e-fe01-4cc4-8338b4043577@suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <6867e5fe-b67e-fe01-4cc4-8338b4043577@suse.cz> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 4664 Lines: 111 On Thu 09-03-17 08:58:45, Vlastimil Babka wrote: > On 03/09/2017 12:55 AM, akpm@linux-foundation.org wrote: > > > > The patch titled > > Subject: compaction: add def_blk_aops migrate function for memory compaction > > has been added to the -mm tree. Its filename is > > compaction-add-def_blk_aops-migrate-function-for-memory-compaction.patch > > > > This patch should soon appear at > > http://ozlabs.org/~akpm/mmots/broken-out/compaction-add-def_blk_aops-migrate-function-for-memory-compaction.patch > > and later at > > http://ozlabs.org/~akpm/mmotm/broken-out/compaction-add-def_blk_aops-migrate-function-for-memory-compaction.patch > > > > Before you just go and hit "reply", please: > > a) Consider who else should be cc'ed > > b) Prefer to cc a suitable mailing list as well > > c) Ideally: find the original patch on the mailing list and do a > > reply-to-all to that, adding suitable additional cc's > > > > *** Remember to use Documentation/SubmitChecklist when testing your code *** > > > > The -mm tree is included into linux-next and is updated > > there every 3-4 working days > > > > ------------------------------------------------------ > > From: zhouxianrong > > Subject: compaction: add def_blk_aops migrate function for memory compaction > > That's not really a mm/compaction patch, but a block layer/migration patch. I > don't know internals of those so well, so I added some CC's. Thanks! > > The reason for doing this is based on two factors. > > > > 1. larg file read/write operations with order 0 can fragmentize > > memory rapidly. > > > > 2. when a special filesystem does not supply migratepage callback, > > kernel would fallback to default function fallback_migrate_page. > > but fallback_migrate_page could not migrate diry page nicely; > > specially kcompactd with MIGRATE_SYNC_LIGHT could not migrate > > diry pages due to this until clear_page_dirty_for_io in some > > procedure. i think it is not suitable here in this scenario. > > for dirty pages we should migrate it rather than skip or writeout > > it in kcomapctd with MIGRATE_SYNC_LIGHT. i think this problem is > > for all filesystem without migratepage not only for block device fs. > > > > So for compaction under large file writing supply migratepage for > > def_blk_aops. > > Is this really safe to do? buffer_migrate_page() has some assumptions listed in > its comment (and maybe more that are not listed). Do we know it's safe to use it > for all def_blk_aops users? So this is definitely *not* OK for ext4 which is using def_blk_aops. Ext4 will create temporary buffer heads that alias buffer heads attached to the block device page when committing a transaction (see fs/jbd2/journal.c: jbd2_journal_write_metadata_buffer()) and these will point to the original page. Now I'm not saying ext4 / jbd2 cannot be modified so that it somehow makes it clear to the page migration code it should stay away from the page. However it demostrates that this patch is not safe as is... Honza > > > Link: http://lkml.kernel.org/r/1488937915-78955-1-git-send-email-zhouxianrong@huawei.com > > Signed-off-by: zhouxianrong > > Cc: Kirill A. Shutemov > > Cc: Johannes Weiner > > Cc: Minchan Kim > > Cc: Mel Gorman > > Cc: Vlastimil Babka > > Cc: Al Viro > > Cc: > > Cc: > > Cc: > > Cc: > > Cc: > > Cc: > > Signed-off-by: Andrew Morton > > --- > > > > fs/block_dev.c | 3 +++ > > 1 file changed, 3 insertions(+) > > > > diff -puN fs/block_dev.c~compaction-add-def_blk_aops-migrate-function-for-memory-compaction fs/block_dev.c > > --- a/fs/block_dev.c~compaction-add-def_blk_aops-migrate-function-for-memory-compaction > > +++ a/fs/block_dev.c > > @@ -2064,6 +2064,9 @@ static const struct address_space_operat > > .releasepage = blkdev_releasepage, > > .direct_IO = blkdev_direct_IO, > > .is_dirty_writeback = buffer_check_dirty_writeback, > > +#ifdef CONFIG_MIGRATION > > + .migratepage = buffer_migrate_page, > > +#endif > > }; > > > > #define BLKDEV_FALLOC_FL_SUPPORTED \ > > _ > > > > Patches currently in -mm which might be from zhouxianrong@huawei.com are > > > > compaction-add-def_blk_aops-migrate-function-for-memory-compaction.patch > > > -- Jan Kara SUSE Labs, CR