Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752584AbcCRCRm (ORCPT ); Thu, 17 Mar 2016 22:17:42 -0400 Received: from szxga03-in.huawei.com ([119.145.14.66]:52801 "EHLO szxga03-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751465AbcCRCRl (ORCPT ); Thu, 17 Mar 2016 22:17:41 -0400 Subject: Re: Suspicious error for CMA stress test To: Joonsoo Kim References: <56DD38E7.3050107@huawei.com> <56DDCB86.4030709@redhat.com> <56DE30CB.7020207@huawei.com> <56DF7B28.9060108@huawei.com> <56E2FB5C.1040602@suse.cz> <20160314064925.GA27587@js1304-P5Q-DELUXE> <56E662E8.700@suse.cz> <20160314071803.GA28094@js1304-P5Q-DELUXE> <56E92AFC.9050208@huawei.com> <20160317065426.GA10315@js1304-P5Q-DELUXE> <56EA77BC.2090702@huawei.com> CC: Joonsoo Kim , Vlastimil Babka , "Leizhen (ThunderTown)" , Laura Abbott , "linux-arm-kernel@lists.infradead.org" , "linux-kernel@vger.kernel.org" , Andrew Morton , Sasha Levin , Laura Abbott , qiuxishi , Catalin Marinas , Will Deacon , Arnd Bergmann , dingtinahong , , "linux-mm@kvack.org" From: Hanjun Guo Message-ID: <56EB6206.4070802@huawei.com> Date: Fri, 18 Mar 2016 10:03:50 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:38.0) Gecko/20100101 Thunderbird/38.5.1 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.17.188] X-CFilter-Loop: Reflected X-Mirapoint-Virus-RAPID-Raw: score=unknown(0), refid=str=0001.0A020203.56EB6213.0110,ss=1,re=0.000,recu=0.000,reip=0.000,cl=1,cld=1,fgs=0, ip=0.0.0.0, so=2013-05-26 15:14:31, dmn=2013-03-21 17:37:32 X-Mirapoint-Loop-Id: 556ddb963eaf3284e9e7783a2b3d93e5 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2058 Lines: 54 On 2016/3/17 23:31, Joonsoo Kim wrote: [...] >>> I may find that there is a bug which was introduced by me some time >>> ago. Could you test following change in __free_one_page() on top of >>> Vlastimil's patch? >>> >>> -page_idx = pfn & ((1 << max_order) - 1); >>> +page_idx = pfn & ((1 << MAX_ORDER) - 1); >> I tested Vlastimil's patch + your change with stress for more than half hour, the bug >> I reported is gone :) > Good to hear! > >> I have some questions, Joonsoo, you provided a patch as following: >> >> diff --git a/mm/cma.c b/mm/cma.c >> index 3a7a67b..952a8a3 100644 >> --- a/mm/cma.c >> +++ b/mm/cma.c >> @@ -448,7 +448,10 @@ bool cma_release(struct cma *cma, const struct page *pages, unsigned int count) >> >> VM_BUG_ON(pfn + count > cma->base_pfn + cma->count); >> >> + mutex_lock(&cma_mutex); >> free_contig_range(pfn, count); >> + mutex_unlock(&cma_mutex); >> + >> cma_clear_bitmap(cma, pfn, count); >> trace_cma_release(pfn, pages, count); >> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c >> index 7f32950..68ed5ae 100644 >> --- a/mm/page_alloc.c >> +++ b/mm/page_alloc.c >> @@ -1559,7 +1559,8 @@ void free_hot_cold_page(struct page *page, bool cold) >> * excessively into the page allocator >> */ >> if (migratetype >= MIGRATE_PCPTYPES) { >> - if (unlikely(is_migrate_isolate(migratetype))) { >> + if (is_migrate_cma(migratetype) || >> + unlikely(is_migrate_isolate(migratetype))) { >> free_one_page(zone, page, pfn, 0, migratetype); >> goto out; >> } >> >> This patch also works to fix the bug, why not just use this one? is there >> any side effects for this patch? maybe there is performance issue as the >> mutex lock is used, any other issues? > The changes in free_hot_cold_page() would cause unacceptable performance > problem in a big machine, because, with above change, it takes zone->lock > whenever freeing one page on CMA region. Thanks for the clarify :) Hanjun