Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753190AbbFSBh1 (ORCPT ); Thu, 18 Jun 2015 21:37:27 -0400 Received: from szxga01-in.huawei.com ([58.251.152.64]:60840 "EHLO szxga01-in.huawei.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752286AbbFSBhV (ORCPT ); Thu, 18 Jun 2015 21:37:21 -0400 Message-ID: <55837224.2090702@huawei.com> Date: Fri, 19 Jun 2015 09:36:36 +0800 From: Xishi Qiu User-Agent: Mozilla/5.0 (Windows NT 6.1; rv:12.0) Gecko/20120428 Thunderbird/12.0.1 MIME-Version: 1.0 To: "Luck, Tony" CC: Vlastimil Babka , Andrew Morton , , Yinghai Lu , "H. Peter Anvin" , Thomas Gleixner , , Xiexiuqi , Hanjun Guo , Linux MM , LKML Subject: Re: [RFC PATCH 00/12] mm: mirrored memory support for page buddy allocations References: <55704A7E.5030507@huawei.com> <557FD5F8.10903@suse.cz> <557FDB9B.1090105@huawei.com> <557FF06A.3020000@suse.cz> <55821D85.3070208@huawei.com> <55825DF0.9090903@suse.cz> <55829149.60807@huawei.com> <5582959E.4080402@suse.cz> <20150618203335.GA3829@agluck-desk.sc.intel.com> In-Reply-To: <20150618203335.GA3829@agluck-desk.sc.intel.com> Content-Type: text/plain; charset="ISO-8859-1" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.177.25.179] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2950 Lines: 79 On 2015/6/19 4:33, Luck, Tony wrote: > On Thu, Jun 18, 2015 at 11:55:42AM +0200, Vlastimil Babka wrote: >>>>> If there are many mirror regions in one node, then it will be many holes in the >>>>> normal zone, is this fine? >>>> >>>> Yeah, it doesn't matter how many holes there are. >>> >>> So mirror zone and normal zone will span each other, right? >>> >>> e.g. node 1: 4G-8G(normal), 8-12G(mirror), 12-16G(normal), 16-24G(mirror), 24-28G(normal) ... >>> normal: start=4G, size=28-4=24G, >>> mirror: start=8G, size=24-8=16G, >> >> Yes, that works. It's somewhat unfortunate wrt performance that the hardware >> does it like this though. > > With current Xeon h/w you can have one mirrored range per memory > controller ... and there are two memory controllers on a cpu socket, > so two mirrored ranges per node. So a map might look like: > > SKT0: MC0: 0-2G Mirrored (but we may want to ignore mirror here to keep it for ZONE_DMA) > SKT0: MC0: 2G-4G No memory ... I/O mapping area > SKT0: MC0: 4G-34G Not mirrored > SKT0: MC1: 34G-40G Mirrored > SKT0: MC1: 40G-66G Not mirrored > > SKT1: MC0: 66G-70G Mirror > SKT1: MC0: 70G-98G Not Mirrored > SKT1: MC1: 98G-102G Mirror > SKT1: MC1: 102G-130G Not Mirrored > > ... and so on. > >>> I think zone is defined according to the special address range, like 16M(DMA), 4G(DMA32), >> >> Traditionally yes. But then there is ZONE_MOVABLE, this year's LSF/MM we >> discussed (and didn't outright deny) ZONE_CMA... >> I'm not saying others will favour the new zone approach though, it's just my >> opinion that it might be a better option than a new migratetype. > > If we are going to have lots of zones ... then perhaps we will > need a fast way to look at a "struct page" and decide which zone > it belongs to. Complicated math on the address deosn't sound ideal. > If the complex zone model is just for 64-bit, are there enough bits > available in page->flags (3 bits for 8 options ... which we are close > to filling now ... 4 bits for future breathing room). > >>> and is it appropriate to add a new mirror zone with a volatile physical address? >> >> By "volatile" you mean what, that the example above would change >> dynamically? That would be rather challenging... > > If we hot-add another cpu together with on die memory controllers connected > to more memory ... then some of the new memory might be mirrored. Current > h/w doesn't allow mirrored areas to grow/shrink (though if there are a lot > of errors we may break a mirror so a whole range could lose the mirror attribute). > > -Tony > Hi Tony, What's your suggestions? a new zone or a new migratetype? Maybe add a new zone will change more mm code. Thanks, Xishi Qiu > . > -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/