Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753326AbbFOA2H (ORCPT ); Sun, 14 Jun 2015 20:28:07 -0400 Received: from TYO202.gate.nec.co.jp ([210.143.35.52]:51933 "EHLO tyo202.gate.nec.co.jp" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752667AbbFOA15 convert rfc822-to-8bit (ORCPT ); Sun, 14 Jun 2015 20:27:57 -0400 From: Naoya Horiguchi To: "Luck, Tony" CC: Xishi Qiu , Andrew Morton , "nao.horiguchi@gmail.com" , Yinghai Lu , "H. Peter Anvin" , Thomas Gleixner , "mingo@elte.hu" , Xiexiuqi , Hanjun Guo , Linux MM , LKML Subject: Re: [RFC PATCH 00/12] mm: mirrored memory support for page buddy allocations Thread-Topic: [RFC PATCH 00/12] mm: mirrored memory support for page buddy allocations Thread-Index: AQHQpOu5DPuXM/P5s0uFlWpiv8WKXZ2opD+AgAN+dwA= Date: Mon, 15 Jun 2015 00:25:00 +0000 Message-ID: <20150615002500.GC4214@hori1.linux.bs1.fc.nec.co.jp> References: <55704A7E.5030507@huawei.com> <20150612084233.GB19075@hori1.linux.bs1.fc.nec.co.jp> <20150612190335.GA21994@agluck-desk.sc.intel.com> In-Reply-To: <20150612190335.GA21994@agluck-desk.sc.intel.com> Accept-Language: ja-JP, en-US Content-Language: ja-JP X-MS-Has-Attach: X-MS-TNEF-Correlator: x-originating-ip: [10.128.101.17] Content-Type: text/plain; charset="iso-2022-jp" Content-ID: <234EE1977ABE7D439BE52CB8CA79328D@gisp.nec.co.jp> Content-Transfer-Encoding: 8BIT MIME-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3501 Lines: 68 On Fri, Jun 12, 2015 at 12:03:35PM -0700, Luck, Tony wrote: > On Fri, Jun 12, 2015 at 08:42:33AM +0000, Naoya Horiguchi wrote: > > 4?) I don't have the whole picture of how address ranging mirroring works, > > but I'm curious about what happens when an uncorrected memory error happens > > on the a mirror page. If HW/FW do some useful work invisible from kernel, > > please document it somewhere. And my questions are: > > - can the kernel with this patchset really continue its operation without > > breaking consistency? More specifically, the corrupted page is replaced with > > its mirror page, but can any other pages which have references (like struct > > page or pfn) for the corrupted page properly switch these references to the > > mirror page? Or no worry about that? (This is difficult for kernel pages > > like slab, and that's why currently hwpoison doesn't handle any kernel pages.) > > The mirror is operated by h/w (perhaps with some platform firmware > intervention when things start breaking badly). > > In normal operation there are two DIMM addresses backing each > system physical address in the mirrored range (thus total system > memory capacity is reduced when mirror is enabled). Memory writes > are directed to both locations. Memory reads are interleaved to > maintain bandwidth, so could come from either address. I misunderstood that both of mirrored page and mirroring page are visible to OS, which is incorrect. > When a read returns with an ECC failure the h/w automatically: > 1) Re-issues the read to the other DIMM address. If that also fails - then > we do the normal machine check processing for an uncorrected error > 2) But if the other side of the mirror is good, we can send the good > data to the reader (cpu, or dma) and, in parallel try to fix the > bad side by writing the good data to it. > 3) A corrected error will be logged, it may indicate whether the > attempt to fix succeeded or not. > 4) If platform firmware wants, it can be notified of the correction > and it may keep statistics on the rate of errors, correction status, > etc. If things get very bad it may "break" the mirror and direct > all future reads to the remaining "good" side. If does this it will > likely tell the OS via some ACPI method. Thanks, this fully answered my question. > All of this is done at much less than page granularity. Cache coherence > is maintained ... apart from some small performance glitches and the corrected > error logs, the OS is unware of all of this. > > Note that in current implementations the mirror copies are both behind > the same memory controller ... so this isn't intended to cope with high > level failure of a memory controller ... just to deal with randomly > distributed ECC errors. OK, I looked at "Memory Address Range Mirroring Validation Guide" and Fig 2-2 clearly shows that. > > - How can we test/confirm that the whole scheme works fine? Is current memory > > error injection framework enough? > > Still working on that piece. To validate you need to be able to > inject errors to just one side of the mirror, and I'm not really > sure that the ACPI/EINJ interface is up to the task. OK. Thanks, Naoya Horiguchi-- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/