Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1758873Ab1FVUeE (ORCPT ); Wed, 22 Jun 2011 16:34:04 -0400 Received: from terminus.zytor.com ([198.137.202.10]:38842 "EHLO mail.zytor.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1758077Ab1FVUeC (ORCPT ); Wed, 22 Jun 2011 16:34:02 -0400 Message-ID: <4E0251AB.8090702@zytor.com> Date: Wed, 22 Jun 2011 13:33:47 -0700 From: "H. Peter Anvin" User-Agent: Mozilla/5.0 (X11; U; Linux x86_64; en-US; rv:1.9.2.17) Gecko/20110428 Fedora/3.1.10-1.fc15 Thunderbird/3.1.10 MIME-Version: 1.0 To: Stefan Assmann CC: linux-mm@kvack.org, linux-kernel@vger.kernel.org, akpm@linux-foundation.org, tony.luck@intel.com, andi@firstfloor.org, mingo@elte.hu, rick@vanrein.org, rdunlap@xenotime.net Subject: Re: [PATCH v2 0/3] support for broken memory modules (BadRAM) References: <1308741534-6846-1-git-send-email-sassmann@kpanic.de> <4E023142.1080605@zytor.com> <4E0250F2.2010607@kpanic.de> In-Reply-To: <4E0250F2.2010607@kpanic.de> Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2172 Lines: 48 On 06/22/2011 01:30 PM, Stefan Assmann wrote: > On 22.06.2011 20:15, H. Peter Anvin wrote: >> On 06/22/2011 04:18 AM, Stefan Assmann wrote: >>> >>> The idea is to allow the user to specify RAM addresses that shouldn't be >>> touched by the OS, because they are broken in some way. Not all machines have >>> hardware support for hwpoison, ECC RAM, etc, so here's a solution that allows to >>> use bitmasks to mask address patterns with the new "badram" kernel command line >>> parameter. >>> Memtest86 has an option to generate these patterns since v2.3 so the only thing >>> for the user to do should be: >>> - run Memtest86 >>> - note down the pattern >>> - add badram= to the kernel command line >>> >> >> We already support the equivalent functionality with >> memmap=
$ for those with only a few ranges... this has >> been supported for ages, literally. For those with a lot of ranges, >> like Google, the command line is insufficient. > > Right, I think this has been discussed a while ago. So the advantages I > see in this approach are. It allows to break down memory exclusion to > the page level with a pattern of non-consecutive pages. So if every > other page would be considered bad that's a bit tough to deal with using > memmap. > Secondly patterns can be easily generated by running Memtest86 and thus > easily be fed to the kernel by command line. Making it much more feasible > for the average user to take advantage of it. > How common are nontrivial patterns on real hardware? This would be interesting to hear from Google or another large user. If so, we should probably introduce this as another linked-list data structure; we can allow it to be preprocessed from the command line if need be. I have to say I think Google's point that truncating the list is unacceptable... that would mean running in a known-bad configuration, and even a hard crash would be better. -hpa -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/