Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754548Ab1EXCwR (ORCPT ); Mon, 23 May 2011 22:52:17 -0400 Received: from mx3.mail.elte.hu ([157.181.1.138]:38270 "EHLO mx3.mail.elte.hu" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752516Ab1EXCwO (ORCPT ); Mon, 23 May 2011 22:52:14 -0400 Date: Tue, 24 May 2011 04:48:48 +0200 From: Ingo Molnar To: Huang Ying Cc: huang ying , Len Brown , "linux-kernel@vger.kernel.org" , Andi Kleen , "Luck, Tony" , "linux-acpi@vger.kernel.org" , Andi Kleen , "Wu, Fengguang" , Andrew Morton , Linus Torvalds , Peter Zijlstra , Borislav Petkov Subject: Re: [PATCH 5/9] HWPoison: add memory_failure_queue() Message-ID: <20110524024848.GA25230@elte.hu> References: <20110517092620.GI22093@elte.hu> <4DD31C78.6000209@intel.com> <20110520115614.GH14745@elte.hu> <20110522100021.GA28177@elte.hu> <20110522132515.GA13078@elte.hu> <4DD9C8B9.5070004@intel.com> <20110523110151.GD24674@elte.hu> <4DDB1396.7050205@intel.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <4DDB1396.7050205@intel.com> User-Agent: Mutt/1.5.20 (2009-08-17) X-ELTE-SpamScore: -2.0 X-ELTE-SpamLevel: X-ELTE-SpamCheck: no X-ELTE-SpamVersion: ELTE 2.0 X-ELTE-SpamCheck-Details: score=-2.0 required=5.9 tests=BAYES_00 autolearn=no SpamAssassin version=3.3.1 -2.0 BAYES_00 BODY: Bayes spam probability is 0 to 1% [score: 0.0000] Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1839 Lines: 42 * Huang Ying wrote: > >> - How to deal with ring-buffer overflow? For example, there is full of > >> corrected memory error in ring-buffer, and now a recoverable memory error > >> occurs but it can not be put into perf ring buffer because of ring-buffer > >> overflow, how to deal with the recoverable memory error? > > > > The solution is to make it large enough. With *every* queueing solution there > > will be some sort of queue size limit. > > Another solution could be: > > Create two ring-buffer. One is for logging and will be read by RAS > daemon; the other is for recovering, the event record will be removed > from the ring-buffer after all 'active filters' have been run on it. > Even RAS daemon being restarted or hang, recoverable error can be taken > cared of. Well, filters will always be executed since they execute when the event is inserted - not when it's extracted. So if you worry about losing *filter* executions (and dependent policy action) - there should be no loss there, ever. But yes, the scheme you outline would work as well: a counting-only event with a filter specified - this will do no buffering at all. So ... to get the ball rolling in this area one of you guys active in RAS should really try a first approximation for the active filter approach: add a test-TRACE_EVENT() for the errors you are interested in and define a convenient way to register policy action with post-filter events. This should work even without having the 'active' portion defined at the ABI and filter-string level. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/