Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1760916Ab2ERLAW (ORCPT ); Fri, 18 May 2012 07:00:22 -0400 Received: from mx1.redhat.com ([209.132.183.28]:40506 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932556Ab2ERLAQ (ORCPT ); Fri, 18 May 2012 07:00:16 -0400 Message-ID: <4FB62BAB.90808@redhat.com> Date: Fri, 18 May 2012 07:59:55 -0300 From: Mauro Carvalho Chehab User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:12.0) Gecko/20120430 Thunderbird/12.0.1 MIME-Version: 1.0 To: Borislav Petkov , Ingo Molnar CC: Linux Edac Mailing List , Linux Kernel Mailing List , Aristeu Rozanski , Doug Thompson , Steven Rostedt , Frederic Weisbecker , Ingo Molnar Subject: Re: [PATCH v24b] RAS: Add a tracepoint for reporting memory controller events References: <1337287277-523-1-git-send-email-mchehab@redhat.com> <20120517214859.GA16777@aftab.osrc.amd.com> <20120518071244.GE429@gmail.com> <20120518095638.GA20215@aftab.osrc.amd.com> In-Reply-To: <20120518095638.GA20215@aftab.osrc.amd.com> X-Enigmail-Version: 1.4.1 Content-Type: text/plain; charset=ISO-8859-1 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5500 Lines: 133 Em 18-05-2012 06:56, Borislav Petkov escreveu: > On Fri, May 18, 2012 at 09:12:44AM +0200, Ingo Molnar wrote: >>>> Of course, any userspace tools meant to handle errors should not parse >>>> the above data. They should, instead, use the binary fields provided by >>>> the tracepoint, mapping them directly into their MIBs. >>> >>> Nacked-by: Borislav Petkov >> >> Just wondering why this got nacked, and what the >> suggestions/plans are to improve the situation: > > Basically this is the thread which lead to it: http://marc.info/?l=linux-kernel&m=133709477524773&w=2 > >> I assume Mauro is working on these things to solve problems, or to >> add features, Mauro could you please give a higher level list of >> those problems or features? There must be more to it than just a new >> tracepoint! :-) > > My main objection was that the tracepoint to report errors from edac > contains the following prototype: > > + TP_PROTO(const unsigned int err_type, > + const unsigned int mc_index, > + const char *error_msg, > + const char *label, > + const char *location, > + const char *core_detail, > + const char *driver_detail), > > and that the last args should be merged simply into one 'const char > *detail' which every driver can populate as it sees fit. > > But Mauro did not want to parse the string in userspace The "detail" field above is, in fact, a group of the following integer values: - physical address of the error, provided as 3 integers: page, offset an the error offset granularity; - ECC syndrome, with has a XOR value used to determine how many and what bit(s) have errors. "driver_detail" contains other driver-specific details about the errors. The "location" field is also given by 3 integers: layer0, layer1, layer2 with contains the branch/slot/channel or slot/channel or chip select row/channel position of the affected DIMM. On my last comment, I said that I'm convinced that those two fields should be broken into their integer components. The fields at "detail" and "location" are generated by a snprintf() logic inside the EDAC core. "driver_detail" is filled by the drivers. I did that merge in order to provide a clearer output message, but this doesn't help userspace tools that could get the binary information at the trace directly. There's a conceptual issue here, and it seems that Boris is not understanding, probably because he doesn't have any experience on handling errors on userspace. In this point, my previous experience on writing network management and my work at the userspace EDAC tool helps me to see both faces of the coin. Currently, EDAC prints an error message, and increments some Kernel counters to indicate where the error is located. Userspace tools can either get the dmesg logs or read the error counters. The edac-tools (with is the only public tool for EDAC that I'm aware of) prefers to rely on the error counters, instead of parsing the error logs. I suspect that one of the reasons is because printk messages are not a consistent ABI, as changing them are not considered as a Kernel regression. There are several issues on relying at the Kernel error counters: they don't have decay or any other kind of logic that would allow detecting bursts. So, on a machine running for 30 days with, let's say, 10 corrected errors, it is not possible to tell if they were all generated on a burst (because of some temporary interference) or if they are sparse errors generated at the same DRAM, indicating a bad memory. The big advantage (maybe the only advantage) of using a tracepoint is that the binary data can be exported directly. If the location and memory location integers are exported as integers, it is simple to change the edac-tools to work with tracepoints, instead of working with the error counters. With this simple change, the tool can be improved to provide a more consistent memory error report, as userspace should not be worried on parsing fields that may change in future without notice. Also, doing that will avoid the extra effort of joining everything into a single string, and then breaking them back into their individual fields on userspace. > but feed it > straight into a MIB (which could mean "Men In Black" for all I know), > right from the tracepoint: I can do a s/MIB/Management Information Base/. We can also avoid all other acronyms that have more than one sense, like RAS (with all network people will associate it with "Remote Access Service"). >> Of course, any userspace tools meant to handle errors should not parse >> the above data. They should, instead, use the binary fields provided by >> the tracepoint, mapping them directly into their MIBs. > > And I wanted to have a generic, usable-for-all tracepoint output > which anyone in userspace can parse, decode, cut, paste as she sees > fit without forcing kernel output formatting into any abstract error > management hierarchy or whatever. s/printk/trace/? That doesn't sound a good idea. The advantage of tracepoints of printk's is that they provide an ABI that allows passing strucutured information to userspace. > As Tony put it, we need to hammer that out properly now before it > becomes an ABI. Yes, sure. Regards, Mauro -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/