Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754854AbYHRRvo (ORCPT ); Mon, 18 Aug 2008 13:51:44 -0400 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1752568AbYHRRvg (ORCPT ); Mon, 18 Aug 2008 13:51:36 -0400 Received: from mail-gx0-f16.google.com ([209.85.217.16]:45290 "EHLO mail-gx0-f16.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751811AbYHRRvf (ORCPT ); Mon, 18 Aug 2008 13:51:35 -0400 DomainKey-Signature: a=rsa-sha1; c=nofws; d=gmail.com; s=gamma; h=message-id:date:from:sender:to:subject:cc:in-reply-to:mime-version :content-type:content-transfer-encoding:content-disposition :references:x-google-sender-auth; b=czhkgcI4PlCjAei2Fd8I7sk3UcjOSXk0hlsg6ORxGPidPVS2WrMyJnRBr9w9M81ANj cWNcwU7Y0y/Ed9MgGPz/iiVgAhxXpzVXpLJV21aT3m+Hd2uz0z4BU8WPqd5ZTHnC+DVj b3v412moedTo83zVzdQ3v8v2v9bfIPOp9XSUM= Message-ID: Date: Mon, 18 Aug 2008 10:51:34 -0700 From: "Tim Hockin" To: "Pavel Machek" Subject: Re: [patch 1/3] kmsg: Kernel message catalog macros. Cc: "Jan Blunck" , "Greg KH" , "Joe Perches" , schwidefsky@de.ibm.com, linux-kernel@vger.kernel.org, linux-s390@vger.kernel.org, lf_kernel_messages@lists.linux-foundation.org, "Andrew Morton" , "Michael Holzheu" , "Gerrit Huizenga" , "Randy Dunlap" , "Jan Kara" , "Sam Ravnborg" , "=?UTF-8?Q?Jochen_Vo=C3=9F?=" , "Kunai Takashi" , "Tim Bird" In-Reply-To: <20080818092317.GD6635@atrey.karlin.mff.cuni.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 7bit Content-Disposition: inline References: <20080730165656.118280544@de.ibm.com> <1218733457.2651.11.camel@localhost> <1218769739.24527.76.camel@localhost> <20080815034419.GB803@suse.de> <20080815112117.GP10078@bolzano.suse.de> <20080818092317.GD6635@atrey.karlin.mff.cuni.cz> X-Google-Sender-Auth: f2dbb44d26615d2e Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2505 Lines: 54 On Mon, Aug 18, 2008 at 2:23 AM, Pavel Machek wrote: > Hi! > >> > I don't think that he wants to unify all the printk's in the system. I don't >> > think that reporting all errors "in the same way as an ATA error" makes any >> > sense. That would just lead to very stupid and unnatural messages for all >> > errors that are not like "ATA errors". Annotation of existing errors is a much >> > more flexible and feasible solution to that problem. >> >> Please don't misinterpret. I don't want to make other errors parse >> like an ATA error, I want to make the plumbing be parallel. I want >> one umbrella mechanism for reporting things that are more important >> than just-another-printk(). >> >> Because frankly, "parse dmesg" is a pretty crappy way to have to >> monitor your system for failures, and I am tired of explaining to >> people why we still do that. > > "parse dmesg" does not work for monitoring your system for failures; > dmesg buffer can overflow. > > If something fails, you should get errno returned for userspace, and > that's where you should be doing the monitoring. > > So... what parts don't return enough information to userspace so that > you need to parse dmesg? Lets fix them. If I get a DMA timeout on my disk, I want to know about it. If I get an OOM kill, I want to know about it. Etc. I *don't* want every application to participate in system monitoring, and that's what it seems you're suggesting. I want a monitoring daemon which is notified of important system events. We like to report these things in various ways, including squirting them out onto the network. I *don't* want to run regexes against dmesg or /var/log/messages or /var/log/kernel every N seconds, that's just a gross hack. I really want first-class notifications of significant events. I don't mind having to do parsing of events - as I said before, they can even be loosely structured strings. They just need to be more important than a plain old printk(), and preferably come through a different channel. I understand that many users will not want this level of monitoring, and that's why it should be flexible enough to devolve into printk(). But we have thousands of systems. I need a better view of what is happening. Tim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/