There are 5 patches that represent the combined efforts of Rusty
Russell, the Enterprise Event logging team (Larry Kessler, Jim Keniston
and Hien Nguyen), and Mike Sullivan (all from IBM). Patchs 1-4 are in
separate notes. Patch 5 is included at the end of this note.
The concept:
-----------
* Device Drivers use new macros to log "problems" when errors are
detected. Devices are also "introduced" at init time.
* If event logging is not configured, then the "details" passed to
problem() and introduce() are written to printk.
If event logging is configured....
* During the build process
the static details (textual description, problem attribute names,
format specifiers for problem attributes, source file name, function
name and line number) associated with the problem() and introduce()
calls are stored in a .log section in the .o file.
* A user-mode utility reads the static details from the .log
section and creates a "formatting template", which contains the
static details needed to interpret and format the problem data
that's logged during runtime.
* Developers, Distros, Sys Admins, etc. can simply edit the template
(or provide an alternate template) to control which information from
the problem record is displayed, how it is displayed, what language
it is displayed in, and can add additional information like probable
cause, recommended operator actions, recommended repair actions, etc.
...all without requiring any changes in the device driver source code.
* Event logging utilities apply the templates to problem records for
querying events, displaying events, event notification, and log
management. Named-attributes in the problem data allow the above
actions to key on specific attributes like MAC address, device name,
etc.
The patches for 2.5.38:
----------------------
Patch 1/5 - Logging macros and template generation code
(separate note)
Patch 2/5 - Event Logging (separate note)
Patch 3/5 - KBUILD_MODNAME (from Kai Germaschewski; separate note)
Patch 4/5 - scsi device driver using the macros (separate note)
Patch 5/5 - eepro100 device driver using the macros (end of this note)
-> apply in the above sequence (patch 4 and 5 do not depend on
each other, but both contain pci_problem.h)
Example:
-------
(1) disk_dummy.c uses the problem() and detail() macros:
/* serious disk problem */
problem(LOG_ALERT, "Disk on fire!",
detail(disk, "%s", drive->name),
detail(temperature, "%d", drive->degC),
detail(action, "%s", "Put out fire; run fsck."));
(2) During 'make bzImage' or 'make modules' static event data is stored
in a .log section in the disk_dummy.o file.
(3) 'make templates' extracts this data from the disk_dummy.o file and
generates a formatting template in templates/disk_dummy/disk_dummy.t:
facility "disk_dummy";
event_type 0x8ab218f4; /* file, message */
const {
string message = "Disk on fire!";
string file = "disk_dummy.c";
string function = "disk_mon";
int line = 81;
}
attributes {
string action "%s";
string disk "%s";
int temperature "%d";
}
format
%file%:%function%:%line%
%message% action=%action% disk=%disk% temperature=%temperature%
The .log section is not included in bzImage nor in modules installed
with 'make modules_install'. However, the original disk_dummy.o file
still has it. 'objcopy -R .log disk_dummy.o' removes it.
(4) 'make templates_install' copies disk_dummy/disk_dummy.t to
/var/evlog/templates.
(5) 'evlfacility -a disk_dummy' adds the facility to the registry.
'evltc disk_dummy.t' compiles the template, and generates
/var/evlog/templates/disk_dummy/0x8ab218f4.to, which is used
by the event logging utilities.
(6) When a problem() is logged by the device driver, the static info. is
not stored in the event. Instead it is read by event logging
utilities from the 0x8ab218f4.to file after the problem record is
read from the event log file.
(7) The template under (3) above would allow the command...
>evlview -b -f 'disk="sda3" && temperature>80'
to display events where sda3's temperature was greater than 80...
recid=2163, size=33, format=BINARY, event_type=0x8ab218f4, facility=disk_dummy,
severity=ALERT, uid=root, gid=root, pid=1, pgrp=0,
time=Fri 20 Sep 2002 04:00:01 PM PDT, flags=0x2 (KERNEL), thread=0x0,
processor=2
disk_dummy.c:dummy_mon:62
Disk on fire! action=Put out fire; run fsck. disk=sda3 temperature=88
Some other examples...
evlview -b -f 'disk="sda3" && temperature>80' -m
Sep 20 16:00:01 elm3b99 kernel: disk_dummy.c:dummy_mon:62
Disk on fire! action=Put out fire; run fsck. disk=sda3 temperature=88
...-m causes a printk style display.
By editing disk_dummy.t, recompiling it, and then reissuing the above
command, the same problem data would be displayed differently...
Sep 20 16:00:01 elm3b99 kernel:
<<< Se quema el disco! Se quema el disco! >>>
!!! temperatura=88 degrees C el disko=sda3
Notes:
-----
For the following 3 invocations, the first 2 work, the 3rd does not...
problem(LOG_ALERT, "Disk on fire"); // OK
#define DISK_ON_FIRE "Disk on fire"
problem(LOG_ALERT, DISK_ON_FIRE); // OK
msg = "Disk on fire";
problem(LOG_ALERT, msg); // No good
Furthermore, you cannot have more than one problem() call on a single
line. This restriction does not apply to the detail() macro.
See http://evlog.sourceforge.net/ for more details about event logging.
Go to https://sourceforge.net/project/showfiles.php?group_id=34226
to download release evlog-2.5_kernel, 1.4.2_k2.5 for the companion
user lib and utilities.
To-do List
----------
1) Resolve "one problem() per line" restriction.
2) Generate shell scripts during 'make templates_install' that
execute 'evlfacility' for all facilities and 'evltc' for all .t
files (currently have to do one at a time).
3) For event-logging not configured case, buffer problem() data and
make a single call to printk(), since multiple printks are
non-atomic.
4) Define valid severities to use with for problem()...3 or 4.
An actual device driver
-----------------------
Note that this patch includes pci_problem.h, as does the ips.c
device driver patch included in the '4 of 4' note.
Summary of this patch...
drivers/net/eepro100.c
Device Driver for the Intel PCI EtherExpressPro with new logging
macros replacing prink() for error conditions.
include/linux/net_problem.h
- net_detail() macro providing common information of interest
for ethernet-class devices.
- net_problem, net_pci_problem, and net_introduce macros
include/linux/pci_problem.h
- pci_detail() macro providing common information on a per class
basis when problems are being reported for devices of that class.
- pci_problem and pci_introduce macros.
--- linux-2.5.37/drivers/net/eepro100.c Fri Sep 20 10:20:31 2002
+++ linux-2.5.37-net/drivers/net/eepro100.c Mon Sep 23 20:20:14 2002
@@ -119,6 +119,7 @@
#include <linux/etherdevice.h>
#include <linux/skbuff.h>
#include <linux/ethtool.h>
+#include <linux/net_problem.h>
MODULE_AUTHOR("Maintainer: Andrey V. Savochkin <[email protected]>");
MODULE_DESCRIPTION("Intel i82557/i82558/i82559 PCI EtherExpressPro driver");
@@ -325,7 +326,8 @@
while(inb(cmd_ioaddr) && --wait >= 0);
#ifndef final_version
if (wait < 0)
- printk(KERN_ALERT "eepro100: wait_for_cmd_done timeout!\n");
+ problem(LOG_ALERT, "eepro100: wait_for_cmd_done timeout!",
+ detail(ioaddr, "%lx", cmd_ioaddr));
#endif
}
@@ -568,6 +570,7 @@
static int cards_found /* = 0 */;
static int did_version /* = 0 */; /* Already printed version info. */
+ pci_introduce(pdev);
if (speedo_debug > 0 && did_version++ == 0)
printk(version);
@@ -586,12 +589,12 @@
if (!request_region(pci_resource_start(pdev, 1),
pci_resource_len(pdev, 1), "eepro100")) {
- printk (KERN_ERR "eepro100: cannot reserve I/O ports\n");
+ pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve I/O ports");
goto err_out_none;
}
if (!request_mem_region(pci_resource_start(pdev, 0),
pci_resource_len(pdev, 0), "eepro100")) {
- printk (KERN_ERR "eepro100: cannot reserve MMIO region\n");
+ pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve MMIO region");
goto err_out_free_pio_region;
}
@@ -605,8 +608,10 @@
ioaddr = (unsigned long)ioremap(pci_resource_start(pdev, 0),
pci_resource_len(pdev, 0));
if (!ioaddr) {
- printk (KERN_ERR "eepro100: cannot remap MMIO region %lx @ %lx\n",
- pci_resource_len(pdev, 0), pci_resource_start(pdev, 0));
+ pci_problem(LOG_ERR, pdev,
+ "eepro100: cannot remap MMIO region %lx @ %lx",
+ detail(region, "%lx", pci_resource_len(pdev, 0)),
+ detail(base, "%lx", pci_resource_start(pdev, 0)));
goto err_out_free_mmio_region;
}
if (speedo_debug > 2)
@@ -653,11 +658,12 @@
dev = init_etherdev(NULL, sizeof(struct speedo_private));
if (dev == NULL) {
- printk(KERN_ERR "eepro100: Could not allocate ethernet device.\n");
+ pci_problem(LOG_ERR, pdev, "Could not allocate ethernet device.");
pci_free_consistent(pdev, size, tx_ring_space, tx_ring_dma);
return -1;
}
+ net_introduce(dev);
SET_MODULE_OWNER(dev);
if (dev->mem_start > 0)
@@ -700,9 +706,9 @@
}
}
if (sum != 0xBABA)
- printk(KERN_WARNING "%s: Invalid EEPROM checksum %#4.4x, "
- "check settings before activating this device!\n",
- dev->name, sum);
+ net_pci_problem(LOG_WARNING, dev, pdev, "Invalid EEPROM checksum, "
+ "check settings before activating this device!",
+ detail(checksum, "%#4.4x", sum));
/* Don't unregister_netdev(dev); as the EEPro may actually be
usable, especially if the MAC address is set later.
On the other hand, it may be unusable if MDI data is corrupted. */
@@ -784,11 +790,9 @@
} while (self_test_results[1] == -1 && --boguscnt >= 0);
if (boguscnt < 0) { /* Test optimized out. */
- printk(KERN_ERR "Self test failed, status %8.8x:\n"
- KERN_ERR " Failure to initialize the i82557.\n"
- KERN_ERR " Verify that the card is a bus-master"
- " capable slot.\n",
- self_test_results[1]);
+ net_pci_problem(LOG_ERR, dev, pdev,
+ "Self test failed.Failure to initialize the i82557. Verify that the card is a bus-master capable slot.\n",
+ detail(results, "%8.8x", self_test_results[1]));
} else
printk(KERN_INFO " General self-test: %s.\n"
KERN_INFO " Serial sub-system self-test: %s.\n"
@@ -931,7 +935,9 @@
do {
val = inl(ioaddr + SCBCtrlMDI);
if (--boguscnt < 0) {
- printk(KERN_ERR " mdio_read() timed out with val = %8.8x.\n", val);
+ problem(LOG_ERR, " mdio_read() timed out.\n",
+ detail(ioaddr, "%lx", ioaddr),
+ detail(value, "%8.8x", val));
break;
}
} while (! (val & 0x10000000));
@@ -947,7 +953,9 @@
do {
val = inl(ioaddr + SCBCtrlMDI);
if (--boguscnt < 0) {
- printk(KERN_ERR" mdio_write() timed out with val = %8.8x.\n", val);
+ problem(LOG_ERR, " mdio_write() timed out.\n",
+ detail(ioaddr, "%lx", ioaddr),
+ detail(value, "%8.8x", val));
break;
}
} while (! (val & 0x10000000));
@@ -1370,11 +1378,12 @@
int status = inw(ioaddr + SCBStatus);
unsigned long flags;
- printk(KERN_WARNING "%s: Transmit timed out: status %4.4x "
- " %4.4x at %d/%d command %8.8x.\n",
- dev->name, status, inw(ioaddr + SCBCmd),
- sp->dirty_tx, sp->cur_tx,
- sp->tx_ring[sp->dirty_tx % TX_RING_SIZE].status);
+ net_pci_problem(LOG_WARNING, dev, sp->pdev, "Transmit timed out\n",
+ detail(scbstatus, "%4.4x", status),
+ detail(scbcmd, "%4.4x", inw(ioaddr + SCBCmd)),
+ detail(dirty_tx, "%d", sp->dirty_tx),
+ detail(current_tx, "%d", sp->cur_tx),
+ detail(commandstatus, "%8.8x", sp->tx_ring[sp->dirty_tx % TX_RING_SIZE].status));
speedo_show_state(dev);
#if 0
@@ -1436,7 +1445,7 @@
/* Check if there are enough space. */
if ((int)(sp->cur_tx - sp->dirty_tx) >= TX_QUEUE_LIMIT) {
- printk(KERN_ERR "%s: incorrect tbusy state, fixed.\n", dev->name);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "incorrect tbusy state, fixed.\n");
netif_stop_queue(dev);
sp->tx_full = 1;
spin_unlock_irqrestore(&sp->lock, flags);
@@ -1529,9 +1538,10 @@
}
if (speedo_debug && (int)(sp->cur_tx - dirty_tx) > TX_RING_SIZE) {
- printk(KERN_ERR "out-of-sync dirty pointer, %d vs. %d,"
- " full=%d.\n",
- dirty_tx, sp->cur_tx, sp->tx_full);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "out-of-sync dirty pointer\n",
+ detail(dirty_tx, "%d", dirty_tx),
+ detail(current_tx, "%d", sp->cur_tx),
+ detail(tx_full, "%d", sp->tx_full));
dirty_tx += TX_RING_SIZE;
}
@@ -1563,7 +1573,8 @@
#ifndef final_version
if (dev == NULL) {
- printk(KERN_ERR "speedo_interrupt(): irq %d for unknown device.\n", irq);
+ net_problem(LOG_ERR, dev,"speedo_interrupt() for unknown device\n",
+ detail(irq, "%d", irq));
return;
}
#endif
@@ -1574,8 +1585,7 @@
#ifndef final_version
/* A lock to prevent simultaneous entry on SMP machines. */
if (test_and_set_bit(0, (void*)&sp->in_interrupt)) {
- printk(KERN_ERR"%s: SMP simultaneous entry of an interrupt handler.\n",
- dev->name);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "SMP simultaneous entry of an interrupt handler.\n");
sp->in_interrupt = 0; /* Avoid halting machine. */
return;
}
@@ -1640,8 +1650,8 @@
spin_unlock(&sp->lock);
if (--boguscnt < 0) {
- printk(KERN_ERR "%s: Too much work at interrupt, status=0x%4.4x.\n",
- dev->name, status);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "Too much work at interrupt\n",
+ detail(status, "%4.4x", status));
/* Clear all interrupt sources. */
/* Will change from 0xfc00 to 0xff00 when we start handling
FCP and ER interrupts --Dragan */
@@ -1712,8 +1722,8 @@
unsigned int forw;
int forw_entry;
if (speedo_debug > 2 || !(sp->rx_ring_state & RrOOMReported)) {
- printk(KERN_WARNING "%s: can't fill rx buffer (force %d)!\n",
- dev->name, force);
+ net_pci_problem(LOG_WARNING, dev, sp->pdev, "can't fill rx buffer\n",
+ detail(force, "%d", force));
speedo_show_state(dev);
sp->rx_ring_state |= RrOOMReported;
}
@@ -1793,14 +1803,13 @@
pkt_len);
if ((status & (RxErrTooBig|RxOK|0x0f90)) != RxOK) {
if (status & RxErrTooBig)
- printk(KERN_ERR "%s: Ethernet frame overran the Rx buffer, "
- "status %8.8x!\n", dev->name, status);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "Ethernet frame overran the Rx buffer\n",
+ detail(status, "%8.8x", status));
else if (! (status & RxOK)) {
/* There was a fatal error. This *should* be impossible. */
sp->stats.rx_errors++;
- printk(KERN_ERR "%s: Anomalous event in speedo_rx(), "
- "status %8.8x.\n",
- dev->name, status);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "Anomalous event in speedo_rx()\n",
+ detail(status, "%8.8x", status));
}
} else {
struct sk_buff *skb;
@@ -1827,8 +1836,7 @@
/* Pass up the already-filled skbuff. */
skb = sp->rx_skbuff[entry];
if (skb == NULL) {
- printk(KERN_ERR "%s: Inconsistent Rx descriptor chain.\n",
- dev->name);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "Inconsistent Rx descriptor chain.\n");
break;
}
sp->rx_skbuff[entry] = NULL;
@@ -2193,8 +2201,7 @@
mc_blk = kmalloc(sizeof(*mc_blk) + 2 + multicast_filter_limit*6,
GFP_ATOMIC);
if (mc_blk == NULL) {
- printk(KERN_ERR "%s: Failed to allocate a setup frame.\n",
- dev->name);
+ net_pci_problem(LOG_ERR, dev, sp->pdev, "Failed to allocate a setup frame.\n");
sp->rx_mode = -1; /* We failed, try again. */
return;
}
--- linux-2.5.37/drivers/include/linux/net_problem.h Wed Dec 31 18:00:00 1969
+++ linux-2.5.37-net/include/linux/net_problem.h Mon Sep 23 20:04:23 2002
@@ -0,0 +1,97 @@
+/*
+ * Linux Event Logging for the Enterprise
+ * Copyright (c) International Business Machines Corp., 2002
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Please send e-mail to [email protected] if you have
+ * questions or comments.
+ *
+ * Project Website: http://evlog.sourceforge.net/
+ *
+ */
+
+#ifndef _NET_PROBLEM_H
+#define _NET_PROBLEM_H
+
+#include <linux/pci.h>
+#include <linux/pci_problem.h>
+#include <linux/problem.h>
+
+/* Many network devices start with a bus specific probe and
+ * have a bus specific reference (ie. pci_dev) before a
+ * net_device instance is allocated. Logging that needs to
+ * occur during these phases would only be able to make pci_dev
+ * info available. To simplify the interface for the DD writer
+ * the same macro would always be used, ie. net_pci_problem()
+ * for a pci based network device (suggestions on shortening the
+ * name would be appreciated). Unfortunately the problem() macros
+ * limit us to one invocation per line, so multiple invocations
+ * based on the the value of the parameters is not allowed :(
+ * The alternative is to have the dd writer use different macros
+ * depending on the info available. ie. problem(), pci_problem(),
+ * net_problem() or net_pci_problem. Complicating
+ * this is that since the net_device structure doesn't have
+ * a bus specific type & member included, invoking
+ * net_problem(struct *net_device) doesn't allow the bus instance
+ * to be located and information to be added to the record automatically
+ * from within the net_problem macro.
+ */
+
+/* Added macaddr to net_detail since support
+ * for byte[] fmt objects exists */
+#define net_detail(dev) \
+ detail(net_name, "%s", (dev)->name), \
+ array_detail(net_mac, "%02hhx", ":", (dev)->dev_addr, 8), \
+ detail(net_addr, "%p", (dev))
+
+/* This macro could conditionally drop down to invoking
+ * problem() without net_detail() if dev is NULL, but one problem()
+ * per line restriction must be resolved first (just do a printk
+ * to warn the developer if they have used this macro without
+ * valid args */
+#define net_problem(sev, dev, string,...) \
+do { \
+ if (dev) \
+ problem(sev, string, net_detail((struct net_device*)dev), ## __VA_ARGS__); \
+ else \
+ printk("net_problem. Invalid usage struct net_device * is NULL\n"); \
+} while (0)
+
+
+/* This macro could conditionally add details depending on the
+ * value of dev and pdev. But one problem()
+ * per line restriction must be resolved first (just do a printk
+ * to warn the developer if they have used this macro without
+ * valid args */
+/* For use by PCI based network drivers */
+#define net_pci_problem(sev, dev, pdev, string,...) \
+do { \
+ if ( (dev) && (pdev) ) \
+ problem(sev, string, net_detail((struct net_device*)dev), pci_detail((struct pci_dev *)pdev), ## __VA_ARGS__); \
+ else { \
+ if (!dev) \
+ printk("net_problem. Invalid usage struct net_device * is NULL\n"); \
+ if (!pdev) \
+ printk("net_problem. Invalid usage struct pci_dev * is NULL\n"); \
+ } \
+} while (0)
+
+static inline void net_introduce(struct net_device *dev) {
+ if (dev) introduce(__stringify(KBUILD_MODNAME) " introduces network device: ", dev, net_detail(dev));
+}
+
+#endif /* _NET_PROBLEM_H */
--- linux-2.5.37/drivers/include/linux/pci_problem.h Wed Dec 31 18:00:00 1969
+++ linux-2.5.37-net/include/linux/pci_problem.h Mon Sep 23 19:56:11 2002
@@ -0,0 +1,52 @@
+/*
+ * Linux Event Logging for the Enterprise
+ * Copyright (c) International Business Machines Corp., 2002
+ *
+ *
+ * This program is free software; you can redistribute it and/or modify
+ * it under the terms of the GNU General Public License as published by
+ * the Free Software Foundation; either version 2 of the License, or
+ * (at your option) any later version.
+ *
+ * This program is distributed in the hope that it will be useful,
+ * but WITHOUT ANY WARRANTY; without even the implied warranty of
+ * MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
+ * GNU General Public License for more details.
+ *
+ * You should have received a copy of the GNU General Public License
+ * along with this program; if not, write to the Free Software
+ * Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.
+ *
+ * Please send e-mail to [email protected] if you have
+ * questions or comments.
+ *
+ * Project Website: http://evlog.sourceforge.net/
+ *
+ */
+
+#ifndef _PCI_PROBLEM_H
+#define _PCI_PROBLEM_H
+
+#include <linux/problem.h>
+
+#define pci_detail(pdev) \
+ detail(pci_name, "%s", (pdev)->name), \
+ detail(pci_slot, "%s", (pdev)->slot_name), \
+ detail(pci_vendorid, "%x", (pdev)->vendor), \
+ detail(pci_deviceid, "%x", (pdev)->device), \
+ detail(pci_dev_addr, "%p", (pdev))
+
+#define pci_problem(sev, pdev, string,...) \
+do { \
+ if (pdev) \
+ problem(sev, string, pci_detail((struct pci_dev *)pdev), ## __VA_ARGS__); \
+ else \
+ printk("pci_problem. Invalid usage struct pci_dev * is NULL\n"); \
+} while (0)
+
+static inline void pci_introduce(struct pci_dev *pdev) {
+ introduce(__stringify(KBUILD_MODNAME) "introduces pci device: ", pdev, pci_detail(pdev));
+}
+
+
+#endif /* _PCI_PROBLEM_H */
Larry Kessler wrote:
> There are 5 patches that represent the combined efforts of Rusty
> Russell, the Enterprise Event logging team (Larry Kessler, Jim Keniston
> and Hien Nguyen), and Mike Sullivan (all from IBM). Patchs 1-4 are in
> separate notes. Patch 5 is included at the end of this note.
>
> The concept:
> -----------
> * Device Drivers use new macros to log "problems" when errors are
> detected. Devices are also "introduced" at init time.
>
> * If event logging is not configured, then the "details" passed to
> problem() and introduce() are written to printk.
>
> If event logging is configured....
>
> * During the build process
> the static details (textual description, problem attribute names,
> format specifiers for problem attributes, source file name, function
> name and line number) associated with the problem() and introduce()
> calls are stored in a .log section in the .o file.
>
> * A user-mode utility reads the static details from the .log
> section and creates a "formatting template", which contains the
> static details needed to interpret and format the problem data
> that's logged during runtime.
>
> * Developers, Distros, Sys Admins, etc. can simply edit the template
> (or provide an alternate template) to control which information from
> the problem record is displayed, how it is displayed, what language
> it is displayed in, and can add additional information like probable
> cause, recommended operator actions, recommended repair actions, etc.
> ...all without requiring any changes in the device driver source code.
>
> * Event logging utilities apply the templates to problem records for
> querying events, displaying events, event notification, and log
> management. Named-attributes in the problem data allow the above
> actions to key on specific attributes like MAC address, device name,
> etc.
>
>
> The patches for 2.5.38:
> ----------------------
> Patch 1/5 - Logging macros and template generation code
> (separate note)
> Patch 2/5 - Event Logging (separate note)
> Patch 3/5 - KBUILD_MODNAME (from Kai Germaschewski; separate note)
> Patch 4/5 - scsi device driver using the macros (separate note)
> Patch 5/5 - eepro100 device driver using the macros (end of this note)
> -> apply in the above sequence (patch 4 and 5 do not depend on
> each other, but both contain pci_problem.h)
>
>
> Example:
> -------
> (1) disk_dummy.c uses the problem() and detail() macros:
> /* serious disk problem */
> problem(LOG_ALERT, "Disk on fire!",
> detail(disk, "%s", drive->name),
> detail(temperature, "%d", drive->degC),
> detail(action, "%s", "Put out fire; run fsck."));
>
> (2) During 'make bzImage' or 'make modules' static event data is stored
> in a .log section in the disk_dummy.o file.
>
> (3) 'make templates' extracts this data from the disk_dummy.o file and
> generates a formatting template in templates/disk_dummy/disk_dummy.t:
>
> facility "disk_dummy";
> event_type 0x8ab218f4; /* file, message */
> const {
> string message = "Disk on fire!";
> string file = "disk_dummy.c";
> string function = "disk_mon";
> int line = 81;
> }
> attributes {
> string action "%s";
> string disk "%s";
> int temperature "%d";
> }
> format
> %file%:%function%:%line%
> %message% action=%action% disk=%disk% temperature=%temperature%
>
> The .log section is not included in bzImage nor in modules installed
> with 'make modules_install'. However, the original disk_dummy.o file
> still has it. 'objcopy -R .log disk_dummy.o' removes it.
>
> (4) 'make templates_install' copies disk_dummy/disk_dummy.t to
> /var/evlog/templates.
>
> (5) 'evlfacility -a disk_dummy' adds the facility to the registry.
> 'evltc disk_dummy.t' compiles the template, and generates
> /var/evlog/templates/disk_dummy/0x8ab218f4.to, which is used
> by the event logging utilities.
>
> (6) When a problem() is logged by the device driver, the static info. is
> not stored in the event. Instead it is read by event logging
> utilities from the 0x8ab218f4.to file after the problem record is
> read from the event log file.
>
> (7) The template under (3) above would allow the command...
> >evlview -b -f 'disk="sda3" && temperature>80'
> to display events where sda3's temperature was greater than 80...
>
> recid=2163, size=33, format=BINARY, event_type=0x8ab218f4, facility=disk_dummy,
> severity=ALERT, uid=root, gid=root, pid=1, pgrp=0,
> time=Fri 20 Sep 2002 04:00:01 PM PDT, flags=0x2 (KERNEL), thread=0x0,
> processor=2
> disk_dummy.c:dummy_mon:62
> Disk on fire! action=Put out fire; run fsck. disk=sda3 temperature=88
>
> Some other examples...
>
> evlview -b -f 'disk="sda3" && temperature>80' -m
>
> Sep 20 16:00:01 elm3b99 kernel: disk_dummy.c:dummy_mon:62
> Disk on fire! action=Put out fire; run fsck. disk=sda3 temperature=88
>
> ...-m causes a printk style display.
>
> By editing disk_dummy.t, recompiling it, and then reissuing the above
> command, the same problem data would be displayed differently...
>
> Sep 20 16:00:01 elm3b99 kernel:
> <<< Se quema el disco! Se quema el disco! >>>
> !!! temperatura=88 degrees C el disko=sda3
>
>
> Notes:
> -----
> For the following 3 invocations, the first 2 work, the 3rd does not...
>
> problem(LOG_ALERT, "Disk on fire"); // OK
>
> #define DISK_ON_FIRE "Disk on fire"
> problem(LOG_ALERT, DISK_ON_FIRE); // OK
>
> msg = "Disk on fire";
> problem(LOG_ALERT, msg); // No good
>
> Furthermore, you cannot have more than one problem() call on a single
> line. This restriction does not apply to the detail() macro.
>
> See http://evlog.sourceforge.net/ for more details about event logging.
>
> Go to https://sourceforge.net/project/showfiles.php?group_id=34226
> to download release evlog-2.5_kernel, 1.4.2_k2.5 for the companion
> user lib and utilities.
>
>
> To-do List
> ----------
>
> 1) Resolve "one problem() per line" restriction.
> 2) Generate shell scripts during 'make templates_install' that
> execute 'evlfacility' for all facilities and 'evltc' for all .t
> files (currently have to do one at a time).
> 3) For event-logging not configured case, buffer problem() data and
> make a single call to printk(), since multiple printks are
> non-atomic.
> 4) Define valid severities to use with for problem()...3 or 4.
>
>
> An actual device driver
> -----------------------
>
> Note that this patch includes pci_problem.h, as does the ips.c
> device driver patch included in the '4 of 4' note.
>
> Summary of this patch...
>
> drivers/net/eepro100.c
> Device Driver for the Intel PCI EtherExpressPro with new logging
> macros replacing prink() for error conditions.
>
> include/linux/net_problem.h
> - net_detail() macro providing common information of interest
> for ethernet-class devices.
> - net_problem, net_pci_problem, and net_introduce macros
>
> include/linux/pci_problem.h
>
> - pci_detail() macro providing common information on a per class
> basis when problems are being reported for devices of that class.
> - pci_problem and pci_introduce macros.
>
>
> --- linux-2.5.37/drivers/net/eepro100.c Fri Sep 20 10:20:31 2002
> +++ linux-2.5.37-net/drivers/net/eepro100.c Mon Sep 23 20:20:14 2002
> @@ -119,6 +119,7 @@
> #include <linux/etherdevice.h>
> #include <linux/skbuff.h>
> #include <linux/ethtool.h>
> +#include <linux/net_problem.h>
>
> MODULE_AUTHOR("Maintainer: Andrey V. Savochkin <[email protected]>");
> MODULE_DESCRIPTION("Intel i82557/i82558/i82559 PCI EtherExpressPro driver");
> @@ -325,7 +326,8 @@
> while(inb(cmd_ioaddr) && --wait >= 0);
> #ifndef final_version
> if (wait < 0)
> - printk(KERN_ALERT "eepro100: wait_for_cmd_done timeout!\n");
> + problem(LOG_ALERT, "eepro100: wait_for_cmd_done timeout!",
> + detail(ioaddr, "%lx", cmd_ioaddr));
bloat, the ioaddr can easily be deduced
> #endif
> }
>
> @@ -568,6 +570,7 @@
> static int cards_found /* = 0 */;
>
> static int did_version /* = 0 */; /* Already printed version info. */
> + pci_introduce(pdev);
bloat, we don't need foo_introduce() functions for every subsystem, when
every subsystem always has an attach-new-device function.
> - printk (KERN_ERR "eepro100: cannot reserve I/O ports\n");
> + pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve I/O ports");
> goto err_out_none;
> }
> if (!request_mem_region(pci_resource_start(pdev, 0),
> pci_resource_len(pdev, 0), "eepro100")) {
> - printk (KERN_ERR "eepro100: cannot reserve MMIO region\n");
> + pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve MMIO region");
bloat, no advantage over printk
> goto err_out_free_pio_region;
> }
>
> @@ -605,8 +608,10 @@
> ioaddr = (unsigned long)ioremap(pci_resource_start(pdev, 0),
> pci_resource_len(pdev, 0));
> if (!ioaddr) {
> - printk (KERN_ERR "eepro100: cannot remap MMIO region %lx @ %lx\n",
> - pci_resource_len(pdev, 0), pci_resource_start(pdev, 0));
> + pci_problem(LOG_ERR, pdev,
> + "eepro100: cannot remap MMIO region %lx @ %lx",
> + detail(region, "%lx", pci_resource_len(pdev, 0)),
> + detail(base, "%lx", pci_resource_start(pdev, 0)));
> goto err_out_free_mmio_region;
> }
> if (speedo_debug > 2)
> @@ -653,11 +658,12 @@
>
> dev = init_etherdev(NULL, sizeof(struct speedo_private));
> if (dev == NULL) {
> - printk(KERN_ERR "eepro100: Could not allocate ethernet device.\n");
> + pci_problem(LOG_ERR, pdev, "Could not allocate ethernet device.");
likewise
> pci_free_consistent(pdev, size, tx_ring_space, tx_ring_dma);
> return -1;
> }
>
> + net_introduce(dev);
likewise, RE foo_introduce()
> SET_MODULE_OWNER(dev);
>
> if (dev->mem_start > 0)
> @@ -700,9 +706,9 @@
> }
> }
> if (sum != 0xBABA)
> - printk(KERN_WARNING "%s: Invalid EEPROM checksum %#4.4x, "
> - "check settings before activating this device!\n",
> - dev->name, sum);
> + net_pci_problem(LOG_WARNING, dev, pdev, "Invalid EEPROM checksum, "
> + "check settings before activating this device!",
> + detail(checksum, "%#4.4x", sum));
bloat, checksum is purely informational, and can be obtained through
other means
> /* Don't unregister_netdev(dev); as the EEPro may actually be
> usable, especially if the MAC address is set later.
> On the other hand, it may be unusable if MDI data is corrupted. */
> @@ -784,11 +790,9 @@
> } while (self_test_results[1] == -1 && --boguscnt >= 0);
>
> if (boguscnt < 0) { /* Test optimized out. */
> - printk(KERN_ERR "Self test failed, status %8.8x:\n"
> - KERN_ERR " Failure to initialize the i82557.\n"
> - KERN_ERR " Verify that the card is a bus-master"
> - " capable slot.\n",
> - self_test_results[1]);
> + net_pci_problem(LOG_ERR, dev, pdev,
> + "Self test failed.Failure to initialize the i82557. Verify that the card is a bus-master capable slot.\n",
> + detail(results, "%8.8x", self_test_results[1]));
> } else
> printk(KERN_INFO " General self-test: %s.\n"
> KERN_INFO " Serial sub-system self-test: %s.\n"
> @@ -931,7 +935,9 @@
> do {
> val = inl(ioaddr + SCBCtrlMDI);
> if (--boguscnt < 0) {
> - printk(KERN_ERR " mdio_read() timed out with val = %8.8x.\n", val);
> + problem(LOG_ERR, " mdio_read() timed out.\n",
> + detail(ioaddr, "%lx", ioaddr),
> + detail(value, "%8.8x", val));
bloat, no need for ioaddr
> break;
> }
> } while (! (val & 0x10000000));
> @@ -947,7 +953,9 @@
> do {
> val = inl(ioaddr + SCBCtrlMDI);
> if (--boguscnt < 0) {
> - printk(KERN_ERR" mdio_write() timed out with val = %8.8x.\n", val);
> + problem(LOG_ERR, " mdio_write() timed out.\n",
> + detail(ioaddr, "%lx", ioaddr),
> + detail(value, "%8.8x", val));
likewise
etcetera...
Jeff Garzik wrote:
> > }
> > if (!request_mem_region(pci_resource_start(pdev, 0),
> > pci_resource_len(pdev, 0), "eepro100")) {
> > - printk (KERN_ERR "eepro100: cannot reserve MMIO region\n");
> > + pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve MMIO region");
>
> bloat, no advantage over printk
the advantage is that the string, which means plenty to the developer, but possibly
much less to a Sys Admin, can be replaced with a more descriptive message,
in the local language, by editing the formatting template in user-space. Since the
printk messages were mapped directly over to the problem macros, then the issue here
I think is how useful (or not) the info. is more so than what interface is used.
> > if (sum != 0xBABA)
> > - printk(KERN_WARNING "%s: Invalid EEPROM checksum %#4.4x, "
> > - "check settings before activating this device!\n",
> > - dev->name, sum);
> > + net_pci_problem(LOG_WARNING, dev, pdev, "Invalid EEPROM checksum, "
> > + "check settings before activating this device!",
>
> > + detail(checksum, "%#4.4x", sum));
>
> bloat, checksum is purely informational, and can be obtained through
> other means
indeed. See previous comment.
Hi,
First off, this looks much nicer than what the Driver Hardening Group
just tried to pass off as event logging. The two groups might want to
get together to sort out which formats is the desired ones, as they are
radically different.
On Mon, Sep 23, 2002 at 06:55:13PM -0700, Larry Kessler wrote:
> There are 5 patches that represent the combined efforts of Rusty
> Russell, the Enterprise Event logging team (Larry Kessler, Jim Keniston
> and Hien Nguyen), and Mike Sullivan (all from IBM). Patchs 1-4 are in
> separate notes. Patch 5 is included at the end of this note.
>
> The concept:
> -----------
> * Device Drivers use new macros to log "problems" when errors are
> detected.
Nice concept. But what's wrong with the existing method of logging when
errors are detected? Can you give us some background as to what is
lacking in the current stuff?
> Devices are also "introduced" at init time.
Um, this isn't a new concept. Devices show up when they are found by a
bus. Or are you talking about a different kind of "device"? And all
"introduction" of devices pass through /sbin/hotplug, so you can grab
all the detailed information about new devices you want right then.
> * If event logging is not configured, then the "details" passed to
> problem() and introduce() are written to printk.
So problem() and introduce() are just printk macros? Do they provide
the same compile-time type-safety that we have in printk (hint, based on
your example below, I would guess not.)
Can you have different levels of problem()? Like dbg(), warn(), and
info() are used in the USB subsystem (see drivers/include/usb.h). If
so, can these levels be adjusted at driver load, or run-time (like the
"debug" option on many kernel drivers.)
IMHO, that would be one of the main places to look at standardizing
across the kernel, much like Linus just mentioned last week or so.
> If event logging is configured....
>
> * During the build process
> the static details (textual description, problem attribute names,
> format specifiers for problem attributes, source file name, function
> name and line number) associated with the problem() and introduce()
> calls are stored in a .log section in the .o file.
Nice.
> * A user-mode utility reads the static details from the .log
> section and creates a "formatting template", which contains the
> static details needed to interpret and format the problem data
> that's logged during runtime.
Nice.
> * Developers, Distros, Sys Admins, etc. can simply edit the template
> (or provide an alternate template) to control which information from
> the problem record is displayed, how it is displayed, what language
> it is displayed in, and can add additional information like probable
> cause, recommended operator actions, recommended repair actions, etc.
> ...all without requiring any changes in the device driver source code.
But who is going to be doing these "translations"? Kernel log messages
change with every release. That would be a _huge_ undertaking to
translate them all.
> * Event logging utilities apply the templates to problem records for
> querying events, displaying events, event notification, and log
> management. Named-attributes in the problem data allow the above
> actions to key on specific attributes like MAC address, device name,
> etc.
But what happens when the attributes, and events change on every kernel
release? Who is keeping track of all of these? Who is managing them?
Is there some maintainer people need to notify when they create a new
type of event or attribute?
I can see someone creating some job security for a long time with this
task :)
If event logging is enabled in a kernel, what kind of format will the
kernel log messages show up as (i.e. can they be read by humans?)
> The patches for 2.5.38:
> ----------------------
> Patch 1/5 - Logging macros and template generation code
> (separate note)
> Patch 2/5 - Event Logging (separate note)
> Patch 3/5 - KBUILD_MODNAME (from Kai Germaschewski; separate note)
> Patch 4/5 - scsi device driver using the macros (separate note)
> Patch 5/5 - eepro100 device driver using the macros (end of this note)
> -> apply in the above sequence (patch 4 and 5 do not depend on
> each other, but both contain pci_problem.h)
>
>
> Example:
> -------
> (1) disk_dummy.c uses the problem() and detail() macros:
> /* serious disk problem */
> problem(LOG_ALERT, "Disk on fire!",
> detail(disk, "%s", drive->name),
> detail(temperature, "%d", drive->degC),
> detail(action, "%s", "Put out fire; run fsck."));
Is the second argument of problem() a format string? In your example
below, sometimes it is, and sometimes it is not.
If not, you just lost a very handy way of showing data in a line of
text, and changed the existing messages in a large way. What would the
above message look like in the kernel log if event logging is not
enabled?
> (2) During 'make bzImage' or 'make modules' static event data is stored
> in a .log section in the disk_dummy.o file.
Nice.
> (3) 'make templates' extracts this data from the disk_dummy.o file and
> generates a formatting template in templates/disk_dummy/disk_dummy.t:
>
> facility "disk_dummy";
> event_type 0x8ab218f4; /* file, message */
How is this generated? What does it match up with? Does this value
show up in the log file now?
> const {
> string message = "Disk on fire!";
> string file = "disk_dummy.c";
> string function = "disk_mon";
> int line = 81;
> }
> attributes {
> string action "%s";
> string disk "%s";
> int temperature "%d";
> }
> format
> %file%:%function%:%line%
> %message% action=%action% disk=%disk% temperature=%temperature%
>
> The .log section is not included in bzImage nor in modules installed
> with 'make modules_install'. However, the original disk_dummy.o file
> still has it. 'objcopy -R .log disk_dummy.o' removes it.
>
> (4) 'make templates_install' copies disk_dummy/disk_dummy.t to
> /var/evlog/templates.
>
> (5) 'evlfacility -a disk_dummy' adds the facility to the registry.
> 'evltc disk_dummy.t' compiles the template, and generates
> /var/evlog/templates/disk_dummy/0x8ab218f4.to, which is used
> by the event logging utilities.
>
> (6) When a problem() is logged by the device driver, the static info. is
> not stored in the event. Instead it is read by event logging
> utilities from the 0x8ab218f4.to file after the problem record is
> read from the event log file.
But where does the log info go to? Is there a chunk of new code that
now does kernel logging stuff, that differs from the way printk() works
today?
> (7) The template under (3) above would allow the command...
> >evlview -b -f 'disk="sda3" && temperature>80'
> to display events where sda3's temperature was greater than 80...
>
> recid=2163, size=33, format=BINARY, event_type=0x8ab218f4, facility=disk_dummy,
^^^^^^
um, where did binary show up from? This used to
be a simple text message.
> severity=ALERT, uid=root, gid=root, pid=1, pgrp=0,
> time=Fri 20 Sep 2002 04:00:01 PM PDT, flags=0x2 (KERNEL), thread=0x0,
> processor=2
> disk_dummy.c:dummy_mon:62
> Disk on fire! action=Put out fire; run fsck. disk=sda3 temperature=88
Like Jeff said, policy remains in userspace. Let the user decide if
they really want to put the fire out, or not :)
> Some other examples...
>
> evlview -b -f 'disk="sda3" && temperature>80' -m
>
> Sep 20 16:00:01 elm3b99 kernel: disk_dummy.c:dummy_mon:62
> Disk on fire! action=Put out fire; run fsck. disk=sda3 temperature=88
>
> ...-m causes a printk style display.
>
> By editing disk_dummy.t, recompiling it, and then reissuing the above
> command, the same problem data would be displayed differently...
>
> Sep 20 16:00:01 elm3b99 kernel:
> <<< Se quema el disco! Se quema el disco! >>>
> !!! temperatura=88 degrees C el disko=sda3
>
>
> Notes:
> -----
> For the following 3 invocations, the first 2 work, the 3rd does not...
>
> problem(LOG_ALERT, "Disk on fire"); // OK
>
> #define DISK_ON_FIRE "Disk on fire"
> problem(LOG_ALERT, DISK_ON_FIRE); // OK
>
> msg = "Disk on fire";
> problem(LOG_ALERT, msg); // No good
Why does this not work?
> Furthermore, you cannot have more than one problem() call on a single
> line. This restriction does not apply to the detail() macro.
>
> See http://evlog.sourceforge.net/ for more details about event logging.
>
> Go to https://sourceforge.net/project/showfiles.php?group_id=34226
> to download release evlog-2.5_kernel, 1.4.2_k2.5 for the companion
> user lib and utilities.
>
>
> To-do List
> ----------
>
> 1) Resolve "one problem() per line" restriction.
> 2) Generate shell scripts during 'make templates_install' that
> execute 'evlfacility' for all facilities and 'evltc' for all .t
> files (currently have to do one at a time).
> 3) For event-logging not configured case, buffer problem() data and
> make a single call to printk(), since multiple printks are
> non-atomic.
> 4) Define valid severities to use with for problem()...3 or 4.
So this would be a "logging level", much like we currently have with the
KERN_* levels?
> An actual device driver
> -----------------------
>
> Note that this patch includes pci_problem.h, as does the ips.c
> device driver patch included in the '4 of 4' note.
>
> Summary of this patch...
>
> drivers/net/eepro100.c
> Device Driver for the Intel PCI EtherExpressPro with new logging
> macros replacing prink() for error conditions.
>
> include/linux/net_problem.h
> - net_detail() macro providing common information of interest
> for ethernet-class devices.
> - net_problem, net_pci_problem, and net_introduce macros
>
> include/linux/pci_problem.h
>
> - pci_detail() macro providing common information on a per class
> basis when problems are being reported for devices of that class.
> - pci_problem and pci_introduce macros.
>
>
> --- linux-2.5.37/drivers/net/eepro100.c Fri Sep 20 10:20:31 2002
> +++ linux-2.5.37-net/drivers/net/eepro100.c Mon Sep 23 20:20:14 2002
> @@ -119,6 +119,7 @@
> #include <linux/etherdevice.h>
> #include <linux/skbuff.h>
> #include <linux/ethtool.h>
> +#include <linux/net_problem.h>
>
> MODULE_AUTHOR("Maintainer: Andrey V. Savochkin <[email protected]>");
> MODULE_DESCRIPTION("Intel i82557/i82558/i82559 PCI EtherExpressPro driver");
> @@ -325,7 +326,8 @@
> while(inb(cmd_ioaddr) && --wait >= 0);
> #ifndef final_version
> if (wait < 0)
> - printk(KERN_ALERT "eepro100: wait_for_cmd_done timeout!\n");
> + problem(LOG_ALERT, "eepro100: wait_for_cmd_done timeout!",
> + detail(ioaddr, "%lx", cmd_ioaddr));
Ok, msg is not a format string here.
Why not just use the existing KERN_* values, and not create new LOG_*
values, as it looks like you are matching them 1 to 1.
> @@ -568,6 +570,7 @@
> static int cards_found /* = 0 */;
>
> static int did_version /* = 0 */; /* Already printed version info. */
> + pci_introduce(pdev);
Please put this in the driver core. That way you only have to modify
one file, not 500.
> if (speedo_debug > 0 && did_version++ == 0)
> printk(version);
>
> @@ -586,12 +589,12 @@
>
> if (!request_region(pci_resource_start(pdev, 1),
> pci_resource_len(pdev, 1), "eepro100")) {
> - printk (KERN_ERR "eepro100: cannot reserve I/O ports\n");
> + pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve I/O ports");
> goto err_out_none;
> }
> if (!request_mem_region(pci_resource_start(pdev, 0),
> pci_resource_len(pdev, 0), "eepro100")) {
> - printk (KERN_ERR "eepro100: cannot reserve MMIO region\n");
> + pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve MMIO region");
> goto err_out_free_pio_region;
> }
>
> @@ -605,8 +608,10 @@
> ioaddr = (unsigned long)ioremap(pci_resource_start(pdev, 0),
> pci_resource_len(pdev, 0));
> if (!ioaddr) {
> - printk (KERN_ERR "eepro100: cannot remap MMIO region %lx @ %lx\n",
> - pci_resource_len(pdev, 0), pci_resource_start(pdev, 0));
> + pci_problem(LOG_ERR, pdev,
> + "eepro100: cannot remap MMIO region %lx @ %lx",
> + detail(region, "%lx", pci_resource_len(pdev, 0)),
> + detail(base, "%lx", pci_resource_start(pdev, 0)));
But msg is a format string here. What's the rules for this?
> @@ -784,11 +790,9 @@
> } while (self_test_results[1] == -1 && --boguscnt >= 0);
>
> if (boguscnt < 0) { /* Test optimized out. */
> - printk(KERN_ERR "Self test failed, status %8.8x:\n"
> - KERN_ERR " Failure to initialize the i82557.\n"
> - KERN_ERR " Verify that the card is a bus-master"
> - " capable slot.\n",
> - self_test_results[1]);
> + net_pci_problem(LOG_ERR, dev, pdev,
> + "Self test failed.Failure to initialize the i82557. Verify that the card is a bus-master capable slot.\n",
> + detail(results, "%8.8x", self_test_results[1]));
This used to be 3 messages. Now it's 1. Is that acceptable?
thanks,
greg k-h
Greg KH wrote:
>>The concept:
>>-----------
>>* Device Drivers use new macros to log "problems" when errors are
>> detected.
>
>
> Nice concept. But what's wrong with the existing method of logging when
> errors are detected? Can you give us some background as to what is
> lacking in the current stuff?
Bah, who needs define a problem when you have a sexy solution...
</sarcasm>
>>If event logging is configured....
>>
>>* During the build process
>> the static details (textual description, problem attribute names,
>> format specifiers for problem attributes, source file name, function
>> name and line number) associated with the problem() and introduce()
>> calls are stored in a .log section in the .o file.
>
>
> Nice.
indeed
>>(3) 'make templates' extracts this data from the disk_dummy.o file and
>> generates a formatting template in templates/disk_dummy/disk_dummy.t:
>>
>> facility "disk_dummy";
>> event_type 0x8ab218f4; /* file, message */
I don't see why we need a "make templates" at all in the kernel tarball.
This can be totally external to the kernel and still work fine.
>>(4) 'make templates_install' copies disk_dummy/disk_dummy.t to
>> /var/evlog/templates.
If they are compiled into the kernel and modules, this is not needed in
the kernel tarball either.
It should be straightforward to [re-]generate templates on boot, much
like module dependencies are [re-]computed on boot when necessary.
>>Notes:
>>-----
>>For the following 3 invocations, the first 2 work, the 3rd does not...
>>
>>problem(LOG_ALERT, "Disk on fire"); // OK
>>
>>#define DISK_ON_FIRE "Disk on fire"
>>problem(LOG_ALERT, DISK_ON_FIRE); // OK
>>
>>msg = "Disk on fire";
>>problem(LOG_ALERT, msg); // No good
>
>
> Why does this not work?
doh! I missed that. That "no good" example is in use in the kernel
today, implying that this new API reduces functionality...
Jeff
In message <[email protected]> you write:
> > @@ -325,7 +326,8 @@
> > while(inb(cmd_ioaddr) && --wait >= 0);
> > #ifndef final_version
> > if (wait < 0)
> > - printk(KERN_ALERT "eepro100: wait_for_cmd_done timeout!\n");
> > + problem(LOG_ALERT, "eepro100: wait_for_cmd_done timeout!",
> > + detail(ioaddr, "%lx", cmd_ioaddr));
>
> bloat, the ioaddr can easily be deduced
No! That's *exactly* the problem: you see:
eepro100: wait_for_cmd_done timeout!
in your logs, now *which* of the 5 eepro100 cards was it?
wait_for_cmd_done(long cmd_ioaddr) should take a 'struct net_device *'
and use net_problem, then no details needed.
> > + pci_introduce(pdev);
>
> bloat, we don't need foo_introduce() functions for every subsystem, when
> every subsystem always has an attach-new-device function.
Yes, this should be moved up into the generic pci/networking code, but I
guess they wanted a self-contained example.
> > - printk (KERN_ERR "eepro100: cannot reserve MMIO region\n");
> > + pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve MMIO regio
n");
>
> bloat, no advantage over printk
Now, which of those 5 cards was it again?
> > - printk(KERN_WARNING "%s: Invalid EEPROM checksum %#4.4x
, "
> > - "check settings before activating this devic
e!\n",
> > - dev->name, sum);
> > + net_pci_problem(LOG_WARNING, dev, pdev, "Invalid EEPROM
checksum, "
> > + "check settings before activating this devic
e!",
>
> > + detail(checksum, "%#4.4x", sum));
>
> bloat, checksum is purely informational, and can be obtained through
> other means
It's a direct translation from the printk ferchissakes!
> likewise
>
> etcetera...
Exactly.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
One further comment:
On Mon, Sep 23, 2002 at 06:55:13PM -0700, Larry Kessler wrote:
> --- linux-2.5.37/drivers/include/linux/net_problem.h Wed Dec 31 18:00:00 1969
> +++ linux-2.5.37-net/include/linux/net_problem.h Mon Sep 23 20:04:23 2002
> --- linux-2.5.37/drivers/include/linux/pci_problem.h Wed Dec 31 18:00:00 1969
> +++ linux-2.5.37-net/include/linux/pci_problem.h Mon Sep 23 19:56:11 2002
{sigh}
Have people been ignoring all of the core driver changes that have been
happening? Almost everything that is "struct device" now, with some bus
specific things tacked on (and those bus specific things are getting
slowly merged into struct device too.)
It would make more sense (if you continue this path of changes to the
kernel) to focus on the device, bus, and class structures. That way you
don't have to create a usb_problem.h, iee1394_problem.h, i2c_problem.h,
i2o_problem.h, scsi_problem.h, ide_problem.h, etc.
thanks,
greg k-h
Rusty Russell wrote:
> In message <[email protected]> you write:
>
>>>@@ -325,7 +326,8 @@
>>> while(inb(cmd_ioaddr) && --wait >= 0);
>>> #ifndef final_version
>>> if (wait < 0)
>>>- printk(KERN_ALERT "eepro100: wait_for_cmd_done timeout!\n");
>>>+ problem(LOG_ALERT, "eepro100: wait_for_cmd_done timeout!",
>>>+ detail(ioaddr, "%lx", cmd_ioaddr));
>>
>>bloat, the ioaddr can easily be deduced
>
>
> No! That's *exactly* the problem: you see:
> eepro100: wait_for_cmd_done timeout!
>
> in your logs, now *which* of the 5 eepro100 cards was it?
>
> wait_for_cmd_done(long cmd_ioaddr) should take a 'struct net_device *'
> and use net_problem, then no details needed.
right, that's a bug, it needs struct net_device * like the standard
Becker style.
>>>- printk (KERN_ERR "eepro100: cannot reserve MMIO region\n");
>>>+ pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve MMIO regio
>>
> n");
>
>>bloat, no advantage over printk
>
>
> Now, which of those 5 cards was it again?
Another bug, this driver should be using pci_request_regions() which
prints that stuff out :)
Does IBM want to submit a patch that cleans up these problems, and makes
the existing event logging more standard [and is compatible with
existing 2.4 and 2.5 kernels]?
As an aside, changing all those printks also introduces a _huge_ PITA
for driver developers porting drivers back and forth between 2.4 and 2.5.
Jeff
In message <[email protected]> you write:
> right, that's a bug, it needs struct net_device * like the standard
> Becker style.
And Andrey Savochkin is no idiot. But if he had a standard
problem-reporting macro to guide him, he wouldn't have done this.
>
> >>>- printk (KERN_ERR "eepro100: cannot reserve MMIO region\n");
> >>>+ pci_problem(LOG_ERR, pdev, "eepro100: cannot reserve MMIO regio
> >>
> > n");
> >
> >>bloat, no advantage over printk
> >
> >
> > Now, which of those 5 cards was it again?
>
> Another bug, this driver should be using pci_request_regions() which
> prints that stuff out :)
Ditto.
> Does IBM want to submit a patch that cleans up these problems, and makes
> the existing event logging more standard [and is compatible with
> existing 2.4 and 2.5 kernels]?
(Um, this is supposed to go into the 2.5 kernel, so the last arg is
void).
This doesn't deprecate printk(). It just formalizes printk() for
problem reporting inside drivers. As we see more tools come out
making this useful, I expect drivers will willingly change across.
You can ensure that each message says what the driver name and kernel
version is if you want, *without* changing every driver.
You can do this with some "printk templates", but *that* would be too
ugly to live:
#include <linux/net_templates.h>
...
printk(NETPROBLEM_27, dev, skb);
...
> As an aside, changing all those printks also introduces a _huge_ PITA
> for driver developers porting drivers back and forth between 2.4 and 2.5.
That's not a good reason. compat.h can define them into printks with
a bit of cpp magic. In fact, Werner already wrote that. Hmm... the
one in the patch they posted is not based on Werner's, but my older
work. It should look vaguely like:
/* Do not meddle in the affairs of cpp, for it is subtle, and quick to anger */
#define __detS_
#define __detV_
#define __detS_detail(label,format,value) " " #label "=" format
#define __detV_detail(label,format,value) value,
#define __detV_3(detail) detail
#define __detV_2(detail,...) detail __detV_3( __detV_ ## __VA_ARGS__)
#define __detV_1(detail,...) detail __detV_2( __detV_ ## __VA_ARGS__)
#define __detS_3(detail) detail
#define __detS_2(detail,...) detail __detS_3( __detS_ ## __VA_ARGS__)
#define __detS_1(detail,...) detail __detS_2( __detS_ ## __VA_ARGS__)
#define recovered(id, msg, ...) \
printk(KERN_WARNING "%p:%s:" __detS_1( __detS_ ## __VA_ARGS__) "%c", \
id, msg, __detV_1( __detV_ ## __VA_ARGS__) '\n')
#define problem(id, msg, ...) \
printk(KERN_ERR "%p:%s:" __detS_1( __detS_ ## __VA_ARGS__) "%c", \
id, msg, __detV_1( __detV_ ## __VA_ARGS__) '\n')
#define introduce(id, name, ...) \
printk(KERN_INFO "Introducing %p:%s:" \
__detS_1( __detS_ ## __VA_ARGS__) "%c", \
id, msg, __detV_1( __detV_ ## __VA_ARGS__) '\n')
#define unintroduce(id) \
printk(KERN_INFO "Unintroducing %p\n", id)
================
Now, these were the previous macros which didn't have a severity arg,
but you get the idea.
Rusty.
--
Anyone who quotes me in their sig is an idiot. -- Rusty Russell.
On 24 September 2002 02:49, Larry Kessler wrote:
> Jeff Garzik wrote:
> > > }
> > > if (!request_mem_region(pci_resource_start(pdev, 0),
> > > pci_resource_len(pdev, 0), "eepro100")) {
> > > - printk (KERN_ERR "eepro100: cannot reserve MMIO
> > > region\n"); + pci_problem(LOG_ERR, pdev, "eepro100: cannot
> > > reserve MMIO region");
> >
> > bloat, no advantage over printk
>
> the advantage is that the string, which means plenty to the developer, but
> possibly much less to a Sys Admin, can be replaced with a more descriptive
> message, in the local language, by editing the formatting template in
> user-space. Since the printk messages were mapped directly over to the
> problem macros, then the issue here I think is how useful (or not) the
> info. is more so than what interface is used.
The problem is that printks are very easy, people won't easily switch
to any other thing if it is hard to understand/use. If you can provide
such easy interface, then ok.
Regarding translation problem: it makes life easier to admins, i.e. you
enable Linux to be used by more stupid admins :-).
This race could not be won: Universe can always produce more remarkable
idiots. Next time they will ask you "what is eepro100?" and if you say
"it's a NIC, if you have intermittent link indicator problem try to reseat
network cord..." they will ask "what is NIC and how to reseat the cord?"
(btw, they will ask that in their native language). 8-(
Maybe requiring admins to know basic English is not that bad?
--
vda
My $0.02.
I understand some of the reasons why certain logging standardization
is helpful.
An example: an administrator is mainly interested in problems with the disks.
Can anyone write, say, a regular expression matching printk messages related
to disks with very low false positive and false negative levels?
I can't.
The current Larry's proposal has a disadvantage of being so big.
Anyone will have a strong internal opposition to the need to learn this
interface for just a simple logging.
Second. One of the most important things is handling of log messages
split over multiple printk and, possibly, multiple lines.
Larry seemed to omit the most interesting place in eepro100 driver:
speedo_show_state() :-)
Any logging infrastructure not being able to deal with places like
speedo_show_state() is only half-useful, unfortunately.
The user-level log management system should be notified that
such a dump is just a single (long) piece of information, consisting of
multiple lines.
I would think about an interface looking like:
log_piece_start_netdev(dev);
printk(KERN_WARNING "%s: Transmit timed out: status %4.4x "
" %4.4x at %d/%d command %8.8x.\n",
dev->name, status, inw(ioaddr + SCBCmd),
sp->dirty_tx, sp->cur_tx,
sp->tx_ring[sp->dirty_tx % TX_RING_SIZE].status);
speedo_show_state(dev);
log_piece_end_netdev(dev);
It is simple and not intrusive, allowing to keep most of the code as it is
(or as driver author prefers).
On top of that simple interface, you can have whatever complex infrastructure
you want, gradually bringing the code to something like
log_piece_start_netdev(dev);
log_severity(LOG_WARNING);
log_netdev_attrib(LOG_NETDEV_TRANSMIT);
printk(KERN_WARNING "%s: Transmit timed out: status %4.4x "
" %4.4x at %d/%d command %8.8x.\n",
dev->name, status, inw(ioaddr + SCBCmd),
sp->dirty_tx, sp->cur_tx,
sp->tx_ring[sp->dirty_tx % TX_RING_SIZE].status);
log_severity(LOG_DEBUG);
speedo_show_state(dev);
log_piece_end_netdev(dev);
or whatever you consider nice and useful.
Best regards
Andrey
P.S. of course, wait_for_cmd_done should have net_device as an argument,
and any logging infrastructure can't help there :-)
On Tue, 24 Sep 2002, Denis Vlasenko wrote:
> Regarding translation problem: it makes life easier to admins, i.e. you
> enable Linux to be used by more stupid admins :-).
Judging by the fact that most linux webservers were still running
vunlerable versions of apache as of the last netcraft survey I'd say that
goal has been already been hit. ;)
Gerhard
--
Gerhard Mack
[email protected]
<>< As a computer I find your faith in technology amusing.
That's a moot point.
You'll have to ask RedHat et al why they persist in backporting
security patches to "old" releases of Apache (etc) instead of
releasing the new versions. The effect is the same, with
vulnerabilities being squashed, but the version numbers reported
suggesting otherwise.
Phil
---------------------------------------------
Phil Randal
Network Engineer
Herefordshire Council
Hereford, UK
> -----Original Message-----
> From: Gerhard Mack [mailto:[email protected]]
> Sent: 24 September 2002 14:59
> To: Denis Vlasenko
> Cc: Larry Kessler; linux-kernel mailing list
> Subject: Re: [PATCH-RFC] README 1ST - New problem logging macros
> (2.5.38)
>
>
> On Tue, 24 Sep 2002, Denis Vlasenko wrote:
>
> > Regarding translation problem: it makes life easier to
> admins, i.e. you
> > enable Linux to be used by more stupid admins :-).
>
> Judging by the fact that most linux webservers were still running
> vunlerable versions of apache as of the last netcraft survey
> I'd say that
> goal has been already been hit. ;)
>
> Gerhard
>
>
> --
> Gerhard Mack
>
> [email protected]
>
> <>< As a computer I find your faith in technology amusing.
>
> -
> To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Tue, 24 Sep 2002, Randal, Phil wrote:
> You'll have to ask RedHat et al why they persist in backporting
> security patches to "old" releases of Apache (etc) instead of
> releasing the new versions. The effect is the same, with
> vulnerabilities being squashed, but the version numbers reported
> suggesting otherwise.
Because every new version gives you some new problems/incompatibilities,
which must not happen in a running production-environment.
And testing every security-update 4 weeks in the lab before putting
it into production would be worse.
c'ya
sven
--
The Internet treats censorship as a routing problem, and routes around it.
(John Gilmore on http://www.cygnus.com/~gnu/)
On Mon, 23 Sep 2002, Greg KH wrote:
> One further comment:
>
> On Mon, Sep 23, 2002 at 06:55:13PM -0700, Larry Kessler wrote:
> > --- linux-2.5.37/drivers/include/linux/net_problem.h Wed Dec 31 18:00:00 1969
> > +++ linux-2.5.37-net/include/linux/net_problem.h Mon Sep 23 20:04:23 2002
>
>
> > --- linux-2.5.37/drivers/include/linux/pci_problem.h Wed Dec 31 18:00:00 1969
> > +++ linux-2.5.37-net/include/linux/pci_problem.h Mon Sep 23 19:56:11 2002
>
> {sigh}
>
> Have people been ignoring all of the core driver changes that have been
> happening? Almost everything that is "struct device" now, with some bus
> specific things tacked on (and those bus specific things are getting
> slowly merged into struct device too.)
>
> It would make more sense (if you continue this path of changes to the
> kernel) to focus on the device, bus, and class structures. That way you
> don't have to create a usb_problem.h, iee1394_problem.h, i2c_problem.h,
> i2o_problem.h, scsi_problem.h, ide_problem.h, etc.
On a tangent and ignoring the completely correct statement...
Why do we have such a flat namespace in include/linux/ anyway? Assuming
introducing a problem.h header for each subsystem was a sound idea, we
could just have
include/linux/problem/pci.h
include/linux/problem/net.h
...
-pat
Hi,
On Tue, 24 Sep 2002, Gerhard Mack wrote:
> Judging by the fact that most linux webservers were still running
> vunlerable versions of apache as of the last netcraft survey I'd say that
> goal has been already been hit. ;)
They all must go through hell and back at least twice to understand, you
know... ;-)
Thunder
--
assert(typeof((fool)->next) == typeof(fool)); /* wrong */
On Tue, 2002-09-24 at 15:04, Randal, Phil wrote:
> That's a moot point.
>
> You'll have to ask RedHat et al why they persist in backporting
> security patches to "old" releases of Apache (etc) instead of
> releasing the new versions. The effect is the same, with
Simple enough answer. Because thats what our customers want generally.
They want fixes, they do not want updates that may break things in new
ways or require they change third party modules
Greg KH wrote:
>
> Hi,
>
> First off, this looks much nicer than what the Driver Hardening Group
> just tried to pass off as event logging.
Thanks for the feedback. You've asked a lot of good questions, some of
which have been answered in subsequent posts. Instead of attempting to
answer them all now, we'll make sure we take them into account as we
update and fix the original proposal. I'll answer a couple though.
> The two groups might want to
> get together to sort out which formats is the desired ones, as they are
> radically different.
We will.
>
> > * Developers, Distros, Sys Admins, etc. can simply edit the template
> > (or provide an alternate template) to control which information from
> > the problem record is displayed, how it is displayed, what language
> > it is displayed in, and can add additional information like probable
> > cause, recommended operator actions, recommended repair actions, etc.
> > ...all without requiring any changes in the device driver source code.
>
> But who is going to be doing these "translations"? Kernel log messages
> change with every release. That would be a _huge_ undertaking to
> translate them all.
I would not expect translations to be done for each and every version.
Not is the expectation that EVERY prink would be converted, only those that
report errors.
Distros could be motivated to provide translations, etc. for the kernel
versions that they base new releases on. It would just have to make
sense for them financially to translate and supplement what's in the
templates AND be an accepted/expected practice in the community (based on
some Distro feedback).
>
> I can see someone creating some job security for a long time with this
> task :)
>
And what's wrong with that ? 8)
On Thu, 26 Sep 2002, Larry Kessler wrote:
> Distros could be motivated to provide translations, etc. for the kernel
> versions that they base new releases on.
Unlikely. It's hard enough already when somebody who doesn't
speak the language submits a bugreport by email or through
bugzilla.
I don't want to imagine receiving a bug report from eg. Japan
that has a cut'n'pasted kernel error in Japanese. It's not just
that I can't read Japanese ... I don't even have the FONT to
display it.
regards,
Rik
--
A: No.
Q: Should I include quotations after my reply?
http://www.surriel.com/ http://distro.conectiva.com/
Rik van Riel wrote:
>
> On Thu, 26 Sep 2002, Larry Kessler wrote:
>
> > Distros could be motivated to provide translations, etc. for the kernel
> > versions that they base new releases on.
>
> Unlikely. It's hard enough already when somebody who doesn't
> speak the language submits a bugreport by email or through
> bugzilla.
>
> I don't want to imagine receiving a bug report from eg. Japan
> that has a cut'n'pasted kernel error in Japanese. It's not just
> that I can't read Japanese ... I don't even have the FONT to
> display it.
Right, so the tools that take kernel events and display them in
human-readable form must be written to always display in english,
with the option to also display in another language, thus allowing
the non-English-reading SysAdmin in Japan to easily understand the
info.
On Thursday 26 September 2002 04:01 pm, Larry Kessler wrote:
> Rik van Riel wrote:
> > On Thu, 26 Sep 2002, Larry Kessler wrote:
> > > Distros could be motivated to provide translations, etc. for the kernel
> > > versions that they base new releases on.
> >
> > Unlikely. It's hard enough already when somebody who doesn't
> > speak the language submits a bugreport by email or through
> > bugzilla.
> >
> > I don't want to imagine receiving a bug report from eg. Japan
> > that has a cut'n'pasted kernel error in Japanese. It's not just
> > that I can't read Japanese ... I don't even have the FONT to
> > display it.
>
> Right, so the tools that take kernel events and display them in
> human-readable form must be written to always display in english,
> with the option to also display in another language, thus allowing
> the non-English-reading SysAdmin in Japan to easily understand the
> info.
If your system is functional enough after the "event" that the user can make
this selection and the system listens to them, then it can clearly be done in
userspace.
Otherwise, the ability to select is kind of pointless. It has to be done
before you actually have a problem, and will result in swahili bug reports to
the list because that's the language it was outputting when the problem
happened and they can't necessarily reproduce the problem on demand.
Rob