ok, found and fixed the bug reported by K.R. Foley, and found the bug
reported by Florian Schmidt as well.
the first bug was caused by an unrobustness in cond_resched(). The bug
happens when a task that is in do_exit() happens to be preempted via
cond_resched() - the TASK_ZOMBIE/TASK_DEAD task state is overwritten
with TASK_RUNNING and then the task crashes in the 'final' schedule. To
fix this i've changed cond_resched() to be much closer in behavior to
preempt_schedule() - this makes sense anyway.
Florian's bug triggers if softirq_preemption is disabled: if a softirq
still gets delayed to softirqd (this can happen even in the stock
kernel, under certain circumstances) then it would be executed without
disabling direct softirq execution. While this is safe and intended to
make softirqd preemptable when softirq_preemption==1, it's unsafe and an
illegal preemption when there are indirect softirqs around. The fix is
to properly disable softirqs in this branch too.
i've uploaded -R4 which fixes these two bugs:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk4-R4
other changes in -R4:
- add the RX-break-up to e100.c which was promised in -R0 - patch was
missing by mistake.
- small tweaks to the latency_trace header
Ingo
i've released -R5:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R5
2.6.9-rc1-bk12 patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
Changes in -R5:
- merge to 2.6.9-rc1-bk12
- fixed an in_atomic() bug in the SMP && !PREEMPT kernel
Ingo
On Sat, 4 Sep 2004 21:51:41 +0200
Ingo Molnar <[email protected]> wrote:
> i've uploaded -R4 which fixes these two bugs:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk4-R4
yep, no more lockup.. cool
flo
Florian Schmidt wrote:
> On Sat, 4 Sep 2004 21:51:41 +0200
> Ingo Molnar <[email protected]> wrote:
>
>
>>i've uploaded -R4 which fixes these two bugs:
>>
>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk4-R4
>
>
> yep, no more lockup.. cool
>
> flo
So far so good here also. Thanks Ingo.
kr
Ingo,
Any chance you will start syncing your patches up with the -mm tree ? Or is it
just -bk for the time being ?
Matt H.
On Sunday 05 September 2004 7:02 am, Ingo Molnar wrote:
> i've released -R5:
>
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12
>-R5
>
> 2.6.9-rc1-bk12 patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
> + http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
>
> Changes in -R5:
>
> - merge to 2.6.9-rc1-bk12
>
> - fixed an in_atomic() bug in the SMP && !PREEMPT kernel
>
> Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
On Sun, 2004-09-05 at 10:02, Ingo Molnar wrote:
> i've released -R5:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R5
>
> 2.6.9-rc1-bk12 patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
> + http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
>
Ok, first new one in a while. This was with -R0, but I haven't seen
anyone else report it. Let me know if you need the complete trace.
preemption latency trace v1.0.2
-------------------------------
latency: 511 us, entries: 951 (951)
-----------------
| task: dbench/4810, uid:1000 nice:0 policy:0 rt_prio:0
-----------------
=> started at: kill_pg_info+0x10/0x50
=> ended at: kill_pg_info+0x2e/0x50
=======>
00000001 0.000ms (+0.000ms): kill_pg_info (sys_kill)
00000001 0.000ms (+0.000ms): __kill_pg_info (kill_pg_info)
00000001 0.000ms (+0.000ms): find_pid (__kill_pg_info)
00000001 0.001ms (+0.000ms): group_send_sig_info (__kill_pg_info)
00000001 0.001ms (+0.000ms): check_kill_permission (group_send_sig_info)
00000001 0.002ms (+0.000ms): dummy_task_kill (check_kill_permission)
00000002 0.002ms (+0.000ms): __group_send_sig_info (group_send_sig_info)
00000002 0.003ms (+0.000ms): handle_stop_signal (__group_send_sig_info)
00000002 0.003ms (+0.000ms): rm_from_queue (handle_stop_signal)
00000002 0.004ms (+0.000ms): rm_from_queue (handle_stop_signal)
00000002 0.004ms (+0.000ms): wake_up_state (handle_stop_signal)
00000002 0.005ms (+0.000ms): try_to_wake_up (wake_up_state)
00000002 0.005ms (+0.000ms): task_rq_lock (try_to_wake_up)
00000002 0.006ms (+0.000ms): next_thread (handle_stop_signal)
00000002 0.006ms (+0.000ms): sig_ignored (__group_send_sig_info)
00000002 0.007ms (+0.000ms): send_signal (__group_send_sig_info)
00000002 0.008ms (+0.000ms): kmem_cache_alloc (send_signal)
00000002 0.009ms (+0.001ms): memcpy (send_signal)
[...]
00000002 0.496ms (+0.000ms): kmem_cache_alloc (send_signal)
00000002 0.497ms (+0.000ms): memcpy (send_signal)
00000002 0.498ms (+0.000ms): __group_complete_signal (__group_send_sig_info)
00000002 0.498ms (+0.000ms): task_curr (__group_complete_signal)
00000001 0.498ms (+0.000ms): preempt_schedule (group_send_sig_info)
00000001 0.499ms (+0.000ms): group_send_sig_info (__kill_pg_info)
00000001 0.499ms (+0.000ms): check_kill_permission (group_send_sig_info)
00000001 0.499ms (+0.000ms): dummy_task_kill (check_kill_permission)
00000002 0.500ms (+0.000ms): __group_send_sig_info (group_send_sig_info)
00000002 0.500ms (+0.000ms): handle_stop_signal (__group_send_sig_info)
00000002 0.501ms (+0.000ms): rm_from_queue (handle_stop_signal)
00000002 0.501ms (+0.000ms): rm_from_queue (handle_stop_signal)
00000002 0.502ms (+0.000ms): wake_up_state (handle_stop_signal)
00000002 0.502ms (+0.000ms): try_to_wake_up (wake_up_state)
00000002 0.503ms (+0.000ms): task_rq_lock (try_to_wake_up)
00000003 0.503ms (+0.000ms): activate_task (try_to_wake_up)
00000003 0.503ms (+0.000ms): sched_clock (activate_task)
00000003 0.504ms (+0.000ms): recalc_task_prio (activate_task)
00000003 0.505ms (+0.000ms): effective_prio (recalc_task_prio)
00000003 0.505ms (+0.000ms): enqueue_task (activate_task)
00000002 0.505ms (+0.000ms): preempt_schedule (try_to_wake_up)
00000002 0.506ms (+0.000ms): next_thread (handle_stop_signal)
00000002 0.506ms (+0.000ms): sig_ignored (__group_send_sig_info)
00000002 0.507ms (+0.000ms): send_signal (__group_send_sig_info)
00000002 0.507ms (+0.000ms): kmem_cache_alloc (send_signal)
00000002 0.508ms (+0.000ms): cache_alloc_refill (kmem_cache_alloc)
00000002 0.509ms (+0.001ms): preempt_schedule (cache_alloc_refill)
00000002 0.510ms (+0.000ms): memcpy (send_signal)
00000002 0.510ms (+0.000ms): __group_complete_signal (__group_send_sig_info)
00000002 0.511ms (+0.000ms): task_curr (__group_complete_signal)
00000001 0.511ms (+0.000ms): preempt_schedule (group_send_sig_info)
00000001 0.512ms (+0.001ms): sub_preempt_count (kill_pg_info)
00000001 0.513ms (+0.000ms): update_max_trace (check_preempt_timing)
00000001 0.513ms (+0.000ms): _mmx_memcpy (update_max_trace)
00000001 0.514ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
Lee
* Lee Revell <[email protected]> wrote:
> Ok, first new one in a while. This was with -R0, but I haven't seen
> anyone else report it. Let me know if you need the complete trace.
>
> preemption latency trace v1.0.2
> -------------------------------
> latency: 511 us, entries: 951 (951)
> -----------------
> | task: dbench/4810, uid:1000 nice:0 policy:0 rt_prio:0
> -----------------
> => started at: kill_pg_info+0x10/0x50
> => ended at: kill_pg_info+0x2e/0x50
> =======>
> 00000001 0.000ms (+0.000ms): kill_pg_info (sys_kill)
> 00000001 0.000ms (+0.000ms): __kill_pg_info (kill_pg_info)
> 00000001 0.000ms (+0.000ms): find_pid (__kill_pg_info)
this is quite hard to fix - lots of processes were SIGKILL-ed (or
SIGTERM-ed) and the signal semantics require us to deliver signals
atomically. The only fix would be to turn the signal locks into
semaphores but that's quite hard. (it's also a bit problematic for
interrupt-delivered signals.)
Ingo
On Sun, 2004-09-05 at 15:12, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > Ok, first new one in a while. This was with -R0, but I haven't seen
> > anyone else report it. Let me know if you need the complete trace.
> >
> > preemption latency trace v1.0.2
> > -------------------------------
> > latency: 511 us, entries: 951 (951)
> > -----------------
> > | task: dbench/4810, uid:1000 nice:0 policy:0 rt_prio:0
> > -----------------
> > => started at: kill_pg_info+0x10/0x50
> > => ended at: kill_pg_info+0x2e/0x50
> > =======>
> > 00000001 0.000ms (+0.000ms): kill_pg_info (sys_kill)
> > 00000001 0.000ms (+0.000ms): __kill_pg_info (kill_pg_info)
> > 00000001 0.000ms (+0.000ms): find_pid (__kill_pg_info)
>
> this is quite hard to fix - lots of processes were SIGKILL-ed (or
> SIGTERM-ed) and the signal semantics require us to deliver signals
> atomically. The only fix would be to turn the signal locks into
> semaphores but that's quite hard. (it's also a bit problematic for
> interrupt-delivered signals.)
>
> Ingo
>
Here is a histogram I generated using realfeel2. This should provide
better data than my jackd histograms because the latter are dependent on
the ALSA driver, jackd's design, etc.
I had to modify the amlat utilities to use usecs instead of msecs, this
is a very good sign. ;-)
http://krustophenia.net/testresults.php?dataset=2.6.9-rc1-R0#/var/www/2.6.9-rc1-R0/foo.hist
I find the two smaller spikes to either side of the central spike really
odd. These showed up in my jackd tests too, I had attributed them to
some measurement artifact, but they seem real. Maybe a rounding bug, or
some kind of weird cache effect?
Lee
* Lee Revell <[email protected]> wrote:
> http://krustophenia.net/testresults.php?dataset=2.6.9-rc1-R0#/var/www/2.6.9-rc1-R0/foo.hist
>
> I find the two smaller spikes to either side of the central spike
> really odd. These showed up in my jackd tests too, I had attributed
> them to some measurement artifact, but they seem real. Maybe a
> rounding bug, or some kind of weird cache effect?
interesting - the histograms are pretty symmetric around the center.
E.g. the exponential foo.hist2 diagram is way too symmetric around 50
usecs! What precisely is being measured?
Ingo
On Mon, 2004-09-06 at 02:30, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > http://krustophenia.net/testresults.php?dataset=2.6.9-rc1-R0#/var/www/2.6.9-rc1-R0/foo.hist
> >
> > I find the two smaller spikes to either side of the central spike
> > really odd. These showed up in my jackd tests too, I had attributed
> > them to some measurement artifact, but they seem real. Maybe a
> > rounding bug, or some kind of weird cache effect?
>
> interesting - the histograms are pretty symmetric around the center.
> E.g. the exponential foo.hist2 diagram is way too symmetric around 50
> usecs! What precisely is being measured?
>
Here's the program. It does mlockall(), acquires realtime scheduling,
then sets up a 2048 Hz stream of interupts from the RTC and measures the
delay. It's quite possible there's a bug, the amlat program did not
seem to work, something must have changed with the RTC from 2.4 to 2.6.
/*
* This was originally written by Mark Hahn. Obtained from
* http://brain.mcmaster.ca/~hahn/realfeel.c
*/
#include <linux/rtc.h>
#include <sys/ioctl.h>
#include <fcntl.h>
#include <unistd.h>
#include <errno.h>
#include <math.h>
#include <stdlib.h>
#include <stdio.h>
#include <sys/time.h>
#include <sys/mman.h>
#include <sched.h>
#include <sys/signal.h>
#include <string.h>
#define _GNU_SOURCE
#include <getopt.h>
/* global vars */
int stopit; /* set to stop measuring */
#define SAMPLES 10000
int histogram[SAMPLES]; /* Milliseconds */
#define PAGE_SIZE 4096UL /* virtual memory page size */
#define OSCR_HZ 3686400 /* frequency of clock ticks */
#define OSCR 0x90000010 /* physical address of OSCR register */
unsigned long *oscr; /* ptr to OSCR */
void setup_clock (void)
{
int fd;
void *map_base;
off_t target;
off_t page;
#ifndef ARCHARM
return;
#endif
fd = open ("/dev/mem", O_RDWR | O_SYNC);
if (-1 == fd) {
perror ("open of /dev/mem failed\n");
exit (1);
}
/* Map one page */
target = OSCR;
page = target & ~(PAGE_SIZE - 1);
map_base = mmap (0, PAGE_SIZE, PROT_READ, MAP_SHARED, fd, page);
if (MAP_FAILED == map_base) {
perror ("mmap failed");
(void) close (fd);
exit (2);
}
oscr = map_base + (target - page);
/* This does not end the mmap,
* the mmap will go away when the process dies
*/
close (fd);
}
double second() {
struct timeval tv;
gettimeofday(&tv,0);
return tv.tv_sec + 1e-6 * tv.tv_usec;
}
typedef unsigned long long u64;
u64 rdtsc() {
u64 tsc;
#ifdef ARCHARM
tsc = *oscr;
#else
__asm__ __volatile__("rdtsc" : "=A" (tsc));
#endif
return tsc;
}
void selectsleep(unsigned us) {
struct timeval tv;
tv.tv_sec = 0;
tv.tv_usec = us;
select(0,0,0,0,&tv);
}
double secondsPerTick, ticksPerSecond;
void calibrate()
{
double sumx = 0;
double sumy = 0;
double sumxx = 0;
double sumxy = 0;
double slope;
// least squares linear regression of ticks onto real time
// as returned by gettimeofday.
const unsigned n = 30;
unsigned i;
for (i=0; i<n; i++) {
double breal,real,ticks;
u64 bticks;
breal = second();
bticks = rdtsc();
selectsleep((unsigned)(10000 + drand48() * 200000));
ticks = rdtsc() - bticks;
real = second() - breal;
sumx += real;
sumxx += real * real;
sumxy += real * ticks;
sumy += ticks;
}
slope = ((sumxy - (sumx*sumy) / n) /
(sumxx - (sumx*sumx) / n));
ticksPerSecond = slope;
secondsPerTick = 1.0 / slope;
printf("%3.3f MHz\n",ticksPerSecond*1e-6);
}
void fatal(char *msg)
{
perror(msg);
exit(1);
}
int set_realtime_priority(void)
{
struct sched_param schp;
/*
* set the process to realtime privs
*/
memset(&schp, 0, sizeof(schp));
schp.sched_priority = sched_get_priority_max(SCHED_FIFO);
if (sched_setscheduler(0, SCHED_FIFO, &schp) != 0) {
perror("sched_setscheduler");
exit(1);
}
return 0;
}
void hist(char *file)
{
int i;
FILE *f;
f = fopen(file, "w");
if (f == 0) {
fprintf(stderr, "realfeel: can't open `%s':%s\n",
file, strerror(errno));
exit(1);
}
for (i = 0; i < SAMPLES; i++) {
if (histogram[i]) {
fprintf(f, "%d.%d %d\n",
i / 10, i % 10, histogram[i]);
}
}
fclose(f);
exit(0);
}
void signalled(int sig)
{
stopit = 1;
}
static void usage(void)
{
fprintf(stderr, "Usage: realfeel [--samples n] [--hertz n] filename.hist\n");
exit(1);
}
uint bounded=0;
uint ncycles=0;
uint current=0;
int hz = 2048;
char *parse_options(int argc, char **argv)
{
int c, option_index;
static struct option long_options[] = {
{"samples", required_argument, 0, 0},
{"hertz", required_argument, 0, 0},
{0, 0, 0, 0},
};
while ((c = getopt_long (argc, argv, "b:", long_options, &option_index)) != -1) {
switch (option_index) {
default:
case -1:
fprintf (stderr, "Bad option (%s)\n", optarg);
usage();
exit (1);
break;
case 0:
bounded = 1;
ncycles = atoi(optarg);
break;
case 1:
hz = atoi(optarg);
if (hz > 2048) {
fprintf(stderr, "max allowable interrupt frequency is 2048 Hz\n");
hz = 2048;
}
else if (hz <= 0) {
fprintf(stderr, "zero or negative frequency doesn't make sense!\n");
hz = 1;
}
}
}
if (argv[optind] == NULL) {
fprintf(stderr, "histogram file name required\n");
usage();
exit (2);
}
return argv[optind];
}
#define msec(f) (1e3 * (f))
#define usec(f) (1e6 * (f))
int main(int argc, char *argv[])
{
int fd;
double ideal;
u64 last;
double max_delay = 0;
char *histfile = parse_options(argc, argv);
if (mlockall(MCL_CURRENT|MCL_FUTURE) != 0) {
perror("mlockall");
exit(1);
}
setup_clock ();
set_realtime_priority();
calibrate();
printf("secondsPerTick=%f\n", secondsPerTick);
printf("ticksPerSecond=%f\n", ticksPerSecond);
fd = open("/dev/rtc",O_RDONLY);
if (fd == -1)
fatal("failed to open /dev/rtc");
ideal = 1.0 / hz;
if (ioctl(fd, RTC_IRQP_SET, hz) == -1)
fatal("ioctl(RTC_IRQP_SET) failed");
printf("Interrupt frequency: %d Hz\n",hz);
if (bounded) {
printf("running for %d samples\n", ncycles);
}
/* Enable periodic interrupts */
if (ioctl(fd, RTC_PIE_ON, 0) == -1)
fatal("ioctl(RTC_PIE_ON) failed");
signal(SIGINT, signalled);
last = rdtsc();
while (!stopit) {
u64 now;
double delay;
int data;
int ms;
if (read(fd, &data, sizeof(data)) == -1)
fatal("blocking read failed");
now = rdtsc();
delay = secondsPerTick * (now - last);
if (delay > max_delay) {
max_delay = delay;
// printf("%.3f msec\n", -(1e3 * (ideal - delay)));
}
ms = (-(ideal - delay) + 1.0/20000.0) * 10000000;
if (ms < 0)
ms = 0; /* hmmm */
if (ms >= SAMPLES)
ms = SAMPLES;
histogram[ms]++;
if (bounded) {
if (++current >= ncycles) {
printf ("finished collecting %d samples\n", ncycles);
printf ("maximum cycle time: %.3fms\n",
-msec(ideal - max_delay));
break;
}
if ((current % 10000) == 0) {
printf("%d cycles (max cycle time so far: %.3fus)\n",
current, -(usec(ideal - max_delay)));
}
}
last = now;
}
if (ioctl(fd, RTC_PIE_OFF, 0) == -1)
fatal("ioctl(RTC_PIE_OFF) failed");
hist(histfile);
return 0;
}
> Ingo
>
i've released the -R6 patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R6
Changes in -R6:
- fixed a CONFIG_SMP + CONFIG_PREEMPT bug that had the potential to
cause spinlock related lockups. (UP kernels are unaffected.) This bug
got introduced in -R5.
2.6.9-rc1-bk12 patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
Ingo
On Monday 06 of September 2004 13:06, Ingo Molnar wrote:
>
> i've released the -R6 patch:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R6
>
> Changes in -R6:
>
> - fixed a CONFIG_SMP + CONFIG_PREEMPT bug that had the potential to
> cause spinlock related lockups. (UP kernels are unaffected.) This bug
> got introduced in -R5.
>
> 2.6.9-rc1-bk12 patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
> + http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
I did as instructed, but it didn't compile (on a UP x86-64 system). I got
this:
CC arch/x86_64/kernel/irq.o
arch/x86_64/kernel/irq.c: In function `request_irq':
arch/x86_64/kernel/irq.c:498: warning: implicit declaration of function
`setup_irq'
CC arch/x86_64/kernel/ptrace.o
CC arch/x86_64/kernel/i8259.o
arch/x86_64/kernel/i8259.c: In function `init_IRQ':
arch/x86_64/kernel/i8259.c:570: warning: implicit declaration of function
`setup_irq'
CC arch/x86_64/kernel/ioport.o
CC arch/x86_64/kernel/ldt.o
CC arch/x86_64/kernel/setup.o
CC arch/x86_64/kernel/time.o
arch/x86_64/kernel/time.c: In function `time_init':
arch/x86_64/kernel/time.c:820: warning: implicit declaration of function
`setup_irq'
CC arch/x86_64/kernel/sys_x86_64.o
[- snip -]
C kernel/hardirq.o
kernel/hardirq.c: In function `recalculate_desc_flags':
kernel/hardirq.c:314: error: `SA_NODELAY' undeclared (first use in this
function)
kernel/hardirq.c:314: error: (Each undeclared identifier is reported only once
kernel/hardirq.c:314: error: for each function it appears in.)
kernel/hardirq.c: In function `generic_setup_irq':
kernel/hardirq.c:344: error: `SA_NODELAY' undeclared (first use in this
function)
kernel/hardirq.c: In function `threaded_read_proc':
kernel/hardirq.c:659: error: `SA_NODELAY' undeclared (first use in this
function)
kernel/hardirq.c: In function `threaded_write_proc':
kernel/hardirq.c:677: error: `SA_NODELAY' undeclared (first use in this
function)
make[1]: *** [kernel/hardirq.o] Error 1
make: *** [kernel] Error 2
The .config is attached.
Regards,
RJW
--
For a successful technology, reality must take precedence over public
relations, for nature cannot be fooled.
-- Richard P. Feynman
On Mon, 2004-09-06 at 13:48, Rafael J. Wysocki wrote:
> On Monday 06 of September 2004 13:06, Ingo Molnar wrote:
> >
> > i've released the -R6 patch:
> >
> >
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R6
> >
> > Changes in -R6:
> >
> > - fixed a CONFIG_SMP + CONFIG_PREEMPT bug that had the potential to
> > cause spinlock related lockups. (UP kernels are unaffected.) This bug
> > got introduced in -R5.
> >
> > 2.6.9-rc1-bk12 patching order is:
> >
> > http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> > + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
> > + http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
>
> I did as instructed, but it didn't compile (on a UP x86-64 system). I got
> this:
>
> CC arch/x86_64/kernel/irq.o
> arch/x86_64/kernel/irq.c: In function `request_irq':
> arch/x86_64/kernel/irq.c:498: warning: implicit declaration of function
> `setup_irq'
> CC arch/x86_64/kernel/ptrace.o
> CC arch/x86_64/kernel/i8259.o
> arch/x86_64/kernel/i8259.c: In function `init_IRQ':
> arch/x86_64/kernel/i8259.c:570: warning: implicit declaration of function
> `setup_irq'
> CC arch/x86_64/kernel/ioport.o
> CC arch/x86_64/kernel/ldt.o
> CC arch/x86_64/kernel/setup.o
> CC arch/x86_64/kernel/time.o
> arch/x86_64/kernel/time.c: In function `time_init':
> arch/x86_64/kernel/time.c:820: warning: implicit declaration of function
> `setup_irq'
> CC arch/x86_64/kernel/sys_x86_64.o
> [- snip -]
> C kernel/hardirq.o
> kernel/hardirq.c: In function `recalculate_desc_flags':
> kernel/hardirq.c:314: error: `SA_NODELAY' undeclared (first use in this
> function)
> kernel/hardirq.c:314: error: (Each undeclared identifier is reported only once
> kernel/hardirq.c:314: error: for each function it appears in.)
> kernel/hardirq.c: In function `generic_setup_irq':
> kernel/hardirq.c:344: error: `SA_NODELAY' undeclared (first use in this
> function)
> kernel/hardirq.c: In function `threaded_read_proc':
> kernel/hardirq.c:659: error: `SA_NODELAY' undeclared (first use in this
> function)
> kernel/hardirq.c: In function `threaded_write_proc':
> kernel/hardirq.c:677: error: `SA_NODELAY' undeclared (first use in this
> function)
> make[1]: *** [kernel/hardirq.o] Error 1
> make: *** [kernel] Error 2
>
It doesn't look like it is fully ported to x86_64 systems yet, these
compile errors are easy to move away but the functionality doesn't seem
to be there. Probably why Ingo hasn't added the PREEMPT_VOLUNTARY to the
x86_64 Kconfig even though I saw a few bits of x86_64 code in the patch.
Or am I missing something?
* Alexander Nyberg <[email protected]> wrote:
> It doesn't look like it is fully ported to x86_64 systems yet, these
> compile errors are easy to move away but the functionality doesn't
> seem to be there. Probably why Ingo hasn't added the PREEMPT_VOLUNTARY
> to the x86_64 Kconfig even though I saw a few bits of x86_64 code in
> the patch.
yeah, it probably doesnt compile on anything other than x86 right now.
Ingo
Lee Revell wrote:
> On Mon, 2004-09-06 at 02:30, Ingo Molnar wrote:
>
>>* Lee Revell <[email protected]> wrote:
>>
>>
>>>http://krustophenia.net/testresults.php?dataset=2.6.9-rc1-R0#/var/www/2.6.9-rc1-R0/foo.hist
>>>
>>>I find the two smaller spikes to either side of the central spike
>>>really odd. These showed up in my jackd tests too, I had attributed
>>>them to some measurement artifact, but they seem real. Maybe a
>>>rounding bug, or some kind of weird cache effect?
>>
>>interesting - the histograms are pretty symmetric around the center.
>>E.g. the exponential foo.hist2 diagram is way too symmetric around 50
>>usecs! What precisely is being measured?
>>
>
>
> Here's the program. It does mlockall(), acquires realtime scheduling,
> then sets up a 2048 Hz stream of interupts from the RTC and measures the
> delay. It's quite possible there's a bug, the amlat program did not
> seem to work, something must have changed with the RTC from 2.4 to 2.6.
>
Actually the amlat program works fine for applying real-time scheduling
pressure. I believe it just doesn't do any real latency measuring
without the hooks provided by Andrew's rtc-debug patch.
kr
i've ported the VP patch to x64. I havent boot-tested it, but it
compiles cleanly and it might even boot:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R7
Caveats: normal kernel with PREEMPT, PREEMPT_VOLUNTARY, PREEMPT_SOFTIRQS
and PREEMPT_HARDIRQS disabled should work just fine.
A kernel with PREEMPT=y and PREEMPT_VOLUNTARY=y ought to work too - with
a smaller probability though. PREEMPT_SOFTIRQS=y should be the next step
- this one might work too. (PREEMPT_HARDIRQS=y doesnt do anything on x64
yet, because i havent changed the irq code. I'd like to keep non-x86
changes small, unless a developer picks it up - like it happened for the
ppc and ppc64 port of the VP patch.)
PREEMPT_TIMING=y might work too if the previous ones worked. The most
problematic one is probably LATENCY_TRACE=y - i've added the proper
mcount assembly code but mostly blindly. It does compile.
so please try this kernel on real hw and try to figure out step by step
at which stage in the following order of parameters it breaks:
PREEMPT=y
PREEMPT_VOLUNTARY=y
PREEMPT_SOFTIRQS=y
PREEMPT_TIMING=y
LATENCY_TRACE=y
(when enabling a new option in this sequence keep all the previous
options enabled.)
Worst-case it already breaks with all these options disabled - in this
case please double-check whether vanilla -bk12 x64 boots fine with the
same .config.
Best-case it works fine with all options enabled - quite unlikely.
If it breaks it will break early and hard during bootup, so data is
probably not at risk - but be careful nevertheless.
to get a 2.6.9-rc1-bk12 kernel the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
Ingo
test-booted the x64 kernel and found a number of bugs in the x64 port of
the VP patch. I've uploaded -R8 that fixes them:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R8
NOTE: i tested a (non-modular) 64-bit bzImage on a 32-bit OS (FC2) but
havent booted it on a 64-bit userland yet. But i'd expect 64-bit
userspace to work just fine too.
to get a 2.6.9-rc1-bk12 kernel the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
Ingo
On Tue, 2004-09-07 at 13:57, Ingo Molnar wrote:
> test-booted the x64 kernel and found a number of bugs in the x64 port of
> the VP patch. I've uploaded -R8 that fixes them:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R8
>
> NOTE: i tested a (non-modular) 64-bit bzImage on a 32-bit OS (FC2) but
> havent booted it on a 64-bit userland yet. But i'd expect 64-bit
> userspace to work just fine too.
Looks fine over here on 2-CPU, debian 64-bit user-space with both preempt &
voluntary preempt turned on. Init seems to explode
(gets killed over and over, not sure how this happens) on CONFIG_LATENCY_TRACE,
I'll take a look at that later today unless you have any offender you're aware of.
===== linux-2.5/arch/x86_64/kernel/x8664_ksyms.c 1.34 vs edited =====
--- 1.34/arch/x86_64/kernel/x8664_ksyms.c 2004-08-24 11:08:31 +02:00
+++ edited/linux-2.5/arch/x86_64/kernel/x8664_ksyms.c 2004-09-07 16:31:46 +02:00
@@ -221,3 +221,7 @@
#endif
EXPORT_SYMBOL(cpu_khz);
+
+#ifdef CONFIG_LATENCY_TRACE
+EXPORT_SYMBOL(mcount);
+#endif
> > Looks fine over here on 2-CPU, debian 64-bit user-space with both
> > preempt & voluntary preempt turned on. Init seems to explode (gets
> > killed over and over, not sure how this happens) on
> > CONFIG_LATENCY_TRACE, I'll take a look at that later today unless you
> > have any offender you're aware of.
>
> > +#ifdef CONFIG_LATENCY_TRACE
> > +EXPORT_SYMBOL(mcount);
> > +#endif
>
> thanks, i've added this to latency.c. You can find my current snapshot
> at:
>
> http://redhat.com/~mingo/private/voluntary-preempt-2.6.9-rc1-bk12-R9-A6
>
> there are alot of changes - perhaps one of them fixes your LATENCY_TRACE
> problem - but the likelyhood is low, for me LATENCY_TRACE worked fine on
> amd64 even with -R8.
I didn't even get to see any text before it rebooted, hmm? I'll take a
look with -R8 later as for LATENCY_TRACE, do you know what could have
caused this sudden reboot?
* Alexander Nyberg <[email protected]> wrote:
> Looks fine over here on 2-CPU, debian 64-bit user-space with both
> preempt & voluntary preempt turned on. Init seems to explode (gets
> killed over and over, not sure how this happens) on
> CONFIG_LATENCY_TRACE, I'll take a look at that later today unless you
> have any offender you're aware of.
> +#ifdef CONFIG_LATENCY_TRACE
> +EXPORT_SYMBOL(mcount);
> +#endif
thanks, i've added this to latency.c. You can find my current snapshot
at:
http://redhat.com/~mingo/private/voluntary-preempt-2.6.9-rc1-bk12-R9-A6
there are alot of changes - perhaps one of them fixes your LATENCY_TRACE
problem - but the likelyhood is low, for me LATENCY_TRACE worked fine on
amd64 even with -R8.
Ingo
On Tuesday 07 of September 2004 13:57, Ingo Molnar wrote:
>
> test-booted the x64 kernel and found a number of bugs in the x64 port of
> the VP patch. I've uploaded -R8 that fixes them:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R8
>
> NOTE: i tested a (non-modular) 64-bit bzImage on a 32-bit OS (FC2) but
> havent booted it on a 64-bit userland yet. But i'd expect 64-bit
> userspace to work just fine too.
Heh, works for me like charm! Thank you very much, Ingo, for the excellent
job!
It has hanged once (the first time I tried to boot it), while starting either
postfix or cron (AFAIR), but I haven't been able to get any trace and I can't
reproduce this. I'm using the serial console just in case right now. I'll
give it a run in production tomorrow.
I'm attaching the output of dmesg in case you want to know what the system is.
Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
On Sun, 2004-09-05 at 10:02, Ingo Molnar wrote:
> i've released -R5:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R5
Ingo, here is a report from a user (Rui) of a problem that seems to have
been introduced in Q5. The symptoms look very similar to the SMP/HT
problems that were thought to be fixed. I have already requested more
info as to what happens if soft/hardirq preemption are enabled.
---
I'm having some trouble with latest VP patches on my P4 HT/SMP box. The
trouble is that since Q5 that I can't get my machine to boot reliably,
if at all. It goes almost all through the init scripts to drop dead on
the beach, so to speak. It just freezes completely somewhere before the
login prompts.
This only happens if the kernel is configured for SMP/SMT
(HyperThreading). The very same kernel configured and built for UP boots
and runs fine. As I said before this was introduced on the Q5 patch, and
the same showstopper is present on latest R6. Only with Q3 I'm still
happy, altought only with softirq-preempt=0 AND hardirq-preempt=0.
The "offending" box is a SUSE 9.1 based one, P4 2.80C HT on a ASUS
P4P800 mobo, 1GB DDR.
---
Lee
On Tue, 2004-09-07 at 07:57, Ingo Molnar wrote:
> test-booted the x64 kernel and found a number of bugs in the x64 port of
> the VP patch. I've uploaded -R8 that fixes them:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R8
>
Does not work on 32 bit x86:
CHK include/linux/compile.h
UPD include/linux/compile.h
CC init/version.o
LD init/built-in.o
LD .tmp_vmlinux1
kernel/built-in.o(.init.text+0xcbf): In function `interruptible_sleep_on':
kernel/sched.c:1563: undefined reference to `init_irq_proc'
make: *** [.tmp_vmlinux1] Error 1
Lee
On Tue, 2004-09-07 at 18:55, Lee Revell wrote:
> Ingo, here is a report from a user (Rui) of a problem that seems to have
> been introduced in Q5.
Ugh, I got Rui's address wrong in the first post, please use the one
above for any followups. Sorry.
Lee
> Lee Revell wrote:
>> Ingo, here is a report from a user (Rui) of a problem that seems to
>> have been introduced in Q5.
>
> Ugh, I got Rui's address wrong in the first post, please use the one
> above for any followups. Sorry.
>
>Rui Nuno Capela wrote:
>>
>> I'm having some trouble with latest VP patches on my P4 HT/SMP box. The
>> trouble is that since Q5 that I can't get my machine to boot reliably,
>> if at all. It goes almost all through the init scripts to drop dead on
>> the beach, so to speak. It just freezes completely somewhere before the
>> login prompts.
>>
>> This only happens if the kernel is configured for SMP/SMT
>> (HyperThreading). The very same kernel configured and built for UP
>> boots and runs fine. As I said before this was introduced on the Q5
>> patch, and the same showstopper is present on latest R6. Only with Q3
>> I'm still happy, altought only with softirq-preempt=0 AND
>> hardirq-preempt=0.
>>
>> The "offending" box is a SUSE 9.1 based one, P4 2.80C HT on a ASUS
>> P4P800 mobo, 1GB DDR.
>>
>
> I posted the above report to LKML and cc'ed you and Ingo. None of the
> LKML testers are reporting this problem, it sounds very similar to the
> problems others have had with SMP/HT but those were thought to all be
> solved.
>
Thanks Lee. Yes I was pretty excited with those reports too (I've been
lurking on LKML :), but I just seem to be unlucky. That's why I've been
trying one by one of the patches, from Q5 to R6 and tweaking the available
options as much as I know and time permitted.
OK, could just someone with a P4 HT/SMP box hand me their working kernel
.config file for me to try? That could be a good starting point, if not a
plain baseline.
>
> Has any version worked with softirq or hardirq preemption enabled? What
> are the symptoms if you boot Q3 with either of the above enabled?
>
On this box in question, softirq and hardirq-preempt options had NEVER
lead to a stable SMP/HT system, ever since my first rehearsal with VP,
which I think was around O3. UP is quite different, it works and always
worked, as advertised :) FWIW, R6 is pumping hard on my laptop :)
Q3-SMP doesn't pass the KDE 3.3 startup splash if I set softirq=1 or
hardirq=1. The system just hangs. I still keep softirq-preempt=0 and
hardirq-preempt=0 as my Q3-SMP kernel bootloader parameters though.
Hope this gets cleared out. All I can do is offering my best efforts to
dig this out, provided someone give me some hunch ;)
Thank you all.
--
rncbc aka Rui Nuno Capela
[email protected]
I'm running the VP patch on a PII 400MHz to closer approximate an
embedded target. I get a 21ms latency trace during boot which dwarfs
other latencies and prevents me from seeing any of the later latencies
when I'm running my test. The trace (from -R5) is available here:
http://hilman.org/kevin/VP/trace-cond_resched.txt
At first glance, it appears to be the result of an accumulation of
calls to __delay() from the 3c59x vortex driver. Any ideas what's
going on here?
Is there a way to disable the trace by default and enable it later via
/proc? I see that the preemption itself can be disabled via
command-line and then enable later via /proc but I don't see the same
for the latency trace.
Kevin
http://hilman.org/
On Wed, 2004-09-08 at 00:22, Kevin Hilman wrote:
> At first glance, it appears to be the result of an accumulation of
> calls to __delay() from the 3c59x vortex driver. Any ideas what's
> going on here?
>
> Is there a way to disable the trace by default and enable it later via
> /proc? I see that the preemption itself can be disabled via
> command-line and then enable later via /proc but I don't see the same
> for the latency trace.
echo 0 > /proc/sys/kernel/preempt_max_latency will reset the counter.
There is probably no point in fixing boot time latencies.
Lee
On Mon, 2004-09-06 at 07:06, Ingo Molnar wrote:
> i've released the -R6 patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R6
I get these latencies when I cause the machine to swap by compiling a
kernel with make -j32. They get bigger as the machine gets further into
swap.
Every 2.0s: head -60 /proc/latency_trace Wed Sep 8 02:51:40 2004
preemption latency trace v1.0.6 on 2.6.9-rc1-bk12-VP-R6
--------------------------------------------------
latency: 605 us, entries: 5 (5) [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: kswapd0/35, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: get_swap_page+0x23/0x490
=> ended at: get_swap_page+0x13f/0x490
=======>
00000001 0.000ms (+0.606ms): get_swap_page (add_to_swap)
00000001 0.606ms (+0.000ms): sub_preempt_count (get_swap_page)
00000001 0.606ms (+0.000ms): update_max_trace (check_preempt_timing)
00000001 0.606ms (+0.000ms): _mmx_memcpy (update_max_trace)
00000001 0.607ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
Lee
* Lee Revell <[email protected]> wrote:
> Does not work on 32 bit x86:
>
> CHK include/linux/compile.h
> UPD include/linux/compile.h
> CC init/version.o
> LD init/built-in.o
> LD .tmp_vmlinux1
> kernel/built-in.o(.init.text+0xcbf): In function `interruptible_sleep_on':
> kernel/sched.c:1563: undefined reference to `init_irq_proc'
> make: *** [.tmp_vmlinux1] Error 1
does -R9 work for you:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R9
to get a 2.6.9-rc1-bk12 kernel the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> OK, could just someone with a P4 HT/SMP box hand me their working
> kernel .config file for me to try? That could be a good starting
> point, if not a plain baseline.
I'll try the latest VP kernel (-R9) on a P4/HT SMP box in a minute and
will send you a .config if it works. Could you also send me your
.config?
Ingo
* Ingo Molnar <[email protected]> wrote:
> * Rui Nuno Capela <[email protected]> wrote:
>
> > OK, could just someone with a P4 HT/SMP box hand me their working
> > kernel .config file for me to try? That could be a good starting
> > point, if not a plain baseline.
>
> I'll try the latest VP kernel (-R9) on a P4/HT SMP box in a minute and
> will send you a .config if it works. [...]
P4/HT SMP works fine here - config attached.
since your lockups occur under X, could you try to disable DRI/DRM in
your XConfig? Also, would it be possible to connect that box to another
box via a serial line and enable the kernel's serial console via the
'console=ttyS0,38400 console=tty' boot option and run 'minicom' on that
other box, set the serial line to 38400 baud there too and capture all
kernel messages that occur when the lockups happens? Also enable the NMI
watchdog via nmi_watchdog=1.
Ingo
Ingo Molnar wrote:
>
> * Rui Nuno Capela <[email protected]> wrote:
>
>> OK, could just someone with a P4 HT/SMP box hand me their working
>> kernel .config file for me to try? That could be a good starting
>> point, if not a plain baseline.
>
> I'll try the latest VP kernel (-R9) on a P4/HT SMP box in a minute and
> will send you a .config if it works. Could you also send me your
> .config?
>
Thanks Ingo. Here goes my .config, altough Q3 specific it's been the base
for all others (from Q5 to R6).
Few moments ago, Tim Savannah was kind enough to send me it's working
.config for 2.6.9-rc1-bk13 with ck patch and voluntary
preempt R6. At first glance it seems that he takes the monolithic approach
while I prefer all-modular. The other main difference is that he has
HIGHMEM disabled, while I'm on HIGHMEM(4GB) 'coz my machine has 1GB of RAM
:)
I didn't test Tim's configuration yet, as I can only do it when I get back
home tonight. I'll wait for yours too then.
Thanks again.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Few moments ago, Tim Savannah was kind enough to send me it's working
> .config for 2.6.9-rc1-bk13 with ck patch and voluntary preempt R6. At
> first glance it seems that he takes the monolithic approach while I
> prefer all-modular. The other main difference is that he has HIGHMEM
> disabled, while I'm on HIGHMEM(4GB) 'coz my machine has 1GB of RAM
> :)
mine is HIGHMEM4G too.
Ingo
On Wednesday 08 of September 2004 10:20, Ingo Molnar wrote:
>
> * Lee Revell <[email protected]> wrote:
>
> > Does not work on 32 bit x86:
> >
> > CHK include/linux/compile.h
> > UPD include/linux/compile.h
> > CC init/version.o
> > LD init/built-in.o
> > LD .tmp_vmlinux1
> > kernel/built-in.o(.init.text+0xcbf): In function `interruptible_sleep_on':
> > kernel/sched.c:1563: undefined reference to `init_irq_proc'
> > make: *** [.tmp_vmlinux1] Error 1
>
> does -R9 work for you:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-R9
Er, it doesn't work for me:
HOSTCC scripts/genksyms/parse.o
HOSTLD scripts/genksyms/genksyms
CC scripts/mod/empty.o
/bin/sh: line 1: x86_64-unknown-linux-gcc: command not found
make[2]: *** [scripts/mod/empty.o] Error 127
make[1]: *** [scripts/mod] Error 2
make: *** [scripts] Error 2
Greets,
RJW
--
- Would you tell me, please, which way I ought to go from here?
- That depends a good deal on where you want to get to.
-- Lewis Carroll "Alice's Adventures in Wonderland"
* Rafael J. Wysocki <[email protected]> wrote:
> Er, it doesn't work for me:
>
> HOSTCC scripts/genksyms/parse.o
> HOSTLD scripts/genksyms/genksyms
> CC scripts/mod/empty.o
> /bin/sh: line 1: x86_64-unknown-linux-gcc: command not found
> make[2]: *** [scripts/mod/empty.o] Error 127
> make[1]: *** [scripts/mod] Error 2
> make: *** [scripts] Error 2
please re-download -R9, it had my crosscompiler flags included by
accident.
Ingo
On Wed, 2004-09-08 at 04:20, Ingo Molnar wrote:
> does -R9 work for you:
>
No, same error:
LD init/built-in.o
LD .tmp_vmlinux1
kernel/built-in.o(.init.text+0xcbf): In function `interruptible_sleep_on':
kernel/sched.c:1563: undefined reference to `init_irq_proc'
make: *** [.tmp_vmlinux1] Error 1
Here is the change that is responsible. R6 compiles:
rlrevell@mindpipe:~/kernel-source/linux-2.6.9-rc1-bk12-R8$ grep init_irq_proc ../voluntary-preempt-2.6.9-rc1-bk12-R6
-void init_irq_proc (void)
-void init_irq_proc (void)
-void init_irq_proc (void)
+void init_irq_proc (void)
R8 and later do not:
rlrevell@mindpipe:~/kernel-source/linux-2.6.9-rc1-bk12-R8$ grep init_irq_proc ../voluntary-preempt-2.6.9-rc1-bk12-R9
-void init_irq_proc (void)
-void init_irq_proc (void)
-void init_irq_proc (void)
+extern void generic_init_irq_proc(void);
+static inline void init_irq_proc(void)
+ generic_init_irq_proc();
+void generic_init_irq_proc(void)
Lee
* Lee Revell <[email protected]> wrote:
> LD .tmp_vmlinux1
> kernel/built-in.o(.init.text+0xcbf): In function `interruptible_sleep_on':
> kernel/sched.c:1563: undefined reference to `init_irq_proc'
> make: *** [.tmp_vmlinux1] Error 1
could you try -S0:
does -R9 work for you:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc1-bk12-S0
[ to get a 2.6.9-rc1-bk12 kernel the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/patch-2.6.9-rc1-bk12.bz2 ]
Ingo
Hi,
I'm back with some news :)
>* Ingo Molnar wrote:
>
>> * Rui Nuno Capela wrote:
>>
>> > OK, could just someone with a P4 HT/SMP box hand me their working
>> > kernel .config file for me to try? That could be a good starting
>> > point, if not a plain baseline.
>>
>> I'll try the latest VP kernel (-R9) on a P4/HT SMP box in a minute and
>> will send you a .config if it works. [...]
>
> P4/HT SMP works fine here - config attached.
>
> since your lockups occur under X, could you try to disable DRI/DRM in
> your XConfig? Also, would it be possible to connect that box to another
> box via a serial line and enable the kernel's serial console via the
> 'console=ttyS0,38400 console=tty' boot option and run 'minicom' on that
> other box, set the serial line to 38400 baud there too and capture all
> kernel messages that occur when the lockups happens? Also enable the NMI
> watchdog via nmi_watchdog=1.
>
OK. Spent the whole late night building several R9's, enabling modules one
by one, and testing most options in a incremental fashion. Didn't tried
the serial console yet but I found something that's otherwise specific to
my hardware setup, but only exposed with the VP SMP configuration, as I've
been recently reporting.
So the base experiments were conducted by applying on:
linux-2.6.9-rc1.tar.bz2
the follwing patches in sequence:
patch-2.6.9-rc1-bk12.bz2
voluntary-preempt-2.6.9-rc1-bk12-R9
Here are my summary report:
1) My first experiment was using Ingo's .config exactly as he gave me,
with the sole exception that I switched reiserfs support liked in, as
needed by primary/boot disk partitions. It boots OK, fine but missing
alomost every needed module/device support.
2) Switched module (un|auto)loading support. OK.
3) Several config iterations later, which were taken by switching in some
modules I found obvious to be included one by one, I've finally reached
the showstopper: USB support.
4) Then I wandered: the problem must be in one of plugged USB devices. And
right I was: my WACOM GRAPHIRE2 USB tablet was the culprit. Strange
enough, the hid and wacom modules weren't compiled in yet. Some more
iterations later, and it didn't matter if those modules are in or not: if
the tablet is plugged in at boot time the VP+SMP combination freezes.
5) Incidentally I found that I must unplug the tablet at boot time of
freshly built VP+SMP kernel. Then I found that installing the linuxwacom
project [http://linuxwacom.sourceforge.net] drivers, which adds some
changes to mousedev (built-in), evdev and wacom kernel modules, I end up
with a kernel that I can boot and run later already with the tablet
plugged in.
6) Now that had found the major showstopper, I decided to go for audio:
among some other thingies, switched ALSA sound modules on, included the
realtime-lsm patch and built what comes to be my latest VP+SMP working
kernel. And it boots OK. Great.
7) Now let's start jackd... start some client applications, hear some
sound, and... horror! The system hangs completetly. The time it takes to
hang is by no means deterministic. Soon or later it hangs. Hard-reboot is
always the only way around, no magic-sysrq :( Gasp, I've seen this before.
8) Indeed, only by disabling both softirq and hardirq preeemption I get an
usable VP+SMP kernel. But that's no surprise either, it has been always
like that until Q3, which was the latest VP+SMP combination that didn't
suffer with the Wacom tablet presence at boot/init time. I only hoped the
(soft|hard)irq trouble would be solved by R9 time.
Nevertheless, I'm now a lil'bit happier, now that I've catched up on this
VP promise :)
Thanks for the patience.
--
rncbc aka Rui Nuno Capela
[email protected]
P.S. Included is my latest .config.
* Lee Revell <[email protected]> wrote:
> I get these latencies when I cause the machine to swap by compiling a
> kernel with make -j32. They get bigger as the machine gets further
> into swap.
>
> Every 2.0s: head -60 /proc/latency_trace Wed Sep 8 02:51:40 2004
>
> preemption latency trace v1.0.6 on 2.6.9-rc1-bk12-VP-R6
> --------------------------------------------------
> latency: 605 us, entries: 5 (5) [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
> -----------------
> | task: kswapd0/35, uid:0 nice:0 policy:0 rt_prio:0
> -----------------
> => started at: get_swap_page+0x23/0x490
> => ended at: get_swap_page+0x13f/0x490
> =======>
> 00000001 0.000ms (+0.606ms): get_swap_page (add_to_swap)
> 00000001 0.606ms (+0.000ms): sub_preempt_count (get_swap_page)
> 00000001 0.606ms (+0.000ms): update_max_trace (check_preempt_timing)
> 00000001 0.606ms (+0.000ms): _mmx_memcpy (update_max_trace)
> 00000001 0.607ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
yep, the get_swap_page() latency. I can easily trigger 10+ msec
latencies on a box with alot of swap by just letting stuff swap out. I
had a quick look but there was no obvious way to break the lock. Maybe
Andrew has better ideas? get_swap_page() is pretty stupid, it does a
near linear search for a free slot in the swap bitmap - this not only is
a latency issue but also an overhead thing as we do it for every other
page that touches swap.
rationale: this is pretty much the only latency that we still having
during heavy VM load and it would Just Be Cool if we fixed this final
one. audio daemons and apps like jackd use mlockall() so they are not
affected by swapping.
Ingo
On Thu, 2004-09-09 at 02:17, Ingo Molnar wrote:
> could you try -S0:
Nope, different error:
CC arch/i386/kernel/irq.o
arch/i386/kernel/irq.c: In function `do_IRQ':
arch/i386/kernel/irq.c:273: warning: implicit declaration of function `redirect_hardirq'
arch/i386/kernel/irq.c:349: error: `noirqdebug' undeclared (first use in this function)
arch/i386/kernel/irq.c:349: error: (Each undeclared identifier is reported only once
arch/i386/kernel/irq.c:349: error: for each function it appears in.)
arch/i386/kernel/irq.c:350: warning: implicit declaration of function `note_interrupt'
make[1]: *** [arch/i386/kernel/irq.o] Error 1
make: *** [arch/i386/kernel] Error 2
Lee
On Thu, 2004-09-09 at 15:29, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > I get these latencies when I cause the machine to swap by compiling a
> > kernel with make -j32. They get bigger as the machine gets further
> > into swap.
> >
> > Every 2.0s: head -60 /proc/latency_trace Wed Sep 8 02:51:40 2004
> >
> > preemption latency trace v1.0.6 on 2.6.9-rc1-bk12-VP-R6
> > --------------------------------------------------
> > latency: 605 us, entries: 5 (5) [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
> > -----------------
> > | task: kswapd0/35, uid:0 nice:0 policy:0 rt_prio:0
> > -----------------
> > => started at: get_swap_page+0x23/0x490
> > => ended at: get_swap_page+0x13f/0x490
> > =======>
> > 00000001 0.000ms (+0.606ms): get_swap_page (add_to_swap)
> > 00000001 0.606ms (+0.000ms): sub_preempt_count (get_swap_page)
> > 00000001 0.606ms (+0.000ms): update_max_trace (check_preempt_timing)
> > 00000001 0.606ms (+0.000ms): _mmx_memcpy (update_max_trace)
> > 00000001 0.607ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
>
> yep, the get_swap_page() latency. I can easily trigger 10+ msec
> latencies on a box with alot of swap by just letting stuff swap out. I
> had a quick look but there was no obvious way to break the lock. Maybe
> Andrew has better ideas? get_swap_page() is pretty stupid, it does a
> near linear search for a free slot in the swap bitmap - this not only is
> a latency issue but also an overhead thing as we do it for every other
> page that touches swap.
>
> rationale: this is pretty much the only latency that we still having
> during heavy VM load and it would Just Be Cool if we fixed this final
> one. audio daemons and apps like jackd use mlockall() so they are not
> affected by swapping.
>
I believe Scott Wood suggested a fix back when I first reported this,
have to check my mailbox. Scott?
Lee
On Thu, 2004-09-09 at 15:29, Ingo Molnar wrote:
> yep, the get_swap_page() latency. I can easily trigger 10+ msec
> latencies on a box with alot of swap by just letting stuff swap out. I
> had a quick look but there was no obvious way to break the lock. Maybe
> Andrew has better ideas? get_swap_page() is pretty stupid, it does a
> near linear search for a free slot in the swap bitmap - this not only is
> a latency issue but also an overhead thing as we do it for every other
> page that touches swap.
>
> rationale: this is pretty much the only latency that we still having
> during heavy VM load and it would Just Be Cool if we fixed this final
> one. audio daemons and apps like jackd use mlockall() so they are not
> affected by swapping.
Considering that the default swappiness behavior is to swap out idle
apps to get more page cache, this would indeed be nice to fix. If the
default behavior were to not swap until someone actually asks for more
memory than we have, it would be less of a concern.
Lee
Ingo Molnar <[email protected]> wrote:
>
> yep, the get_swap_page() latency. I can easily trigger 10+ msec
> latencies on a box with alot of swap by just letting stuff swap out. I
> had a quick look but there was no obvious way to break the lock. Maybe
> Andrew has better ideas? get_swap_page() is pretty stupid, it does a
> near linear search for a free slot in the swap bitmap - this not only is
> a latency issue but also an overhead thing as we do it for every other
> page that touches swap.
Someone needs to get down and redesign the swap block allocator. I bet
latency improvements would fall out of that automatically.
The main problem is that swap blocks are now physically clustered according
to the page lru ordering, which doesn't have much relationship to
process-virtual-address-ordering.
The swap allocator made sense when we were doing a virtual scan. It
doesn't make much sense now.
I did a patch a while back which switches the swapspace allocator over to
perform program-virtual-address clustering, but it didn't help much in
brief testing and I haven't got back onto it.
And contrary to my above assertion, I don't think it'll help latency ;)
A short-term bodge would be to scan the map without locks held, take the
lock just to actually claim the block, retry if we raced. Use swapon_sem
to avoid races. After checking that we never perform GFP_WAIT allocations
while holding swapon_sem.
The whole thing needs work.
diff -puN mm/vmscan.c~swapspace-layout-improvements mm/vmscan.c
--- 25/mm/vmscan.c~swapspace-layout-improvements 2004-06-03 21:32:51.087602712 -0700
+++ 25-akpm/mm/vmscan.c 2004-06-03 21:32:51.102600432 -0700
@@ -381,8 +381,11 @@ static int shrink_list(struct list_head
* XXX: implement swap clustering ?
*/
if (PageAnon(page) && !PageSwapCache(page)) {
+ void *cookie = page->mapping;
+ pgoff_t index = page->index;
+
page_map_unlock(page);
- if (!add_to_swap(page))
+ if (!add_to_swap(page, cookie, index))
goto activate_locked;
page_map_lock(page);
}
diff -puN mm/swap_state.c~swapspace-layout-improvements mm/swap_state.c
--- 25/mm/swap_state.c~swapspace-layout-improvements 2004-06-03 21:32:51.089602408 -0700
+++ 25-akpm/mm/swap_state.c 2004-06-03 21:32:51.103600280 -0700
@@ -137,8 +137,12 @@ void __delete_from_swap_cache(struct pag
*
* Allocate swap space for the page and add the page to the
* swap cache. Caller needs to hold the page lock.
+ *
+ * We attempt to lay pages out on swap to that virtually-contiguous pages are
+ * contiguous on-disk. To do this we utilise page->index (offset into vma) and
+ * page->mapping (the anon_vma's address).
*/
-int add_to_swap(struct page * page)
+int add_to_swap(struct page *page, void *cookie, pgoff_t index)
{
swp_entry_t entry;
int pf_flags;
@@ -148,7 +152,7 @@ int add_to_swap(struct page * page)
BUG();
for (;;) {
- entry = get_swap_page();
+ entry = get_swap_page(cookie, index);
if (!entry.val)
return 0;
diff -puN include/linux/swap.h~swapspace-layout-improvements include/linux/swap.h
--- 25/include/linux/swap.h~swapspace-layout-improvements 2004-06-03 21:32:51.090602256 -0700
+++ 25-akpm/include/linux/swap.h 2004-06-03 21:32:51.104600128 -0700
@@ -193,7 +193,7 @@ extern int rw_swap_page_sync(int, swp_en
extern struct address_space swapper_space;
#define total_swapcache_pages swapper_space.nrpages
extern void show_swap_cache_info(void);
-extern int add_to_swap(struct page *);
+extern int add_to_swap(struct page *page, void *cookie, pgoff_t index);
extern void __delete_from_swap_cache(struct page *);
extern void delete_from_swap_cache(struct page *);
extern int move_to_swap_cache(struct page *, swp_entry_t);
@@ -210,7 +210,7 @@ extern int total_swap_pages;
extern unsigned int nr_swapfiles;
extern struct swap_info_struct swap_info[];
extern void si_swapinfo(struct sysinfo *);
-extern swp_entry_t get_swap_page(void);
+extern swp_entry_t get_swap_page(void *cookie, pgoff_t index);
extern int swap_duplicate(swp_entry_t);
extern int valid_swaphandles(swp_entry_t, unsigned long *);
extern void swap_free(swp_entry_t);
@@ -259,7 +259,7 @@ static inline int remove_exclusive_swap_
return 0;
}
-static inline swp_entry_t get_swap_page(void)
+static inline swp_entry_t get_swap_page(void *cookie, pgoff_t index)
{
swp_entry_t entry;
entry.val = 0;
diff -puN mm/shmem.c~swapspace-layout-improvements mm/shmem.c
--- 25/mm/shmem.c~swapspace-layout-improvements 2004-06-03 21:32:51.092601952 -0700
+++ 25-akpm/mm/shmem.c 2004-06-03 21:32:51.108599520 -0700
@@ -744,7 +744,7 @@ static int shmem_writepage(struct page *
struct shmem_inode_info *info;
swp_entry_t *entry, swap;
struct address_space *mapping;
- unsigned long index;
+ pgoff_t index;
struct inode *inode;
BUG_ON(!PageLocked(page));
@@ -756,7 +756,7 @@ static int shmem_writepage(struct page *
info = SHMEM_I(inode);
if (info->flags & VM_LOCKED)
goto redirty;
- swap = get_swap_page();
+ swap = get_swap_page(mapping, index);
if (!swap.val)
goto redirty;
diff -puN mm/swapfile.c~swapspace-layout-improvements mm/swapfile.c
--- 25/mm/swapfile.c~swapspace-layout-improvements 2004-06-03 21:32:51.094601648 -0700
+++ 25-akpm/mm/swapfile.c 2004-06-03 23:40:44.396082512 -0700
@@ -25,6 +25,7 @@
#include <linux/rmap.h>
#include <linux/security.h>
#include <linux/backing-dev.h>
+#include <linux/hash.h>
#include <asm/pgtable.h>
#include <asm/tlbflush.h>
@@ -83,71 +84,51 @@ void swap_unplug_io_fn(struct backing_de
up_read(&swap_unplug_sem);
}
-static inline int scan_swap_map(struct swap_info_struct *si)
-{
- unsigned long offset;
- /*
- * We try to cluster swap pages by allocating them
- * sequentially in swap. Once we've allocated
- * SWAPFILE_CLUSTER pages this way, however, we resort to
- * first-free allocation, starting a new cluster. This
- * prevents us from scattering swap pages all over the entire
- * swap partition, so that we reduce overall disk seek times
- * between swap pages. -- sct */
- if (si->cluster_nr) {
- while (si->cluster_next <= si->highest_bit) {
- offset = si->cluster_next++;
- if (si->swap_map[offset])
- continue;
- si->cluster_nr--;
- goto got_page;
- }
- }
- si->cluster_nr = SWAPFILE_CLUSTER;
+int akpm;
- /* try to find an empty (even not aligned) cluster. */
- offset = si->lowest_bit;
- check_next_cluster:
- if (offset+SWAPFILE_CLUSTER-1 <= si->highest_bit)
- {
- int nr;
- for (nr = offset; nr < offset+SWAPFILE_CLUSTER; nr++)
- if (si->swap_map[nr])
- {
- offset = nr+1;
- goto check_next_cluster;
- }
- /* We found a completly empty cluster, so start
- * using it.
- */
- goto got_page;
- }
- /* No luck, so now go finegrined as usual. -Andrea */
- for (offset = si->lowest_bit; offset <= si->highest_bit ; offset++) {
- if (si->swap_map[offset])
+/*
+ * We divide the swapdev into 1024 kilobyte chunks. We use the cookie and the
+ * upper bits of the index to select a chunk and the rest of the index as the
+ * offset into the selected chunk.
+ */
+#define CHUNK_SHIFT (20 - PAGE_SHIFT)
+#define CHUNK_MASK (-1UL << CHUNK_SHIFT)
+
+static int
+scan_swap_map(struct swap_info_struct *si, void *cookie, pgoff_t index)
+{
+ unsigned long chunk;
+ unsigned long nchunks;
+ unsigned long block;
+ unsigned long scan;
+
+ nchunks = si->max >> CHUNK_SHIFT;
+ chunk = 0;
+ if (nchunks)
+ chunk = hash_long((unsigned long)cookie + (index & CHUNK_MASK),
+ BITS_PER_LONG) % nchunks;
+
+ block = (chunk << CHUNK_SHIFT) + (index & ~CHUNK_MASK);
+
+ for (scan = 0; scan < si->max; scan++, block++) {
+ if (block == si->max)
+ block = 0;
+ if (block == 0)
continue;
- si->lowest_bit = offset+1;
- got_page:
- if (offset == si->lowest_bit)
- si->lowest_bit++;
- if (offset == si->highest_bit)
- si->highest_bit--;
- if (si->lowest_bit > si->highest_bit) {
- si->lowest_bit = si->max;
- si->highest_bit = 0;
- }
- si->swap_map[offset] = 1;
- si->inuse_pages++;
+ if (si->swap_map[block])
+ continue;
+ si->swap_map[block] = 1;
nr_swap_pages--;
- si->cluster_next = offset+1;
- return offset;
+ if (akpm)
+ printk("cookie:%p, index:%lu, chunk:%lu nchunks:%lu "
+ "block:%lu\n",
+ cookie, index, chunk, nchunks, block);
+ return block;
}
- si->lowest_bit = si->max;
- si->highest_bit = 0;
return 0;
}
-swp_entry_t get_swap_page(void)
+swp_entry_t get_swap_page(void *cookie, pgoff_t index)
{
struct swap_info_struct * p;
unsigned long offset;
@@ -166,7 +147,7 @@ swp_entry_t get_swap_page(void)
p = &swap_info[type];
if ((p->flags & SWP_ACTIVE) == SWP_ACTIVE) {
swap_device_lock(p);
- offset = scan_swap_map(p);
+ offset = scan_swap_map(p, cookie, index);
swap_device_unlock(p);
if (offset) {
entry = swp_entry(type,offset);
diff -puN kernel/power/swsusp.c~swapspace-layout-improvements kernel/power/swsusp.c
--- 25/kernel/power/swsusp.c~swapspace-layout-improvements 2004-06-03 21:32:51.096601344 -0700
+++ 25-akpm/kernel/power/swsusp.c 2004-06-03 21:32:51.112598912 -0700
@@ -317,7 +317,7 @@ static int write_suspend_image(void)
for (i=0; i<nr_copy_pages; i++) {
if (!(i%100))
printk( "." );
- if (!(entry = get_swap_page()).val)
+ if (!(entry = get_swap_page(NULL, i)).val)
panic("\nNot enough swapspace when writing data" );
if (swapfile_used[swp_type(entry)] != SWAPFILE_SUSPEND)
@@ -334,7 +334,7 @@ static int write_suspend_image(void)
cur = (union diskpage *)((char *) pagedir_nosave)+i;
BUG_ON ((char *) cur != (((char *) pagedir_nosave) + i*PAGE_SIZE));
printk( "." );
- if (!(entry = get_swap_page()).val) {
+ if (!(entry = get_swap_page(NULL, i)).val) {
printk(KERN_CRIT "Not enough swapspace when writing pgdir\n" );
panic("Don't know how to recover");
free_page((unsigned long) buffer);
@@ -356,7 +356,7 @@ static int write_suspend_image(void)
BUG_ON (sizeof(struct suspend_header) > PAGE_SIZE-sizeof(swp_entry_t));
BUG_ON (sizeof(union diskpage) != PAGE_SIZE);
BUG_ON (sizeof(struct link) != PAGE_SIZE);
- if (!(entry = get_swap_page()).val)
+ if (!(entry = get_swap_page(NULL, i)).val)
panic( "\nNot enough swapspace when writing header" );
if (swapfile_used[swp_type(entry)] != SWAPFILE_SUSPEND)
panic("\nNot enough swapspace for header on suspend device" );
diff -puN kernel/power/pmdisk.c~swapspace-layout-improvements kernel/power/pmdisk.c
--- 25/kernel/power/pmdisk.c~swapspace-layout-improvements 2004-06-03 21:32:51.098601040 -0700
+++ 25-akpm/kernel/power/pmdisk.c 2004-06-03 21:32:51.113598760 -0700
@@ -206,7 +206,7 @@ static int write_swap_page(unsigned long
swp_entry_t entry;
int error = 0;
- entry = get_swap_page();
+ entry = get_swap_page(NULL, addr >> PAGE_SHIFT);
if (swp_offset(entry) &&
swapfile_used[swp_type(entry)] == SWAPFILE_SUSPEND) {
error = rw_swap_page_sync(WRITE, entry,
_
On Thu, 2004-09-09 at 15:30, Lee Revell wrote:
> On Thu, 2004-09-09 at 02:17, Ingo Molnar wrote:
> > could you try -S0:
>
> Nope, different error:
>
> CC arch/i386/kernel/irq.o
> arch/i386/kernel/irq.c: In function `do_IRQ':
> arch/i386/kernel/irq.c:273: warning: implicit declaration of function `redirect_hardirq'
> arch/i386/kernel/irq.c:349: error: `noirqdebug' undeclared (first use in this function)
> arch/i386/kernel/irq.c:349: error: (Each undeclared identifier is reported only once
> arch/i386/kernel/irq.c:349: error: for each function it appears in.)
> arch/i386/kernel/irq.c:350: warning: implicit declaration of function `note_interrupt'
> make[1]: *** [arch/i386/kernel/irq.o] Error 1
> make: *** [arch/i386/kernel] Error 2
>
OK, this was user error on my part. R7 introduced the
CONFIG_GENERIC_HARDIRQS config item, and I had neglected to run make
oldconfig.
Lee
On Thu, 2004-09-09 at 15:33, Lee Revell wrote:
> I believe Scott Wood suggested a fix back when I first reported this,
> have to check my mailbox. Scott?
>
Nope, checking the original thread, Scott pointed out that any RT
process will have mlockall'ed anyway and thus won't be affected by this
latency. So, this one would be cool to fix, but it's not a problem as
such.
Lee
* Lee Revell <[email protected]> wrote:
> > I believe Scott Wood suggested a fix back when I first reported this,
> > have to check my mailbox. Scott?
>
> Nope, checking the original thread, Scott pointed out that any RT
> process will have mlockall'ed anyway and thus won't be affected by
> this latency. So, this one would be cool to fix, but it's not a
> problem as such.
RT threads will be affected by this latency just as much because it's a
non-preemptable critical section. You are right in that an RT task wont
see the overhead itself because it doesnt generate swap entries.
Ingo
On Thu, Sep 09, 2004 at 04:43:49PM -0400, Lee Revell wrote:
> On Thu, 2004-09-09 at 15:33, Lee Revell wrote:
> > I believe Scott Wood suggested a fix back when I first reported this,
> > have to check my mailbox. Scott?
> >
>
> Nope, checking the original thread, Scott pointed out that any RT
> process will have mlockall'ed anyway and thus won't be affected by this
> latency. So, this one would be cool to fix, but it's not a problem as
> such.
Though, if this is an actual lock latency (as opposed to merely being
a page-fault latency suffered by the task swapping something in), it
could affect mlockall'd processes as well due to some other task
swapping.
One way to fix the latency would be to turn the locks involved into
sleeping mutexes. There's a comment in the code saying that swaplock
cannot be turned into a semaphore, but it does not say why; if this
is due to it nesting in other locks, those locks would need to be
converted as well. It could turn into quite a mess doing it
manually, though turning all spinlocks into mutexes (except
hand-chosen exceptions) should take care of it.
-Scott
On Iau, 2004-09-09 at 21:05, Andrew Morton wrote:
> I did a patch a while back which switches the swapspace allocator over to
> perform program-virtual-address clustering, but it didn't help much in
> brief testing and I haven't got back onto it.
>
> And contrary to my above assertion, I don't think it'll help latency ;)
I would still expect the only thing to materially improve swap latency
to be a log structured swap, possibly with a cleaner which tidies
together pages that are referenced together.
You also want contiguous runs of at least 64K and probaly a lot more on
bigger memory systems.
Alan Cox <[email protected]> wrote:
>
> I would still expect the only thing to materially improve swap latency
> to be a log structured swap, possibly with a cleaner which tidies
> together pages that are referenced together.
>
Maybe. It'd be nice to show some benefit from the "organise pages by
virtual address" patch first.
But then, maybe that doesn't help because there is little correlation
between address congruency and time-of-reference. That's hard to believe
though.
hm. The patch _does_ do what I wanted it to do. Maybe I tested it with
silly workloads.
>
> You also want contiguous runs of at least 64K and probaly a lot more on
> bigger memory systems.
I used 1MB.
+/*
+ * We divide the swapdev into 1024 kilobyte chunks. We use the cookie and the
+ * upper bits of the index to select a chunk and the rest of the index as the
+ * offset into the selected chunk.
+ */
+#define CHUNK_SHIFT (20 - PAGE_SHIFT)
Ingo Molnar <[email protected]> wrote:
>> yep, the get_swap_page() latency. I can easily trigger 10+ msec
>> latencies on a box with alot of swap by just letting stuff swap out. I
>> had a quick look but there was no obvious way to break the lock. Maybe
>> Andrew has better ideas? get_swap_page() is pretty stupid, it does a
>> near linear search for a free slot in the swap bitmap - this not only is
>> a latency issue but also an overhead thing as we do it for every other
>> page that touches swap.
On Thu, Sep 09, 2004 at 01:05:26PM -0700, Andrew Morton wrote:
> Someone needs to get down and redesign the swap block allocator. I bet
> latency improvements would fall out of that automatically.
> The main problem is that swap blocks are now physically clustered according
> to the page lru ordering, which doesn't have much relationship to
> process-virtual-address-ordering.
> The swap allocator made sense when we were doing a virtual scan. It
> doesn't make much sense now.
Something odd is going on, in part because I get *blistering* IO speeds
running benchmarks like dbench, tiobench, et al on tmpfs with striped
swap. In fact, IO speeds markedly faster than any other filesystem I've
ever tried, by about 30MB/s (i.e. wirespeed, where others fall about
37.5% short of it). Virtual alignment issues do hurt, but the core
allocation algorithm appears to be better than good, it's astounding.
On Thu, Sep 09, 2004 at 01:05:26PM -0700, Andrew Morton wrote:
> I did a patch a while back which switches the swapspace allocator over to
> perform program-virtual-address clustering, but it didn't help much in
> brief testing and I haven't got back onto it.
> And contrary to my above assertion, I don't think it'll help latency ;)
> A short-term bodge would be to scan the map without locks held, take the
> lock just to actually claim the block, retry if we raced. Use swapon_sem
> to avoid races. After checking that we never perform GFP_WAIT allocations
> while holding swapon_sem.
> The whole thing needs work.
Well, yes, dbench on tmpfs isn't really the load we're shooting for.
-- wli
On Iau, 2004-09-09 at 23:45, William Lee Irwin III wrote:
> Something odd is going on, in part because I get *blistering* IO speeds
> running benchmarks like dbench, tiobench, et al on tmpfs with striped
> swap. In fact, IO speeds markedly faster than any other filesystem I've
> ever tried, by about 30MB/s (i.e. wirespeed, where others fall about
> 37.5% short of it). Virtual alignment issues do hurt, but the core
> allocation algorithm appears to be better than good, it's astounding.
Thats a very atypical load where you can expect to get long linear write
outs. The seek v write numbers for a disk nowdays have more in common
with a tape drive. Paging tends to be much much more random.
Alan
On Iau, 2004-09-09 at 23:45, William Lee Irwin III wrote:
>> Something odd is going on, in part because I get *blistering* IO speeds
>> running benchmarks like dbench, tiobench, et al on tmpfs with striped
>> swap. In fact, IO speeds markedly faster than any other filesystem I've
>> ever tried, by about 30MB/s (i.e. wirespeed, where others fall about
>> 37.5% short of it). Virtual alignment issues do hurt, but the core
>> allocation algorithm appears to be better than good, it's astounding.
On Thu, Sep 09, 2004 at 11:11:39PM +0100, Alan Cox wrote:
> Thats a very atypical load where you can expect to get long linear write
> outs. The seek v write numbers for a disk nowdays have more in common
> with a tape drive. Paging tends to be much much more random.
Yes, I mentioned that those kinds of benchmarks are not the workload
we're shooting for in the second part of the message. The commentary
regarding dbench et al on tmpfs suggests that the lower-level parts of
the algorithm are sound, but are somehow driven inappropriately or in a
manner unaligned with what locality of reference there may be.
-- wli
* Andrew Morton <[email protected]> wrote:
> diff -puN mm/vmscan.c~swapspace-layout-improvements mm/vmscan.c
> --- 25/mm/vmscan.c~swapspace-layout-improvements 2004-06-03 21:32:51.087602712 -0700
> +++ 25-akpm/mm/vmscan.c 2004-06-03 21:32:51.102600432 -0700
i've attached a merge against current BK-ish kernels. Lee, would you be
interested in testing it? It applies cleanly to an -S0 VP tree. I've
tested it only lightly - it compiles and boots and survives some simple
swapping but that's all.
Ingo
On Fri, 10 Sep 2004 15:28:41 +0200, Ingo Molnar <[email protected]> wrote:
>
> * Andrew Morton <[email protected]> wrote:
>
> > diff -puN mm/vmscan.c~swapspace-layout-improvements mm/vmscan.c
> > --- 25/mm/vmscan.c~swapspace-layout-improvements 2004-06-03 21:32:51.087602712 -0700
> > +++ 25-akpm/mm/vmscan.c 2004-06-03 21:32:51.102600432 -0700
>
> i've attached a merge against current BK-ish kernels. Lee, would you be
> interested in testing it? It applies cleanly to an -S0 VP tree. I've
> tested it only lightly - it compiles and boots and survives some simple
> swapping but that's all.
Hello kernel folks,
what's the plan regarding the inclusion of VP in mainstream ?
--
Paolo
Personal home page: paoloc.doesntexist.org
Buy cool stuff here: http://www.cafepress.com/paoloc
On Fri, 2004-09-10 at 10:28, Paolo Ciarrocchi wrote:
> On Fri, 10 Sep 2004 15:28:41 +0200, Ingo Molnar <[email protected]> wrote:
> >
> > * Andrew Morton <[email protected]> wrote:
> >
> > > diff -puN mm/vmscan.c~swapspace-layout-improvements mm/vmscan.c
> > > --- 25/mm/vmscan.c~swapspace-layout-improvements 2004-06-03 21:32:51.087602712 -0700
> > > +++ 25-akpm/mm/vmscan.c 2004-06-03 21:32:51.102600432 -0700
> >
> > i've attached a merge against current BK-ish kernels. Lee, would you be
> > interested in testing it? It applies cleanly to an -S0 VP tree. I've
> > tested it only lightly - it compiles and boots and survives some simple
> > swapping but that's all.
>
> Hello kernel folks,
> what's the plan regarding the inclusion of VP in mainstream ?
>
I believe the plan is to merge the individual fixes one at a time. See
Ingo's recent non-VP-related posts. Once the fixes for all of the real
deficiencies in the kernel that the VP patches revealed are merged, then
we will have a very small patch.
Lee
On Fri, 2004-09-10 at 09:28, Ingo Molnar wrote:
> * Andrew Morton <[email protected]> wrote:
>
> > diff -puN mm/vmscan.c~swapspace-layout-improvements mm/vmscan.c
> > --- 25/mm/vmscan.c~swapspace-layout-improvements 2004-06-03 21:32:51.087602712 -0700
> > +++ 25-akpm/mm/vmscan.c 2004-06-03 21:32:51.102600432 -0700
>
OK, Andrew's patch seems to be an improvement. I can still cause
unbounded latencies, but these only seem to happen when we fill all
available RAM and swap space, at which point we start spending
milliseconds at a time in scan_swap_map:
preemption latency trace v1.0.7 on 2.6.9-rc1-bk12-VP-S0
-------------------------------------------------------
latency: 6032 us, entries: 550 (550) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: xfs/1098, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: rtc_interrupt+0x294/0x450
=> ended at: get_swap_page+0x13f/0x350
=======>
00010002 0.000ms (+0.000ms): touch_preempt_timing (rtc_interrupt)
00010002 0.000ms (+0.000ms): printk (rtc_interrupt)
00010002 0.000ms (+0.001ms): vprintk (printk)
00010003 0.002ms (+0.000ms): vscnprintf (vprintk)
00010003 0.002ms (+0.002ms): vsnprintf (vscnprintf)
00010003 0.005ms (+0.004ms): number (vsnprintf)
00010003 0.009ms (+0.001ms): number (vsnprintf)
00010003 0.010ms (+0.001ms): number (vsnprintf)
00010003 0.011ms (+0.000ms): emit_log_char (vprintk)
[...]
00010002 1.983ms (+0.000ms): preempt_schedule (do_IRQ)
00000003 1.984ms (+0.000ms): do_softirq (do_IRQ)
00000003 1.984ms (+0.911ms): __do_softirq (do_softirq)
00010002 2.896ms (+0.000ms): do_IRQ (scan_swap_map)
00010002 2.896ms (+0.000ms): do_IRQ (<00000008>)
00010003 2.897ms (+0.004ms): mask_and_ack_8259A (do_IRQ)
00010003 2.901ms (+0.000ms): preempt_schedule (do_IRQ)
Full trace:
http://krustophenia.net/testresults.php?dataset=2.6.9-rc1-bk12-S0#/var/www/2.6.9-rc1-bk12-S0/swapspace-layout-improvements-A1.txt
The above are just the initial results; I am still testing this. It
certainly seems like it can take a beating.
Lee
Lee
Lee Revell wrote:
> On Fri, 2004-09-10 at 09:28, Ingo Molnar wrote:
>
>>* Andrew Morton <[email protected]> wrote:
>>
>>
>>>diff -puN mm/vmscan.c~swapspace-layout-improvements mm/vmscan.c
>>>--- 25/mm/vmscan.c~swapspace-layout-improvements 2004-06-03 21:32:51.087602712 -0700
>>>+++ 25-akpm/mm/vmscan.c 2004-06-03 21:32:51.102600432 -0700
>>
>
> OK, Andrew's patch seems to be an improvement. I can still cause
> unbounded latencies, but these only seem to happen when we fill all
> available RAM and swap space, at which point we start spending
> milliseconds at a time in scan_swap_map:
>
>
I see much improved performance so far. Been running for about 3 hours
and the highest latency I've seen thus far is ~260 usec and that was
mmap not swap. The highest latency I've seen from swapping is ~198 and
we have been in and out of swap at least several times. The latency
trace can be seen here:
http://www.cybsft.com/testresults/2.6.9-rc1-bk12-S0/latencytrace1.txt
kr
On Thu, 2004-09-09 at 07:09, Rui Nuno Capela wrote:
> 4) Then I wandered: the problem must be in one of plugged USB devices. And
> right I was: my WACOM GRAPHIRE2 USB tablet was the culprit. Strange
> enough, the hid and wacom modules weren't compiled in yet. Some more
> iterations later, and it didn't matter if those modules are in or not: if
> the tablet is plugged in at boot time the VP+SMP combination freezes.
>
> 5) Incidentally I found that I must unplug the tablet at boot time of
> freshly built VP+SMP kernel. Then I found that installing the linuxwacom
> project [http://linuxwacom.sourceforge.net] drivers, which adds some
> changes to mousedev (built-in), evdev and wacom kernel modules, I end up
> with a kernel that I can boot and run later already with the tablet
> plugged in.
>
> 6) Now that had found the major showstopper, I decided to go for audio:
> among some other thingies, switched ALSA sound modules on, included the
> realtime-lsm patch and built what comes to be my latest VP+SMP working
> kernel. And it boots OK. Great.
>
> 7) Now let's start jackd... start some client applications, hear some
> sound, and... horror! The system hangs completetly. The time it takes to
> hang is by no means deterministic. Soon or later it hangs. Hard-reboot is
> always the only way around, no magic-sysrq :( Gasp, I've seen this before.
>
> 8) Indeed, only by disabling both softirq and hardirq preeemption I get an
> usable VP+SMP kernel. But that's no surprise either, it has been always
> like that until Q3, which was the latest VP+SMP combination that didn't
> suffer with the Wacom tablet presence at boot/init time. I only hoped the
> (soft|hard)irq trouble would be solved by R9 time.
Rui, did you ever get this working? Other testers are not reporting
problems, it would be good to know if there are still bugs lurking.
Have you tried booting with hard and softirq preemption disabled and
enabling them one at a time?
Lee
Hi,
Lee Revell wrote:
>
>Rui Nuno Capela wrote:
>>
>> 8) Indeed, only by disabling both softirq and hardirq preeemption I get
>> an usable VP+SMP kernel. But that's no surprise either, it has been
>> always like that until Q3, which was the latest VP+SMP combination that
>> didn't suffer with the Wacom tablet presence at boot/init time. I only
>> hoped the (soft|hard)irq trouble would be solved by R9 time.
>
> Rui, did you ever get this working? Other testers are not reporting
> problems, it would be good to know if there are still bugs lurking.
>
> Have you tried booting with hard and softirq preemption disabled and
> enabling them one at a time?
>
I've beeing doing a lot experiments, the trial-and-error way, by tweaking
kernel config options on and off, and (re)building the
linux-2.6.9-rc1-bk12-S0 SMP kernel.
I have some news indeed. As you may recall, I'm trying to run jackd
realtime low-latency audio on a P4 2.80C HT (Hyperthreading), and I keep
CONFIG_SMP=y always set.
I found, almost by mistake, that whether CONFIG_SCHED_SMT is set makes a
lot of difference.
a) With CONFIG_SCHED_SMT=y, which I've being doing since ever, the system
behavior is that same one I've been complaining about: having
softirq-preempt=0 and hardirq-preempt=0 is the minimal setting to run
jackd in realtime mode without hard-locking the whole system. Even then, I
get the system completely frozen more times than I like, almost twice a
day! Can't figure out who's or what's the culprit here. It's quite random.
b) When CONFIG_SCHED_SMT is not set, I can run all along with
softirq-preempt=1, hardirq-preempt=1 et al. While running jackd in
realtime mode, I get NO hard-locks, but unfortunately XRUNs are plenty. A
real storm. However I've noticed that the whole seems pretty much stable,
as I didn't experience one single system hang. Regression to
softirq-preempt=0 and hardirq-preempt=0 dissolves the xrun storm to
nothing again.
All my experiments were done based on starting jackd -R -p 128 -n 2 ...
using an onboard Intel ICH5 soundcrap driven by snd-intel8x0 (alsa). Oh, I
forgot to say that it's been always with kernel-preempt=1 and
voluntary-preempt=1.
I'm preparing to take some latency traces later on, while regarding the
SMP=1 SMT=0 configuration and softirq=1 hardirq=1 setting, in a effort to
let that horrible XRUN flux getting out of the way somehow, someday ;)
So, bottomline is that the SMT-aware scheduler is not ready for VP, isn't
it? Does anyone care to confirm this out?
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> b) When CONFIG_SCHED_SMT is not set, I can run all along with
> softirq-preempt=1, hardirq-preempt=1 et al. While running jackd in
> realtime mode, I get NO hard-locks, but unfortunately XRUNs are
> plenty. A real storm. However I've noticed that the whole seems pretty
> much stable, as I didn't experience one single system hang. Regression
> to softirq-preempt=0 and hardirq-preempt=0 dissolves the xrun storm to
> nothing again.
when you set softirq-preempt=1 and hardirq-preempt=1 then you also need
to make the soundcard's IRQ non-threaded via /proc/irq/*/*/threaded
(pick the right one that is your soundcard). E.g. i have a CMI8738-MC6
on IRQ11, so i'd have to do this:
echo 0 > /proc/irq/11/CMI8738-MC6/threaded
to mark the soundcard's interrupt as directly-executed.
Ingo
Hi Ingo,
Ingo Molnar wrote:
>
>Rui Nuno Capela wrote:
>
>> b) When CONFIG_SCHED_SMT is not set, I can run all along with
>> softirq-preempt=1, hardirq-preempt=1 et al. While running jackd in
>> realtime mode, I get NO hard-locks, but unfortunately XRUNs are
>> plenty. A real storm. However I've noticed that the whole seems pretty
>> much stable, as I didn't experience one single system hang. Regression
>> to softirq-preempt=0 and hardirq-preempt=0 dissolves the xrun storm to
>> nothing again.
>
> when you set softirq-preempt=1 and hardirq-preempt=1 then you also need
> to make the soundcard's IRQ non-threaded via /proc/irq/*/*/threaded
> (pick the right one that is your soundcard). E.g. i have a CMI8738-MC6
> on IRQ11, so i'd have to do this:
>
> echo 0 > /proc/irq/11/CMI8738-MC6/threaded
>
> to mark the soundcard's interrupt as directly-executed.
>
Yes, I didn't mentioned that, but I do have provided it and assumed on all
my reported trials:
echo 0 > "/proc/irq/8/rtc/threaded"
echo 0 > "/proc/irq/17/Intel ICH5/threaded"
Thanks.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Yes, I didn't mentioned that, but I do have provided it and assumed on
> all my reported trials:
>
> echo 0 > "/proc/irq/8/rtc/threaded"
> echo 0 > "/proc/irq/17/Intel ICH5/threaded"
>
> Thanks.
weird. You shouldnt get any xruns - unless jackd for whatever reason
doesnt truly run under RT priorities. (there was some NPTL related
buglet that caused such a symptom in earlier jackd versions.)
Ingo
Ingo Molnar wrote:
>
>Rui Nuno Capela wrote:
>
>> Yes, I didn't mentioned that, but I do have provided it and assumed on
>> all my reported trials:
>>
>> echo 0 > "/proc/irq/8/rtc/threaded"
>> echo 0 > "/proc/irq/17/Intel ICH5/threaded"
>>
>> Thanks.
>
> weird. You shouldnt get any xruns - unless jackd for whatever reason
> doesnt truly run under RT priorities. (there was some NPTL related
> buglet that caused such a symptom in earlier jackd versions.)
>
I thought it has been ironed out here.
Note that the difference arises only whether softirq-preempt and
hardirq-preempt are enabled or not.
- with softirq-preempt=0 and hardirq-preempt=0, jackd realtime runs
perfectly, as advertised (jackd -R -p 128 -n 2), and sounds good too ;)
- with softirq-preempt=1 and hardirq-preempt=1, the XRUN storm is terribly
annoying. And sound is obviously a crackling festival.
Remember, that all this is on same hardware, same smp kernel configuration
(CONFIG_SCHED_SMT is not set), same everything else (SuSE 9.1, NPTL 0.61,
jack-0.98.11cvs).
I hope the latency-traces show something useful. Til then...
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
i've released the -S1 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
NOTE: this patch is against Andrew's -mm tree and the VP patchset will
stay based on -mm until the merging process has been finished.
to get a 2.6.9-rc2-mm1-VP-S1 kernel, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/2.6.9-rc2-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
Changes relative to -S0:
- lots of merging. A good chunk of the VP patch latency breakers and
support patches are in -mm already.
- integrated my 'preemptible big kernel lock' patch into VP. This makes
all BKL code preemptible while keeping correctness. A new debugging
infrastructure has been added to catch code that might use the BKL
in an unsafe way. If the debugging check triggers it will print
messages like:
using smp_processor_id() in preemptible code: bash/1020
please report such messages and backtraces to me. Most of the messages
i've fixed so far were false positives, but one bug has been caught
already via this.
Also, this BKL patch allowed the removal of two questionable
latency breakers: the tty.c and the DRM BKL relaxation hack.
- fixed an SMP hardirq redirection bug - IRQ threads could be bound to
multiple CPUs resulting in potentially illegal preemption of hardirq
contexts.
- temporarily dropped the ppc/ppc64 GENERIC_HARDIRQS changes, they broke
and i cannot test them.
Reports, comments welcome,
Ingo
On Sep 19, 2004, at 14:26, Ingo Molnar wrote:
>
> i've released the -S1 VP patch:
...
> Reports, comments welcome,
I've been running 2.6.9-rc2-mm1-VP-S1 for some time now and it seems to
be performing well.
Hi Ingo,
things improved here after having applied
swapspace-layout-improvements-2.6.9-rc1-bk12-A1.
I'm happily running jackd and clients realtime now without any dropouts even
under heavy swapping pressure.
(Machine is a PIII@600MHz with 256MB RAM)
Could you please include the swapspace-layout-improvements in the
voluntary-preempt patches?
Just 1 small correction:
>>>>
--- kernel/time.c~ 2004-09-19 15:09:38.000000000 +0200
+++ kernel/time.c 2004-09-19 17:02:35.000000000 +0200
@@ -96,8 +96,10 @@
asmlinkage long sys_gettimeofday(struct timeval __user *tv, struct timezone
__user *tz)
{
#ifdef CONFIG_LATENCY_TRACE
- if (!tv && ((long)tz == 1))
+ if (!tv && ((long)tz == 1)) {
user_trace_start();
+ tz = NULL;
+ }
if (!tv && !tz)
user_trace_stop();
#endif
<<<<
thanks for your splendid patches,
Karsten
* Karsten Wiese <[email protected]> wrote:
> Just 1 small correction:
> >>>>
> --- kernel/time.c~ 2004-09-19 15:09:38.000000000 +0200
> +++ kernel/time.c 2004-09-19 17:02:35.000000000 +0200
> @@ -96,8 +96,10 @@
> asmlinkage long sys_gettimeofday(struct timeval __user *tv, struct timezone
> __user *tz)
> {
> #ifdef CONFIG_LATENCY_TRACE
> - if (!tv && ((long)tz == 1))
> + if (!tv && ((long)tz == 1)) {
> user_trace_start();
> + tz = NULL;
> + }
> if (!tv && !tz)
> user_trace_stop();
The point is to let gettimeofday(0,1) start tracing and
gettimeofday(0,0) stop tracing - a system-call-controlled tracing
facility (if trace_enabled=2). This was used to trace weird latencies
before, but it's not the normal mode of operation.
Ingo
Am Sonntag 19 September 2004 22:48 schrieb Ingo Molnar:
>
> The point is to let gettimeofday(0,1) start tracing and
> gettimeofday(0,0) stop tracing - a system-call-controlled tracing
> facility (if trace_enabled=2). This was used to trace weird latencies
> before, but it's not the normal mode of operation.
>
Ok. The other point is a page_fault being generated later on in
sys_gettimeofday() if tz is not reset:
>>>>
if (unlikely(tz != NULL)) {
^^
if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
return -EFAULT;
}
<<<<
What do you think about including the swapspace-layout-improvements in the
voluntary-preempt patches?
best regards,
Karsten
* Karsten Wiese <[email protected]> wrote:
> Hi Ingo,
>
> things improved here after having applied
> swapspace-layout-improvements-2.6.9-rc1-bk12-A1. I'm happily running
> jackd and clients realtime now without any dropouts even under heavy
> swapping pressure. (Machine is a PIII@600MHz with 256MB RAM) Could you
> please include the swapspace-layout-improvements in the
> voluntary-preempt patches?
only if Andrew agrees that the patch has a chance for -mm inclusion and
possible upstream merging.
Ingo
* Karsten Wiese <[email protected]> wrote:
> Am Sonntag 19 September 2004 22:48 schrieb Ingo Molnar:
> >
> > The point is to let gettimeofday(0,1) start tracing and
> > gettimeofday(0,0) stop tracing - a system-call-controlled tracing
> > facility (if trace_enabled=2). This was used to trace weird latencies
> > before, but it's not the normal mode of operation.
> >
> Ok. The other point is a page_fault being generated later on in
> sys_gettimeofday() if tz is not reset:
> >>>>
> if (unlikely(tz != NULL)) {
> ^^
> if (copy_to_user(tz, &sys_tz, sizeof(sys_tz)))
> return -EFAULT;
> }
> <<<<
yeah - it's a bit ugly. The right thing is to return from that branch.
Ingo
Ingo Molnar <[email protected]> wrote:
>
> > Could you
> > please include the swapspace-layout-improvements in the
> > voluntary-preempt patches?
>
> only if Andrew agrees that the patch has a chance for -mm inclusion and
> possible upstream merging.
It needs more work - from the (brief) testing I did, it didn't seem to
improve that which it was intended to improve: swap I/O performance. Not
sure why, really.
The latency improvements were serendipitous.
Ingo Molnar wrote:
> i've released the -S1 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
>
> NOTE: this patch is against Andrew's -mm tree and the VP patchset will
> stay based on -mm until the merging process has been finished.
>
> to get a 2.6.9-rc2-mm1-VP-S1 kernel, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/2.6.9-rc2-mm1.bz2
> + http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
>
> Changes relative to -S0:
>
> - lots of merging. A good chunk of the VP patch latency breakers and
> support patches are in -mm already.
>
> - integrated my 'preemptible big kernel lock' patch into VP. This makes
> all BKL code preemptible while keeping correctness. A new debugging
> infrastructure has been added to catch code that might use the BKL
> in an unsafe way. If the debugging check triggers it will print
> messages like:
>
> using smp_processor_id() in preemptible code: bash/1020
>
> please report such messages and backtraces to me. Most of the messages
> i've fixed so far were false positives, but one bug has been caught
> already via this.
>
> Also, this BKL patch allowed the removal of two questionable
> latency breakers: the tty.c and the DRM BKL relaxation hack.
>
> - fixed an SMP hardirq redirection bug - IRQ threads could be bound to
> multiple CPUs resulting in potentially illegal preemption of hardirq
> contexts.
>
> - temporarily dropped the ppc/ppc64 GENERIC_HARDIRQS changes, they broke
> and i cannot test them.
>
> Reports, comments welcome,
>
> Ingo
> -
Is anyone else having trouble getting this to build on x86 smp? I am
getting undefined references to smp_processor_id within most, if not
all, modules.
kr
* K.R. Foley <[email protected]> wrote:
> Is anyone else having trouble getting this to build on x86 smp? I am
> getting undefined references to smp_processor_id within most, if not
> all, modules.
add EXPORT_SYMBOL(smp_processor_id) to the end of sched.c.
Ingo
Hello Ingo,
On Sunday 19 September 2004 14.26, Ingo Molnar wrote:
> i've released the -S1 VP patch:
>
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-
>S1
>
> NOTE: this patch is against Andrew's -mm tree and the VP patchset will
> stay based on -mm until the merging process has been finished.
>
I got this trace on my laptop (first time I'm testing VP):
preemption latency trace v1.0.7 on 2.6.9-rc2-mm1-VP-S1
-------------------------------------------------------
latency: 1658 us, entries: 150 (150) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: powersaved/3542, uid:0 nice:10 policy:0 rt_prio:0
-----------------
=> started at: acpi_ec_write+0x62/0x1c9
=> ended at: acpi_ec_write+0x19a/0x1c9
=======>
00000001 0.000ms (+0.000ms): acpi_ec_write (acpi_ec_space_handler)
00000001 0.000ms (+0.000ms): acpi_hw_low_level_write (acpi_ec_write)
00000001 0.000ms (+0.001ms): acpi_os_write_port (acpi_hw_low_level_write)
00000001 0.001ms (+0.000ms): acpi_ec_wait (acpi_ec_write)
00000001 0.002ms (+0.000ms): acpi_hw_low_level_read (acpi_ec_wait)
00000001 0.002ms (+0.001ms): acpi_os_read_port (acpi_hw_low_level_read)
00000001 0.003ms (+0.000ms): __const_udelay (acpi_ec_wait)
00000001 0.004ms (+0.000ms): __delay (acpi_ec_wait)
00000001 0.004ms (+0.099ms): delay_pmtmr (__delay)
00000001 0.103ms (+0.000ms): acpi_hw_low_level_read (acpi_ec_wait)
00000001 0.103ms (+0.001ms): acpi_os_read_port (acpi_hw_low_level_read)
...
00000001 1.320ms (+0.000ms): __const_udelay (acpi_ec_wait)
00000001 1.320ms (+0.000ms): __delay (acpi_ec_wait)
00000001 1.321ms (+0.099ms): delay_pmtmr (__delay)
00000001 1.420ms (+0.000ms): acpi_hw_low_level_read (acpi_ec_wait)
00000001 1.420ms (+0.001ms): acpi_os_read_port (acpi_hw_low_level_read)
00000001 1.422ms (+0.000ms): acpi_hw_low_level_write (acpi_ec_write)
00000001 1.422ms (+0.001ms): acpi_os_write_port (acpi_hw_low_level_write)
00000001 1.423ms (+0.000ms): acpi_ec_wait (acpi_ec_write)
00000001 1.423ms (+0.000ms): acpi_hw_low_level_read (acpi_ec_wait)
00000001 1.424ms (+0.001ms): acpi_os_read_port (acpi_hw_low_level_read)
00000001 1.425ms (+0.000ms): __const_udelay (acpi_ec_wait)
00000001 1.425ms (+0.000ms): __delay (acpi_ec_wait)
00000001 1.425ms (+0.099ms): delay_pmtmr (__delay)
00000001 1.525ms (+0.000ms): acpi_hw_low_level_read (acpi_ec_wait)
00000001 1.525ms (+0.001ms): acpi_os_read_port (acpi_hw_low_level_read)
00000001 1.526ms (+0.000ms): acpi_hw_low_level_write (acpi_ec_write)
00000001 1.527ms (+0.001ms): acpi_os_write_port (acpi_hw_low_level_write)
00000001 1.528ms (+0.000ms): acpi_ec_wait (acpi_ec_write)
00000001 1.528ms (+0.000ms): acpi_hw_low_level_read (acpi_ec_wait)
00000001 1.529ms (+0.001ms): acpi_os_read_port (acpi_hw_low_level_read)
00000001 1.530ms (+0.000ms): __const_udelay (acpi_ec_wait)
00000001 1.530ms (+0.000ms): __delay (acpi_ec_wait)
00000001 1.530ms (+0.099ms): delay_pmtmr (__delay)
00000001 1.630ms (+0.000ms): acpi_hw_low_level_read (acpi_ec_wait)
00000001 1.630ms (+0.001ms): acpi_os_read_port (acpi_hw_low_level_read)
00000001 1.632ms (+0.000ms): smp_apic_timer_interrupt (acpi_ec_write)
00010001 1.632ms (+0.000ms): profile_tick (smp_apic_timer_interrupt)
00010001 1.632ms (+0.000ms): profile_hook (profile_tick)
00010002 1.632ms (+0.000ms): notifier_call_chain (profile_hook)
00010001 1.633ms (+0.001ms): profile_hit (smp_apic_timer_interrupt)
00010001 1.634ms (+0.000ms): do_IRQ (acpi_ec_write)
00010001 1.634ms (+0.000ms): do_IRQ (<00000000>)
00010002 1.635ms (+0.002ms): mask_and_ack_8259A (do_IRQ)
00010002 1.637ms (+0.000ms): redirect_hardirq (do_IRQ)
00010001 1.638ms (+0.000ms): handle_IRQ_event (do_IRQ)
00010001 1.638ms (+0.000ms): timer_interrupt (handle_IRQ_event)
00010002 1.638ms (+0.002ms): mark_offset_pmtmr (timer_interrupt)
00010002 1.641ms (+0.000ms): do_timer (timer_interrupt)
00010002 1.641ms (+0.000ms): update_process_times (do_timer)
00010002 1.641ms (+0.000ms): update_one_process (update_process_times)
00010002 1.642ms (+0.000ms): run_local_timers (update_process_times)
00010002 1.642ms (+0.000ms): raise_softirq (update_process_times)
00010002 1.642ms (+0.000ms): scheduler_tick (update_process_times)
00010002 1.643ms (+0.000ms): task_timeslice (scheduler_tick)
00010002 1.643ms (+0.000ms): update_wall_time (do_timer)
00010002 1.643ms (+0.000ms): update_wall_time_one_tick (update_wall_time)
00010002 1.644ms (+0.000ms): note_interrupt (do_IRQ)
00010002 1.644ms (+0.000ms): end_8259A_irq (do_IRQ)
00010002 1.644ms (+0.001ms): enable_8259A_irq (do_IRQ)
00000002 1.645ms (+0.000ms): do_softirq (do_IRQ)
00000002 1.646ms (+0.000ms): __do_softirq (do_softirq)
00000002 1.646ms (+0.000ms): wake_up_process (do_softirq)
00000002 1.646ms (+0.000ms): try_to_wake_up (wake_up_process)
00000002 1.646ms (+0.000ms): task_rq_lock (try_to_wake_up)
00000003 1.647ms (+0.000ms): activate_task (try_to_wake_up)
00000003 1.647ms (+0.000ms): sched_clock (activate_task)
00000003 1.647ms (+0.000ms): task_priority (activate_task)
00000003 1.648ms (+0.000ms): task_sleep_avg (task_priority)
00000003 1.648ms (+0.000ms): enqueue_task (activate_task)
00000002 1.648ms (+0.001ms): preempt_schedule (try_to_wake_up)
00010001 1.650ms (+0.000ms): do_IRQ (acpi_ec_write)
00010001 1.650ms (+0.000ms): do_IRQ (<00000009>)
00010002 1.650ms (+0.003ms): mask_and_ack_8259A (do_IRQ)
00010002 1.654ms (+0.000ms): preempt_schedule (do_IRQ)
00010002 1.654ms (+0.000ms): redirect_hardirq (do_IRQ)
00010002 1.654ms (+0.000ms): wake_up_process (redirect_hardirq)
00010002 1.654ms (+0.000ms): try_to_wake_up (wake_up_process)
00010002 1.655ms (+0.000ms): task_rq_lock (try_to_wake_up)
00010003 1.655ms (+0.000ms): activate_task (try_to_wake_up)
00010003 1.655ms (+0.000ms): sched_clock (activate_task)
00010003 1.655ms (+0.000ms): task_priority (activate_task)
00010003 1.656ms (+0.000ms): task_sleep_avg (task_priority)
00010003 1.656ms (+0.000ms): enqueue_task (activate_task)
00010002 1.656ms (+0.000ms): preempt_schedule (try_to_wake_up)
00010001 1.657ms (+0.000ms): preempt_schedule (do_IRQ)
00000002 1.657ms (+0.000ms): do_softirq (do_IRQ)
00000002 1.657ms (+0.000ms): __do_softirq (do_softirq)
00000001 1.658ms (+0.000ms): sub_preempt_count (acpi_ec_write)
00000001 1.658ms (+0.000ms): update_max_trace (check_preempt_timing)
I don't know if anything can be done about it, but I get lots of them.
The computer is a Intel P-M 1.5GHz with 512MB RAM.
/Magnus M??tt?
I am having what appears to be IDE DMA problems with 2.6.9-rc2-mm1-S1.
2.6.9-rc2-mm1 does not show this problem and runs fine. Before this I
was happily using 2.6.8-rc3-O5.
I tried booting with acpi=off but was unable to enter my user name at
the login prompt, it just hung with no response to sysreq. I also tried
turning off irq threading for that irq but it made no difference.
There is one drive on the secondary channel of this Promise TX133. This
is what appears in the log after a minute or two of using the drive.
hdg: dma_timer_expiry: dma status == 0x24
PDC202XX: Secondary channel reset.
hdg: DMA interrupt recovery
hdg: lost interrupt
hdg: dma_timer_expiry: dma status == 0x24
PDC202XX: Secondary channel reset.
hdg: DMA interrupt recovery
hdg: lost interrupt
hdg: dma_timer_expiry: dma status == 0x24
PDC202XX: Secondary channel reset.
[..many repeats..]
It sometimes recovers but it immediately happens again. This leaves apps
touching that drive stuck in an un-killable D state and eventually I
have to reboot.
Linux video capture interface: v1.00
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx
PDC20269: IDE controller at PCI slot 0000:00:0d.0
ACPI: PCI interrupt 0000:00:0d.0[A] -> GSI 16 (level, low) -> IRQ 16
PDC20269: chipset revision 2
PDC20269: 100% native mode on irq 16
ide2: BM-DMA at 0xa000-0xa007, BIOS settings: hde:pio, hdf:pio
ide3: BM-DMA at 0xa008-0xa00f, BIOS settings: hdg:pio, hdh:pio
hdg: MAXTOR 6L080J4, ATA DISK drive
requesting new irq thread for IRQ16...
ide3 at 0xa800-0xa807,0xa402 on irq 16
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
ACPI: PCI interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 20
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
ide0: BM-DMA at 0x7400-0x7407, BIOS settings: hda:pio, hdb:pio
ide1: BM-DMA at 0x7408-0x740f, BIOS settings: hdc:DMA, hdd:pio
hdc: HL-DT-ST DVDRAM GSA-4082B, ATAPI CD/DVD-ROM drive
requesting new irq thread for IRQ15...
ide1 at 0x170-0x177,0x376 on irq 15
hdg: max request size: 128KiB
IRQ#16 thread started up.
hdg: 156355584 sectors (80054 MB) w/1819KiB Cache, CHS=65535/16/63,
UDMA(133)
hdg: hdg1
IRQ#15 thread started up.
hdc: ATAPI 63X DVD-ROM DVD-R-RAM CD-R/RW drive, 2048kB Cache, UDMA(33)
Uniform CD-ROM driver Revision: 3.20
ACPI: PCI interrupt 0000:00:0c.0[A] -> GSI 19 (level, low) -> IRQ 19
Regards,
Shane
sorry for the scrambled reply,
i'm not subscribed
(please add me to the CC list)
in reply to:
-----------------------
List: linux-kernel
Subject: Re: [patch] voluntary-preempt-2.6.9-rc2-mm1-S1
From: Shane Shrybman <shrybman () aei ! ca>
Date: 2004-09-20 21:16:07
Message-ID: <1095714967.3646.14.camel () mars>
[Download message RAW]
I am having what appears to be IDE DMA problems with 2.6.9-rc2-mm1-S1.
2.6.9-rc2-mm1 does not show this problem and runs fine. Before this I
was happily using 2.6.8-rc3-O5.
I tried booting with acpi=off but was unable to enter my user name at
the login prompt, it just hung with no response to sysreq. I also tried
turning off irq threading for that irq but it made no difference.
There is one drive on the secondary channel of this Promise TX133. This
is what appears in the log after a minute or two of using the drive.
hdg: dma_timer_expiry: dma status == 0x24
PDC202XX: Secondary channel reset.
hdg: DMA interrupt recovery
hdg: lost interrupt
hdg: dma_timer_expiry: dma status == 0x24
PDC202XX: Secondary channel reset.
hdg: DMA interrupt recovery
hdg: lost interrupt
hdg: dma_timer_expiry: dma status == 0x24
PDC202XX: Secondary channel reset.
[..many repeats..]
......
---------------------------------------
i'm getting the same problem
(although not sure if the dma status was the same)
if i enable hardirq preemption
turning acpi on/off doesn't change anything
io/up apic is on (haven't tried disabling it)
with hardirq preemption disabled in .config
everything looks fine sofar (~5h )
-----------------------------
666 root 16 -10 0 0 0 S 0.0 0.0 0:00.00 IRQ 17
670 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 IRQ 14
673 root 15 -10 0 0 0 S 0.0 0.0 0:00.00 IRQ 15
711 root 25 0 0 0 0 S 0.0 0.0 0:00.23 khubd
712 root 16 -10 0 0 0 S 0.0 0.0 0:00.00 IRQ 21
----------------------------
this puzzles me a bit
aren't those hardirqs ?
why are they listed as threads, in case i compiled
with
----------------
[svetljo@svetljo rc2mm1]$ grep PREEMPT .config
# CONFIG_PREEMPT_TIMING is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_PREEMPT_SOFTIRQS=y
# CONFIG_PREEMPT_HARDIRQS is not set
----------------
best,
svetljo
PS.
PC is up amd xp 2700 KT400 VT8235 (Epox 8k9A3+)
[svetljo@svetljo rc2mm1]$ cat /proc/interrupts
CPU0
0: 20809263 IO-APIC-edge timer
2: 0 XT-PIC cascade
14: 516 IO-APIC-edge ide4
15: 478 IO-APIC-edge ide5
16: 1763130 IO-APIC-level radeon@pci:0000:01:00.0
17: 894337 IO-APIC-level ide0, ide2, ide3, eth0
18: 59771 IO-APIC-level EMU10K1
21: 1120812 IO-APIC-level uhci_hcd, uhci_hcd, uhci_hcd, ehci_hcd
23: 2395783 IO-APIC-level eth1
NMI: 0
LOC: 20810017
ERR: 0
MIS: 485
--
+++ GMX DSL Premiumtarife 3 Monate gratis* + WLAN-Router 0,- EUR* +++
Clevere DSL-Nutzer wechseln jetzt zu GMX: http://www.gmx.net/de/go/dsl
Ingo Molnar wrote:
> i've released the -S1 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
>
All of these were generated while booting:
Sep 20 19:45:10 porky kernel: using smp_processor_id() in preemptible
code: modprobe/1019
Sep 20 19:45:10 porky kernel: [<c011c58e>] smp_processor_id+0x8e/0xa0
Sep 20 19:45:10 porky kernel: [<c013ace6>] module_unload_init+0x46/0x70
Sep 20 19:45:10 porky kernel: [<c013ce58>] load_module+0x598/0xb10
Sep 20 19:45:10 porky kernel: [<c013d438>] sys_init_module+0x68/0x280
Sep 20 19:45:10 porky kernel: [<c01066b9>] sysenter_past_esp+0x52/0x71
The above one of course repeats on each module load.
Sep 20 19:45:10 porky kernel: using smp_processor_id() in preemptible
code: X/1017
Sep 20 19:45:10 porky kernel: [<c011c58e>] smp_processor_id+0x8e/0xa0
Sep 20 19:45:10 porky kernel: [<c01d6c15>] add_timer_randomness+0x125/0x150
Sep 20 19:45:10 porky kernel: [<c01d6c9e>] add_mouse_randomness+0x1e/0x30
Sep 20 19:45:10 porky kernel: [<c022b835>] input_event+0x55/0x3f0
Sep 20 19:45:10 porky kernel: [<c01151e8>] mcount+0x14/0x18
Sep 20 19:45:10 porky kernel: [<c01e559e>] kbd_rate+0x5e/0xc0
Sep 20 19:45:10 porky kernel: [<c01e2196>] vt_ioctl+0xe06/0x1ad0
Sep 20 19:45:10 porky kernel: [<c014fdcf>] pte_alloc_map+0x9f/0xd0
Sep 20 19:45:10 porky kernel: [<c015214b>] handle_mm_fault+0x17b/0x1a0
Sep 20 19:45:10 porky kernel: [<c0119440>] do_page_fault+0x1e0/0x621
Sep 20 19:45:10 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 20 19:45:10 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 20 19:45:10 porky kernel: [<c0138512>] check_preempt_timing+0x192/0x200
Sep 20 19:45:10 porky kernel: [<c0175034>] sys_ioctl+0xe4/0x240
Sep 20 19:45:10 porky kernel: [<c01dbd1e>] tty_ioctl+0xe/0x4d0
Sep 20 19:45:10 porky kernel: [<c01151e8>] mcount+0x14/0x18
Sep 20 19:45:10 porky kernel: [<c01e1390>] vt_ioctl+0x0/0x1ad0
Sep 20 19:45:10 porky kernel: [<c01dc08b>] tty_ioctl+0x37b/0x4d0
Sep 20 19:45:10 porky kernel: [<c0175034>] sys_ioctl+0xe4/0x240
Sep 20 19:45:10 porky kernel: [<c01066b9>] sysenter_past_esp+0x52/0x71
The X one above repeats once also.
kr
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Is anyone else having trouble getting this to build on x86 smp? I am
>>getting undefined references to smp_processor_id within most, if not
>>all, modules.
>
>
> add EXPORT_SYMBOL(smp_processor_id) to the end of sched.c.
>
> Ingo
>
Thanks. That did the trick along with the afs patch Andrew posted.
kr
* K.R. Foley <[email protected]> wrote:
> All of these were generated while booting:
>
> Sep 20 19:45:10 porky kernel: using smp_processor_id() in preemptible
> code: modprobe/1019
> Sep 20 19:45:10 porky kernel: [<c011c58e>] smp_processor_id+0x8e/0xa0
> Sep 20 19:45:10 porky kernel: [<c013ace6>] module_unload_init+0x46/0x70
> Sep 20 19:45:10 porky kernel: [<c013ce58>] load_module+0x598/0xb10
> Sep 20 19:45:10 porky kernel: [<c013d438>] sys_init_module+0x68/0x280
> Sep 20 19:45:10 porky kernel: [<c01066b9>] sysenter_past_esp+0x52/0x71
>
> The above one of course repeats on each module load.
ok, this is a harmless false positive - the attached patch fixes it.
> Sep 20 19:45:10 porky kernel: using smp_processor_id() in preemptible
> code: X/1017
> Sep 20 19:45:10 porky kernel: [<c011c58e>] smp_processor_id+0x8e/0xa0
> Sep 20 19:45:10 porky kernel: [<c01d6c15>] add_timer_randomness+0x125/0x150
> Sep 20 19:45:10 porky kernel: [<c01d6c9e>] add_mouse_randomness+0x1e/0x30
> Sep 20 19:45:10 porky kernel: [<c022b835>] input_event+0x55/0x3f0
> Sep 20 19:45:10 porky kernel: [<c01e1390>] vt_ioctl+0x0/0x1ad0
> Sep 20 19:45:10 porky kernel: [<c01dc08b>] tty_ioctl+0x37b/0x4d0
> Sep 20 19:45:10 porky kernel: [<c0175034>] sys_ioctl+0xe4/0x240
> Sep 20 19:45:10 porky kernel: [<c01066b9>] sysenter_past_esp+0x52/0x71
>
> The X one above repeats once also.
aha! This is a real one, fixed by the second patch. This piece of code
relied on add_timer_randomness() always being called with preemption
disabled.
these fixes will show up in -S2.
Ingo
* Shane Shrybman <[email protected]> wrote:
> I am having what appears to be IDE DMA problems with 2.6.9-rc2-mm1-S1.
> 2.6.9-rc2-mm1 does not show this problem and runs fine. Before this I
> was happily using 2.6.8-rc3-O5.
>
> I tried booting with acpi=off but was unable to enter my user name at
> the login prompt, it just hung with no response to sysreq. I also
> tried turning off irq threading for that irq but it made no
> difference.
does undoing (patch -R) the attached patch fix this IDE problem?
Ingo
i've released the -S2 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S2
Changes since -S1:
- added the swapspace-layout patch to fix the get_swap_page()
latencies. (sw-suspend wont compile but everything else should work
fine.)
- fixed the random.c BKL non-preemptability assumption
- export smp_processor_id() to fix modules
- module init smp_processor_id()-debug false positive fix
To get a 2.6.9-rc2-mm1-VP-S2 kernel, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/2.6.9-rc2-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S2
Ingo
On Tue, 2004-09-21 at 03:32, Ingo Molnar wrote:
> * Shane Shrybman <[email protected]> wrote:
>
> > I am having what appears to be IDE DMA problems with 2.6.9-rc2-mm1-S1.
> > 2.6.9-rc2-mm1 does not show this problem and runs fine. Before this I
> > was happily using 2.6.8-rc3-O5.
> >
> > I tried booting with acpi=off but was unable to enter my user name at
> > the login prompt, it just hung with no response to sysreq. I also
> > tried turning off irq threading for that irq but it made no
> > difference.
>
> does undoing (patch -R) the attached patch fix this IDE problem?
>
Yes, backing out that patch seems to have fixed that problem. Thanks.
> Ingo
Regards,
Shane
On Tue, 2004-09-21 at 03:32, Ingo Molnar wrote:
> * Shane Shrybman <[email protected]> wrote:
>
> > I am having what appears to be IDE DMA problems with 2.6.9-rc2-mm1-S1.
> > 2.6.9-rc2-mm1 does not show this problem and runs fine. Before this I
> > was happily using 2.6.8-rc3-O5.
> >
> > I tried booting with acpi=off but was unable to enter my user name at
> > the login prompt, it just hung with no response to sysreq. I also
> > tried turning off irq threading for that irq but it made no
> > difference.
>
> does undoing (patch -R) the attached patch fix this IDE problem?
>
Oh, I spoke too soon. A few minutes after I sent the last email the
problem re-appeared.
IRQ#22 thread started up.
hdg: dma_timer_expiry: dma status == 0x24
ALSA sound/core/pcm_native.c:1330: playback drain error (DMA or IRQ
trouble?)
PDC202XX: Secondary channel reset.
hdg: DMA interrupt recovery
hdg: lost interrupt
hdg: dma_timer_expiry: dma status == 0x24
PDC202XX: Secondary channel reset.
hdg: DMA interrupt recovery
hdg: lost interrupt
> Ingo
Regards,
Shane
Ingo Molnar wrote:
> i've released the -S1 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
>
Two separate oopses this morning running that above patch. One appears
to happen in locks_delete_lock. The log output follows. Unfortunately I
am not sure what is relevant to the oops and whats not so I am sending
it all. Also the trace that was generated when this happened can be
found here:
http://www.cybsft.com/testresults/2.6.9-rc2-mm1-VP-S0/lat_trace22.txt
log output:
http://www.cybsft.com/testresults/2.6.9-rc2-mm1-VP-S0/dump1.txt
The other appears to happen in __posix_lock_file.
Trace here:
http://www.cybsft.com/testresults/2.6.9-rc2-mm1-VP-S0/lat_trace23.txt
log output here:
http://www.cybsft.com/testresults/2.6.9-rc2-mm1-VP-S0/dump2.txt
If there is anything else that I can provide on these, or if there is a
better way to post this, please let me know.
kr
Ingo Molnar wrote:
> i've released the -S2 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S2
>
Another smp_processor_id in modprobe. Now I see these for every
modprobe. Is this a different global lock?
Sep 21 13:27:53 porky kernel: using smp_processor_id() in preemptible
code: modprobe/1737
Sep 21 13:27:53 porky kernel: [<c011c58e>] smp_processor_id+0x8e/0xa0
Sep 21 13:27:53 porky kernel: [<c01401e5>] __stop_machine_run+0xb5/0xc0
Sep 21 13:27:53 porky kernel: [<c013de30>] __try_stop_module+0x0/0x46
Sep 21 13:27:53 porky kernel: [<c01151e8>] mcount+0x14/0x18
Sep 21 13:27:53 porky kernel: [<c0140214>] stop_machine_run+0x24/0x3d
Sep 21 13:27:53 porky kernel: [<c013de30>] __try_stop_module+0x0/0x46
Sep 21 13:27:53 porky kernel: [<c013b019>] try_stop_module+0x39/0x40
Sep 21 13:27:53 porky kernel: [<c013de30>] __try_stop_module+0x0/0x46
Sep 21 13:27:53 porky kernel: [<c013b1e0>] sys_delete_module+0x110/0x180
Sep 21 13:27:53 porky kernel: [<c0154c09>] sys_munmap+0x59/0x80
Sep 21 13:27:53 porky kernel: [<c01066b9>] sysenter_past_esp+0x52/0x71
kr
* K.R. Foley <[email protected]> wrote:
> Ingo Molnar wrote:
> >i've released the -S1 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
> >
>
> Two separate oopses this morning running that above patch. One appears
> to happen in locks_delete_lock. [...]
i too got this one today. Seems to be related to the BKL changes -
locks.c is a heavy user of the BKL. You have an SMP system, right? Does
the oops happen if you boot with maxcpus=1?
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Ingo Molnar wrote:
>>
>>>i've released the -S1 VP patch:
>>>
>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S1
>>>
>>
>>Two separate oopses this morning running that above patch. One appears
>>to happen in locks_delete_lock. [...]
>
>
> i too got this one today. Seems to be related to the BKL changes -
> locks.c is a heavy user of the BKL. You have an SMP system, right? Does
> the oops happen if you boot with maxcpus=1?
>
> Ingo
>
This was on my SMP system. I can try the maxcpus=1. However, the trouble
may be reproducing the oops. This happened at ~5:35 this morning (~9
hrs. ago). Hadn't happened again as of an hour ago when I booted S2. I
will give it a try though.
kr
Reports in my dmesg from using voluntary-preempt-2.6.9-rc2-mm1-S2
This doesn't include a bunch of reports from nvidia.
using smp_processor_id() in preemptible code: rmmod/3329
[<c011d200>] smp_processor_id+0x84/0x8a
[<c013b527>] __stop_machine_run+0xc3/0xc7
[<c01376a3>] __try_stop_module+0x0/0x42
[<c013b530>] stop_machine_run+0x5/0x18
[<c013769f>] try_stop_module+0x1f/0x23
[<c0137854>] sys_delete_module+0xef/0x171
[<c014d87f>] unmap_vma_list+0xe/0x17
[<c014dbe9>] do_munmap+0x11a/0x176
[<c0104049>] sysenter_past_esp+0x52/0x71
using smp_processor_id() in preemptible code: gkrellm/3542
[<c011d200>] smp_processor_id+0x84/0x8a
[<c02dc66a>] disk_round_stats+0x23/0x8d
[<c02df0cf>] diskstats_show+0x17/0x312
[<c0243de6>] inode_has_perm+0x53/0x87
[<c024668f>] selinux_file_mmap+0x27/0x135
[<c0275834>] __copy_to_user_ll+0x52/0x61
[<c02758e3>] copy_to_user+0x40/0x53
[<c0161bba>] cp_new_stat64+0xf0/0x102
[<c0246280>] selinux_file_permission+0x111/0x166
[<c01750cb>] seq_read+0xc3/0x273
[<c0158a3a>] vfs_read+0xcd/0x126
[<c0158d00>] sys_read+0x41/0x6a
[<c0104049>] sysenter_past_esp+0x52/0x71
using smp_processor_id() in preemptible code: gkrellm/3542
[<c011d200>] smp_processor_id+0x84/0x8a
[<c02dc66a>] disk_round_stats+0x23/0x8d
[<c02df0cf>] diskstats_show+0x17/0x312
[<c01751db>] seq_read+0x1d3/0x273
[<c0158a3a>] vfs_read+0xcd/0x126
[<c0158d00>] sys_read+0x41/0x6a
[<c0104049>] sysenter_past_esp+0x52/0x71
* Lee Revell <[email protected]> wrote:
>> preemption latency trace v1.0.6 on 2.6.9-rc1-bk12-VP-R6
>> --------------------------------------------------
>> latency: 605 us, entries: 5 (5) [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
>> -----------------
>> | task: kswapd0/35, uid:0 nice:0 policy:0 rt_prio:0
>> -----------------
>> => started at: get_swap_page+0x23/0x490
>> => ended at: get_swap_page+0x13f/0x490
>> =======>
>> 00000001 0.000ms (+0.606ms): get_swap_page (add_to_swap)
>> 00000001 0.606ms (+0.000ms): sub_preempt_count (get_swap_page)
>> 00000001 0.606ms (+0.000ms): update_max_trace (check_preempt_timing)
>> 00000001 0.606ms (+0.000ms): _mmx_memcpy (update_max_trace)
>> 00000001 0.607ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
This appears to be the middle of the function, which is about right...
On Thu, Sep 09, 2004 at 09:29:24PM +0200, Ingo Molnar wrote:
> yep, the get_swap_page() latency. I can easily trigger 10+ msec
> latencies on a box with alot of swap by just letting stuff swap out. I
> had a quick look but there was no obvious way to break the lock. Maybe
> Andrew has better ideas? get_swap_page() is pretty stupid, it does a
> near linear search for a free slot in the swap bitmap - this not only is
> a latency issue but also an overhead thing as we do it for every other
> page that touches swap.
> rationale: this is pretty much the only latency that we still having
> during heavy VM load and it would Just Be Cool if we fixed this final
> one. audio daemons and apps like jackd use mlockall() so they are not
> affected by swapping.
I presume most of the time is due to scan_swap_map() and not much of the
rest of get_swap_page(). Dangling hierarchical bitmaps off of the
swap_info structures to accelerate the search sounds plausible, though
code reuse is largely infeasible due to memory allocation concerns (it
must be fully-populated during swaplist element creation). The space
overhead should be 1 bit per unsigned short at the first level and 1
bit per word for higher levels until the terms vanish. So that's
si->max*sum(i>=0,BITS_PER_LONG**i<=si->max)floor(si->max/BITS_PER_LONG**i)
<= si->max*/(1-1/BITS_PER_LONG) bits, or
si->max*sizeof(long)/(BITS_PER_LONG-1) bytes, which is
sizeof(long)/sizeof(short)/(BITS_PER_LONG-1) times the size of the
original swap map, which is 2/31 on 32-bit and 4/63 on 64-bit of the
size of the original swap map, both of which are just above 1/16 (1/496
above on 32-bit and 1/1008 above on 64-bit), so the space overhead
appears to be acceptable. A hierarchical bitmap should reduce the time
requirements for the search from O(sum(si) si->max) to
O(BITS_PER_LONG/lg(BITS_PER_LONG)). Sound reasonable?
-- wli
i've released the -S3 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
most importantly, -S3 fixes the SMP+PREEMPT bug reported by K.R. Foley.
It was a bug in BKL-preemption: forced preemption still caused automatic
dropping of the BKS - this is bad and broke fs/locks.c. (The race could
occur on UP+PREEMPT too but it has never been reproduced there.)
other changes since -S2:
- introduced a CONFIG_PREEMPT_BKL - just in case there are other
problems. This can be used to turn BKL preemption on/off. Can be
useful for performance tests as well.
- fixed a couple of more smp_processor_id() false positives.
- cleaned up hardirq.c some more
To get a 2.6.9-rc2-mm1-VP-S3 kernel, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm1/2.6.9-rc2-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
Ingo
Ingo Molnar wrote:
> i've released the -S3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
>
In order to get this to build I had to add
#include <asm/delay.h>
to linux/kernel/time.c
kr
Ingo Molnar wrote:
> i've released the -S3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
>
OK. Bad things seem to happen with this patch. Each time I booted it
(twice) several telnet connections get dropped before I get a prompt
(this is without any load on the system). The system SEEMED a bit less
responsive. I can't be sure about that because I booted it remotely.
After starting stress-kernel and logging out, I couldn't get back into
the system remotely. Telnet and ssh both would just hang indefinitely.
The console was still useable I think (according to my wife being my
remote hands and eyes.) I saw no indications in the log of any
unhappiness or any indications of why connections were dropping and
hanging. Also the highest latency reported was 252 usec.
kr
On Wed, 2004-09-22 at 11:07, K.R. Foley wrote:
> Ingo Molnar wrote:
> > i've released the -S3 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
> >
>
> In order to get this to build I had to add
>
> #include <asm/delay.h>
>
> to linux/kernel/time.c
>
Builds fine for me, this must specific to your config, or Ingo fixed the
patch.
Lee
Ingo Molnar wrote:
> i've released the -S2 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S2
>
I don't know if this one has been fixed in S3 or not but I also saw this
in S1 I think. This just happened when I booted back into S2 so I
thought I would report it.
kr
Sep 22 12:00:44 porky kernel: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Sep 22 12:00:44 porky kernel: printing eip:
Sep 22 12:00:44 porky kernel: c01776dd
Sep 22 12:00:44 porky kernel: *pde = 00000000
Sep 22 12:00:44 porky kernel: using smp_processor_id() in preemptible
code: mingetty/2667
Sep 22 12:00:44 porky kernel: [<c011c58e>] smp_processor_id+0x8e/0xa0
Sep 22 12:00:44 porky kernel: [<c01078b8>] die+0x18/0x190
Sep 22 12:00:44 porky kernel: [<c0121675>] vprintk+0x125/0x170
Sep 22 12:00:44 porky kernel: [<c012153d>] printk+0x1d/0x30
Sep 22 12:00:44 porky kernel: [<c011958d>] do_page_fault+0x32d/0x621
Sep 22 12:00:44 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 22 12:00:44 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 22 12:00:44 porky kernel: [<c02b3f9a>] _spin_unlock+0x1a/0x40
Sep 22 12:00:44 porky init: open(/dev/pts/0): No such file or directory
Sep 22 12:00:44 porky kernel: [<c0138512>] check_preempt_timing+0x192/0x200
Sep 22 12:00:45 porky init: open(/dev/pts/0): No such file or directory
Sep 22 12:00:45 porky kernel: [<c02b3f9a>] _spin_unlock+0x1a/0x40
Sep 22 12:00:45 porky kernel: [<c014a8a3>] kmem_cache_alloc+0x63/0x70
Sep 22 12:00:45 porky kernel: [<c02b314e>] cond_resched+0xe/0x80
Sep 22 12:00:45 porky kernel: [<c0119260>] do_page_fault+0x0/0x621
Sep 22 12:00:45 porky kernel: [<c0107175>] error_code+0x2d/0x38
Sep 22 12:00:45 porky kernel: [<c01776dd>] __posix_lock_file+0x6d/0x5a0
Sep 22 12:00:45 porky kernel: [<c0176a2e>] flock_to_posix_lock+0xe/0x170
Sep 22 12:00:45 porky kernel: [<c0178bd6>] fcntl_setlk+0x166/0x2d0
Sep 22 12:00:45 porky kernel: [<c01b3fd8>] dummy_file_lock+0x8/0x10
Sep 22 12:00:45 porky kernel: [<c0178c0f>] fcntl_setlk+0x19f/0x2d0
Sep 22 12:00:45 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 22 12:00:45 porky kernel: [<c02b3f9a>] _spin_unlock+0x1a/0x40
Sep 22 12:00:45 porky kernel: [<c0138512>] check_preempt_timing+0x192/0x200
Sep 22 12:00:45 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 22 12:00:45 porky kernel: [<c02b3f9a>] _spin_unlock+0x1a/0x40
Sep 22 12:00:45 porky kernel: [<c01747d2>] sys_fcntl64+0xa2/0xc0
Sep 22 12:00:45 porky kernel: [<c0174501>] do_fcntl+0x11/0x1c0
Sep 22 12:00:45 porky kernel: [<c01745f5>] do_fcntl+0x105/0x1c0
Sep 22 12:00:45 porky kernel: [<c01747d2>] sys_fcntl64+0xa2/0xc0
Sep 22 12:00:46 porky kernel: [<c01066b9>] sysenter_past_esp+0x52/0x71
Sep 22 12:00:46 porky kernel: Oops: 0000 [#1]
Sep 22 12:00:46 porky kernel: PREEMPT SMP
Sep 22 12:00:46 porky kernel: Modules linked in: parport_pc lp parport
autofs4 sunrpc tulip ide_cd cdrom floppy sg microcode dm_mod uhci_hcd
usbcore ipv6 ext3 jbd aic7xxx sd_mod scsi_mod
Sep 22 12:00:46 porky kernel: CPU: 0
Sep 22 12:00:47 porky kernel: EIP: 0060:[<c01776dd>] Not tainted VLI
Sep 22 12:00:47 porky kernel: EFLAGS: 00010246 (2.6.9-rc2-mm1-VP-S2)
Sep 22 12:00:47 porky kernel: EIP is at __posix_lock_file+0x6d/0x5a0
Sep 22 12:00:47 porky kernel: eax: 00000000 ebx: 00000000 ecx:
d74e05ac edx: 00000000
Sep 22 12:00:47 porky kernel: esi: dc53e328 edi: dc53e3d8 ebp:
de96ff00 esp: de96fea4
Sep 22 12:00:47 porky kernel: ds: 007b es: 007b ss: 0068 preempt:
00000001
Sep 22 12:00:47 porky kernel: Process mingetty (pid: 2667,
threadinfo=de96e000 task=dfcecbf0)
Sep 22 12:00:48 porky kernel: Stack: d74e05ac d74e01ec 00000202 00000046
c0176a2e d74e05ac dea26ce0 d74e05ac
Sep 22 12:00:48 porky kernel: dea26ce0 de96fee4 00000202 c0178bd6
c01b3fd8 d74e05ac 00000000 00000000
Sep 22 12:00:48 porky kernel: 00000000 00000000 d74e054c d74e04ec
d74e05ac dea26ce0 00000000 de96ff7c
Sep 22 12:00:48 porky kernel: Call Trace:
Sep 22 12:00:48 porky kernel: [<c0176a2e>] flock_to_posix_lock+0xe/0x170
Sep 22 12:00:48 porky kernel: [<c0178bd6>] fcntl_setlk+0x166/0x2d0
Sep 22 12:00:48 porky kernel: [<c01b3fd8>] dummy_file_lock+0x8/0x10
Sep 22 12:00:48 porky kernel: [<c0178c0f>] fcntl_setlk+0x19f/0x2d0
Sep 22 12:00:48 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 22 12:00:48 porky kernel: [<c02b3f9a>] _spin_unlock+0x1a/0x40
Sep 22 12:00:48 porky kernel: [<c0138512>] check_preempt_timing+0x192/0x200
Sep 22 12:00:48 porky kernel: [<c0138759>] sub_preempt_count+0x69/0x80
Sep 22 12:00:48 porky kernel: [<c02b3f9a>] _spin_unlock+0x1a/0x40
Sep 22 12:00:48 porky kernel: [<c01747d2>] sys_fcntl64+0xa2/0xc0
Sep 22 12:00:48 porky kernel: [<c0174501>] do_fcntl+0x11/0x1c0
Sep 22 12:00:48 porky kernel: [<c01745f5>] do_fcntl+0x105/0x1c0
Sep 22 12:00:48 porky kernel: [<c01747d2>] sys_fcntl64+0xa2/0xc0
Sep 22 12:00:48 porky kernel: [<c01066b9>] sysenter_past_esp+0x52/0x71
Sep 22 12:00:48 porky kernel: Code: 74 2f 8d 96 b0 00 00 00 89 55 e0 8b
86 b0 00 00 00 85 c0 74 1c 89 c3 8d b4 26 00 00 00 00 f6 43 30 01 0f 85
ce 04 00 00 89 5d e0 <8b> 1b 85 db 75 ed 8b 55 0c 31 ff f6 42 30 08 0f
85 6e 02 00 00
Lee Revell wrote:
> On Wed, 2004-09-22 at 11:07, K.R. Foley wrote:
>
>>Ingo Molnar wrote:
>>
>>>i've released the -S3 VP patch:
>>>
>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
>>>
>>
>>In order to get this to build I had to add
>>
>>#include <asm/delay.h>
>>
>>to linux/kernel/time.c
>>
>
>
> Builds fine for me, this must specific to your config, or Ingo fixed the
> patch.
>
> Lee
>
>
Hmm. That seems odd.
By the way, should I have included linux/delay.h, which includes
asm/delay.h instead of including it directly?
kr
On Wed, 2004-09-22 at 06:33, Ingo Molnar wrote:
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
The rt_garbage_collect latency is back:
preemption latency trace v1.0.7 on 2.6.9-rc2-mm1-VP-S3
-------------------------------------------------------
latency: 2040 us, entries: 2321 (2321) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: ksoftirqd/0/2, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: netif_receive_skb+0x6a/0x1d0
=> ended at: netif_receive_skb+0x125/0x1d0
=======>
00000001 0.000ms (+0.001ms): netif_receive_skb (process_backlog)
00000001 0.001ms (+0.000ms): packet_rcv_spkt (netif_receive_skb)
00000001 0.002ms (+0.000ms): skb_clone (packet_rcv_spkt)
00000001 0.002ms (+0.001ms): kmem_cache_alloc (skb_clone)
00000001 0.004ms (+0.001ms): memcpy (skb_clone)
00000001 0.006ms (+0.001ms): strlcpy (packet_rcv_spkt)
00000002 0.007ms (+0.003ms): sk_run_filter (packet_rcv_spkt)
00000001 0.011ms (+0.000ms): __kfree_skb (packet_rcv_spkt)
00000001 0.012ms (+0.000ms): kfree_skbmem (__kfree_skb)
00000001 0.012ms (+0.000ms): skb_release_data (kfree_skbmem)
00000001 0.012ms (+0.000ms): kmem_cache_free (kfree_skbmem)
00000001 0.013ms (+0.001ms): ip_rcv (netif_receive_skb)
00000001 0.015ms (+0.000ms): ip_route_input (ip_rcv)
00000001 0.015ms (+0.003ms): rt_hash_code (ip_route_input)
00000001 0.019ms (+0.001ms): ip_route_input_slow (ip_rcv)
00000001 0.021ms (+0.001ms): rt_hash_code (ip_route_input_slow)
00000001 0.022ms (+0.002ms): fn_hash_lookup (ip_route_input_slow)
00000002 0.024ms (+0.002ms): fib_semantic_match (fn_hash_lookup)
00000001 0.027ms (+0.000ms): fib_validate_source (ip_route_input_slow)
00000001 0.028ms (+0.001ms): fn_hash_lookup (fib_validate_source)
00000001 0.029ms (+0.001ms): fn_hash_lookup (fib_validate_source)
00000002 0.031ms (+0.001ms): fib_semantic_match (fn_hash_lookup)
00000001 0.032ms (+0.000ms): __fib_res_prefsrc (fib_validate_source)
00000001 0.033ms (+0.002ms): inet_select_addr (__fib_res_prefsrc)
00000001 0.035ms (+0.000ms): dst_alloc (ip_route_input_slow)
00000001 0.036ms (+0.002ms): rt_garbage_collect (dst_alloc)
00000102 0.039ms (+0.001ms): rt_may_expire (rt_garbage_collect)
00000101 0.040ms (+0.000ms): local_bh_enable (rt_garbage_collect)
00000102 0.041ms (+0.001ms): rt_may_expire (rt_garbage_collect)
00000101 0.042ms (+0.000ms): local_bh_enable (rt_garbage_collect)
00000102 0.043ms (+0.000ms): rt_may_expire (rt_garbage_collect)
00000101 0.044ms (+0.000ms): local_bh_enable (rt_garbage_collect)
00000102 0.045ms (+0.001ms): rt_may_expire (rt_garbage_collect)
00000102 0.046ms (+0.001ms): rt_may_expire (rt_garbage_collect)
00000101 0.047ms (+0.000ms): local_bh_enable (rt_garbage_collect)
00000102 0.048ms (+0.001ms): rt_may_expire (rt_garbage_collect)
[these 2 repeat hundreds of times]
00000101 1.875ms (+0.000ms): local_bh_enable (rt_garbage_collect)
00000102 1.876ms (+0.000ms): rt_may_expire (rt_garbage_collect)
00000102 1.877ms (+0.000ms): rt_may_expire (rt_garbage_collect)
00000101 1.877ms (+0.000ms): local_bh_enable (rt_garbage_collect)
00000102 1.878ms (+0.000ms): rt_may_expire (rt_garbage_collect)
00000102 1.879ms (+0.000ms): rt_may_expire (rt_garbage_collect)
00000101 1.880ms (+0.001ms): local_bh_enable (rt_garbage_collect)
00000001 1.881ms (+0.001ms): kmem_cache_alloc (dst_alloc)
00000001 1.882ms (+0.003ms): cache_alloc_refill (kmem_cache_alloc)
[some other stuff]
Lee
* Lee Revell <[email protected]> wrote:
> On Wed, 2004-09-22 at 11:07, K.R. Foley wrote:
> > Ingo Molnar wrote:
> > > i've released the -S3 VP patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
> > >
> >
> > In order to get this to build I had to add
> >
> > #include <asm/delay.h>
> >
> > to linux/kernel/time.c
> >
>
> Builds fine for me, this must specific to your config, or Ingo fixed
> the patch.
yeah, this was reported very early so i just fixed up the patch.
Ingo
* K.R. Foley <[email protected]> wrote:
> Ingo Molnar wrote:
> >i've released the -S2 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S2
> >
>
> I don't know if this one has been fixed in S3 or not but I also saw
> this in S1 I think. This just happened when I booted back into S2 so I
> thought I would report it.
>
> kr
>
>
> Sep 22 12:00:44 porky kernel: Unable to handle kernel NULL pointer
> dereference at virtual address 00000000
> Sep 22 12:00:45 porky kernel: [<c0107175>] error_code+0x2d/0x38
> Sep 22 12:00:45 porky kernel: [<c01776dd>] __posix_lock_file+0x6d/0x5a0
yeah, this is one of the bugs that should be fixed in -S3.
Ingo
On Wed, 2004-09-22 at 06:33, Ingo Molnar wrote:
> i've released the -S3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm1-S3
>
Ingo, it looks like there are still a few hardware configurations on
which the VP patches just don't work:
http://eca.cx/lad/2004/09/0221.html
This user definitely knows what they are doing. IIRC the problems were
introduced with -O5, which involved big SMP changes so this really looks
like a bug. Any ideas?
Lee
i've released the -S4 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm2-S4
-S4 fixes a softirq latency processing bug introduced in -S3. The
symptoms of this bug can be erratic mouse/keyboard behavior, higher
networking latencies, and similar things. (If CONFIG_PREEMPT is disabled
then another effect of this bug can lead to crashes.)
-S4 is also a merge to 2.6.9-rc2-mm2.
To get a 2.6.9-rc2-mm2-VP-S4 kernel, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm2/2.6.9-rc2-mm2.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm2-S4
Ingo
* Norberto Bensa <[email protected]> wrote:
> Hello,
>
> Ingo Molnar wrote:
> > i've released the -S4 VP patch:
> >
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm2-
> >S4
>
> CC arch/i386/kernel/irq.o
> arch/i386/kernel/irq.c: In function `do_IRQ':
> arch/i386/kernel/irq.c:287: warning: implicit declaration of function
> `redirect_hardirq'
> arch/i386/kernel/irq.c:344: error: `noirqdebug' undeclared (first use in this
did you do a 'make oldconfig'? Make sure there's
CONFIG_GENERIC_HARDIRQ=y in your .config.
Ingo
Hello,
Ingo Molnar wrote:
> i've released the -S4 VP patch:
>
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm2-
>S4
CC arch/i386/kernel/irq.o
arch/i386/kernel/irq.c: In function `do_IRQ':
arch/i386/kernel/irq.c:287: warning: implicit declaration of function
`redirect_hardirq'
arch/i386/kernel/irq.c:344: error: `noirqdebug' undeclared (first use in this
function)
arch/i386/kernel/irq.c:344: error: (Each undeclared identifier is reported
only once
arch/i386/kernel/irq.c:344: error: for each function it appears in.)
arch/i386/kernel/irq.c:345: warning: implicit declaration of function
`note_interrupt'
make[1]: *** [arch/i386/kernel/irq.o] Error 1
make: *** [arch/i386/kernel] Error 2
Regards,
Norberto
oops - Kconfig chunks are missing. fixing.
Ingo
* Ingo Molnar <[email protected]> wrote:
>
> * Norberto Bensa <[email protected]> wrote:
>
> > Hello,
> >
> > Ingo Molnar wrote:
> > > i've released the -S4 VP patch:
> > >
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm2-
> > >S4
> >
> > CC arch/i386/kernel/irq.o
> > arch/i386/kernel/irq.c: In function `do_IRQ':
> > arch/i386/kernel/irq.c:287: warning: implicit declaration of function
> > `redirect_hardirq'
> > arch/i386/kernel/irq.c:344: error: `noirqdebug' undeclared (first use in this
>
>
> did you do a 'make oldconfig'? Make sure there's
> CONFIG_GENERIC_HARDIRQ=y in your .config.
>
> Ingo
i've uploaded the correct patch - please download -S4 again.
Ingo
* Ingo Molnar <[email protected]> wrote:
>
> * Norberto Bensa <[email protected]> wrote:
>
> > Hello,
> >
> > Ingo Molnar wrote:
> > > i've released the -S4 VP patch:
> > >
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm2-
> > >S4
> >
> > CC arch/i386/kernel/irq.o
> > arch/i386/kernel/irq.c: In function `do_IRQ':
> > arch/i386/kernel/irq.c:287: warning: implicit declaration of function
> > `redirect_hardirq'
> > arch/i386/kernel/irq.c:344: error: `noirqdebug' undeclared (first use in this
>
>
> did you do a 'make oldconfig'? Make sure there's
> CONFIG_GENERIC_HARDIRQ=y in your .config.
>
> Ingo
Ingo Molnar wrote:
>
> i've released the -S4 VP patch:
>
Just tried to configure for 2.6.9-rc2-mm2-VP-S4 on my laptop. Strange
enough I don't get the PREEMPT_VOLUNTARY, PREEMPT_SOFTIRQS and
PREEMPT_HARDIRQS symbols available for Kconfig.
Not surprisingly, make stops on these messages:
[...]
CC arch/i386/kernel/irq.o
arch/i386/kernel/irq.c: In function `do_IRQ':
arch/i386/kernel/irq.c:287: warning: implicit declaration of function
`redirect_hardirq'
arch/i386/kernel/irq.c:363: error: `noirqdebug' undeclared (first use in
this function)
arch/i386/kernel/irq.c:363: error: (Each undeclared identifier is reported
only once
arch/i386/kernel/irq.c:363: error: for each function it appears in.)
arch/i386/kernel/irq.c:364: warning: implicit declaration of function
`note_interrupt'
make[1]: *** [arch/i386/kernel/irq.o] Error 1
make: *** [arch/i386/kernel] Error 2
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Ingo Molnar wrote:
> >
> > i've released the -S4 VP patch:
> >
>
> Just tried to configure for 2.6.9-rc2-mm2-VP-S4 on my laptop. Strange
> enough I don't get the PREEMPT_VOLUNTARY, PREEMPT_SOFTIRQS and
> PREEMPT_HARDIRQS symbols available for Kconfig.
>
> Not surprisingly, make stops on these messages:
>
> [...]
> CC arch/i386/kernel/irq.o
> arch/i386/kernel/irq.c: In function `do_IRQ':
yeah, please re-download the -S4 patch, i fixed this meanwhile.
Ingo
Ingo Molnar wrote:
>
> yeah, please re-download the -S4 patch, i fixed this meanwhile.
>
Yes, now it builds fine on my laptop.
However, after a couple of reboots, there appears to be some verbose
messages regarding PCI something, but the my main complaint is the USB
subsystem which is failing miserably now.
I guess these are the relevant log messages excerpt:
[...]
Mounted devfs on /dev
Freeing unused kernel memory: 160k freed
IRQ#8 thread started up.
usbcore: registered new driver usbfs
usbcore: registered new driver hub
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt Link [LNK8] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:02.0: OHCI Host Controller
requesting new irq thread for IRQ10...
ohci_hcd 0000:00:02.0: irq 10, pci mem 0xd4000000
ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:02.0: init err (00002edf 0000)
ohci_hcd 0000:00:02.0: can't start
ohci_hcd 0000:00:02.0: init error -75
ohci_hcd 0000:00:02.0: remove, state 0
ohci_hcd 0000:00:02.0: USB bus 1 deregistered
ohci_hcd: probe of 0000:00:02.0 failed with error -75
ACPI: PCI Interrupt Link [LNK4] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:0f.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:0f.0: OHCI Host Controller
ohci_hcd 0000:00:0f.0: irq 10, pci mem 0xd4009000
ohci_hcd 0000:00:0f.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:0f.0: init err (00002edf 0000)
ohci_hcd 0000:00:0f.0: can't start
ohci_hcd 0000:00:0f.0: init error -75
ohci_hcd 0000:00:0f.0: remove, state 0
ohci_hcd 0000:00:0f.0: USB bus 1 deregistered
ohci_hcd: probe of 0000:00:0f.0 failed with error -75
[...]
Probably this isn't strictly related to VP, but surely it was introduced
by mm1 and mm2. Can't tell for sure. And please don't count as hardware
failure as it suffices to go back to 2.6.9-rc1 to get USB back to normal
;)
Any thoughts?
--
rncbc aka Rui Nuno Capela
[email protected]
Hello,
Ingo Molnar wrote:
> i've uploaded the correct patch - please download -S4 again.
>
Is it me or does this patch broke quiet command-line parameter?
Kernel command line: quiet vga=0x030c
Yet this kernel is very verbose :(
.config attached. BTW, I think you got your Subject: wrong ;)
Best regards,
Norberto
Hello,
Norberto Bensa wrote:
> Is it me or does this patch broke quiet command-line parameter?
Ingo, is this on purpose:
--- linux/kernel/printk.c.orig
+++ linux/kernel/printk.c
@@ -401,7 +401,7 @@ static void __call_console_drivers(unsig
static void _call_console_drivers(unsigned long start,
unsigned long end, int msg_log_level)
{
- if (msg_log_level < console_loglevel &&
+ if (/*msg_log_level < console_loglevel && */
console_drivers && start != end) {
if ((start & LOG_BUF_MASK) > (end & LOG_BUF_MASK)) {
/* wrapped write */
If so, why is it needed?
Thanks,
Norberto
Rui Nuno Capela (meself:) wrote:
>
> Yes, now it builds fine on my laptop.
>
> However, after a couple of reboots, there appears to be some verbose
> messages regarding PCI something, but the my main complaint is the USB
> subsystem which is failing miserably now.
>
> I guess these are the relevant log messages excerpt:
>
> [...]
> Mounted devfs on /dev
> Freeing unused kernel memory: 160k freed
> IRQ#8 thread started up.
> usbcore: registered new driver usbfs
> usbcore: registered new driver hub
> ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
> ACPI: PCI Interrupt Link [LNK8] enabled at IRQ 10
> ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 10 (level, low) -> IRQ 10
> ohci_hcd 0000:00:02.0: OHCI Host Controller
> requesting new irq thread for IRQ10...
> ohci_hcd 0000:00:02.0: irq 10, pci mem 0xd4000000
> ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 1
> ohci_hcd 0000:00:02.0: init err (00002edf 0000)
> ohci_hcd 0000:00:02.0: can't start
> ohci_hcd 0000:00:02.0: init error -75
> ohci_hcd 0000:00:02.0: remove, state 0
> ohci_hcd 0000:00:02.0: USB bus 1 deregistered
> ohci_hcd: probe of 0000:00:02.0 failed with error -75
> ACPI: PCI Interrupt Link [LNK4] enabled at IRQ 10
> ACPI: PCI interrupt 0000:00:0f.0[A] -> GSI 10 (level, low) -> IRQ 10
> ohci_hcd 0000:00:0f.0: OHCI Host Controller
> ohci_hcd 0000:00:0f.0: irq 10, pci mem 0xd4009000
> ohci_hcd 0000:00:0f.0: new USB bus registered, assigned bus number 1
> ohci_hcd 0000:00:0f.0: init err (00002edf 0000)
> ohci_hcd 0000:00:0f.0: can't start
> ohci_hcd 0000:00:0f.0: init error -75
> ohci_hcd 0000:00:0f.0: remove, state 0
> ohci_hcd 0000:00:0f.0: USB bus 1 deregistered
> ohci_hcd: probe of 0000:00:0f.0 failed with error -75
> [...]
>
> Probably this isn't strictly related to VP, but surely it was introduced
> by mm1 and mm2. Can't tell for sure. And please don't count as hardware
> failure as it suffices to go back to 2.6.9-rc1 to get USB back to normal
> ;)
>
> Any thoughts?
This isn't related to VP at all. Sorry for the bandwidth waste.
Just noticed a post from Andre Eisenbach [1] a few moments ago, with the
this subject: "USB (OHCI) not working without pci=routeirq", and it's
exactly what is going on wrong here.
[1] http://lkml.org/lkml/2004/9/23/163
Thanks anyhow.
--
rncbc aka Rui Nuno Capela
[email protected]
* Norberto Bensa <[email protected]> wrote:
> Ingo, is this on purpose:
>
> --- linux/kernel/printk.c.orig
> +++ linux/kernel/printk.c
> @@ -401,7 +401,7 @@ static void __call_console_drivers(unsig
> static void _call_console_drivers(unsigned long start,
> unsigned long end, int msg_log_level)
> {
> - if (msg_log_level < console_loglevel &&
> + if (/*msg_log_level < console_loglevel && */
> console_drivers && start != end) {
not intended, debugging leftover, you can remove it.
Ingo
i've released the -S5 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm3-S5
this iteration fixes the jackd lockup reported by Rui Nuno Capela.
since Andrew has not released -mm3 yet i've uploaded his latest
intermediate tree plus two additional fixes that will likely show up in
the real -mm3, and unrolled some of the more experimental scheduler
stuff. The patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
+ http://redhat.com/~mingo/voluntary-preempt/2.6.9-rc2-pre-mm3-mingo.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm3-S5
Ingo
Dear Sir Molnar,
>
> i've released the -S5 VP patch:
>
This is going GREAT!
While I'm still writing this very lines (via some webmail client on
konqui), my dreaded P4/HT box is pumping hard along jackd -R -p 128 -n2,
for more than one hour now, with two qsynth/fluidsynth instances up,
ardour continuous 8-track in loop-streaming playback nonstop, and also
some more juk and artsd (kde 3.3 here) all trashing my-full-head-of-sound
in my phones. No xrun, NO lockup, NO nothing. And still counting...
I even have CONFIG_SCHED_SMT set, as you can check by attached config
file. Amazing, to say the least, if you remember my latest complaints ;)
Ingo, you're a champ!
Gotta go to sleep now, it was worth the effort staying awake this late...
tomorrow is a nother day :)
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar wrote:
> i've released the -S5 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm3-S5
This one seems to bring back some issues with the network interface. The
only noticeable symptom is dropping ~30 percent of new telnet
connections under heavy load. When not loaded it still drops ~5 percent.
I had no dropped connections with S4 even when loaded. This just happens
to be one of things that I have been testing manually since I noticed
some problems with previous patches.
Currently using an SMC card with a DEC 21140 chip and the tulip driver
on my SMP system.
kr
K.R. Foley wrote:
> Ingo Molnar wrote:
>
>> i've released the -S5 VP patch:
>>
>>
>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm3-S5
>>
>
>
> This one seems to bring back some issues with the network interface. The
> only noticeable symptom is dropping ~30 percent of new telnet
> connections under heavy load. When not loaded it still drops ~5 percent.
> I had no dropped connections with S4 even when loaded. This just happens
> to be one of things that I have been testing manually since I noticed
> some problems with previous patches.
>
> Currently using an SMC card with a DEC 21140 chip and the tulip driver
> on my SMP system.
>
> kr
The following, on top of Ingo's patch above, fixes the problem with
dropping new connections and doesn't have any adverse affects that I've
seen:
--- linux-2.6.9-rc2-pre-mm3/net/ipv4/tcp_output.c.orig 2004-09-23
22:16:42.249435870 -0500
+++ linux-2.6.9-rc2-pre-mm3/net/ipv4/tcp_output.c 2004-09-23
22:12:03.911811945 -0500
@@ -699,11 +699,6 @@
tcp_minshall_update(tp, mss_now, skb);
sent_pkts = 1;
- /*
- * Break out early - we'll continue later:
- */
- if (softirq_need_resched())
- break;
}
if (sent_pkts) {
* K.R. Foley <[email protected]> wrote:
> The following, on top of Ingo's patch above, fixes the problem with
> dropping new connections and doesn't have any adverse affects that
> I've seen:
>
> --- linux-2.6.9-rc2-pre-mm3/net/ipv4/tcp_output.c.orig 2004-09-23
> 22:16:42.249435870 -0500
> +++ linux-2.6.9-rc2-pre-mm3/net/ipv4/tcp_output.c 2004-09-23
> 22:12:03.911811945 -0500
> @@ -699,11 +699,6 @@
>
> tcp_minshall_update(tp, mss_now, skb);
> sent_pkts = 1;
> - /*
> - * Break out early - we'll continue later:
> - */
> - if (softirq_need_resched())
> - break;
hm, ok, i'll revert this in my tree. I suspect we'll see some latencies
resurfacing under high network load again, but correctness goes first
obviously. If then we'll have to find some other method to break that
critical path.
Ingo
i've released the -S6 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm3-S6
this iteration fixes the 'dropped tcp connections' problem reported and
fixed by K.R. Foley.
The patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
+ http://redhat.com/~mingo/voluntary-preempt/2.6.9-rc2-pre-mm3-mingo.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm3-S6
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>The following, on top of Ingo's patch above, fixes the problem with
>>dropping new connections and doesn't have any adverse affects that
>>I've seen:
>>
>>--- linux-2.6.9-rc2-pre-mm3/net/ipv4/tcp_output.c.orig 2004-09-23
>>22:16:42.249435870 -0500
>>+++ linux-2.6.9-rc2-pre-mm3/net/ipv4/tcp_output.c 2004-09-23
>>22:12:03.911811945 -0500
>>@@ -699,11 +699,6 @@
>>
>> tcp_minshall_update(tp, mss_now, skb);
>> sent_pkts = 1;
>>- /*
>>- * Break out early - we'll continue later:
>>- */
>>- if (softirq_need_resched())
>>- break;
>
>
> hm, ok, i'll revert this in my tree. I suspect we'll see some latencies
> resurfacing under high network load again, but correctness goes first
> obviously. If then we'll have to find some other method to break that
> critical path.
>
> Ingo
>
Ingo,
Maybe this wasn't the right way to fix the problem? I just looked at the
S4 patch and it had the same change in it, but did not exhibit the same
problem. Not knowing exactly what I was looking for, I just started
looking for obvious changes that might affect dropping tcp connections
and this one seemed reasonable. I made the change and the problem went
away. Maybe this needs looking at a little closer.
kr
* K.R. Foley <[email protected]> wrote:
> Maybe this wasn't the right way to fix the problem? I just looked at
> the S4 patch and it had the same change in it, but did not exhibit the
> same problem. Not knowing exactly what I was looking for, I just
> started looking for obvious changes that might affect dropping tcp
> connections and this one seemed reasonable. I made the change and the
> problem went away. Maybe this needs looking at a little closer.
S4 had other problems with softirq processing so i'd not be surprised if
that magically fixed the problem introduced by this change.
Ingo
Hi,
My laptop is a Compaq Presario 2516EA, loaded with Mandrake 10.0 Official,
which has been working fine until 2.6.9-rc2. Incidentally with -mm1 and
-mm2 the USB subsystem start failing miserably, due to errors while
(re)starting the ohci_hcd stuff.
By lurking on some recent threads on lkml, and following some other
suggestions about this very problem, I've tried every combination on the
boot prompt: pci=routeirq, acpi=off, softirq-preempt=0, harirq-preempt=0.
This laptop also has the USB legacy BIOS mode switch. Tried that too.
All combinations resulted in the same failure, without exception. Here
comes the relevant dmesg lines, regarding the ohci_hcd failure (note: USB
verbose debug messages are turned on):
[...]
usbcore: registered new driver usbfs
usbcore: registered new driver hub
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ohci_hcd: block sizes: ed 64 td 64
ACPI: PCI Interrupt Link [LNK8] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:02.0: OHCI Host Controller
ohci_hcd 0000:00:02.0: USB HC TakeOver from BIOS/SMM
requesting new irq thread for IRQ10...
ohci_hcd 0000:00:02.0: irq 10, pci mem 0xd4000000
ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:02.0: resetting from state 'reset', control = 0x0
ohci_hcd 0000:00:02.0: init err (00002edf 0000)
ohci_hcd 0000:00:02.0: can't start
ohci_hcd 0000:00:02.0: stop reset controller (state 0x00)
ohci_hcd 0000:00:02.0: OHCI controller state
ohci_hcd 0000:00:02.0: OHCI 1.0, with legacy support registers
ohci_hcd 0000:00:02.0: control 0x0c0 HCFS=suspend CBSR=0
ohci_hcd 0000:00:02.0: cmdstatus 0x00000 SOC=0
ohci_hcd 0000:00:02.0: intrstatus 0x00000000
ohci_hcd 0000:00:02.0: intrenable 0x00000000
ohci_hcd 0000:00:02.0: hcca frame #0000
ohci_hcd 0000:00:02.0: roothub.a 03000003 POTPGT=3 NDP=3
ohci_hcd 0000:00:02.0: roothub.b 00000000 PPCM=0000 DR=0000
ohci_hcd 0000:00:02.0: roothub.status 00000000
ohci_hcd 0000:00:02.0: roothub.portstatus [0] 0x00000000
ohci_hcd 0000:00:02.0: roothub.portstatus [1] 0x00000000
ohci_hcd 0000:00:02.0: roothub.portstatus [2] 0x00000000
ohci_hcd 0000:00:02.0: init error -75
ohci_hcd 0000:00:02.0: remove, state 0
ohci_hcd 0000:00:02.0: roothub graceful disconnect
usb_disconnect nodev
ohci_hcd 0000:00:02.0: stop reset controller (state 0x00)
ohci_hcd 0000:00:02.0: OHCI controller state
ohci_hcd 0000:00:02.0: OHCI 1.0, with legacy support registers
ohci_hcd 0000:00:02.0: control 0x000 HCFS=reset CBSR=0
ohci_hcd 0000:00:02.0: cmdstatus 0x00000 SOC=0
ohci_hcd 0000:00:02.0: intrstatus 0x00000000
ohci_hcd 0000:00:02.0: intrenable 0x00000000
ohci_hcd 0000:00:02.0: roothub.a 03000003 POTPGT=3 NDP=3
ohci_hcd 0000:00:02.0: roothub.b 00000000 PPCM=0000 DR=0000
ohci_hcd 0000:00:02.0: roothub.status 00000000
ohci_hcd 0000:00:02.0: roothub.portstatus [0] 0x00000000
ohci_hcd 0000:00:02.0: roothub.portstatus [1] 0x00000000
ohci_hcd 0000:00:02.0: roothub.portstatus [2] 0x00000000
ohci_hcd 0000:00:02.0: USB bus 1 deregistered
ohci_hcd: probe of 0000:00:02.0 failed with error -75
ACPI: PCI Interrupt Link [LNK4] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:0f.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:0f.0: OHCI Host Controller
ohci_hcd 0000:00:0f.0: USB HC TakeOver from BIOS/SMM
ohci_hcd 0000:00:0f.0: irq 10, pci mem 0xd4009000
ohci_hcd 0000:00:0f.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:0f.0: resetting from state 'reset', control = 0x0
ohci_hcd 0000:00:0f.0: init err (00002edf 0000)
ohci_hcd 0000:00:0f.0: can't start
ohci_hcd 0000:00:0f.0: stop reset controller (state 0x00)
ohci_hcd 0000:00:0f.0: OHCI controller state
ohci_hcd 0000:00:0f.0: OHCI 1.0, with legacy support registers
ohci_hcd 0000:00:0f.0: control 0x0c0 HCFS=suspend CBSR=0
ohci_hcd 0000:00:0f.0: cmdstatus 0x00000 SOC=0
ohci_hcd 0000:00:0f.0: intrstatus 0x00000000
ohci_hcd 0000:00:0f.0: intrenable 0x00000000
ohci_hcd 0000:00:0f.0: hcca frame #0000
ohci_hcd 0000:00:0f.0: roothub.a 03000003 POTPGT=3 NDP=3
ohci_hcd 0000:00:0f.0: roothub.b 00000000 PPCM=0000 DR=0000
ohci_hcd 0000:00:0f.0: roothub.status 00000000
ohci_hcd 0000:00:0f.0: roothub.portstatus [0] 0x00000000
ohci_hcd 0000:00:0f.0: roothub.portstatus [1] 0x00000000
ohci_hcd 0000:00:0f.0: roothub.portstatus [2] 0x00000000
ohci_hcd 0000:00:0f.0: init error -75
ohci_hcd 0000:00:0f.0: remove, state 0
ohci_hcd 0000:00:0f.0: roothub graceful disconnect
usb_disconnect nodev
ohci_hcd 0000:00:0f.0: stop reset controller (state 0x00)
ohci_hcd 0000:00:0f.0: OHCI controller state
ohci_hcd 0000:00:0f.0: OHCI 1.0, with legacy support registers
ohci_hcd 0000:00:0f.0: control 0x000 HCFS=reset CBSR=0
ohci_hcd 0000:00:0f.0: cmdstatus 0x00000 SOC=0
ohci_hcd 0000:00:0f.0: intrstatus 0x00000000
ohci_hcd 0000:00:0f.0: intrenable 0x00000000
ohci_hcd 0000:00:0f.0: roothub.a 03000003 POTPGT=3 NDP=3
ohci_hcd 0000:00:0f.0: roothub.b 00000000 PPCM=0000 DR=0000
ohci_hcd 0000:00:0f.0: roothub.status 00000000
ohci_hcd 0000:00:0f.0: roothub.portstatus [0] 0x00000000
ohci_hcd 0000:00:0f.0: roothub.portstatus [1] 0x00000000
ohci_hcd 0000:00:0f.0: roothub.portstatus [2] 0x00000000
ohci_hcd 0000:00:0f.0: USB bus 1 deregistered
ohci_hcd: probe of 0000:00:0f.0 failed with error -75
[...]
Then, in a silent suggestion from David Brownell, I hacked ohci_hcd.c,
forcing the OHCI_QUIRK_INITRESET flag behaviour, build a new kernel and
modules and voila', all USB is back to functionality.
Here goes the complete dmesg output after the new kernel boot, where
everything seems to work as it should (note: I'm using Ingo Molnar's
voluntary-preempt-2.6.9-rc2-mm3-S6 patch):
[...]
usbcore: registered new driver usbfs
usbcore: registered new driver hub
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ohci_hcd: block sizes: ed 64 td 64
ACPI: PCI Interrupt Link [LNK8] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:02.0: OHCI Host Controller
ohci_hcd 0000:00:02.0: USB HC TakeOver from BIOS/SMM
requesting new irq thread for IRQ10...
ohci_hcd 0000:00:02.0: irq 10, pci mem 0xd4000000
ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:02.0: resetting from state 'reset', control = 0x0
ohci_hcd 0000:00:02.0: OHCI_QUIRK_INITRESET forced!
ohci_hcd 0000:00:02.0: OHCI controller state
ohci_hcd 0000:00:02.0: OHCI 1.0, with legacy support registers
ohci_hcd 0000:00:02.0: control 0x083 HCFS=operational CBSR=3
ohci_hcd 0000:00:02.0: cmdstatus 0x00000 SOC=0
ohci_hcd 0000:00:02.0: intrstatus 0x00000004 SF
ohci_hcd 0000:00:02.0: intrenable 0x8000000a MIE RD WDH
ohci_hcd 0000:00:02.0: hcca frame #0007
ohci_hcd 0000:00:02.0: roothub.a 03000203 POTPGT=3 NPS NDP=3
ohci_hcd 0000:00:02.0: roothub.b 00000000 PPCM=0000 DR=0000
ohci_hcd 0000:00:02.0: roothub.status 00008000 DRWE
ohci_hcd 0000:00:02.0: roothub.portstatus [0] 0x00000100 PPS
ohci_hcd 0000:00:02.0: roothub.portstatus [1] 0x00000100 PPS
ohci_hcd 0000:00:02.0: roothub.portstatus [2] 0x00000100 PPS
usb usb1: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb1: default language 0x0409
usb usb1: Product: OHCI Host Controller
usb usb1: Manufacturer: Linux 2.6.9-rc2-mm3-S6.1 ohci_hcd
usb usb1: SerialNumber: 0000:00:02.0
usb usb1: hotplug
usb usb1: adding 1-0:1.0 (config #1, interface 0)
usb 1-0:1.0: hotplug
hub 1-0:1.0: usb_probe_interface
hub 1-0:1.0: usb_probe_interface - got id
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
hub 1-0:1.0: standalone hub
hub 1-0:1.0: no power switching (usb 1.0)
hub 1-0:1.0: global over-current protection
hub 1-0:1.0: power on to power good time: 6ms
hub 1-0:1.0: local power source is good
hub 1-0:1.0: no over-current condition exists
ohci_hcd 0000:00:02.0: created debug files
ACPI: PCI Interrupt Link [LNK4] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:0f.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:0f.0: OHCI Host Controller
ohci_hcd 0000:00:0f.0: USB HC TakeOver from BIOS/SMM
ohci_hcd 0000:00:0f.0: irq 10, pci mem 0xd4009000
ohci_hcd 0000:00:0f.0: new USB bus registered, assigned bus number 2
ohci_hcd 0000:00:0f.0: resetting from state 'reset', control = 0x0
ohci_hcd 0000:00:02.0: suspend root hub
ohci_hcd 0000:00:0f.0: OHCI_QUIRK_INITRESET forced!
ohci_hcd 0000:00:0f.0: OHCI controller state
ohci_hcd 0000:00:0f.0: OHCI 1.0, with legacy support registers
ohci_hcd 0000:00:0f.0: control 0x083 HCFS=operational CBSR=3
ohci_hcd 0000:00:0f.0: cmdstatus 0x00000 SOC=0
ohci_hcd 0000:00:0f.0: intrstatus 0x00000004 SF
ohci_hcd 0000:00:0f.0: intrenable 0x8000000a MIE RD WDH
ohci_hcd 0000:00:0f.0: hcca frame #0007
ohci_hcd 0000:00:0f.0: roothub.a 03000203 POTPGT=3 NPS NDP=3
ohci_hcd 0000:00:0f.0: roothub.b 00000000 PPCM=0000 DR=0000
ohci_hcd 0000:00:0f.0: roothub.status 00008000 DRWE
ohci_hcd 0000:00:0f.0: roothub.portstatus [0] 0x00000100 PPS
ohci_hcd 0000:00:0f.0: roothub.portstatus [1] 0x00000100 PPS
ohci_hcd 0000:00:0f.0: roothub.portstatus [2] 0x00000100 PPS
usb usb2: new device strings: Mfr=3, Product=2, SerialNumber=1
usb usb2: default language 0x0409
usb usb2: Product: OHCI Host Controller
usb usb2: Manufacturer: Linux 2.6.9-rc2-mm3-S6.1 ohci_hcd
usb usb2: SerialNumber: 0000:00:0f.0
usb usb2: hotplug
usb usb2: adding 2-0:1.0 (config #1, interface 0)
usb 2-0:1.0: hotplug
hub 2-0:1.0: usb_probe_interface
hub 2-0:1.0: usb_probe_interface - got id
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
hub 2-0:1.0: standalone hub
hub 2-0:1.0: no power switching (usb 1.0)
hub 2-0:1.0: global over-current protection
hub 2-0:1.0: power on to power good time: 6ms
hub 2-0:1.0: local power source is good
hub 2-0:1.0: no over-current condition exists
ohci_hcd 0000:00:0f.0: created debug files
ohci_hcd 0000:00:0f.0: suspend root hub
[...]
OK. And yes, I know this is a very dirty hack (see attached patch), that
most probably only serves for me and this specific hardware configuration.
I didn't found any other way to set this ohci->flag |=
OHCI_QUIRK_INITRESET; other than forcing it being hardcoded. Sorry.
So I'll ask if this flag may be set as an ohci_hcd module parameter or else?
Please forgive my lack of knowledge in this matter. Probably is just
lazziness :)
Cheers and thanks anyway.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Then, in a silent suggestion from David Brownell, I hacked ohci_hcd.c,
> forcing the OHCI_QUIRK_INITRESET flag behaviour, build a new kernel and
> modules and voila', all USB is back to functionality.
> diff -duPNr linux.0/drivers/usb/host/ohci-hcd.c linux.1/drivers/usb/host/ohci-hcd.c
> --- linux.0/drivers/usb/host/ohci-hcd.c 2004-09-24 11:07:00.982690336 +0100
> +++ linux.1/drivers/usb/host/ohci-hcd.c 2004-09-24 11:19:06.232435616 +0100
> @@ -564,11 +564,12 @@
> * (SiS, OPTi ...), so reset again instead. SiS doesn't need
> * this if we write fmInterval after we're OPERATIONAL.
> */
> - if (ohci->flags & OHCI_QUIRK_INITRESET) {
> + ohci_dbg(ohci, "OHCI_QUIRK_INITRESET forced!\n");
> +/* if (ohci->flags & OHCI_QUIRK_INITRESET) { */
> writel (ohci->hc_control, &ohci->regs->control);
> // flush those writes
> (void) ohci_readl (&ohci->regs->control);
> - }
> +/* } */
> writel (ohci->fminterval, &ohci->regs->fminterval);
it would be cleaner to make this dependent on your chipset/vendor-id -
look how OHCI_QUIRK_INITRESET gets activated for e.g. SiS
(PCI_VENDOR_ID_SI) and OPTi (PCI_VENDOR_ID_OPTI). What is your box's
pdev->vendor and pdev->device? (lspci -v) (If it's indeed a quirk that
is needed, not some other fix.)
Ingo
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> Then, in a silent suggestion from David Brownell, I hacked ohci_hcd.c,
>> forcing the OHCI_QUIRK_INITRESET flag behaviour, build a new kernel and
>> modules and voila', all USB is back to functionality.
>
>
> it would be cleaner to make this dependent on your chipset/vendor-id -
> look how OHCI_QUIRK_INITRESET gets activated for e.g. SiS
> (PCI_VENDOR_ID_SI) and OPTi (PCI_VENDOR_ID_OPTI). What is your box's
> pdev->vendor and pdev->device? (lspci -v) (If it's indeed a quirk that
> is needed, not some other fix.)
>
Here it is:
(lspci -n)
00:02.0 Class 0c03: 10b9:5237 (rev 03)
(lspci -v)
00:02.0 USB Controller: ALi Corporation USB 1.1 Controller (rev 03)
(prog-if 10 [OHCI])
Subsystem: Hewlett-Packard Company: Unknown device 0850
Flags: bus master, medium devsel, latency 64, IRQ 10
Memory at d4000000 (32-bit, non-prefetchable) [size=4K]
Capabilities: [60] Power Management version 2
So I made it strictly specific to this hardware vendor PCI_VENDOR_ID_AL
(0x10b9) and device PCI_DEVICE_ID_AL_M5237 (0x5237).
The attached patch applies now to ohci_pci.c and it just works for me (tm:)
Thanks for the suggestion Ingo.
I feel better now.
--
rncbc aka Rui Nuno Capela
[email protected]
My box has:
0000:00:0f.2 USB Controller: ServerWorks OSB4/CSB5 OHCI USB Controller (rev 05) (prog-if 10 [OHCI])
Subsystem: ServerWorks OSB4/CSB5 OHCI USB Controller
Flags: bus master, medium devsel, latency 64, IRQ 10
Memory at f5e70000 (32-bit, non-prefetchable)
0000:00:0f.2 Class 0c03: 1166:0220 (rev 05)
The attached patch (which applies on top of Rui's patch for
ALI M5237) fixes the problem for my DL360. Here's the relevant
output:
ohci_hcd 0000:00:0f.2: ServerWorks OSB4/CSB5 OHCI USB Controller
ohci_hcd 0000:00:0f.2: irq 10, pci mem 0xf5e70000
ohci_hcd 0000:00:0f.2: new USB bus registered, assigned bus number 1
ohci_hcd 0000:00:0f.2: Serverworks OSB4/CSB5 init quirk
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 4 ports detected
USB Universal Host Controller Interface driver v2.2
Thanks for chasing this down!
On Wed, 2004-09-08 at 04:31, Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
>
> > * Rui Nuno Capela <[email protected]> wrote:
> >
> > > OK, could just someone with a P4 HT/SMP box hand me their working
> > > kernel .config file for me to try? That could be a good starting
> > > point, if not a plain baseline.
> >
> > I'll try the latest VP kernel (-R9) on a P4/HT SMP box in a minute and
> > will send you a .config if it works. [...]
>
> P4/HT SMP works fine here - config attached.
>
> since your lockups occur under X, could you try to disable DRI/DRM in
> your XConfig? Also, would it be possible to connect that box to another
> box via a serial line and enable the kernel's serial console via the
> 'console=ttyS0,38400 console=tty' boot option and run 'minicom' on that
> other box, set the serial line to 38400 baud there too and capture all
> kernel messages that occur when the lockups happens? Also enable the NMI
> watchdog via nmi_watchdog=1.
Rui brought up an interesting point on the alsa list. Is this procedure
possible at all on a new laptop without a legacy serial port?
Lee
* Lee Revell <[email protected]> wrote:
> > since your lockups occur under X, could you try to disable DRI/DRM in
> > your XConfig? Also, would it be possible to connect that box to another
> > box via a serial line and enable the kernel's serial console via the
> > 'console=ttyS0,38400 console=tty' boot option and run 'minicom' on that
> > other box, set the serial line to 38400 baud there too and capture all
> > kernel messages that occur when the lockups happens? Also enable the NMI
> > watchdog via nmi_watchdog=1.
>
> Rui brought up an interesting point on the alsa list. Is this
> procedure possible at all on a new laptop without a legacy serial
> port?
well, netconsole should work.
Ingo
On Sat, 2004-09-25 at 16:38, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > > since your lockups occur under X, could you try to disable DRI/DRM in
> > > your XConfig? Also, would it be possible to connect that box to another
> > > box via a serial line and enable the kernel's serial console via the
> > > 'console=ttyS0,38400 console=tty' boot option and run 'minicom' on that
> > > other box, set the serial line to 38400 baud there too and capture all
> > > kernel messages that occur when the lockups happens? Also enable the NMI
> > > watchdog via nmi_watchdog=1.
> >
> > Rui brought up an interesting point on the alsa list. Is this
> > procedure possible at all on a new laptop without a legacy serial
> > port?
>
> well, netconsole should work.
>
OK just to save everyone a google search here is the procedure:
http://technocrat.net/article.pl?sid=04/08/14/0236245&mode=thread
Lee
On Saturday 25 September 2004 22:40, Lee Revell wrote:
> On Sat, 2004-09-25 at 16:38, Ingo Molnar wrote:
> > * Lee Revell <[email protected]> wrote:
> >
> > > > since your lockups occur under X, could you try to disable DRI/DRM in
> > > > your XConfig? Also, would it be possible to connect that box to another
> > > > box via a serial line and enable the kernel's serial console via the
> > > > 'console=ttyS0,38400 console=tty' boot option and run 'minicom' on that
> > > > other box, set the serial line to 38400 baud there too and capture all
> > > > kernel messages that occur when the lockups happens? Also enable the NMI
> > > > watchdog via nmi_watchdog=1.
> > >
> > > Rui brought up an interesting point on the alsa list. Is this
> > > procedure possible at all on a new laptop without a legacy serial
> > > port?
> >
> > well, netconsole should work.
> >
>
> OK just to save everyone a google search here is the procedure:
>
> http://technocrat.net/article.pl?sid=04/08/14/0236245&mode=thread
You may need this patch:
http://linux.bkbits.net:8080/linux-2.5/cset%4041470590n9GHsFJI2h0NeYTRXiyWMQ?nav=index.html|ChangeSet@-8w
Ciao,
Duncan.
On Friday 24 September 2004 9:16 am, Bjorn Helgaas wrote:
>
> The attached patch (which applies on top of Rui's patch for
> ALI M5237) fixes the problem for my DL360.
Hmm, I'd rather avoid needing a quirk table ... especially
when I've always suspected this is some subtle bug in the
way Linux initializes! Does this patch behave too?
- Dave
On Sat, 2004-09-25 at 16:50, Duncan Sands wrote:
> On Saturday 25 September 2004 22:40, Lee Revell wrote:
> > On Sat, 2004-09-25 at 16:38, Ingo Molnar wrote:
> > > * Lee Revell <[email protected]> wrote:
> > >
> > > > > since your lockups occur under X, could you try to disable DRI/DRM in
> > > > > your XConfig? Also, would it be possible to connect that box to another
> > > > > box via a serial line and enable the kernel's serial console via the
> > > > > 'console=ttyS0,38400 console=tty' boot option and run 'minicom' on that
> > > > > other box, set the serial line to 38400 baud there too and capture all
> > > > > kernel messages that occur when the lockups happens? Also enable the NMI
> > > > > watchdog via nmi_watchdog=1.
> > > >
> > > > Rui brought up an interesting point on the alsa list. Is this
> > > > procedure possible at all on a new laptop without a legacy serial
> > > > port?
> > >
> > > well, netconsole should work.
> > >
> >
> > OK just to save everyone a google search here is the procedure:
> >
> > http://technocrat.net/article.pl?sid=04/08/14/0236245&mode=thread
>
> You may need this patch:
>
> http://linux.bkbits.net:8080/linux-2.5/cset%4041470590n9GHsFJI2h0NeYTRXiyWMQ?nav=index.html|ChangeSet@-8w
>
You know, this actually seems like an easier process than the serial
console. Is there any good reason this isn't the "recommended" way to
diagnose lockups? Unless they used to work at a telco ;-), most users
are more likely to have an ethernet crossover cable handy than a serial
cable.
Here's the netconsole mini-mini HOWTO.
load the module like:
insmod netconsole \
netconsole=source-port@source-ip/net-interface,dest-port@dest-ip/MAC-address
for example:
modprobe netconsole \
[email protected]/eth0,[email protected]/00:0A:8A:05:3D:80
then connect the other machine and run:
netcat -u -l -p dest-port
to watch the output.
Lee
David Brownell wrote:
> Bjorn Helgaas wrote:
>>
>> The attached patch (which applies on top of Rui's patch for
>> ALI M5237) fixes the problem for my DL360.
>
> Hmm, I'd rather avoid needing a quirk table ... especially
> when I've always suspected this is some subtle bug in the
> way Linux initializes! Does this patch behave too?
>
The patch works on my ALI M5237, having ohci_hcd (re)starting properly. No
need for my previous specific patch.
Thanks.
--
rncbc aka Rui Nuno Capela
[email protected]
Lee Revell wrote:
>
> You know, this actually seems like an easier process than the serial
> console. Is there any good reason this isn't the "recommended" way to
> diagnose lockups? Unless they used to work at a telco ;-), most users
> are more likely to have an ethernet crossover cable handy than a serial
> cable.
>
> Here's the netconsole mini-mini HOWTO.
>
> load the module like:
> insmod netconsole \
> netconsole=source-port@source-ip/net-interface,dest-port@dest-ip/MAC-address
>
> for example:
> modprobe netconsole \
> [email protected]/eth0,[email protected]/00:0A:8A:05:3D:80
>
> then connect the other machine and run:
> netcat -u -l -p dest-port
>
> to watch the output.
>
Thanks Lee.
I have tested this netconsole stuff and it seems to work pretty well.
Already sent some results to alsa-devel bugtracker on the respective bug
entry.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
On Saturday 25 September 2004 5:37 pm, David Brownell wrote:
> On Friday 24 September 2004 9:16 am, Bjorn Helgaas wrote:
> >
> > The attached patch (which applies on top of Rui's patch for
> > ALI M5237) fixes the problem for my DL360.
>
> Hmm, I'd rather avoid needing a quirk table ... especially
> when I've always suspected this is some subtle bug in the
> way Linux initializes! Does this patch behave too?
Yes, your patch seems to work on my DL360 (ServerWorks OSB4/CSB5 OHCI)
as well.
i've released the -S7 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
this iteration is mainly a merge to -mm4. (-mm4 includes PREEMPT_BKL so
the -VP patch got smaller again - at least until -mm carries it.) The
patch undoes some more experimental scheduler patches that -mm includes.
The patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc2.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc2/2.6.9-rc2-mm4/2.6.9-rc2-mm4.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
Ingo
Ingo Molnar wrote:
>
> i've released the -S7 VP patch:
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
>
Works here on SMP/SMT (P4/HT). However I have a probable off-topic
complaint about the -mm4 (both vanilla and VP):
My Wacom Graphire USB mouse wheel stopped to work properly, at least on X.
Trying to scroll with the mouse wheel just causes flicker and the view
stucks in the same position.
Again this was surely introduced on -mm4.
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
>
> My Wacom Graphire USB mouse wheel stopped to work properly, at least on
> X. Trying to scroll with the mouse wheel just causes flicker and the
> view stucks in the same position.
>
> Again this was surely introduced on -mm4.
>
Apparently this was solved by applying the specific kernel stuff from
linuxwacom-0.6.4 (http://linuxwacom.sourceforge.net).
The curious thing is that the tablet is working flawlessly on 2.6.9-rc2
kernels since before -mm4 (either vanilla or VP).
Just to clear things out, for now.
Sorry for garbage :)
--
rncbc aka Rui Nuno Capela
[email protected]
This is another quirk on -mm4 I have found: I have a couple of outsider
modules, both related to webcams, that fail on modprobe wrt same missing
kernel symbol:
w9968cf: Unknown symbol remap_page_range
spca50x: Unknown symbol remap_page_range
CU
--
rncbc aka Rui Nuno Capela
[email protected]
This is due in part b/c of the patches andrew merged that changed
remap_page_range to remap_pfn_range.
On Tuesday 28 September 2004 2:46 pm, Rui Nuno Capela wrote:
> This is another quirk on -mm4 I have found: I have a couple of outsider
> modules, both related to webcams, that fail on modprobe wrt same missing
> kernel symbol:
>
> w9968cf: Unknown symbol remap_page_range
> spca50x: Unknown symbol remap_page_range
>
> CU
On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> i've released the -S7 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
>
I think I might be seeing the mysterious network problems that K. R.
Foley reported. The symptoms are that Evolution often fails to download
mail with various errors like "Interrupted system call", "Connection
reset by peer". I can't rule out a bug in Evolution, but this did not
seem to happen with 2.6.8.1.
These problems could also be related to the changes to the via-rhine
driver. ISTR that when the VP patches were based on 2.6.8.1, I applied
the patches from -mm4 affecting via-rhine, and that was when the problem
was introduced.
I will try backing these out and see if the problem goes away.
Lee
On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> i've released the -S7 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
Disabling latency tracing does not seem to work. To demonstrate:
echo 0 > /proc/sys/kernel/preempt_max_latency
echo 0 > /proc/sys/kernel/trace_enabled
modprobe foo-module (will reliably cause a ~3-600 usec latency in resolve_symbol)
check /proc/latency_trace, or dmesg, it will be the modprobe latency.
cat /proc/sys/kernel/trace_enabled, it is still 0
This definitely worked at one point. Not sure when it broke.
Lee
* Lee Revell <[email protected]> wrote:
> On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> > i've released the -S7 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
>
> Disabling latency tracing does not seem to work. To demonstrate:
>
> echo 0 > /proc/sys/kernel/preempt_max_latency
> echo 0 > /proc/sys/kernel/trace_enabled
> modprobe foo-module (will reliably cause a ~3-600 usec latency in resolve_symbol)
> check /proc/latency_trace, or dmesg, it will be the modprobe latency.
> cat /proc/sys/kernel/trace_enabled, it is still 0
>
> This definitely worked at one point. Not sure when it broke.
is it the full modprobe latency trace, or just the header? Putting zero
into trace_enabled wont disable the critical-section-timing code - it
only disables the function tracer. Since /proc/latency_trace takes the
header portion from the latency-timing code that might change. To
disable both do something like:
echo 100000000 > /proc/sys/kernel/preempt_max_latency
echo 0 > /proc/sys/kernel/trace_enabled
Ingo
On Wed, 2004-09-29 at 16:30, Ingo Molnar wrote:
> >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
> >
> > Disabling latency tracing does not seem to work. To demonstrate:
> is it the full modprobe latency trace, or just the header? Putting zero
> into trace_enabled wont disable the critical-section-timing code - it
> only disables the function tracer. Since /proc/latency_trace takes the
> header portion from the latency-timing code that might change. To
> disable both do something like:
>
> echo 100000000 > /proc/sys/kernel/preempt_max_latency
> echo 0 > /proc/sys/kernel/trace_enabled
OK, thanks for clarifying.
Lee
Hello,
The problem seems to affect more chipsets than suspected.
Is it possible that it is indeed not a quirk as David Brownell suspects?
I am getting this:
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI interrupt 0000:02:00.0[D] -> GSI 19 (level, low) -> IRQ 19
ohci_hcd 0000:02:00.0: Advanced Micro Devices [AMD] AMD-768 [Opus] USB
ohci_hcd 0000:02:00.0: irq 19, pci mem 0xd3000000
ohci_hcd 0000:02:00.0: new USB bus registered, assigned bus number 1
ohci_hcd 0000:02:00.0: init err (00002edf 0000)
ohci_hcd 0000:02:00.0: can't start
ohci_hcd 0000:02:00.0: init error -75
ohci_hcd 0000:02:00.0: remove, state 0
ohci_hcd 0000:02:00.0: USB bus 1 deregistered
ohci_hcd: probe of 0000:02:00.0 failed with error -75
the deviceinfo from lspci -n/-v:
0000:02:00.0 Class 0c03: 1022:7449 (rev 07)
0000:02:00.0 USB Controller: Advanced Micro Devices [AMD] AMD-768 [Opus] USB (rev 07) (prog-if 10 [OHCI])
Subsystem: Asustek Computer, Inc.: Unknown device 8044
Flags: medium devsel, IRQ 19
Memory at d3000000 (32-bit, non-prefetchable)
If wanted I will test with a USB debug enabled kernel as soon as I physically
get to my box.
It would be nice if you'd cc me on Reply as I'm not subscribed.
Greets,
Peter G.
--
"I do not think the way you think I think."
-- Kai, last of the Brunnen G
On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> i've released the -S7 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
>
OK, I have been busy with other things, so haven't been able to test as
much. There might be a few regressions with S7. Here is a trace from
the ext3 journaling code that I never saw before. It starts with some
printks from the rtc_interrupt, due to having the rtc-debug patch
installed, but these accout for less than 100 usecs of the ~600 usec
latency.
http://krustophenia.net/testresults.php?dataset=2.6.9-rc2-mm4-S7
This part repeats many times:
00000001 0.127ms (+0.000ms): journal_refile_buffer (journal_commit_transaction)
00000003 0.127ms (+0.000ms): __journal_refile_buffer (journal_refile_buffer)
00000003 0.128ms (+0.000ms): __journal_unfile_buffer (journal_refile_buffer)
00000002 0.128ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000002 0.128ms (+0.000ms): journal_remove_journal_head (journal_refile_buffer)
00000003 0.129ms (+0.000ms): __journal_remove_journal_head (journal_remove_journal_head)
00000003 0.129ms (+0.000ms): __brelse (__journal_remove_journal_head)
00000003 0.130ms (+0.000ms): journal_free_journal_head (journal_remove_journal_head)
00000003 0.130ms (+0.000ms): kmem_cache_free (journal_free_journal_head)
00000002 0.130ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000001 0.131ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000001 0.131ms (+0.000ms): __brelse (journal_commit_transaction)
00000002 0.132ms (+0.000ms): kfree (journal_commit_transaction)
00000001 0.132ms (+0.000ms): preempt_schedule (journal_commit_transaction)
00000001 0.133ms (+0.000ms): journal_refile_buffer (journal_commit_transaction)
00000003 0.133ms (+0.000ms): __journal_refile_buffer (journal_refile_buffer)
00000003 0.133ms (+0.000ms): __journal_unfile_buffer (journal_refile_buffer)
00000002 0.134ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000002 0.134ms (+0.000ms): journal_remove_journal_head (journal_refile_buffer)
00000003 0.135ms (+0.000ms): __journal_remove_journal_head (journal_remove_journal_head)
00000003 0.135ms (+0.000ms): __brelse (__journal_remove_journal_head)
00000003 0.135ms (+0.000ms): journal_free_journal_head (journal_remove_journal_head)
00000003 0.136ms (+0.000ms): kmem_cache_free (journal_free_journal_head)
00000002 0.136ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000001 0.136ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000001 0.137ms (+0.000ms): __brelse (journal_commit_transaction)
00000002 0.137ms (+0.000ms): kfree (journal_commit_transaction)
00000001 0.138ms (+0.000ms): preempt_schedule (journal_commit_transaction)
00000001 0.138ms (+0.000ms): journal_refile_buffer (journal_commit_transaction)
00000003 0.139ms (+0.000ms): __journal_refile_buffer (journal_refile_buffer)
00000003 0.139ms (+0.000ms): __journal_unfile_buffer (journal_refile_buffer)
00000002 0.140ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000002 0.140ms (+0.000ms): journal_remove_journal_head (journal_refile_buffer)
00000003 0.141ms (+0.000ms): __journal_remove_journal_head (journal_remove_journal_head)
00000003 0.141ms (+0.000ms): __brelse (__journal_remove_journal_head)
00000003 0.141ms (+0.000ms): journal_free_journal_head (journal_remove_journal_head)
00000003 0.142ms (+0.000ms): kmem_cache_free (journal_free_journal_head)
00000002 0.142ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000001 0.142ms (+0.000ms): preempt_schedule (journal_refile_buffer)
00000001 0.143ms (+0.001ms): __brelse (journal_commit_transaction)
00000002 0.144ms (+0.000ms): kfree (journal_commit_transaction)
Lee
* Lee Revell <[email protected]> wrote:
> OK, I have been busy with other things, so haven't been able to test
> as much. There might be a few regressions with S7. Here is a trace
> from the ext3 journaling code that I never saw before. It starts with
> some printks from the rtc_interrupt, due to having the rtc-debug patch
> installed, but these accout for less than 100 usecs of the ~600 usec
> latency.
>
> http://krustophenia.net/testresults.php?dataset=2.6.9-rc2-mm4-S7
>
> This part repeats many times:
>
> 00000001 0.127ms (+0.000ms): journal_refile_buffer (journal_commit_transaction)
ok, will take another look at this one, it seems we dont break out of
the loop early enough.
Ingo
On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> i've released the -S7 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
>
I believe we have found a bug. A user reported massive xruns with S7,
they turned out to be printk() overhead from tons of "using
smp_processor_id() in preemptible code" errors. The trace below repeats
over and over. Looks like raid0 is the problem.
using smp_processor_id() in preemptible code: kjournald/685
[<c011892a>] smp_processor_id+0x8a/0xa0
[<c0273f67>] raid0_make_request+0x37/0x240
[<c02e1bde>] cond_resched+0xe/0x80
[<c0111608>] mcount+0x14/0x18
[<c020f4a7>] generic_make_request+0x117/0x1a0
[<c01621e6>] submit_bh+0xe6/0x150
[<c020f544>] submit_bio+0x14/0x120
[<c0111608>] mcount+0x14/0x18
[<c020f5a2>] submit_bio+0x72/0x120
[<c0133050>] autoremove_wake_function+0x0/0x60
[<c0111608>] mcount+0x14/0x18
[<c0162a6a>] bio_alloc+0xea/0x1d0
[<c01621e6>] submit_bh+0xe6/0x150
[<c01b022b>] journal_commit_transaction+0x102b/0x1680
[<c02e27b3>] _spin_unlock+0x13/0x40
[<c0136914>] _metered_spin_unlock+0x14/0xa0
[<c0117c51>] find_busiest_group+0xf1/0x310
[<c01de306>] find_next_bit+0x16/0xa0
[<c011815e>] load_balance_newidle+0x8e/0xc0
[<c01178ee>] move_tasks+0xe/0x280
[<c02e27b3>] _spin_unlock+0x13/0x40
[<c0136914>] _metered_spin_unlock+0x14/0xa0
[<c0135059>] sub_preempt_count+0x69/0x80
[<c0135059>] sub_preempt_count+0x69/0x80
[<c01263bf>] del_timer_sync+0xaf/0x160
[<c01de306>] find_next_bit+0x16/0xa0
[<c0111608>] mcount+0x14/0x18
[<c01de306>] find_next_bit+0x16/0xa0
[<c01263bf>] del_timer_sync+0xaf/0x160
[<c0111608>] mcount+0x14/0x18
[<c02e27bf>] _spin_unlock+0x1f/0x40
[<c01b31f3>] kjournald+0xf3/0x270
[<c0133050>] autoremove_wake_function+0x0/0x60
[<c02e2b20>] _spin_unlock_irq+0x20/0x40
[<c0133050>] autoremove_wake_function+0x0/0x60
[<c0117223>] schedule_tail+0x23/0x70
[<c01b30d0>] commit_timeout+0x0/0x20
[<c01b3100>] kjournald+0x0/0x270
[<c0102779>] kernel_thread_helper+0x5/0xc
printk: 4203 messages suppressed.
using smp_processor_id() in preemptible code: kjournald/685
[<c011892a>] smp_processor_id+0x8a/0xa0
[<c0273f67>] raid0_make_request+0x37/0x240
etc.
I have the .config if you need more info. All PREEMPT_ options are
enabled.
Joel, you reported the same xruns with SCSI and IDE. Are you running
raid0 in both cases? Also what is your hardware configuration? Is this
regular SMP, or HT?
Lee
On Sat, 2004-10-02 at 22:01, Lee Revell wrote:
> On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> > i've released the -S7 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
> >
>
> I believe we have found a bug. A user reported massive xruns with S7,
> they turned out to be printk() overhead from tons of "using
> smp_processor_id() in preemptible code" errors. The trace below repeats
> over and over. Looks like raid0 is the problem.
The exact configuration is 4 SCSI drives in a raid 0, and a single IDE
drive. The raid0 code apparently is not preempt safe.
Lee
On Sat, 2004-10-02 at 22:14, Lee Revell wrote:
> On Sat, 2004-10-02 at 22:01, Lee Revell wrote:
> > On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> > > i've released the -S7 VP patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
> > >
> >
> > I believe we have found a bug. A user reported massive xruns with S7,
> > they turned out to be printk() overhead from tons of "using
> > smp_processor_id() in preemptible code" errors. The trace below repeats
> > over and over. Looks like raid0 is the problem.
>
> The exact configuration is 4 SCSI drives in a raid 0, and a single IDE
> drive. The raid0 code apparently is not preempt safe.
s/preempt safe/preempt+SMP safe/. System is a dual PIII-600.
Lee
On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> i've released the -S7 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
>
This one was caused by amlat:
preemption latency trace v1.0.7 on 2.6.9-rc2-mm4-VP-S7
-------------------------------------------------------
latency: 264 us, entries: 5 (5) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: amlat/3921, uid:1000 nice:0 policy:1 rt_prio:99
-----------------
=> started at: rtc_open+0x12/0x270
=> ended at: rtc_open+0x13f/0x270
=======>
00000001 0.000ms (+0.264ms): rtc_open (misc_open)
00000001 0.264ms (+0.000ms): sub_preempt_count (rtc_open)
00000001 0.265ms (+0.000ms): update_max_trace (check_preempt_timing)
00000001 0.265ms (+0.000ms): _mmx_memcpy (update_max_trace)
00000001 0.265ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
Lee
On Sun, 2004-10-03 at 02:37, Lee Revell wrote:
> On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> > i've released the -S7 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
> >
>
> This one was caused by amlat:
And here's another produced by "lsof /foo":
preemption latency trace v1.0.7 on 2.6.9-rc2-mm4-VP-S7
-------------------------------------------------------
latency: 399 us, entries: 608 (608) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: lsof/4347, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: unix_seq_start+0x10/0x50
=> ended at: rtc_interrupt+0x294/0x450
=======>
00000001 0.000ms (+0.000ms): unix_seq_start (seq_read)
00000001 0.000ms (+0.039ms): unix_seq_idx (unix_seq_start)
00000001 0.040ms (+0.000ms): unix_seq_show (seq_read)
00000002 0.040ms (+0.001ms): sock_i_ino (unix_seq_show)
00000002 0.041ms (+0.000ms): seq_printf (unix_seq_show)
00000002 0.042ms (+0.002ms): vsnprintf (seq_printf)
00000002 0.045ms (+0.003ms): number (vsnprintf)
00000002 0.048ms (+0.000ms): skip_atoi (vsnprintf)
00000002 0.049ms (+0.001ms): number (vsnprintf)
00000002 0.050ms (+0.000ms): skip_atoi (vsnprintf)
00000002 0.051ms (+0.000ms): number (vsnprintf)
00000002 0.052ms (+0.000ms): skip_atoi (vsnprintf)
00000002 0.052ms (+0.000ms): number (vsnprintf)
00000002 0.053ms (+0.000ms): skip_atoi (vsnprintf)
00000002 0.053ms (+0.000ms): number (vsnprintf)
00000002 0.054ms (+0.000ms): skip_atoi (vsnprintf)
00000002 0.055ms (+0.000ms): number (vsnprintf)
00000002 0.056ms (+0.000ms): skip_atoi (vsnprintf)
00000002 0.056ms (+0.001ms): number (vsnprintf)
00000002 0.058ms (+0.000ms): seq_putc (unix_seq_show)
00000002 0.059ms (+0.000ms): seq_putc (unix_seq_show)
00000002 0.059ms (+0.000ms): seq_putc (unix_seq_show)
00000002 0.059ms (+0.000ms): seq_putc (unix_seq_show)
00000002 0.060ms (+0.000ms): seq_putc (unix_seq_show)
00000002 0.060ms (+0.000ms): seq_putc (unix_seq_show)
...
Full trace:
http://krustophenia.net/testresults.php?dataset=2.6.9-rc2-mm4-S7#/var/www/2.6.9-rc2-mm4-S7/lsof-latency-trace.txt
Lee
On Sun, 2004-10-03 at 02:50, Lee Revell wrote:
> On Sun, 2004-10-03 at 02:37, Lee Revell wrote:
> > On Mon, 2004-09-27 at 20:05, Ingo Molnar wrote:
> > > i've released the -S7 VP patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc2-mm4-S7
> > >
> >
> > This one was caused by amlat:
>
> And here's another produced by "lsof /foo":
>
Finally, there's this one which makes no sense to me:
preemption latency trace v1.0.7 on 2.6.9-rc2-mm4-VP-S7
-------------------------------------------------------
latency: 507 us, entries: 45 (45) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: events/0/3, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: kernel_fpu_begin+0x15/0x70
=> ended at: _mmx_memcpy+0x13a/0x180
=======>
00000001 0.000ms (+0.002ms): kernel_fpu_begin (_mmx_memcpy)
00010001 0.002ms (+0.000ms): do_IRQ (_mmx_memcpy)
00010001 0.002ms (+0.000ms): do_IRQ (<00000000>)
[ timer interrupt stuff ]
00010001 0.024ms (+0.000ms): preempt_schedule (do_IRQ)
00000002 0.025ms (+0.000ms): do_softirq (do_IRQ)
00000002 0.025ms (+0.000ms): __do_softirq (do_softirq)
00000002 0.025ms (+0.000ms): wake_up_process (do_softirq)
00000002 0.026ms (+0.000ms): try_to_wake_up (wake_up_process)
00000002 0.026ms (+0.000ms): task_rq_lock (try_to_wake_up)
00000003 0.026ms (+0.000ms): activate_task (try_to_wake_up)
00000003 0.027ms (+0.000ms): sched_clock (activate_task)
00000003 0.027ms (+0.000ms): recalc_task_prio (activate_task)
00000003 0.028ms (+0.000ms): effective_prio (recalc_task_prio)
00000003 0.028ms (+0.000ms): enqueue_task (activate_task)
00000002 0.028ms (+0.478ms): preempt_schedule (try_to_wake_up)
00000001 0.507ms (+0.000ms): sub_preempt_count (_mmx_memcpy)
00000001 0.508ms (+0.000ms): update_max_trace (check_preempt_timing)
00000001 0.508ms (+0.000ms): _mmx_memcpy (update_max_trace)
00000001 0.508ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
478 usecs in try_to_wake_up?
Lee
* Lee Revell <[email protected]> wrote:
> Finally, there's this one which makes no sense to me:
>
> preemption latency trace v1.0.7 on 2.6.9-rc2-mm4-VP-S7
> -------------------------------------------------------
> latency: 507 us, entries: 45 (45) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
> -----------------
> | task: events/0/3, uid:0 nice:-10 policy:0 rt_prio:0
> -----------------
> => started at: kernel_fpu_begin+0x15/0x70
> => ended at: _mmx_memcpy+0x13a/0x180
> =======>
> 00000001 0.000ms (+0.002ms): kernel_fpu_begin (_mmx_memcpy)
> 00010001 0.002ms (+0.000ms): do_IRQ (_mmx_memcpy)
> 00000002 0.028ms (+0.478ms): preempt_schedule (try_to_wake_up)
> 00000001 0.507ms (+0.000ms): sub_preempt_count (_mmx_memcpy)
> 478 usecs in try_to_wake_up?
no, 478 usecs in _mmx_memcpy - the timer interrupt interrupted that
function and we returned to it after the timer irq. We only know that
the latency happens between try_to_wake_up()'s preempt_schedule() and
_mmax_memcpy. Latency tracing does not capture function exits (would be
too costly) so effects of two functions can add up. The try_to_wake_up()
was simply the last function _entry_ that happened in the timer irq,
that's why it got 'credited' for the latency.
do you still have the stacktrace too that went to the syslog? What
codepath called _mmx_memcpy()?
Ingo
* Lee Revell <[email protected]> wrote:
> This one was caused by amlat:
>
> preemption latency trace v1.0.7 on 2.6.9-rc2-mm4-VP-S7
> -------------------------------------------------------
> latency: 264 us, entries: 5 (5) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
> -----------------
> | task: amlat/3921, uid:1000 nice:0 policy:1 rt_prio:99
> -----------------
> => started at: rtc_open+0x12/0x270
> => ended at: rtc_open+0x13f/0x270
> =======>
> 00000001 0.000ms (+0.264ms): rtc_open (misc_open)
> 00000001 0.264ms (+0.000ms): sub_preempt_count (rtc_open)
> 00000001 0.265ms (+0.000ms): update_max_trace (check_preempt_timing)
> 00000001 0.265ms (+0.000ms): _mmx_memcpy (update_max_trace)
> 00000001 0.265ms (+0.000ms): kernel_fpu_begin (_mmx_memcpy)
weird, there's nothing in that function that should cause _any_ delay.
Is this drivers/char/rtc.c's rtc_open()? Also, how reproducible is this
delay?
Ingo
* Lee Revell <[email protected]> wrote:
> I believe we have found a bug. A user reported massive xruns with S7,
> they turned out to be printk() overhead from tons of "using
> smp_processor_id() in preemptible code" errors. The trace below
> repeats over and over. Looks like raid0 is the problem.
the patch below should fix this issue - it's in rc3-mm1 already.
Ingo
* Reuben Farrelly <[email protected]> wrote:
> Since upgrading from -mm3 to -mm4, I'm now getting messages like this
> logged every second or so:
>
> Sep 27 18:28:06 tornado kernel: using smp_processor_id() in preemptible code: swapper/1
> Sep 27 18:28:06 tornado kernel: [<c0104dce>] dump_stack+0x17/0x19
> Sep 27 18:28:06 tornado kernel: [<c0117fc6>] smp_processor_id+0x80/0x86
> Sep 27 18:28:06 tornado kernel: [<c0282bf3>] make_request+0x174/0x2e7
> Sep 27 18:28:06 tornado kernel: [<c02073dd>] generic_make_request+0xda/0x190
this is the remove-bkl patch's debugging feature showing that there's
more preempt-unsafe disk statistics code in the RAID code.
i've attached a patch that introduces preempt and non-preempt versions
of the statistics code and updates the block code to use the appropriate
ones - does this fix all the smp_processor_id() warnings you get?
Ingo
--- linux/drivers/block/ll_rw_blk.c.orig
+++ linux/drivers/block/ll_rw_blk.c
@@ -2099,13 +2099,13 @@ void drive_stat_acct(struct request *rq,
return;
if (rw == READ) {
- disk_stat_add(rq->rq_disk, read_sectors, nr_sectors);
+ __disk_stat_add(rq->rq_disk, read_sectors, nr_sectors);
if (!new_io)
- disk_stat_inc(rq->rq_disk, read_merges);
+ __disk_stat_inc(rq->rq_disk, read_merges);
} else if (rw == WRITE) {
- disk_stat_add(rq->rq_disk, write_sectors, nr_sectors);
+ __disk_stat_add(rq->rq_disk, write_sectors, nr_sectors);
if (!new_io)
- disk_stat_inc(rq->rq_disk, write_merges);
+ __disk_stat_inc(rq->rq_disk, write_merges);
}
if (new_io) {
disk_round_stats(rq->rq_disk);
@@ -2151,12 +2151,12 @@ void disk_round_stats(struct gendisk *di
{
unsigned long now = jiffies;
- disk_stat_add(disk, time_in_queue,
+ __disk_stat_add(disk, time_in_queue,
disk->in_flight * (now - disk->stamp));
disk->stamp = now;
if (disk->in_flight)
- disk_stat_add(disk, io_ticks, (now - disk->stamp_idle));
+ __disk_stat_add(disk, io_ticks, (now - disk->stamp_idle));
disk->stamp_idle = now;
}
@@ -2957,12 +2957,12 @@ void end_that_request_last(struct reques
unsigned long duration = jiffies - req->start_time;
switch (rq_data_dir(req)) {
case WRITE:
- disk_stat_inc(disk, writes);
- disk_stat_add(disk, write_ticks, duration);
+ __disk_stat_inc(disk, writes);
+ __disk_stat_add(disk, write_ticks, duration);
break;
case READ:
- disk_stat_inc(disk, reads);
- disk_stat_add(disk, read_ticks, duration);
+ __disk_stat_inc(disk, reads);
+ __disk_stat_add(disk, read_ticks, duration);
break;
}
disk_round_stats(disk);
--- linux/include/linux/genhd.h.orig
+++ linux/include/linux/genhd.h
@@ -112,13 +112,14 @@ struct gendisk {
/*
* Macros to operate on percpu disk statistics:
- * Since writes to disk_stats are serialised through the queue_lock,
- * smp_processor_id() should be enough to get to the per_cpu versions
- * of statistics counters
+ *
+ * The __ variants should only be called in critical sections. The full
+ * variants disable/enable preemption.
*/
#ifdef CONFIG_SMP
-#define disk_stat_add(gendiskp, field, addnd) \
+#define __disk_stat_add(gendiskp, field, addnd) \
(per_cpu_ptr(gendiskp->dkstats, smp_processor_id())->field += addnd)
+
#define disk_stat_read(gendiskp, field) \
({ \
typeof(gendiskp->dkstats->field) res = 0; \
@@ -142,7 +143,8 @@ static inline void disk_stat_set_all(str
}
#else
-#define disk_stat_add(gendiskp, field, addnd) (gendiskp->dkstats.field += addnd)
+#define __disk_stat_add(gendiskp, field, addnd) \
+ (gendiskp->dkstats.field += addnd)
#define disk_stat_read(gendiskp, field) (gendiskp->dkstats.field)
static inline void disk_stat_set_all(struct gendisk *gendiskp, int value) {
@@ -150,8 +152,21 @@ static inline void disk_stat_set_all(str
}
#endif
-#define disk_stat_inc(gendiskp, field) disk_stat_add(gendiskp, field, 1)
+#define disk_stat_add(gendiskp, field, addnd) \
+ do { \
+ preempt_disable(); \
+ __disk_stat_add(gendiskp, field, addnd); \
+ preempt_enable(); \
+ } while (0)
+
+#define __disk_stat_dec(gendiskp, field) __disk_stat_add(gendiskp, field, -1)
#define disk_stat_dec(gendiskp, field) disk_stat_add(gendiskp, field, -1)
+
+#define __disk_stat_inc(gendiskp, field) __disk_stat_add(gendiskp, field, 1)
+#define disk_stat_inc(gendiskp, field) disk_stat_add(gendiskp, field, 1)
+
+#define __disk_stat_sub(gendiskp, field, subnd) \
+ __disk_stat_add(gendiskp, field, -subnd)
#define disk_stat_sub(gendiskp, field, subnd) \
disk_stat_add(gendiskp, field, -subnd)
i've released the -S8 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm1-S8
this iteration is mainly a merge to -rc3-mm1. The -rc3-mm1 tree now
includes the generic-irq-subsystem patch which is a prerequisite of the
threaded-irqs feature in the -VP patch. As a result of this the -VP
patch got significantly smaller, down from 224K to 89K.
also part of the patch are further refinements of the preempt-bkl
feature and a couple of bugfixes, reported for the -mm tree but not
included in -rc3-mm1 yet. (All of these were sent to Andrew too so they
should show up in -mm2.)
to build an -S8 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc3.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc3/2.6.9-rc3-mm1/2.6.9-rc3-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm1-S8
Ingo
On Sun, 2004-10-03 at 15:57, Ingo Molnar wrote:
> do you still have the stacktrace too that went to the syslog? What
> codepath called _mmx_memcpy()?
>
Here is an almost identical one (it's even exactly 507 usecs!). This
and the one I sent previously were apparently caused by switching from X
to a text console and back.
Sep 2 16:13:49 krustophenia kernel: (events/0/3): new 507 us maximum-latency critical section.
Sep 2 16:13:49 krustophenia kernel: => started at: <kernel_fpu_begin+0x15/0x70>
Sep 2 16:13:49 krustophenia kernel: => ended at: <_mmx_memcpy+0x13a/0x180>
Sep 2 16:13:49 krustophenia kernel: [check_preempt_timing+259/464] check_preempt_timing+0x103/0x1d0
Sep 2 16:13:49 krustophenia kernel: [_mmx_memcpy+314/384] _mmx_memcpy+0x13a/0x180
Sep 2 16:13:49 krustophenia kernel: [sub_preempt_count+70/96] sub_preempt_count+0x46/0x60
Sep 2 16:13:49 krustophenia kernel: [sub_preempt_count+70/96] sub_preempt_count+0x46/0x60
Sep 2 16:13:49 krustophenia kernel: [_mmx_memcpy+314/384] _mmx_memcpy+0x13a/0x180
Sep 2 16:13:49 krustophenia kernel: [vgacon_save_screen+120/128] vgacon_save_screen+0x78/0x80
Sep 2 16:13:49 krustophenia kernel: [redraw_screen+411/560] redraw_screen+0x19b/0x230
Sep 2 16:13:49 krustophenia kernel: [complete_change_console+44/224] complete_change_console+0x2c/0xe0
Sep 2 16:13:49 krustophenia kernel: [console_callback+258/272] console_callback+0x102/0x110
Sep 2 16:13:49 krustophenia kernel: [worker_thread+422/624] worker_thread+0x1a6/0x270
Sep 2 16:13:49 krustophenia kernel: [console_callback+0/272] console_callback+0x0/0x110
Sep 2 16:13:49 krustophenia kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
Sep 2 16:13:49 krustophenia kernel: [schedule+718/1360] schedule+0x2ce/0x550
Sep 2 16:13:49 krustophenia kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
Sep 2 16:13:49 krustophenia kernel: [schedule+718/1360] schedule+0x2ce/0x550
Sep 2 16:13:49 krustophenia kernel: [kthread+180/192] kthread+0xb4/0xc0
Sep 2 16:13:49 krustophenia kernel: [worker_thread+0/624] worker_thread+0x0/0x270
Sep 2 16:13:49 krustophenia kernel: [kthread+0/192] kthread+0x0/0xc0
Sep 2 16:13:49 krustophenia kernel: [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10
Lee
* Lee Revell <[email protected]> wrote:
> Here is an almost identical one (it's even exactly 507 usecs!). This
> and the one I sent previously were apparently caused by switching from
> X to a text console and back.
ah, it's the VGA console:
> Sep 2 16:13:49 krustophenia kernel: [_mmx_memcpy+314/384] _mmx_memcpy+0x13a/0x180
> Sep 2 16:13:49 krustophenia kernel: [vgacon_save_screen+120/128] vgacon_save_screen+0x78/0x80
> Sep 2 16:13:49 krustophenia kernel: [redraw_screen+411/560] redraw_screen+0x19b/0x230
now i'm wondering why that's running with preemption disabled - i
thought we fixed that.
Ingo
On Mon, 2004-10-04 at 06:17, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > Here is an almost identical one (it's even exactly 507 usecs!). This
> > and the one I sent previously were apparently caused by switching from
> > X to a text console and back.
>
> ah, it's the VGA console:
>
> > Sep 2 16:13:49 krustophenia kernel: [_mmx_memcpy+314/384] _mmx_memcpy+0x13a/0x180
> > Sep 2 16:13:49 krustophenia kernel: [vgacon_save_screen+120/128] vgacon_save_screen+0x78/0x80
> > Sep 2 16:13:49 krustophenia kernel: [redraw_screen+411/560] redraw_screen+0x19b/0x230
>
> now i'm wondering why that's running with preemption disabled - i
> thought we fixed that.
>
Same here. But, here's the actual trace from S7 (that one was Q
something). It is indeed the VGA console.
Oct 3 02:58:08 krustophenia kernel: (events/0/3): new 507 us maximum-latency critical section.
Oct 3 02:58:08 krustophenia kernel: => started at: <kernel_fpu_begin+0x15/0x70>
Oct 3 02:58:08 krustophenia kernel: => ended at: <_mmx_memcpy+0x13a/0x180>
Oct 3 02:58:08 krustophenia kernel: [check_preempt_timing+330/480] check_preempt_timing+0x14a/0x1e0
Oct 3 02:58:08 krustophenia kernel: [common_interrupt+24/32] common_interrupt+0x18/0x20
Oct 3 02:58:08 krustophenia kernel: [sub_preempt_count+74/112] sub_preempt_count+0x4a/0x70
Oct 3 02:58:08 krustophenia kernel: [_mmx_memcpy+314/384] _mmx_memcpy+0x13a/0x180
Oct 3 02:58:08 krustophenia kernel: [_mmx_memcpy+314/384] _mmx_memcpy+0x13a/0x180
Oct 3 02:58:08 krustophenia kernel: [vgacon_save_screen+120/128] vgacon_save_screen+0x78/0x80
Oct 3 02:58:08 krustophenia kernel: [redraw_screen+411/560] redraw_screen+0x19b/0x230
Oct 3 02:58:08 krustophenia kernel: [complete_change_console+44/224] complete_change_console+0x2c/0xe0
Oct 3 02:58:08 krustophenia kernel: [console_callback+258/272] console_callback+0x102/0x110
Oct 3 02:58:08 krustophenia kernel: [worker_thread+559/1088] worker_thread+0x22f/0x440
Oct 3 02:58:08 krustophenia kernel: [console_callback+0/272] console_callback+0x0/0x110
Oct 3 02:58:08 krustophenia kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
Oct 3 02:58:08 krustophenia kernel: [schedule+829/1712] schedule+0x33d/0x6b0
Oct 3 02:58:08 krustophenia kernel: [default_wake_function+0/32] default_wake_function+0x0/0x20
Oct 3 02:58:08 krustophenia kernel: [mcount+20/24] mcount+0x14/0x18
Oct 3 02:58:08 krustophenia kernel: [kthread+180/192] kthread+0xb4/0xc0
Oct 3 02:58:08 krustophenia kernel: [worker_thread+0/1088] worker_thread+0x0/0x440
Oct 3 02:58:08 krustophenia kernel: [kthread+0/192] kthread+0x0/0xc0
Oct 3 02:58:08 krustophenia kernel: [kernel_thread_helper+5/16] kernel_thread_helper+0x5/0x10
Lee
i've released the -S9 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
Changes since -S8:
- merge to -mm2. mm2 includes another latency breaker: Hugh Dickins'
vmtruncate rework that should fix the bash-shared-mapping latency.
- fix the x64 compilation bug reported by thewade
- fix the menuconfig duplicate entry bug noticed by Florian Schmidt
- (fix two preempt bugs in -mm2 - will be in -mm3)
to build an -S9 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc3.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc3/2.6.9-rc3-mm2/2.6.9-rc3-mm2.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
Ingo
On Mon, 2004-10-04 at 17:53, Ingo Molnar wrote:
> i've released the -S9 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
>
Does not compile:
rlrevell@mindpipe:~/kernel-source/linux-2.6.8$ make
CHK include/linux/version.h
make[1]: `arch/i386/kernel/asm-offsets.s' is up to date.
CHK include/linux/compile.h
GEN_INITRAMFS_LIST usr/initramfs_list
Using shipped usr/initramfs_list
CC arch/i386/kernel/irq.o
arch/i386/kernel/irq.c:205: error: redefinition of 'is_irq_stack_ptr'
include/asm/hardirq.h:25: error: previous definition of 'is_irq_stack_ptr' was here
arch/i386/kernel/irq.c: In function `is_irq_stack_ptr':
arch/i386/kernel/irq.c:209: error: `hardirq_stack' undeclared (first use in this function)
arch/i386/kernel/irq.c:209: error: (Each undeclared identifier is reported only once
arch/i386/kernel/irq.c:209: error: for each function it appears in.)
arch/i386/kernel/irq.c:212: error: `softirq_stack' undeclared (first use in this function)
make[1]: *** [arch/i386/kernel/irq.o] Error 1
make: *** [arch/i386/kernel] Error 2
Lee
On Mon, 04 Oct 2004 20:31:58 -0400
Lee Revell <[email protected]> wrote:
> On Mon, 2004-10-04 at 17:53, Ingo Molnar wrote:
> > i've released the -S9 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
> >
>
> Does not compile:
The definition of hardirq_stack seem to depend on the 4k stacks option.
with this enables it builds past irq.c.
flo
On Mon, 2004-10-04 at 20:56, Florian Schmidt wrote:
> On Mon, 04 Oct 2004 20:31:58 -0400
> Lee Revell <[email protected]> wrote:
>
> > On Mon, 2004-10-04 at 17:53, Ingo Molnar wrote:
> > > i've released the -S9 VP patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
> > >
> >
> > Does not compile:
>
> The definition of hardirq_stack seem to depend on the 4k stacks option.
> with this enables it builds past irq.c.
Ah, OK. I should have this enabled anyway, I think I disabled it way
back at Ingo's recommendation when were trying to get useful traces out
of ALSA's xrun debugging feature.
Lee
Ingo Molnar wrote:
>
> i've released the -S9 VP patch:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
The swapspace layout part of this is incompatible with swsusp, causing a
compile error and I don't understand the changes well enough to fix it.
Could you please provide a fix or at least provide a note, so that people
like me who depend on it know not use this patch?
Cheers
hobbs
Ingo wrote:
>
> i've released the -S9 VP patch:
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
>
Me again, we bad humor :(
My SMP/HT box is (again) terribly in that uglyness of being quite
unfriendly to -mm1, -mm2, and indirectly to -S8 and -S9 labeled kernels.
It works quite well with vanilla 2.6.9-rc3, though.
But very, very bad with those -mm1 or -mm2 patches. To get it straight,
almost all the time it hangs, randomly, but not as completely as to a
dramatic cold-reboot. It stalls on the the most administrative tasks. Name
one, and it stalls! I can hardly feel lucky if it sometimes reaches the
login prompt, while boot/initing.
I know you remember this story. Yeah. This seems quite similar to some of
earlier problems, but (un/fortunately) it doesn't seem related to VP at
all. Just having -mm1 or -mm2 is enough to make this machine go astray.
However, as usual, this seems to be ix86 SMP/HT specific. On my laptop, I
get to run full 2.6.9-rc3-mm2-S9 UP very happily.
Sorry if I can't get any real or useful debug data for now. The bad
behavior I'm referring to, is terribly non-deterministic, so I couldn't
get a pattern yet.
I just wanted to let you know ;)
Sorry for the bandwidth waste,
Cheers,
--
rncbc aka Rui Nuno Capela
[email protected]
On Mon, 4 Oct 2004 23:53:15 +0200
Ingo Molnar <[email protected]> wrote:
>
> i've released the -S9 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
>
Something is fishy for me from S8 on. I justbooted into S9 and i see
many many xruns under load in jackd [as i saw in S8]. Since all my ll
settings are enabled as usual, something else must be wrong.
I find the following very interesting:
mango:~# ps aux|grep IRQ
root 12 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 8]
root 14 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 12]
root 15 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 14]
root 16 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 15]
root 17 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 1]
root 314 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 10]
root 7983 0.6 0.0 0 0 ? S< 03:26 0:03 [IRQ 5]
root 14617 0.0 0.0 1576 464 pts/2 R+ 03:35 0:00 grep IRQ
mango:~# chrt -p 7938
sched_getscheduler: No such process
failed to get pid 7938's policy
For other irq threads i get normal values [SCHED_OTHER].
Here's the usual tweakables. I find it interesting that the preempt
timing stuff is seemingly non working, either [see bottom]. Will boot
into S8 to see if i see similar things:
-----------
-----------
Linux mango.fruits.de 2.6.9-rc3-mm2-VP-S9LT #1 Tue Oct 5 03:07:49 CEST
2004 i686 GNU/Linux-----------
/proc/interrupts:
CPU0
0: 1233290 XT-PIC timer 0/33290
1: 4810 XT-PIC i8042 0/4810
2: 0 XT-PIC cascade 0/0
5: 863776 XT-PIC CS46XX 0/63776
8: 4 XT-PIC rtc 0/4
10: 1008 XT-PIC eth0 0/1008
12: 34888 XT-PIC i8042 0/34888
14: 927 XT-PIC ide0 0/927
15: 33585 XT-PIC ide1 0/33585
NMI: 0
ERR: 0
-----------
/proc/irq/1/i8042/threaded:1
/proc/irq/10/eth0/threaded:1
/proc/irq/12/i8042/threaded:1
/proc/irq/14/ide0/threaded:1
/proc/irq/15/ide1/threaded:1
/proc/irq/5/CS46XX/threaded:0
/proc/irq/8/rtc/threaded:0
-----------
/sys/block/hda/queue/max_sectors_kb:16
/sys/block/hdc/queue/max_sectors_kb:16
/sys/block/hdd/queue/max_sectors_kb:16
-----------
voluntary_preemption
1
kernel_preemption
1
softirq redirect
1
hardirq redirect
1
-----------
preempt_thresh
0
-----------
preempt_max_thresh
0
-----------
trace enabled
0
-----------
I also see some of these "Wait for ready failed before probe !" [in
dmesg] which have popped up somewhere along the line when VP was based
on mm [definetly in S8 and S9, dunno about earlier for sure]:
[snip]
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with
idebus=xx SIS5513: IDE controller at PCI slot 0000:00:02.5
SIS5513: chipset revision 208
SIS5513: not 100% native mode: will probe irqs later
SIS5513: SiS735 ATA 100 (2nd gen) controller
ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xff08-0xff0f, BIOS settings: hdc:DMA, hdd:DMA
Probing IDE interface ide0...
hda: IC35L060AVER07-0, ATA DISK drive
requesting new irq thread for IRQ14...
elevator: using anticipatory as default io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: ST340823A, ATA DISK drive
hdd: TDK CDRW121032, ATAPI CD/DVD-ROM drive
requesting new irq thread for IRQ15...
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
IRQ#14 thread started up.
hda: 120103200 sectors (61492 MB) w/1916KiB Cache, CHS=65535/16/63,
UDMA(100) hda: cache flushes not supported
hda: hda1 hda2 hda3
hdc: max request size: 128KiB
IRQ#15 thread started up.
hdc: Host Protected Area detected.
current capacity is 78165360 sectors (40020 MB)
native capacity is 78165361 sectors (40020 MB)
hdc: Host Protected Area disabled.
hdc: 78165361 sectors (40020 MB) w/1024KiB Cache, CHS=65535/16/63,
UDMA(33) hdc: cache flushes not supported
hdc: hdc1 hdc2
hdd: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache, DMA
Uniform CD-ROM driver Revision: 3.20
[snip]
On Tue, 5 Oct 2004 03:42:23 +0200
Florian Schmidt <[email protected]> wrote:
> I find the following very interesting:
>
> mango:~# ps aux|grep IRQ
> root 12 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 8]
> root 14 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 12]
> root 15 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 14]
> root 16 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 15]
> root 17 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 1]
> root 314 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 10]
> root 7983 0.6 0.0 0 0 ? S< 03:26 0:03 [IRQ 5]
> root 14617 0.0 0.0 1576 464 pts/2 R+ 03:35 0:00 grep IRQ
> mango:~# chrt -p 7938
> sched_getscheduler: No such process
> failed to get pid 7938's policy
>
> For other irq threads i get normal values [SCHED_OTHER].
Hmm, this seemed to have been caused by a small program for changing the
IRQ priorities in the PIC as found here and loading the CS46XX module
[irq 5] at a later time:
http://roht.informatik.uni-halle.de/~ladischc/pic_priorities.html
But i'm not sure. I will do some more experimenting. The xrun and preempt
timing issues still remain. Here's the S9 .config:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.9-rc3-mm2-VP-S9
# Tue Oct 5 03:23:14 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y
#
# General setup
#
CONFIG_LOCALVERSION="LT"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=14
# CONFIG_HOTPLUG is not set
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_PREEMPT_TIMING=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
# CONFIG_TINY_SHMEM is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
CONFIG_KMOD=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
# CONFIG_X86_UP_APIC is not set
CONFIG_X86_TSC=y
# CONFIG_X86_MCE is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOW_INTS is not set
CONFIG_APM_REAL_MODE_POWER_OFF=y
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_LEGACY_PROC is not set
CONFIG_PCI_NAMES=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
# CONFIG_DEBUG_DRIVER is not set
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
# CONFIG_PARPORT is not set
#
# Plug and Play support
#
#
# Block devices
#
CONFIG_BLK_DEV_FD=m
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_LBD is not set
# CONFIG_CDROM_PKTCDVD is not set
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
CONFIG_BLK_DEV_IDESCSI=m
# CONFIG_IDE_TASK_IOCTL is not set
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
CONFIG_BLK_DEV_SIS5513=y
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_ARM is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=m
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=m
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_DEBUG_ENABLE is not set
CONFIG_AIC7XXX_DEBUG_MASK=0
# CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA2XXX=m
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
CONFIG_NETLINK_DEV=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
# CONFIG_IP_MULTIPLE_TABLES is not set
# CONFIG_IP_ROUTE_MULTIPATH is not set
# CONFIG_IP_ROUTE_VERBOSE is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_TUNNEL is not set
#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
# CONFIG_IP_NF_CT_ACCT is not set
# CONFIG_IP_NF_CT_PROTO_SCTP is not set
# CONFIG_IP_NF_FTP is not set
# CONFIG_IP_NF_IRC is not set
# CONFIG_IP_NF_TFTP is not set
# CONFIG_IP_NF_AMANDA is not set
# CONFIG_IP_NF_QUEUE is not set
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
# CONFIG_IP_NF_MATCH_SCTP is not set
# CONFIG_IP_NF_MATCH_COMMENT is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
# CONFIG_IP_NF_NAT_LOCAL is not set
# CONFIG_IP_NF_NAT_SNMP_BASIC is not set
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
# CONFIG_IP_NF_RAW is not set
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_HW_FLOWCONTROL is not set
#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CLK_JIFFIES=y
# CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set
# CONFIG_NET_SCH_CLK_CPU is not set
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
# CONFIG_NET_SCH_HFSC is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
# CONFIG_NET_SCH_NETEM is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
# CONFIG_CLS_U32_PERF is not set
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_POLICE=y
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_ETHERTAP is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
CONFIG_SIS900=m
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
CONFIG_GAMEPORT=m
CONFIG_SOUND_GAMEPORT=m
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
# CONFIG_GAMEPORT_EMU10K1 is not set
# CONFIG_GAMEPORT_VORTEX is not set
# CONFIG_GAMEPORT_FM801 is not set
# CONFIG_GAMEPORT_CS461x is not set
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=y
# CONFIG_INPUT_UINPUT is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_CONSOLE is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
CONFIG_NVRAM=y
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=m
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_INTEL_MCH is not set
# CONFIG_AGP_NVIDIA is not set
CONFIG_AGP_SIS=m
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HANGCHECK_TIMER=m
#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m
#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
# CONFIG_I2C_ALGOPCA is not set
#
# I2C Hardware Bus support
#
CONFIG_I2C_ALI1535=m
CONFIG_I2C_ALI1563=m
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_ISA=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
CONFIG_SCx200_ACB=m
CONFIG_I2C_SIS5595=m
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=m
# CONFIG_I2C_STUB is not set
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
# CONFIG_I2C_PCA_ISA is not set
#
# Hardware Sensors Chip support
#
CONFIG_I2C_SENSOR=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_MAX1619=m
# CONFIG_SENSORS_SMSC47M1 is not set
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
#
# Other I2C Chip support
#
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
CONFIG_SENSORS_PCF8591=m
CONFIG_SENSORS_RTC8564=m
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
# CONFIG_FB is not set
CONFIG_VIDEO_SELECT=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
#
# Sound
#
CONFIG_SOUND=m
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
# CONFIG_SND_VERBOSE_PRINTK is not set
CONFIG_SND_DEBUG=y
# CONFIG_SND_DEBUG_MEMORY is not set
# CONFIG_SND_DEBUG_DETECT is not set
#
# Generic devices
#
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
#
# PCI devices
#
CONFIG_SND_AC97_CODEC=m
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VX222 is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
# CONFIG_USB is not set
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
# CONFIG_EXT2_FS_POSIX_ACL is not set
# CONFIG_EXT2_FS_SECURITY is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
CONFIG_ROMFS_FS=y
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
#
# Caches
#
# CONFIG_CACHEFS is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS_XATTR=y
# CONFIG_DEVPTS_FS_SECURITY is not set
CONFIG_TMPFS=y
# CONFIG_TMPFS_XATTR is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=m
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
#
# Network File Systems
#
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set
# CONFIG_EXPORTFS is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
CONFIG_AFS_FS=m
CONFIG_RXRPC=m
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
CONFIG_NLS_CODEPAGE_850=y
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
CONFIG_NLS_CODEPAGE_1250=y
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=y
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set
#
# Profiling support
#
# CONFIG_PROFILING is not set
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_DEBUG_SMP_PROCESSOR_ID is not set
# CONFIG_DEBUG_INFO is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_KPROBES is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_4KSTACKS=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_KGDB is not set
#
# Security options
#
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITY_NETWORK is not set
CONFIG_SECURITY_CAPABILITIES=m
# CONFIG_SECURITY_SECLVL is not set
# CONFIG_SECURITY_SELINUX is not set
#
# Cryptographic options
#
CONFIG_CRYPTO=y
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
# CONFIG_CRYPTO_MD5 is not set
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES_586 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set
#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC32=m
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_GENERIC_HARDIRQS=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
Florian Schmidt wrote:
> On Mon, 4 Oct 2004 23:53:15 +0200
> Ingo Molnar <[email protected]> wrote:
>
>>
>> i've released the -S9 VP patch:
>>
>>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
>>
>
> Something is fishy for me from S8 on. I justbooted into S9 and i see
> many many xruns under load in jackd [as i saw in S8]. Since all my ll
> settings are enabled as usual, something else must be wrong.
>
> I find the following very interesting:
>
> mango:~# ps aux|grep IRQ
> root 12 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 8]
> root 14 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 12]
> root 15 0.0 0.0 0 0 ? S< 04:28 0:00 [IRQ 14]
> root 16 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 15]
> root 17 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 1]
> root 314 0.0 0.0 0 0 ? S< 03:17 0:00 [IRQ 10]
> root 7983 0.6 0.0 0 0 ? S< 03:26 0:03 [IRQ 5]
> root 14617 0.0 0.0 1576 464 pts/2 R+ 03:35 0:00 grep IRQ
> mango:~# chrt -p 7938
> sched_getscheduler: No such process
> failed to get pid 7938's policy
>
> For other irq threads i get normal values [SCHED_OTHER].
>
Looks like a case of minor dyslexia to me. 7983 != 7938 :)
--hobbs
Rui Nuno Capela wrote:
> Ingo wrote:
>
>>i've released the -S9 VP patch:
>>http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
>>
>
>
> Me again, we bad humor :(
>
> My SMP/HT box is (again) terribly in that uglyness of being quite
> unfriendly to -mm1, -mm2, and indirectly to -S8 and -S9 labeled kernels.
>
> It works quite well with vanilla 2.6.9-rc3, though.
>
> But very, very bad with those -mm1 or -mm2 patches. To get it straight,
> almost all the time it hangs, randomly, but not as completely as to a
> dramatic cold-reboot. It stalls on the the most administrative tasks. Name
> one, and it stalls! I can hardly feel lucky if it sometimes reaches the
> login prompt, while boot/initing.
>
> I know you remember this story. Yeah. This seems quite similar to some of
> earlier problems, but (un/fortunately) it doesn't seem related to VP at
> all. Just having -mm1 or -mm2 is enough to make this machine go astray.
>
> However, as usual, this seems to be ix86 SMP/HT specific. On my laptop, I
> get to run full 2.6.9-rc3-mm2-S9 UP very happily.
>
> Sorry if I can't get any real or useful debug data for now. The bad
> behavior I'm referring to, is terribly non-deterministic, so I couldn't
> get a pattern yet.
>
> I just wanted to let you know ;)
This may be my fault. I made changes in the SMH code as part of the
ZAPHOD patch but was unable to test them on a hyperthreaded machine due
to a lack thereof (I have one on order). I've attached a patch that
reverts the changes that I made. Can you give them a try please and let
me know if they fix your problem?
Thanks
Peter
--
Peter Williams [email protected]
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
On Tue, 5 Oct 2004, Rui Nuno Capela wrote:
> Ingo wrote:
> >
> > i've released the -S9 VP patch:
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-S9
> >
>
> Me again, we bad humor :(
>
> My SMP/HT box is (again) terribly in that uglyness of being quite
> unfriendly to -mm1, -mm2, and indirectly to -S8 and -S9 labeled kernels.
could you apply this patch first:
http://redhat.com/~mingo/voluntary-preempt/zaphod-undo-2.6.9-rc3-mm2-A0
to get back to the original scheduler. Can you still see problems with
this one applied? If yes then please try to debug it a bit more.
Ingo
i've released the -T0 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-T0
Changes since -T0:
- fix preempt-timing facility (reported by Florian Schmidt)
- fix !4K stack compilation breakage (reported by Lee Revell)
- revert experimental scheduler stuff from -mm. (Rui, does this fix your
problems?)
note that sw-suspend and x64+LATENCY_TRACE is still broken.
Ingo
On Mon, 04 Oct 2004 23:09:29 -0400
Andrew Rodland <[email protected]> wrote:
> Looks like a case of minor dyslexia to me. 7983 != 7938 :)
Hah, right you are.. I shouldn't do this stuff at 3 in the
morning. I rechecked like 20 times, just to type it in
wrongly every time, lol.
flo
On Tue, 5 Oct 2004 03:02:05 -0400 (EDT)
Ingo Molnar <[email protected]> wrote:
> i've released the -T0 VP patch:
> - fix !4K stack compilation breakage (reported by Lee Revell)
I still need to enable 4k stacks to get it to build [see error below w/o 4k
stacks].
But xrun hell is no more [as opposed to S8 and S9]. jackd seems to run fine
again. I suspect the reverted scheduler changes to have fixed this
[uneducated guess], since in S8 and S9 the system behaved all different
[sloppy X when compiling stuff. jackd producing xruns although everything is
setup for ll ok].
flo
CC arch/i386/kernel/irq.o
arch/i386/kernel/irq.c:205: error: redefinition of `is_irq_stack_ptr'
include/asm/hardirq.h:25: error: `is_irq_stack_ptr' previously defined here
arch/i386/kernel/irq.c: In function `is_irq_stack_ptr':
arch/i386/kernel/irq.c:209: error: `hardirq_stack' undeclared (first use in this function)
arch/i386/kernel/irq.c:209: error: (Each undeclared identifier is reported only once
arch/i386/kernel/irq.c:209: error: for each function it appears in.)
arch/i386/kernel/irq.c:212: error: `softirq_stack' undeclared (first use in this function)
make[1]: *** [arch/i386/kernel/irq.o] Error 1
make: *** [arch/i386/kernel] Error 2
* Florian Schmidt <[email protected]> wrote:
> On Tue, 5 Oct 2004 03:02:05 -0400 (EDT)
> Ingo Molnar <[email protected]> wrote:
>
> > i've released the -T0 VP patch:
>
> > - fix !4K stack compilation breakage (reported by Lee Revell)
>
> I still need to enable 4k stacks to get it to build [see error below w/o 4k
> stacks].
doh - chunk went MIA. Updated the patch, please re-download -T0.
Ingo
Ingo Molnar wrote:
>
> i've released the -T0 VP patch:
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-T0
>
> Changes since -T0:
>
> - fix preempt-timing facility (reported by Florian Schmidt)
>
> - fix !4K stack compilation breakage (reported by Lee Revell)
>
> - revert experimental scheduler stuff from -mm. (Rui, does this fix
> your problems?)
>
Unfortunately it doesn't seem to. Attached you may find some info I could
dump out, as a snapshot of what I'm seeing on my fresh
2.6.9-rc3-mm2-T0.0smp kernel.
Things to note:
- vp_info: the general state of VP tunnables.
- dmesg: some badness occurrences, worth to look and fix.
- config: the kernel .config file.
- top-100: top output, showing that ksoftirqd/1 is consuming 99.9% of one,
but only one, of the virtual CPUs, permanentely.
The overall system behavior is hardly better than what I've complained
before: if I can get to X, the mouse(s) don't work (either USB or PS/2);
calling /sbin/modprobe is asking for trouble; most of the times, it stalls
while boot-initing. Can't hardly make anything with it :(
Sorry.
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar wrote:
>
> doh - chunk went MIA. Updated the patch, please re-download -T0.
>
Oops. Does it affect me?
I've already test-ran T0 and sent away my early results. Guess it was too
quick. Do I have to start all over?
--
rncbc aka Rui Nuno Capela
[email protected]
On Tue, 5 Oct 2004, Rui Nuno Capela wrote:
> Ingo Molnar wrote:
> >
> > doh - chunk went MIA. Updated the patch, please re-download -T0.
> >
>
> Oops. Does it affect me?
no, it doesnt affect you - the bug only causes build breakage, not runtime
breakage.
> I've already test-ran T0 and sent away my early results. Guess it was
> too quick. Do I have to start all over? [...]
no. I'll reply to your other mail soon.
Ingo
On Tue, 5 Oct 2004, Rui Nuno Capela wrote:
> Unfortunately it doesn't seem to. Attached you may find some info I
> could dump out, as a snapshot of what I'm seeing on my fresh
> 2.6.9-rc3-mm2-T0.0smp kernel.
thanks. Do your problems go away if you turn off the SMT scheduler, or if
you disable SMP altogether on your P4-HT box?
> - top-100: top output, showing that ksoftirqd/1 is consuming 99.9% of one,
> but only one, of the virtual CPUs, permanentely.
i think this is the clearest indication that there's something is
fundamentally wrong - ksoftirqd must never use that much CPU time on an
idle system.
Ingo
On Tue, 5 Oct 2004, Ingo Molnar wrote:
> On Tue, 5 Oct 2004, Rui Nuno Capela wrote:
>
> i think this is the clearest indication that there's something is
> fundamentally wrong - ksoftirqd must never use that much CPU time on an
> idle system.
Please, would you try this patch below that I posted yesterday?
At the time I thought the trylock was hardly used so not urgent.
I've just now discovered that the standard SMP PREEMPT read_lock
- as in do_wait's read_lock(&tasklist_lock) for example - uses it
via one of those dreaded expansions that grep misses:
if (likely(_raw_##op##_trylock(lock)))
I've been suffering the occasional leftover zombie from multiple
kernel builds precisely since the preempt-smp.patch went in; been
hunting it unsuccessfully in spare moments, yesterday noticed that
bug, today realize it's probably what I've been hunting - I'm
about to start my own tests again, can't be sure until tomorrow.
Hugh
The i386 and x86_64 _raw_read_trylocks in preempt-smp.patch
are too successful: atomic_read() returns a signed integer.
Signed-off-by: Hugh Dickins <[email protected]>
--- 2.6.9-rc3-mm2/include/asm-i386/spinlock.h 2004-10-04 12:00:14.000000000 +0100
+++ linux/include/asm-i386/spinlock.h 2004-10-04 18:50:32.752864600 +0100
@@ -235,7 +235,7 @@ static inline int _raw_read_trylock(rwlo
{
atomic_t *count = (atomic_t *)lock;
atomic_dec(count);
- if (atomic_read(count) < RW_LOCK_BIAS)
+ if (atomic_read(count) >= 0)
return 1;
atomic_inc(count);
return 0;
--- 2.6.9-rc3-mm2/include/asm-x86_64/spinlock.h 2004-10-04 12:00:15.000000000 +0100
+++ linux/include/asm-x86_64/spinlock.h 2004-10-04 18:50:32.752864600 +0100
@@ -236,7 +236,7 @@ static inline int _raw_read_trylock(rwlo
{
atomic_t *count = (atomic_t *)lock;
atomic_dec(count);
- if (atomic_read(count) < RW_LOCK_BIAS)
+ if (atomic_read(count) >= 0)
return 1;
atomic_inc(count);
return 0;
i've released the -T1 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-T1
Changes since -T0:
- added the read-lock fix from Hugh that affects SMP systems. This
could fix Rui's problem - i've checked -T1 on a P4/HT box and saw no
problems, BYMMV.
- compilation fixes (for those who downloaded T0 early)
- small tracer improvement
to build a -T1 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc3.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc3/2.6.9-rc3-mm2/2.6.9-rc3-mm2.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-T1
Ingo
Ingo Molnar wrote:
>
> i've released the -T1 VP patch:
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-T1
>
> Changes since -T0:
>
> - added the read-lock fix from Hugh that affects SMP systems. This
> could fix Rui's problem - i've checked -T1 on a P4/HT box and saw
> no problems, BYMMV.
>
No go here. My milleage certainly varies :)
Still the same ugliness here with T1. As before, there goes some info
attached, which I could collect while barely up and running.
I don't wanna be a PITA, but changes are hardly noticed since T0.
Next I'll try building with CONFIG_SCHED_SMT off, and later with
CONFIG_SMP off too, to just try to make some point here.
CU
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Still the same ugliness here with T1. As before, there goes some info
> attached, which I could collect while barely up and running.
the dmesg info shows you had a crash early on, in khubd:
Badness in remove_proc_entry at fs/proc/generic.c:688
[<c018c8e8>] remove_proc_entry+0x152/0x15a
[<f8b8e116>] uhci_hcd_init+0x116/0x133 [uhci_hcd]
[<c0135f0e>] sys_init_module+0x1df/0x2da
[<c01044ed>] sysenter_past_esp+0x52/0x71
usb 3-1: new low speed USB device using address 2
Unable to handle kernel paging request at virtual address a49c0e0c
i believe this is a crash present in -mm too. In theory such a crash
could mess up the kernel so best would be if you could try a kernel with
USB disabled? Hopefully none of your critical devices is on USB ...
Ingo
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> Still the same ugliness here with T1. As before, there goes some info
>> attached, which I could collect while barely up and running.
>
> the dmesg info shows you had a crash early on, in khubd:
>
> Badness in remove_proc_entry at fs/proc/generic.c:688
> [<c018c8e8>] remove_proc_entry+0x152/0x15a
> [<f8b8e116>] uhci_hcd_init+0x116/0x133 [uhci_hcd]
> [<c0135f0e>] sys_init_module+0x1df/0x2da
> [<c01044ed>] sysenter_past_esp+0x52/0x71
> usb 3-1: new low speed USB device using address 2
> Unable to handle kernel paging request at virtual address a49c0e0c
>
> i believe this is a crash present in -mm too. In theory such a crash
> could mess up the kernel so best would be if you could try a kernel with
> USB disabled? Hopefully none of your critical devices is on USB ...
>
Yeah, I've found it the hard way :) One of my trials discovered just that,
by booting without any USB devices plugged in. It booted apparently fine.
Then, exacltly when I plug in my USB mouse (Wacom Graphire2 tablet), I
immediately get the following kernel oops (on dmesg):
IRQ#23 thread started up.
IRQ#19 thread started up.
usb 2-1: new low speed USB device using address 2
Unable to handle kernel paging request at virtual address aaf7ee8f
printing eip:
c014284b
*pde = 00000000
Oops: 0000 [#1]
PREEMPT SMP
Modules linked in: mga ohci1394 ieee1394 ehci_hcd uhci_hcd intel_mch_agp
agpgart snd_usb_usx2y snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep
snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd soundcore snd_page_alloc
evdev sk98lin realtime commoncap w83781d i2c_sensor i2c_isa i2c_i801
i2c_core wacom usbcore subfs dm_mod
CPU: 1
EIP: 0060:[<c014284b>] Not tainted VLI
EFLAGS: 00010093 (2.6.9-rc3-mm2-T1.0smp)
EIP is at check_poison_obj+0x98/0x1d0
eax: 0000006b ebx: f6e8a780 ecx: f6d254b8 edx: ffffffa5
esi: 00000000 edi: aaf7ee8f ebp: f74d3d10 esp: f74d3ce8
ds: 007b es: 007b ss: 0068 preempt: 00000002
Process khubd (pid: 1306, threadinfo=f74d3000 task=f7df2a40)
Stack: f6e8a780 f74d3cfc c0130ce3 0000002b c0144658 00000000 0000002c
f6e8a780
aaf7ee8b f6e8a780 f74d3d34 c01442f5 f6e8a780 aaf7ee8b f6d254b8
f6d254b8
f6e8a780 00000046 00000020 f74d3d58 c0144658 f6e8a780 00000020
aaf7ee8b
Call Trace:
[<c0130ce3>] __mcount+0x1d/0x1f
[<c0144658>] kmem_cache_alloc+0x6b/0xa5
[<c01442f5>] cache_alloc_debugcheck_after+0x129/0x162
[<c0144658>] kmem_cache_alloc+0x6b/0xa5
[<f8c35ff9>] uhci_alloc_urb_priv+0x26/0x81 [uhci_hcd]
[<f8c35ff9>] uhci_alloc_urb_priv+0x26/0x81 [uhci_hcd]
[<f8c36fbb>] uhci_urb_enqueue+0x5d/0x2bc [uhci_hcd]
[<c0110350>] mcount+0x14/0x18
[<f8b424f4>] hcd_submit_urb+0x127/0x194 [usbcore]
[<f8b42e38>] usb_submit_urb+0x1e2/0x244 [usbcore]
[<f8b430c7>] usb_start_wait_urb+0xe/0xe2 [usbcore]
[<f8b4310c>] usb_start_wait_urb+0x53/0xe2 [usbcore]
[<c0130ce3>] __mcount+0x1d/0x1f
[<c01ea660>] kref_init+0x8/0x13
[<f8b42ba8>] usb_init_urb+0x27/0x3c [usbcore]
[<c0110350>] mcount+0x14/0x18
[<f8b42bef>] usb_alloc_urb+0x32/0x52 [usbcore]
[<f8b43215>] usb_internal_control_msg+0x7a/0x83 [usbcore]
[<f8b432aa>] usb_control_msg+0x8c/0xa0 [usbcore]
[<f8b3fe23>] hub_set_address+0x6d/0x90 [usbcore]
[<f8b3fff7>] hub_port_init+0x1b1/0x39a [usbcore]
[<f8b404d9>] hub_port_connect_change+0xfe/0x43a [usbcore]
[<f8b409d3>] hub_events+0x1be/0x395 [usbcore]
[<f8b40be1>] hub_thread+0x37/0x109 [usbcore]
[<c0130845>] autoremove_wake_function+0x0/0x50
[<c0104416>] ret_from_fork+0x6/0x14
[<c0130845>] autoremove_wake_function+0x0/0x50
[<f8b40baa>] hub_thread+0x0/0x109 [usbcore]
[<c0102601>] kernel_thread_helper+0x5/0xb
Code: 89 44 24 08 e8 09 fe ff ff 83 45 ec 01 83 7d ec 05 7f 5b 83 c6 01 3b
75 f0 7d 53 3b 75 e4 b8 6b 00 00 00 ba a5 ff ff ff 0f 44 c2 <38> 04 3e 2e
74 e2 8b 45 ec 85 c0 75 a0 8b 55 f0 89 7c 24 04 c7
<6>note: khubd[1306] exited with preempt_count 1
Guess what? It's right after this crash that ksoftirqd/1 pulls up to
99.99%CPU#1 and stays still on that figure, forever. Of course, as Ingo
noted, this was happening behind the scenes every time I was boot/initing.
OTOH, I've tested T1 with CONFIG_SCHED_SMT and/or CONFIG_SMP not set, and
got similar crashes too. So this seems to be some nasty bug introduced by
-mm{1,2}, not by VP on SMP/SMT.
Yes, I do have some critical USB devices around here. One is that wacom
tablet (mouse) and the other is a tascam us-224 audio/midi control surface
that a love very much :)
Don't know if this makes me feeling better, doh.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> OTOH, I've tested T1 with CONFIG_SCHED_SMT and/or CONFIG_SMP not set,
> and got similar crashes too. So this seems to be some nasty bug
> introduced by -mm{1,2}, not by VP on SMP/SMT.
>
> Yes, I do have some critical USB devices around here. One is that
> wacom tablet (mouse) and the other is a tascam us-224 audio/midi
> control surface that a love very much :)
>
> Don't know if this makes me feeling better, doh.
i believe Andrew said that these USB problems should be fixed in the
next -mm iteration.
Ingo
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> OTOH, I've tested T1 with CONFIG_SCHED_SMT and/or CONFIG_SMP not set,
>> and got similar crashes too. So this seems to be some nasty bug
>> introduced by -mm{1,2}, not by VP on SMP/SMT.
>>
>> Yes, I do have some critical USB devices around here. One is that
>> wacom tablet (mouse) and the other is a tascam us-224 audio/midi
>> control surface that a love very much :)
>>
>> Don't know if this makes me feeling better, doh.
>
> i believe Andrew said that these USB problems should be fixed in the
> next -mm iteration.
>
Oh yes, I really do hope so :)
Meanwhile, I'm stuck with 2.6.9-rc2-mm4-S7 (SMP), but happy.
Strange thing is, that on my laptop, 2.6.9-rc3-mm2-S9 (UP) is doing just
fine. Guess that ohci_hcd now makes the difference here, against the
former which makes uhci_hcd bad behaved atm.
Thanks Ingo.
--
rncbc aka Rui Nuno Capela
[email protected]
On Tue, 2004-10-05 at 15:44, Ingo Molnar wrote:
> * Rui Nuno Capela <[email protected]> wrote:
>
> > OTOH, I've tested T1 with CONFIG_SCHED_SMT and/or CONFIG_SMP not set,
> > and got similar crashes too. So this seems to be some nasty bug
> > introduced by -mm{1,2}, not by VP on SMP/SMT.
> >
> > Yes, I do have some critical USB devices around here. One is that
> > wacom tablet (mouse) and the other is a tascam us-224 audio/midi
> > control surface that a love very much :)
> >
> > Don't know if this makes me feeling better, doh.
>
> i believe Andrew said that these USB problems should be fixed in the
> next -mm iteration.
>
FWIW, this one does not work for me either, I get a USB-related Oops on
boot.
Lee
* Lee Revell <[email protected]> wrote:
> > i believe Andrew said that these USB problems should be fixed in the
> > next -mm iteration.
> >
>
> FWIW, this one does not work for me either, I get a USB-related Oops
> on boot.
by next -mm iteration i meant -rc3-mm3, which is not released yet.
Ingo
Ingo Molnar wrote:
>
>* Lee Revell <rlrevell@xxxxxxxxxxx> wrote:
>
> > > i believe Andrew said that these USB problems should be fixed in
> > > the next -mm iteration.
> > >
> >
> > FWIW, this one does not work for me either, I get a USB-related Oops
> > on boot.
>
> by next -mm iteration i meant -rc3-mm3, which is not released yet.
>
> Ingo
>
The usb-hubs problem patch in LKML by Greg KH seems to fix this problem.
The thread is 2.6.9-rc3-mm1, bk-pci patch, USB hubs
---
Kari H?meenaho
>Meanwhile, I'm stuck with 2.6.9-rc2-mm4-S7 (SMP), but happy.
>
>Strange thing is, that on my laptop, 2.6.9-rc3-mm2-S9 (UP) is doing just
>fine. Guess that ohci_hcd now makes the difference here, against the
>former which makes uhci_hcd bad behaved atm.
I am having similar problems with -T1 and separately reported problems with
a build of rc3-mm1-S8 as well (no oops, but the USB mouse is dead).
Somewhere between those two versions (rc2-mm4-S7 and rc3-mm1-S8) is where
the problem appears to be introduced. For now I'll stay with my working -S0
kernel.
--Mark H Johnson
<mailto:[email protected]>
* [email protected] <[email protected]> wrote:
> >Meanwhile, I'm stuck with 2.6.9-rc2-mm4-S7 (SMP), but happy.
> >
> >Strange thing is, that on my laptop, 2.6.9-rc3-mm2-S9 (UP) is doing just
> >fine. Guess that ohci_hcd now makes the difference here, against the
> >former which makes uhci_hcd bad behaved atm.
>
> I am having similar problems with -T1 and separately reported problems
> with a build of rc3-mm1-S8 as well (no oops, but the USB mouse is
> dead). Somewhere between those two versions (rc2-mm4-S7 and
> rc3-mm1-S8) is where the problem appears to be introduced. For now
> I'll stay with my working -S0 kernel.
disable USB for now - it's broken in -mm and unrelated to -VP. There are
hopes that in -rc3-mm3 USB will work again.
Ingo
i've released the -T3 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
Changes since -T1 (-T2 was not announced):
- rebased to -rc3-mm3. This should fix the build problems and further
shrinks the -VP patch. Also, people who had USB problems please
re-test -T3 as -mm3 is supposed to have much of those problems fixed.
Also, the dvb-bt8xx.c build problem should be fixed in -mm3 too, plus
a number of smp_processor_id() warnings were debugged and fixed as
well.
- fixed SWSUSPEND compilation. Could someone who uses swsuspend check
whether sw-suspension works fine?
- improved CONFIG_DEBUG_PREEMPT - this could help debug any potentially
remaining unbalanced preemption counts that were reported. (but
the fixes in -mm3 could fix them as well.)
to build a -T3 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc3.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc3/2.6.9-rc3-mm3/2.6.9-rc3-mm3.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
Ingo
* Hanno Meyer-Thurow <[email protected]> wrote:
> > i've released the -T3 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
> i get this on compile:
>
> init/built-in.o(.text+0x18b): In function `rest_init':
> : undefined reference to `sub_preempt_count'
ok, this happens if PREEMPT_TIMING is not enabled. I've re-uploaded the
new -T3 patch, please re-download it.
Ingo
On Thu, 7 Oct 2004 13:44:34 +0200
Ingo Molnar <[email protected]> wrote:
> ok, this happens if PREEMPT_TIMING is not enabled. I've re-uploaded the
> new -T3 patch, please re-download it.
>
> Ingo
great! it works, thanks :)
Hi Ingo,
>
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
Didn't try this yet on my desktop (where those USB troubles roared), but
on my laptop there's already a showstopper with while trying to start
beloved jackd -R:
---
Unable to handle kernel paging request at virtual address 00010024
printing eip:
c011995f
*pde = 00000000
Oops: 0002 [#1]
PREEMPT
Modules linked in: realtime commoncap snd_seq_oss snd_seq_midi_event
snd_seq snd_pcm_oss snd_mixer_oss snd_usb_usx2y snd_usb_lib snd_rawmidi
snd_seq_device snd_hwdep snd_ali5451 snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd soundcore prism2_cs p80211 ds yenta_socket pcmcia_core
natsemi crc32 ohci1394 ieee1394 loop subfs evdev ohci_hcd usbcore thermal
processor fan button battery ac
CPU: 0
EIP: 0060:[<c011995f>] Not tainted VLI
EFLAGS: 00010086 (2.6.9-rc3-mm3-T3.0)
EIP is at profile_hit+0x2f/0x33
eax: 00010024 ebx: de1f8510 ecx: 00000000 edx: 00000000
esi: 00000001 edi: ffffffea ebp: db32ef88 esp: db32ef88
ds: 007b es: 007b ss: 0068 preempt: 00000003
Process jackd (pid: 6519, threadinfo=db32e000 task=dec4fa00)
Stack: db32efbc c0115431 00000002 c0104231 00000004 c0398460 c0102d2f
007d0f00
00000046 0000001e 00001979 b75a1bb0 b7fa5a4c db32e000 c0104231
00001979
00000001 b75a1dac b75a1bb0 b7fa5a4c bfffb9b8 0000009c 0000007b
0000007b
Call Trace:
[<c0115431>] setscheduler+0xc4/0x254
[<c0104231>] sysenter_past_esp+0x52/0x71
[<c0102d2f>] sys_clone+0x40/0x42
[<c0104231>] sysenter_past_esp+0x52/0x71
Code: b8 60 ff ff 8b 0d ec 16 3a c0 8b 15 e8 16 3a c0 8b 45 0c 83 ea 01 2d
28 02 10 c0 d3 e8 39 c2 0f 46 c2 8b 15 e4 16 3a c0 8d 04 82 <ff> 00 5d c3
55 89 e5 e8 85 60 ff ff b8 da ff ff ff 5d c3 55 89
<6>note: jackd[6519] exited with preempt_count 2
Debug: sleeping function called from invalid context jackd(6519) at
kernel/fork.c:421
in_atomic():1, irqs_disabled():0
[<c0116385>] __might_sleep+0xb5/0xc5
[<c011921c>] vprintk+0x135/0x182
[<c0116927>] mm_release+0x72/0xcf
[<c01190e5>] printk+0x1d/0x1f
[<c011b3a6>] do_exit+0x85/0x506
[<c0112f0e>] do_page_fault+0x0/0x67e
[<c0105539>] do_divide_error+0x0/0x13a
[<c0112f0e>] do_page_fault+0x0/0x67e
[<c01190e5>] printk+0x1d/0x1f
[<c01133a7>] do_page_fault+0x499/0x67e
[<c01c54dd>] __copy_from_user_ll+0x11/0x76
[<c012efbc>] check_preempt_timing+0x18f/0x1fb
[<c012f1ce>] sub_preempt_count+0x5c/0x8b
[<c0117869>] copy_process+0x610/0xcfe
[<c010eca6>] sched_clock+0x14/0x8d
[<c012f1ce>] sub_preempt_count+0x5c/0x8b
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c012efbc>] check_preempt_timing+0x18f/0x1fb
[<c012f1ce>] sub_preempt_count+0x5c/0x8b
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c0117869>] copy_process+0x610/0xcfe
[<c0117869>] copy_process+0x610/0xcfe
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c0112f0e>] do_page_fault+0x0/0x67e
[<c0104c8d>] error_code+0x2d/0x38
[<c011995f>] profile_hit+0x2f/0x33
[<c0115431>] setscheduler+0xc4/0x254
[<c0104231>] sysenter_past_esp+0x52/0x71
[<c0102d2f>] sys_clone+0x40/0x42
[<c0104231>] sysenter_past_esp+0x52/0x71
scheduling while atomic: jackd/0x04000002/6519
[<c02d21cc>] schedule+0x554/0x5f5
[<c02d27f4>] cond_resched+0x5f/0x7f
[<c011692c>] mm_release+0x77/0xcf
[<c01190e5>] printk+0x1d/0x1f
[<c011b3a6>] do_exit+0x85/0x506
[<c0112f0e>] do_page_fault+0x0/0x67e
[<c0105539>] do_divide_error+0x0/0x13a
[<c0112f0e>] do_page_fault+0x0/0x67e
[<c01190e5>] printk+0x1d/0x1f
[<c01133a7>] do_page_fault+0x499/0x67e
[<c01c54dd>] __copy_from_user_ll+0x11/0x76
[<c012efbc>] check_preempt_timing+0x18f/0x1fb
[<c012f1ce>] sub_preempt_count+0x5c/0x8b
[<c0117869>] copy_process+0x610/0xcfe
[<c010eca6>] sched_clock+0x14/0x8d
[<c012f1ce>] sub_preempt_count+0x5c/0x8b
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c012efbc>] check_preempt_timing+0x18f/0x1fb
[<c012f1ce>] sub_preempt_count+0x5c/0x8b
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c0117869>] copy_process+0x610/0xcfe
[<c0117869>] copy_process+0x610/0xcfe
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c01146c3>] wake_up_new_task+0x142/0x190
[<c0112f0e>] do_page_fault+0x0/0x67e
[<c0104c8d>] error_code+0x2d/0x38
[<c011995f>] profile_hit+0x2f/0x33
[<c0115431>] setscheduler+0xc4/0x254
[<c0104231>] sysenter_past_esp+0x52/0x71
[<c0102d2f>] sys_clone+0x40/0x42
[<c0104231>] sysenter_past_esp+0x52/0x71
---
This jackd crash seems to show up due to CONFIG_DEBUG_PREEMPT being set
on, but not sure yet.
CU
--
rncbc aka Rui Nuno Capela
[email protected]
>
> please re-download it, this is another bug i've fixed in the re-uploaded
> version. Does the new patch work?
>
OK. Now it works fine. Thanks Ingo.
Maybe I'm just a plain idiot, but wouldn't it be welcome to add another
dot number or whatever to the VP filename label? IMHO that should clear
things of what actual patch release are we really applying.
The crash with jackd wasn't the only one, some other sound apps also failed
with similar kernel oopses dumps.
And, just out of curiosity, I've also tested "vanilla" 2.6.9-rc3-mm3 and
it looks like suffering from the same illness. So this has to be yet
another "feature" of the -mm line ;)
I'm glad this time VP came to the rescue :)
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
>i've released the -T3 VP patch: [snip]
Thanks. This fixes the USB boot time problem / USB mouse does not work.
I do not have the problem others have reported with audio - audio works
just fine for me.
I've made two runs using latencytest for the real time task / generating
audio. To recap, I have a dual 866 Mhz Pentium III w/ 512 M memory,
Ensonic audio, 8139too network interface, and IDE disk. The IDE is set
to use udma2. Both tests run with both hard IRQ and soft IRQ threads
(except for the audio IRQ).
The first run, I set a limit of 500 usec and had no latencies longer than
that. Very good.
The second run, I set the limit to 200 usec and had 47 traces > 200 usec
in a 20-30 minute period. Two were VERY long, one was about 1.7 msec and
the other was over 50 msec. The material at the end summarizes the types
of traces I had. I will send the detailed traces separately.
--Mark
To summarize, the symptoms w/ trace numbers following:
[1] VERY long latencies
04 27
[2] Long traces, USB related
00 32 42 43 44
[3] Long traces, network related
03 05... 24 28 46
[4] Pruning icache
25 26
[5] clear_page_tables
01 02 29 30 31 33... 41 45
[1] VERY long latencies
This appears to be disk related, IRQ 14 is ide0.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 53418 us, entries: 226 (226) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: IRQ 14/203, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80
=> ended at: _spin_unlock+0x2d/0x60
=======>
00000001 0.000ms (+0.000ms): _spin_lock_irqsave (ide_intr)
00000001 0.000ms (+0.000ms): drive_is_ready (ide_intr)
00000001 0.001ms (+0.000ms): __ide_dma_test_irq (drive_is_ready)
00000001 0.002ms (+0.000ms): ide_inb (__ide_dma_test_irq)
00000001 0.002ms (+0.000ms): del_timer (ide_intr)
00000001 0.003ms (+0.422ms): _spin_lock_irqsave (del_timer)
00010002 0.425ms (+0.000ms): do_nmi (sub_preempt_count)
00010002 0.426ms (+0.003ms): do_nmi (___trace)
00010002 0.429ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 0.430ms (+0.993ms): nmi_watchdog_tick (default_do_nmi)
00010002 1.423ms (+0.000ms): do_nmi (_spin_lock_irqsave)
00010002 1.424ms (+0.001ms): do_nmi (___trace)
00010002 1.425ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 1.425ms (+0.996ms): nmi_watchdog_tick (default_do_nmi)
00010001 2.422ms (+0.000ms): do_nmi (_spin_lock_irqsave)
00010001 2.422ms (+0.001ms): do_nmi (check_preempt_timing)
00010001 2.424ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010001 2.424ms (+0.996ms): nmi_watchdog_tick (default_do_nmi)
00010002 3.420ms (+0.000ms): do_nmi (_spin_lock_irqsave)
00010002 3.420ms (+0.001ms): do_nmi (check_preempt_timing)
00010002 3.422ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 3.422ms (+0.996ms): nmi_watchdog_tick (default_do_nmi)
00010002 4.418ms (+0.000ms): do_nmi (_spin_lock_irqsave)
00010002 4.419ms (+0.003ms): do_nmi (___trace)
00010002 4.422ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 4.422ms (+0.994ms): nmi_watchdog_tick (default_do_nmi)
...
00010001 49.345ms (+0.996ms): nmi_watchdog_tick (default_do_nmi)
00010002 50.341ms (+0.000ms): do_nmi (_spin_lock_irqsave)
00010002 50.341ms (+0.001ms): do_nmi (___trace)
00010002 50.343ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 50.343ms (+0.996ms): nmi_watchdog_tick (default_do_nmi)
00010002 51.339ms (+0.000ms): do_nmi (add_preempt_count)
00010002 51.340ms (+0.003ms): do_nmi (cycles_to_usecs)
00010002 51.343ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 51.343ms (+0.994ms): nmi_watchdog_tick (default_do_nmi)
00010002 52.338ms (+0.000ms): do_nmi (_spin_lock_irqsave)
00010002 52.338ms (+0.003ms): do_nmi (touch_preempt_timing)
00010002 52.342ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 52.342ms (+0.994ms): nmi_watchdog_tick (default_do_nmi)
00010002 53.336ms (+0.000ms): do_nmi (add_preempt_count)
00010002 53.336ms (+0.003ms): do_nmi (check_preempt_timing)
00010002 53.340ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 53.340ms (+0.076ms): nmi_watchdog_tick (default_do_nmi)
00000002 53.417ms (+0.000ms): _spin_unlock_irqrestore (del_timer)
00000001 53.417ms (+0.000ms): _spin_unlock (ide_intr)
00000001 53.418ms (+0.000ms): sub_preempt_count (_spin_unlock)
00000001 53.419ms (+0.000ms): update_max_trace (check_preempt_timing)
Here is the other one, same basic symptom but just not as long duration.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 1715 us, entries: 18 (18) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: IRQ 14/203, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80
=> ended at: _spin_unlock+0x2d/0x60
=======>
00000001 0.000ms (+0.000ms): _spin_lock_irqsave (ide_intr)
00000001 0.000ms (+0.000ms): drive_is_ready (ide_intr)
00000001 0.001ms (+0.000ms): __ide_dma_test_irq (drive_is_ready)
00000001 0.001ms (+0.000ms): ide_inb (__ide_dma_test_irq)
00000001 0.002ms (+0.000ms): del_timer (ide_intr)
00000001 0.002ms (+0.022ms): _spin_lock_irqsave (del_timer)
00010002 0.024ms (+0.000ms): do_nmi (sub_preempt_count)
00010002 0.025ms (+0.001ms): do_nmi (check_preempt_timing)
00010002 0.026ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 0.027ms (+0.996ms): nmi_watchdog_tick (default_do_nmi)
00010002 1.023ms (+0.000ms): do_nmi (sub_preempt_count)
00010002 1.023ms (+0.003ms): do_nmi (check_preempt_timing)
00010002 1.027ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010002 1.027ms (+0.688ms): nmi_watchdog_tick (default_do_nmi)
00000002 1.715ms (+0.000ms): _spin_unlock_irqrestore (del_timer)
00000001 1.715ms (+0.000ms): _spin_unlock (ide_intr)
00000001 1.716ms (+0.000ms): sub_preempt_count (_spin_unlock)
00000001 1.716ms (+0.000ms): update_max_trace (check_preempt_timing)
[2] Long traces, USB related
I had several traces with hundreds of lines of tracing data. This set
appear to be related to USB processing (moving the mouse).
This one is typical.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 365 us, entries: 320 (320) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: X/2815, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80
=> ended at: _spin_unlock_irqrestore+0x32/0x70
[3] Long traces, network related
Similar to the previous one, and a symptom I have reported before.
I don't see any obvious fixes for this type of problem and the overhead
of tracing may introduce a significant delay to make the problem look
much worse than it really is.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 203 us, entries: 401 (401) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: ksoftirqd/0/3, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock+0x1f/0x70
=> ended at: rtl8139_rx+0x217/0x340
OR
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 222 us, entries: 562 (562) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: rcp/7416, uid:2711 nice:0 policy:0 rt_prio:0
-----------------
=> started at: tcp_prequeue_process+0x49/0xb0
=> ended at: local_bh_enable+0x3f/0xb0
[4] Pruning icache
Don't recall seeing this one before. It was long enough to have a clock
tick come in, but with only 25 usec overhead, the clock overhead was not
significant.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 274 us, entries: 165 (165) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: kswapd0/72, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock+0x1f/0x70
=> ended at: _spin_unlock+0x2d/0x60
=======>
00000001 0.000ms (+0.002ms): _spin_lock (prune_icache)
00000001 0.002ms (+0.005ms): inode_has_buffers (prune_icache)
00000001 0.007ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.009ms (+0.002ms): inode_has_buffers (prune_icache)
00000001 0.011ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.013ms (+0.002ms): inode_has_buffers (prune_icache)
...
00000001 0.263ms (+0.002ms): inode_has_buffers (prune_icache)
00000001 0.265ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.266ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.268ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.269ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.271ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.272ms (+0.000ms): inode_has_buffers (prune_icache)
00000001 0.273ms (+0.002ms): _spin_unlock (prune_icache)
00000001 0.275ms (+0.000ms): sub_preempt_count (_spin_unlock)
00000001 0.276ms (+0.000ms): update_max_trace (check_preempt_timing)
[5] clear_page_tables
Usually short (<50-100 lines) but taking significant time to perform.
You can see the big hit at clear_page_tables in this one.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 234 us, entries: 38 (38) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: get_ltrace.sh/7082, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: unmap_vmas+0x12e/0x280
=> ended at: _spin_unlock+0x2d/0x60
=======>
00000001 0.000ms (+0.000ms): touch_preempt_timing (unmap_vmas)
00000001 0.000ms (+0.000ms): __bitmap_weight (unmap_vmas)
00000001 0.000ms (+0.000ms): unmap_page_range (unmap_vmas)
00000001 0.000ms (+0.000ms): zap_pmd_range (unmap_page_range)
00000001 0.000ms (+0.000ms): zap_pte_range (zap_pmd_range)
00000001 0.001ms (+0.000ms): kmap_atomic (zap_pte_range)
00000002 0.001ms (+0.000ms): page_address (zap_pte_range)
00000002 0.001ms (+0.000ms): set_page_dirty (zap_pte_range)
00000002 0.001ms (+0.000ms): page_remove_rmap (zap_pte_range)
00000002 0.002ms (+0.000ms): set_page_dirty (zap_pte_range)
00000002 0.002ms (+0.000ms): page_remove_rmap (zap_pte_range)
00000002 0.002ms (+0.000ms): kunmap_atomic (zap_pte_range)
00000001 0.003ms (+0.001ms): vm_acct_memory (exit_mmap)
00000001 0.004ms (+0.217ms): clear_page_tables (exit_mmap)
00000001 0.221ms (+0.003ms): flush_tlb_mm (exit_mmap)
00000001 0.225ms (+0.000ms): free_pages_and_swap_cache (exit_mmap)
00000001 0.225ms (+0.000ms): lru_add_drain (free_pages_and_swap_cache)
00000001 0.226ms (+0.000ms): release_pages (free_pages_and_swap_cache)
00000001 0.227ms (+0.000ms): _spin_lock_irq (release_pages)
00000001 0.227ms (+0.001ms): _spin_lock_irqsave (release_pages)
00000002 0.228ms (+0.000ms): _spin_unlock_irq (release_pages)
00000001 0.228ms (+0.000ms): __pagevec_free (release_pages)
00000001 0.228ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.229ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.229ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.230ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.230ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.230ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.231ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.231ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.232ms (+0.000ms): _spin_lock_irq (release_pages)
00000001 0.232ms (+0.000ms): _spin_lock_irqsave (release_pages)
00000002 0.232ms (+0.000ms): _spin_unlock_irq (release_pages)
00000001 0.233ms (+0.000ms): __pagevec_free (release_pages)
00000001 0.233ms (+0.000ms): free_hot_cold_page (__pagevec_free)
00000001 0.233ms (+0.001ms): _spin_unlock (exit_mmap)
00000001 0.235ms (+0.001ms): sub_preempt_count (_spin_unlock)
00000001 0.236ms (+0.000ms): update_max_trace (check_preempt_timing)
Ingo Molnar wrote:
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
For me, this one wants to panic on boot when trying to find the root
filesystem. Acts like either the aic7xxx module is missing (which I
don't think is the case) or hosed, or it's having trouble with the label
for the root partition (Fedora system). Will investigate further when I
get home tonight, unless something jumps out at anyone.
kr
K.R. Foley wrote:
> Ingo Molnar wrote:
>
>> i've released the -T3 VP patch:
>>
>>
>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>>
>>
>
> For me, this one wants to panic on boot when trying to find the root
> filesystem. Acts like either the aic7xxx module is missing (which I
> don't think is the case) or hosed, or it's having trouble with the label
> for the root partition (Fedora system). Will investigate further when I
> get home tonight, unless something jumps out at anyone.
>
> kr
For clarification: This appears to be a problem in 2.6.9-rc3-mm3 also.
kr
* K.R. Foley <[email protected]> wrote:
> >For me, this one wants to panic on boot when trying to find the root
> >filesystem. Acts like either the aic7xxx module is missing (which I
> >don't think is the case) or hosed, or it's having trouble with the label
> >for the root partition (Fedora system). Will investigate further when I
> >get home tonight, unless something jumps out at anyone.
> >
> >kr
>
> For clarification: This appears to be a problem in 2.6.9-rc3-mm3 also.
try root=/dev/sda3 (or whereever your root fs is) instead of
root=LABEL=/, in /etc/grub.conf.
Ingo
Ingo Molnar wrote:
>
>>
>> i've released the -T3 VP patch:
>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>>
>
OK. Just to let you know, both of my personal machines are now running on
bleeding-edge 2.6.9-rc3-mm3-T3, and very happily may I assure :)
- my laptop, Mandrake 10.1c: [email protected] UP
- my desktop, SUSE 9.1 Pro: [email protected] HT/SMP
USB is fine and so is jackd, only to mention my recently known annoyances.
Even my Wacom Graphire USB is working without anything else but the kernel
supplied stuff. Most of the previous kernel installs I had to pullover
from linuxwacom.sf.net the mousedev, evdev and wacom modules, just to get
this tablet working straight on X, but now it seems pretty native :)
Good times are we living, eh?
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>>For me, this one wants to panic on boot when trying to find the root
>>>filesystem. Acts like either the aic7xxx module is missing (which I
>>>don't think is the case) or hosed, or it's having trouble with the label
>>>for the root partition (Fedora system). Will investigate further when I
>>>get home tonight, unless something jumps out at anyone.
>>>
>>>kr
>>
>>For clarification: This appears to be a problem in 2.6.9-rc3-mm3 also.
>
>
> try root=/dev/sda3 (or whereever your root fs is) instead of
> root=LABEL=/, in /etc/grub.conf.
>
> Ingo
>
Thanks. Tried that just to be sure. However, I don't seem to be the only
one having this problem with aic7xxx.
kr
On Thu, 2004-10-07 at 19:26, Rui Nuno Capela wrote:
> Ingo Molnar wrote:
> >
> >>
> >> i've released the -T3 VP patch:
> >> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
> >>
> >
>
> OK. Just to let you know, both of my personal machines are now running on
> bleeding-edge 2.6.9-rc3-mm3-T3, and very happily may I assure :)
This actually feels a _lot_ snappier than mm2, which seemed prone to
weird stalls. I don't have any numbers to back this up yet.
Lee
Lee Revell wrote:
> On Thu, 2004-10-07 at 19:26, Rui Nuno Capela wrote:
>
>>Ingo Molnar wrote:
>>
>>>>i've released the -T3 VP patch:
>>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>>>>
>>>
>>OK. Just to let you know, both of my personal machines are now running on
>>bleeding-edge 2.6.9-rc3-mm3-T3, and very happily may I assure :)
>
>
> This actually feels a _lot_ snappier than mm2, which seemed prone to
> weird stalls. I don't have any numbers to back this up yet.
mm2 had a completely different cpu scheduler so no meaningful comparison
can be made. Try comparing to mm3 vanilla.
Cheers,
Con
* K.R. Foley <[email protected]> wrote:
> Ingo Molnar wrote:
> >* K.R. Foley <[email protected]> wrote:
> >
> >
> >>>For me, this one wants to panic on boot when trying to find the root
> >>>filesystem. Acts like either the aic7xxx module is missing (which I
> >>>don't think is the case) or hosed, or it's having trouble with the label
> >>>for the root partition (Fedora system). Will investigate further when I
> >>>get home tonight, unless something jumps out at anyone.
> >>>
> >>>kr
> >>
> >>For clarification: This appears to be a problem in 2.6.9-rc3-mm3 also.
> >
> >
> >try root=/dev/sda3 (or whereever your root fs is) instead of
> >root=LABEL=/, in /etc/grub.conf.
> >
> > Ingo
> >
>
> Thanks. Tried that just to be sure. However, I don't seem to be the
> only one having this problem with aic7xxx.
could you send me the following info:
- full log of a failed boot
- full log of a successful boot
- the output of 'mount'
Ingo
* Lee Revell <[email protected]> wrote:
> On Thu, 2004-10-07 at 19:26, Rui Nuno Capela wrote:
> > Ingo Molnar wrote:
> > >
> > >>
> > >> i've released the -T3 VP patch:
> > >> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
> > >>
> > >
> >
> > OK. Just to let you know, both of my personal machines are now running on
> > bleeding-edge 2.6.9-rc3-mm3-T3, and very happily may I assure :)
>
> This actually feels a _lot_ snappier than mm2, which seemed prone to
> weird stalls. I don't have any numbers to back this up yet.
yeah, -mm is back to the development branch of the stock scheduler.
(i.e. the scheduler changes destined for 2.6.10.)
Ingo
Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
>
>>On Thu, 2004-10-07 at 19:26, Rui Nuno Capela wrote:
>>
>>>Ingo Molnar wrote:
>>>
>>>>>i've released the -T3 VP patch:
>>>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>>>>>
>>>>
>>>OK. Just to let you know, both of my personal machines are now running on
>>>bleeding-edge 2.6.9-rc3-mm3-T3, and very happily may I assure :)
>>
>>This actually feels a _lot_ snappier than mm2, which seemed prone to
>>weird stalls. I don't have any numbers to back this up yet.
>
>
> yeah, -mm is back to the development branch of the stock scheduler.
> (i.e. the scheduler changes destined for 2.6.10.)
It's also got a fix for the cache hot timing bug which was causing havoc
with the load balancer.
Peter
--
Peter Williams [email protected]
"Learning, n. The kind of ignorance distinguishing the studious."
-- Ambrose Bierce
On Tue, Oct 05, 2004 at 03:47:07PM +0200, Ingo Molnar wrote:
> i've released the -T1 VP patch:
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-T1
> Changes since -T0:
> - added the read-lock fix from Hugh that affects SMP systems. This
> could fix Rui's problem - i've checked -T1 on a P4/HT box and saw no
> problems, BYMMV.
> - compilation fixes (for those who downloaded T0 early)
> - small tracer improvement
> to build a -T1 tree from scratch the patching order is:
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc3.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc3/2.6.9-rc3-mm2/2.6.9-rc3-mm2.bz2
> + http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm2-T1
The version numbers are up to Chebyshev polynomials. Now I have to try it.
-- wli
Oct 8 06:48:42 porky syslogd 1.4.1: restart.
Oct 8 06:48:42 porky syslog: syslogd startup succeeded
Oct 8 06:48:42 porky syslog: klogd startup succeeded
Oct 8 06:48:42 porky kernel: klogd 1.4.1, log source = /proc/kmsg started.
Oct 8 06:48:42 porky kernel: Linux version 2.6.9-rc3-mm3-VP-T3 ([email protected]) (gcc version 3.3.3 20040412 (Red Hat Linux 3.3.3-7)) #8 SMP Thu Oct 7 23:07:40 CDT 2004
Oct 8 06:48:42 porky kernel: BIOS-provided physical RAM map:
Oct 8 06:48:42 porky kernel: BIOS-e820: 0000000000000000 - 00000000000a0000 (usable)
Oct 8 06:48:42 porky kernel: BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
Oct 8 06:48:42 porky kernel: BIOS-e820: 0000000000100000 - 000000001ff9e000 (usable)
Oct 8 06:48:42 porky kernel: BIOS-e820: 000000001ff9e000 - 0000000020000000 (reserved)
Oct 8 06:48:42 porky kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
Oct 8 06:48:42 porky kernel: BIOS-e820: 00000000fee00000 - 00000000fee10000 (reserved)
Oct 8 06:48:42 porky kernel: BIOS-e820: 00000000ffb00000 - 0000000100000000 (reserved)
Oct 8 06:48:42 porky kernel: 0MB HIGHMEM available.
Oct 8 06:48:42 porky kernel: 511MB LOWMEM available.
Oct 8 06:48:42 porky kernel: found SMP MP-table at 000fe710
Oct 8 06:48:42 porky kernel: DMI 2.3 present.
Oct 8 06:48:42 porky irqbalance: irqbalance startup succeeded
Oct 8 06:48:42 porky kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Oct 8 06:48:42 porky kernel: Processor #0 6:8 APIC version 17
Oct 8 06:48:42 porky kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Oct 8 06:48:42 porky kernel: Processor #1 6:8 APIC version 17
Oct 8 06:48:42 porky kernel: ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
Oct 8 06:48:42 porky kernel: ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
Oct 8 06:48:42 porky kernel: Using ACPI for processor (LAPIC) configuration information
Oct 8 06:48:42 porky kernel: Intel MultiProcessor Specification v1.4
Oct 8 06:48:42 porky kernel: Virtual Wire compatibility mode.
Oct 8 06:48:42 porky kernel: OEM ID: DELL Product ID: WS 620 APIC at: 0xFEE00000
Oct 8 06:48:42 porky kernel: I/O APIC #2 Version 32 at 0xFEC00000.
Oct 8 06:48:42 porky kernel: Enabling APIC mode: Flat. Using 1 I/O APICs
Oct 8 06:48:42 porky kernel: Processors: 2
Oct 8 06:48:42 porky portmap: portmap startup succeeded
Oct 8 06:48:42 porky kernel: Built 1 zonelists
Oct 8 06:48:42 porky kernel: Initializing CPU#0
Oct 8 06:48:43 porky kernel: Kernel command line: ro root=LABEL=/ rhgb quiet noapic
Oct 8 06:48:43 porky kernel: (swapper/0): new 196391 us maximum-latency critical section.
Oct 8 06:48:43 porky kernel: => started at: <start_kernel+0x48/0x1e0>
Oct 8 06:48:43 porky kernel: => ended at: <cond_resched+0x25/0x80>
Oct 8 06:48:43 porky kernel: [<c0137a38>] touch_preempt_timing+0x48/0x50
Oct 8 06:48:43 porky kernel: [<c0137952>] check_preempt_timing+0x162/0x200
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:43 porky kernel: [<c0137a38>] touch_preempt_timing+0x48/0x50
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:43 porky rpc.statd[2435]: Version 1.0.6 Starting
Oct 8 06:48:43 porky kernel: [<c0138e58>] register_cpu_notifier+0x18/0x60
Oct 8 06:48:43 porky kernel: [<c0133688>] rcu_cpu_notify+0x38/0x40
Oct 8 06:48:43 porky kernel: [<c0381d0d>] rcu_init+0x6d/0x80
Oct 8 06:48:43 porky kernel: [<c037097c>] start_kernel+0xbc/0x1e0
Oct 8 06:48:43 porky nfslock: rpc.statd startup succeeded
Oct 8 06:48:43 porky kernel: [<c0370440>] unknown_bootoption+0x0/0x190
Oct 8 06:48:43 porky kernel: PID hash table entries: 2048 (order: 11, 32768 bytes)
Oct 8 06:48:43 porky kernel: (swapper/0): new 205265 us maximum-latency critical section.
Oct 8 06:48:43 porky kernel: => started at: <cond_resched+0x25/0x80>
Oct 8 06:48:43 porky kernel: => ended at: <cond_resched+0x25/0x80>
Oct 8 06:48:43 porky kernel: [<c0137a38>] touch_preempt_timing+0x48/0x50
Oct 8 06:48:43 porky kernel: [<c0137952>] check_preempt_timing+0x162/0x200
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:43 porky kernel: [<c0137a38>] touch_preempt_timing+0x48/0x50
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:43 porky kernel: [<c0138e58>] register_cpu_notifier+0x18/0x60
Oct 8 06:48:43 porky kernel: [<c012b21b>] timer_cpu_notify+0x2b/0x30
Oct 8 06:48:43 porky kernel: [<c03819d5>] init_timers+0x35/0x60
Oct 8 06:48:43 porky kernel: [<c037098b>] start_kernel+0xcb/0x1e0
Oct 8 06:48:43 porky kernel: [<c0370440>] unknown_bootoption+0x0/0x190
Oct 8 06:48:43 porky kernel: Detected 931.130 MHz processor.
Oct 8 06:48:43 porky kernel: Using tsc for high-res timesource
Oct 8 06:48:43 porky kernel: (swapper/0): new 502992 us maximum-latency critical section.
Oct 8 06:48:43 porky kernel: => started at: <cond_resched+0x25/0x80>
Oct 8 06:48:43 porky kernel: => ended at: <cond_resched+0x25/0x80>
Oct 8 06:48:43 porky kernel: [<c0137a38>] touch_preempt_timing+0x48/0x50
Oct 8 06:48:43 porky kernel: [<c0137952>] check_preempt_timing+0x162/0x200
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:43 porky kernel: [<c0137a38>] touch_preempt_timing+0x48/0x50
Oct 8 06:48:43 porky rpcidmapd: rpc.idmapd startup succeeded
Oct 8 06:48:43 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:44 porky kernel: [<c02b67b5>] cond_resched+0x25/0x80
Oct 8 06:48:44 porky kernel: [<c01212fb>] acquire_console_sem+0x2b/0x60
Oct 8 06:48:44 porky kernel: [<c038a4e3>] con_init+0x13/0x2b0
Oct 8 06:48:44 porky kernel: [<c0114df0>] mcount+0x14/0x18
Oct 8 06:48:44 porky kernel: [<c0389a12>] console_init+0x42/0x50
Oct 8 06:48:44 porky kernel: [<c037099a>] start_kernel+0xda/0x1e0
Oct 8 06:48:44 porky kernel: [<c0370440>] unknown_bootoption+0x0/0x190
Oct 8 06:48:44 porky kernel: Console: colour VGA+ 80x25
Oct 8 06:48:44 porky random: Initializing random number generator: succeeded
Oct 8 06:48:44 porky kernel: Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Oct 8 06:48:44 porky kernel: Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Oct 8 06:48:44 porky kernel: Memory: 513384k/523896k available (1762k kernel code, 9892k reserved, 725k data, 284k init, 0k highmem)
Oct 8 06:48:44 porky kernel: Checking if this processor honours the WP bit even in supervisor mode... Ok.
Oct 8 06:48:44 porky kernel: Security Scaffold v1.0.0 initialized
Oct 8 06:48:44 porky kernel: Capability LSM initialized
Oct 8 06:48:44 porky kernel: Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
Oct 8 06:48:44 porky rc: Starting pcmcia: succeeded
Oct 8 06:48:44 porky kernel: CPU: L1 I cache: 16K, L1 D cache: 16K
Oct 8 06:48:44 porky kernel: CPU: L2 cache: 256K
Oct 8 06:48:44 porky kernel: Intel machine check architecture supported.
Oct 8 06:48:44 porky kernel: Intel machine check reporting enabled on CPU#0.
Oct 8 06:48:44 porky kernel: Enabling fast FPU save and restore... done.
Oct 8 06:48:44 porky kernel: Enabling unmasked SIMD FPU exception support... done.
Oct 8 06:48:44 porky kernel: Checking 'hlt' instruction... OK.
Oct 8 06:48:44 porky kernel: CPU0: Intel Pentium III (Coppermine) stepping 06
Oct 8 06:48:44 porky kernel: per-CPU timeslice cutoff: 731.28 usecs.
Oct 8 06:48:39 porky sysctl: kernel.sysrq = 0
Oct 8 06:48:44 porky kernel: task migration cache decay timeout: 1 msecs.
Oct 8 06:48:39 porky sysctl: kernel.core_uses_pid = 1
Oct 8 06:48:44 porky kernel: Booting processor 1/1 eip 2000
Oct 8 06:48:39 porky network: Setting network parameters: succeeded
Oct 8 06:48:44 porky kernel: Initializing CPU#1
Oct 8 06:48:39 porky network: Bringing up loopback interface: succeeded
Oct 8 06:48:44 porky netfs: Mounting other filesystems: succeeded
Oct 8 06:48:44 porky kernel: CPU: L1 I cache: 16K, L1 D cache: 16K
Oct 8 06:48:44 porky kernel: CPU: L2 cache: 256K
Oct 8 06:48:44 porky kernel: Intel machine check architecture supported.
Oct 8 06:48:44 porky kernel: Intel machine check reporting enabled on CPU#1.
Oct 8 06:48:44 porky kernel: CPU1: Intel Pentium III (Coppermine) stepping 06
Oct 8 06:48:44 porky kernel: Total of 2 processors activated (3682.30 BogoMIPS).
Oct 8 06:48:44 porky kernel: checking TSC synchronization across 2 CPUs: passed.
Oct 8 06:48:44 porky kernel: ksoftirqd started up.
Oct 8 06:48:44 porky kernel: Brought up 2 CPUs
Oct 8 06:48:44 porky kernel: ksoftirqd started up.
Oct 8 06:48:44 porky kernel: checking if image is initramfs...it isn't (no cpio magic); looks like an initrd
Oct 8 06:48:44 porky kernel: Freeing initrd memory: 355k freed
Oct 8 06:48:44 porky kernel: NET: Registered protocol family 16
Oct 8 06:48:44 porky autofs: automount startup succeeded
Oct 8 06:48:44 porky kernel: PCI: PCI BIOS revision 2.10 entry at 0xfc03e, last bus=4
Oct 8 06:48:44 porky kernel: PCI: Using configuration type 1
Oct 8 06:48:44 porky kernel: mtrr: v2.0 (20020519)
Oct 8 06:48:44 porky kernel: Linux Plug and Play Support v0.97 (c) Adam Belay
Oct 8 06:48:44 porky kernel: PCI: Probing PCI hardware
Oct 8 06:48:44 porky kernel: PCI: Probing PCI hardware (bus 00)
Oct 8 06:48:44 porky kernel: PCI: Transparent bridge - 0000:00:1e.0
Oct 8 06:48:44 porky kernel: PCI: Using IRQ router PIIX/ICH [8086/2410] at 0000:00:1f.0
Oct 8 06:48:45 porky smartd[2575]: smartd version 5.21 Copyright (C) 2002-3 Bruce Allen
Oct 8 06:48:45 porky kernel: PCI: Failed to allocate mem resource #0:1000@0 for 0000:03:00.0
Oct 8 06:48:45 porky smartd[2575]: Home page is http://smartmontools.sourceforge.net/
Oct 8 06:48:45 porky kernel: apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
Oct 8 06:48:45 porky smartd[2575]: Opened configuration file /etc/smartd.conf
Oct 8 06:48:45 porky kernel: apm: disabled - APM is not SMP safe.
Oct 8 06:48:45 porky smartd[2575]: Configuration file /etc/smartd.conf parsed.
Oct 8 06:48:45 porky kernel: Starting balanced_irq
Oct 8 06:48:45 porky smartd[2575]: Device: /dev/hda, opened
Oct 8 06:48:45 porky kernel: VFS: Disk quotas dquot_6.5.1
Oct 8 06:48:45 porky smartd[2575]: Device: /dev/hda, unable to read Device Identity Structure
Oct 8 06:48:45 porky kernel: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Oct 8 06:48:45 porky smartd[2575]: Unable to register ATA device /dev/hda at line 30 of file /etc/smartd.conf
Oct 8 06:48:45 porky kernel: Initializing Cryptographic API
Oct 8 06:48:45 porky smartd[2575]: Unable to register device /dev/hda (no Directive -d removable). Exiting.
Oct 8 06:48:45 porky kernel: vesafb: probe of vesafb0 failed with error -6
Oct 8 06:48:45 porky kernel: isapnp: Scanning for PnP cards...
Oct 8 06:48:45 porky smartd: smartd startup failed
Oct 8 06:48:45 porky kernel: isapnp: No Plug & Play device found
Oct 8 06:48:45 porky kernel: requesting new irq thread for IRQ8...
Oct 8 06:48:45 porky kernel: Real Time Clock Driver v1.12
Oct 8 06:48:45 porky kernel: requesting new irq thread for IRQ12...
Oct 8 06:48:45 porky kernel: serio: i8042 AUX port at 0x60,0x64 irq 12
Oct 8 06:48:45 porky kernel: serio: i8042 KBD port at 0x60,0x64 irq 1
Oct 8 06:48:45 porky kernel: io scheduler noop registered
Oct 8 06:48:45 porky kernel: io scheduler anticipatory registered
Oct 8 06:48:45 porky kernel: io scheduler deadline registered
Oct 8 06:48:45 porky kernel: io scheduler cfq registered
Oct 8 06:48:45 porky kernel: RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Oct 8 06:48:45 porky kernel: Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
Oct 8 06:48:45 porky kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Oct 8 06:48:45 porky kernel: ICH: IDE controller at PCI slot 0000:00:1f.1
Oct 8 06:48:45 porky kernel: ICH: chipset revision 2
Oct 8 06:48:45 porky kernel: ICH: not 100%% native mode: will probe irqs later
Oct 8 06:48:45 porky kernel: ide0: BM-DMA at 0xffa0-0xffa7, BIOS settings: hda:DMA, hdb:pio
Oct 8 06:48:45 porky kernel: ide1: BM-DMA at 0xffa8-0xffaf, BIOS settings: hdc:DMA, hdd:pio
Oct 8 06:48:45 porky kernel: hda: SAMSUNG CD-R/RW SW-248F, ATAPI CD/DVD-ROM drive
Oct 8 06:48:45 porky kernel: requesting new irq thread for IRQ14...
Oct 8 06:48:45 porky kernel: elevator: using anticipatory as default io scheduler
Oct 8 06:48:45 porky kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Oct 8 06:48:45 porky kernel: hdc: Lite-On LTN483S 48x Max, ATAPI CD/DVD-ROM drive
Oct 8 06:48:45 porky kernel: requesting new irq thread for IRQ15...
Oct 8 06:48:45 porky kernel: ide1 at 0x170-0x177,0x376 on irq 15
Oct 8 06:48:46 porky kernel: mice: PS/2 mouse device common for all mice
Oct 8 06:48:46 porky kernel: IRQ#12 thread started up.
Oct 8 06:48:46 porky kernel: requesting new irq thread for IRQ1...
Oct 8 06:48:46 porky kernel: IRQ#1 thread started up.
Oct 8 06:48:46 porky kernel: input: AT Translated Set 2 keyboard on isa0060/serio0
Oct 8 06:48:46 porky kernel: input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
Oct 8 06:48:46 porky kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
Oct 8 06:48:46 porky kernel: NET: Registered protocol family 2
Oct 8 06:48:46 porky kernel: IP: routing cache hash table of 2048 buckets, 32Kbytes
Oct 8 06:48:46 porky kernel: TCP: Hash tables configured (established 16384 bind 21845)
Oct 8 06:48:46 porky kernel: NET: Registered protocol family 1
Oct 8 06:48:46 porky kernel: NET: Registered protocol family 17
Oct 8 06:48:46 porky kernel: NET: Registered protocol family 8
Oct 8 06:48:46 porky kernel: NET: Registered protocol family 20
Oct 8 06:48:46 porky kernel: md: Autodetecting RAID arrays.
Oct 8 06:48:46 porky kernel: md: autorun ...
Oct 8 06:48:46 porky rc: Starting hpoj: succeeded
Oct 8 06:48:46 porky kernel: md: ... autorun DONE.
Oct 8 06:48:46 porky kernel: RAMDISK: Compressed image found at block 0
Oct 8 06:48:46 porky kernel: VFS: Mounted root (ext2 filesystem).
Oct 8 06:48:46 porky kernel: SCSI subsystem initialized
Oct 8 06:48:46 porky kernel: PCI: Found IRQ 10 for device 0000:04:05.0
Oct 8 06:48:46 porky kernel: PCI: Sharing IRQ 10 with 0000:00:1f.3
Oct 8 06:48:47 porky kernel: requesting new irq thread for IRQ10...
Oct 8 06:48:47 porky kernel: PCI: Found IRQ 5 for device 0000:04:05.1
Oct 8 06:48:47 porky kernel: PCI: Sharing IRQ 5 with 0000:04:0a.0
Oct 8 06:48:47 porky kernel: requesting new irq thread for IRQ5...
Oct 8 06:48:47 porky kernel: scsi0 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
Oct 8 06:48:47 porky kernel: <Adaptec aic7899 Ultra160 SCSI adapter>
Oct 8 06:48:47 porky kernel: aic7899: Ultra160 Wide Channel A, SCSI Id=7, 32/253 SCBs
Oct 8 06:48:47 porky kernel:
Oct 8 06:48:47 porky kernel: IRQ#10 thread started up.
Oct 8 06:48:47 porky kernel: (scsi0:A:0): 160.000MB/s transfers (80.000MHz DT, offset 127, 16bit)
Oct 8 06:48:47 porky kernel: Vendor: QUANTUM Model: ATLAS10K2-TY092L Rev: DA40
Oct 8 06:48:47 porky kernel: Type: Direct-Access ANSI SCSI revision: 03
Oct 8 06:48:48 porky kernel: scsi0:A:0:0: Tagged Queuing enabled. Depth 32
Oct 8 06:48:48 porky kernel: SCSI device sda: 17783239 512-byte hdwr sectors (9105 MB)
Oct 8 06:48:48 porky kernel: SCSI device sda: drive cache: write back
Oct 8 06:48:48 porky kernel: sda: sda1 sda2 sda3
Oct 8 06:48:48 porky kernel: Attached scsi disk sda at scsi0, channel 0, id 0, lun 0
Oct 8 06:48:48 porky kernel: scsi1 : Adaptec AIC7XXX EISA/VLB/PCI SCSI HBA DRIVER, Rev 6.2.36
Oct 8 06:48:48 porky kernel: <Adaptec aic7899 Ultra160 SCSI adapter>
Oct 8 06:48:48 porky kernel: aic7899: Ultra160 Wide Channel B, SCSI Id=7, 32/253 SCBs
Oct 8 06:48:48 porky kernel:
Oct 8 06:48:48 porky kernel: IRQ#5 thread started up.
Oct 8 06:48:48 porky kernel: (scsi1:A:0): 20.000MB/s transfers (20.000MHz, offset 15)
Oct 8 06:48:48 porky kernel: Vendor: SEAGATE Model: SX118273LC Rev: 6679
Oct 8 06:48:48 porky kernel: Type: Direct-Access ANSI SCSI revision: 02
Oct 8 06:48:48 porky kernel: scsi1:A:0:0: Tagged Queuing enabled. Depth 32
Oct 8 06:48:48 porky kernel: SCSI device sdb: 35566480 512-byte hdwr sectors (18210 MB)
Oct 8 06:48:48 porky kernel: SCSI device sdb: drive cache: write through
Oct 8 06:48:48 porky kernel: sdb: sdb1
Oct 8 06:48:48 porky kernel: Attached scsi disk sdb at scsi1, channel 0, id 0, lun 0
Oct 8 06:48:48 porky kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 8 06:48:48 porky kernel: kjournald starting. Commit interval 5 seconds
Oct 8 06:48:48 porky kernel: Freeing unused kernel memory: 284k freed
Oct 8 06:48:48 porky kernel: NET: Registered protocol family 10
Oct 8 06:48:48 porky kernel: IPv6 over IPv4 tunneling driver
Oct 8 06:48:48 porky kernel: IRQ#8 thread started up.
Oct 8 06:48:48 porky kernel: usbcore: registered new driver usbfs
Oct 8 06:48:48 porky kernel: usbcore: registered new driver hub
Oct 8 06:48:48 porky kernel: USB Universal Host Controller Interface driver v2.2
Oct 8 06:48:48 porky kernel: PCI: Found IRQ 11 for device 0000:00:1f.2
Oct 8 06:48:48 porky kernel: uhci_hcd 0000:00:1f.2: Intel Corp. 82801AA USB
Oct 8 06:48:48 porky kernel: requesting new irq thread for IRQ11...
Oct 8 06:48:48 porky kernel: uhci_hcd 0000:00:1f.2: irq 11, io base 0xff80
Oct 8 06:48:48 porky kernel: uhci_hcd 0000:00:1f.2: new USB bus registered, assigned bus number 1
Oct 8 06:48:48 porky kernel: hub 1-0:1.0: USB hub found
Oct 8 06:48:48 porky kernel: hub 1-0:1.0: 2 ports detected
Oct 8 06:48:48 porky kernel: EXT3 FS on sda2, internal journal
Oct 8 06:48:48 porky kernel: device-mapper: 4.1.0-ioctl (2003-12-10) initialised: [email protected]
Oct 8 06:48:48 porky kernel: Adding 1044216k swap on /dev/sda3. Priority:-1 extents:1
Oct 8 06:48:48 porky kernel: program scsi_unique_id is using a deprecated SCSI ioctl, please convert it to SG_IO
Oct 8 06:48:49 porky last message repeated 11 times
Oct 8 06:48:49 porky kernel: kjournald starting. Commit interval 5 seconds
Oct 8 06:48:49 porky kernel: EXT3 FS on sda1, internal journal
Oct 8 06:48:49 porky kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 8 06:48:49 porky kernel: kjournald starting. Commit interval 5 seconds
Oct 8 06:48:49 porky kernel: EXT3 FS on sdb1, internal journal
Oct 8 06:48:49 porky kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 8 06:48:49 porky kernel: IA-32 Microcode Update Driver: v1.14 <[email protected]>
Oct 8 06:48:49 porky kernel: microcode: CPU0 already at revision 0x2 (current=0x2)
Oct 8 06:48:49 porky kernel: microcode: CPU1 already at revision 0x2 (current=0x2)
Oct 8 06:48:49 porky kernel: microcode: No suitable data for CPU0
Oct 8 06:48:49 porky kernel: microcode: No suitable data for CPU1
Oct 8 06:48:49 porky kernel: parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP]
Oct 8 06:48:49 porky kernel: parport0: irq 7 detected
Oct 8 06:48:49 porky kernel: Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0, type 0
Oct 8 06:48:49 porky kernel: Attached scsi generic sg1 at scsi1, channel 0, id 0, lun 0, type 0
Oct 8 06:48:49 porky kernel: inserting floppy driver for 2.6.9-rc3-mm3-VP-T3
Oct 8 06:48:49 porky kernel: Floppy drive(s): fd0 is 1.44M
Oct 8 06:48:49 porky kernel: requesting new irq thread for IRQ6...
Oct 8 06:48:49 porky kernel: IRQ#6 thread started up.
Oct 8 06:48:49 porky kernel: FDC 0 is a National Semiconductor PC87306
Oct 8 06:48:49 porky kernel: Linux Tulip driver version 1.1.13 (May 11, 2002)
Oct 8 06:48:49 porky kernel: PCI: Found IRQ 5 for device 0000:04:0a.0
Oct 8 06:48:49 porky kernel: PCI: Sharing IRQ 5 with 0000:04:05.1
Oct 8 06:48:49 porky kernel: tulip0: EEPROM default media type Autosense.
Oct 8 06:48:49 porky kernel: tulip0: Index #0 - Media MII (#11) described by a 21140 MII PHY (1) block.
Oct 8 06:48:49 porky kernel: tulip0: MII transceiver #3 config 3100 status 7809 advertising 01e1.
Oct 8 06:48:49 porky kernel: eth0: Digital DS21140 Tulip rev 32 at 0xe480, 00:00:C0:7F:A0:E9, IRQ 5.
Oct 8 06:48:49 porky kernel: tulip 0000:04:0a.0: Device was removed without properly calling pci_disable_device(). This may need fixing.
Oct 8 06:48:49 porky kernel: IRQ#14 thread started up.
Oct 8 06:48:49 porky kernel: hda: ATAPI 48X CD-ROM CD-R/RW drive, 8192kB Cache, UDMA(33)
Oct 8 06:48:49 porky kernel: Uniform CD-ROM driver Revision: 3.20
Oct 8 06:48:49 porky kernel: IRQ#15 thread started up.
Oct 8 06:48:49 porky kernel: hdc: ATAPI 48X CD-ROM drive, 120kB Cache, UDMA(33)
Oct 8 06:48:49 porky kernel: ip_tables: (C) 2000-2002 Netfilter core team
Oct 8 06:48:49 porky kernel: Linux Tulip driver version 1.1.13 (May 11, 2002)
Oct 8 06:48:49 porky kernel: PCI: Found IRQ 5 for device 0000:04:0a.0
Oct 8 06:48:49 porky kernel: PCI: Sharing IRQ 5 with 0000:04:05.1
Oct 8 06:48:49 porky kernel: tulip0: EEPROM default media type Autosense.
Oct 8 06:48:49 porky kernel: tulip0: Index #0 - Media MII (#11) described by a 21140 MII PHY (1) block.
Oct 8 06:48:49 porky kernel: tulip0: MII transceiver #3 config 3100 status 7809 advertising 01e1.
Oct 8 06:48:49 porky kernel: eth0: Digital DS21140 Tulip rev 32 at 0xe480, 00:00:C0:7F:A0:E9, IRQ 5.
Oct 8 06:48:49 porky kernel: ip_tables: (C) 2000-2002 Netfilter core team
Oct 8 06:48:49 porky kernel: eth0: Setting full-duplex based on MII#3 link partner capability of 05e1.
Oct 8 06:48:49 porky kernel: parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP]
Oct 8 06:48:49 porky kernel: parport0: irq 7 detected
Oct 8 06:48:49 porky kernel: lp0: using parport0 (polling).
Oct 8 06:48:50 porky cups: cupsd startup succeeded
Oct 8 06:48:50 porky sshd: succeeded
Oct 8 06:48:50 porky xinetd: xinetd startup succeeded
Oct 8 06:48:50 porky ntpdate[2906]: step time server 192.168.36.1 offset -0.498488 sec
Oct 8 06:48:50 porky ntpd: succeeded
Oct 8 06:48:50 porky ntpd[2910]: ntpd [email protected] Thu Mar 11 11:46:39 EST 2004 (1)
Oct 8 06:48:50 porky ntpd: ntpd startup succeeded
Oct 8 06:48:50 porky ntpd[2910]: precision = 1.000 usec
Oct 8 06:48:50 porky ntpd[2910]: kernel time sync status 0040
Oct 8 06:48:50 porky ntpd[2910]: frequency initialized 16.655 PPM from /var/lib/ntp/drift
Oct 8 06:48:50 porky ntpd[2910]: configure: keyword "authenticate" unknown, line ignored
Oct 8 06:48:50 porky vsftpd: vsftpd vsftpd succeeded
Oct 8 06:48:50 porky sendmail: sendmail startup succeeded
Oct 8 06:48:50 porky sendmail: sm-client startup succeeded
Oct 8 06:48:50 porky gpm[2959]: *** info [startup.c(95)]:
Oct 8 06:48:50 porky gpm[2959]: Started gpm successfully. Entered daemon mode.
Oct 8 06:48:50 porky xinetd[2896]: xinetd Version 2.3.13 started with libwrap loadavg options compiled in.
Oct 8 06:48:50 porky xinetd[2896]: Started working: 2 available services
Oct 8 06:48:51 porky gpm[2959]: *** info [mice.c(1766)]:
Oct 8 06:48:51 porky gpm[2959]: imps2: Auto-detected intellimouse PS/2
Oct 8 06:48:51 porky gpm: gpm startup succeeded
Oct 8 06:48:51 porky crond: crond startup succeeded
Oct 8 06:48:51 porky xfs: xfs startup succeeded
Oct 8 06:48:52 porky anacron: anacron startup succeeded
Oct 8 06:48:52 porky atd: atd startup succeeded
Oct 8 06:48:52 porky readahead: Starting background readahead:
Oct 8 06:48:52 porky rc: Starting readahead: succeeded
Oct 8 06:48:53 porky messagebus: messagebus startup succeeded
Oct 8 06:48:53 porky mdmonitor: mdadm succeeded
Oct 8 06:48:53 porky mdmpd: mdmpd succeeded
* K.R. Foley <[email protected]> wrote:
> First let me say that, in case you haven't been following the other
> thread about this "2.6.9-rc3-mm3 fails to detect aic7xxx", I resolved
> this by backing out the bk-scsi.patch and bk-scsi-target.patch.
> Without those everything works fine.
> >could you send me the following info:
> >
> > - full log of a failed boot
>
> I would like to be able to be able to send you this, but it doesn't
> get to the point of logging. [...]
meanwhile i could reproduce an aic79xx detection problem on a
testsystem, so no need to send the log.
Ingo
>i've released the -T3 VP patch:
I ran another series of tests, this time without using threaded IRQ's. Both
/proc/sys/kernel/softirq_preemption and
/proc/sys/kernel/hardirq_preemption
were zero.
The results were somewhat similar to what I saw yesterday when both types
of
IRQ's were threaded. The number of latencies > 200 usec was higher:
threaded IRQ's - 47
unthreaded IRQ's - 128
However, the application level overhead (as measured by latencytest)
appears to be less without threaded IRQ's than with. For example,
during the X11 stress test:
CPU task samples within 0.1 msec
nominal duration: 1.16 msec n/a
Max with threaded IRQ's: 1.38 msec 99.97%
Max with unthreaded IRQ's: 1.25 msec 100.00%
The green line in the chart is MUCH thinner in the unthreaded test
as well indicating much less overhead with unthreaded IRQ's. Please
note that in both tests, the audio IRQ was not threaded (all others
were...).
This trend continued until the disk tests were performed. In those
cases, the threaded overhead was less but I believe that is due to
the setting of the audio IRQ to be non-threaded.
The types of latency traces I saw are summarized as follows. Some
details are at the end and if anyone wants full traces, please let
me know. The numbers that follow refer to the trace numbers.
[1] rt_check_expire - hundreds of traces of _spin_lock and _spin_unlock
with the preempt count >1
00 02 78 82 87 92 101 110 126
[2] do_wait - preempt count bounces up / down many cycles
01
[3] do_IRQ - appears to be chaining of hard and soft IRQ's without any
opportunity for preemption. May also have timer tick as well. A few
different ways to start this symptom. Most traces are like one of these.
03... 37 39... 73 76 79 81 83... 86 88 89 94 96 97 99 100 102 103 106 107
109 111 114 119 122 123
[4] rt_run_flush - a VERY long trace (> 4000 samples)
38 106
[5] rcu / cache actions
74 75 77 90 95 113 124
[6] prune_icache - an interrupt causes some delays
80 91 98
[7] clear_page_tables - also seen without threaded IRQ's
92 108 112 115 ... 118 120 121 125 127
[8] avc_insert - a long delay at one step...
105
--Mark
[1] rt_check_expire
The following is typical of this kind of trace.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 295 us, entries: 882 (882) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: X/2815, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: smp_apic_timer_interrupt+0x43/0xf0
=> ended at: irq_exit+0x33/0x50
=======>
00010000 0.000ms (+0.000ms): smp_apic_timer_interrupt
(apic_timer_interrupt)
00010000 0.000ms (+0.000ms): profile_tick (smp_apic_timer_interrupt)
00010000 0.000ms (+0.000ms): profile_hook (profile_tick)
00010000 0.000ms (+0.000ms): _read_lock (profile_hook)
00010001 0.001ms (+0.000ms): notifier_call_chain (profile_hook)
00010001 0.001ms (+0.000ms): _read_unlock (profile_tick)
00010000 0.001ms (+0.000ms): update_process_times
(smp_apic_timer_interrupt)
00010000 0.001ms (+0.000ms): update_one_process (update_process_times)
00010000 0.002ms (+0.000ms): run_local_timers (update_process_times)
00010000 0.002ms (+0.000ms): raise_softirq (update_process_times)
00010000 0.002ms (+0.000ms): scheduler_tick (update_process_times)
00010000 0.002ms (+0.000ms): sched_clock (scheduler_tick)
00010000 0.003ms (+0.000ms): _spin_lock (scheduler_tick)
00010001 0.003ms (+0.000ms): _spin_unlock (scheduler_tick)
00010000 0.003ms (+0.000ms): rebalance_tick (scheduler_tick)
00010000 0.004ms (+0.000ms): irq_exit (smp_apic_timer_interrupt)
00000001 0.004ms (+0.000ms): do_softirq (irq_exit)
00000001 0.004ms (+0.000ms): __do_softirq (do_softirq)
00000101 0.004ms (+0.000ms): ___do_softirq (__do_softirq)
00000101 0.005ms (+0.000ms): run_timer_softirq (___do_softirq)
00000101 0.005ms (+0.000ms): _spin_lock_irq (run_timer_softirq)
00000101 0.005ms (+0.000ms): _spin_lock_irqsave (run_timer_softirq)
00000102 0.006ms (+0.000ms): _spin_unlock_irq (run_timer_softirq)
00000101 0.006ms (+0.000ms): peer_check_expire (run_timer_softirq)
00000101 0.007ms (+0.000ms): cleanup_once (peer_check_expire)
00000101 0.007ms (+0.000ms): _spin_lock_bh (cleanup_once)
00000101 0.008ms (+0.000ms): _spin_lock_irqsave (_spin_lock_bh)
00000202 0.008ms (+0.000ms): _spin_unlock_bh (cleanup_once)
00000201 0.008ms (+0.000ms): local_bh_enable (cleanup_once)
00000101 0.009ms (+0.000ms): __mod_timer (peer_check_expire)
00000101 0.009ms (+0.000ms): _spin_lock_irqsave (__mod_timer)
00000102 0.010ms (+0.000ms): _spin_lock (__mod_timer)
00000103 0.010ms (+0.000ms): internal_add_timer (__mod_timer)
00000103 0.011ms (+0.000ms): _spin_unlock (__mod_timer)
00000102 0.011ms (+0.000ms): _spin_unlock_irqrestore (__mod_timer)
00000101 0.012ms (+0.000ms): cond_resched_all (run_timer_softirq)
00000101 0.012ms (+0.000ms): cond_resched_softirq (run_timer_softirq)
00000101 0.012ms (+0.000ms): _spin_lock_irq (run_timer_softirq)
00000101 0.012ms (+0.000ms): _spin_lock_irqsave (run_timer_softirq)
00000102 0.013ms (+0.000ms): _spin_unlock_irq (run_timer_softirq)
00000101 0.013ms (+0.000ms): rt_check_expire (run_timer_softirq)
00000101 0.014ms (+0.000ms): _spin_lock (rt_check_expire)
00000102 0.014ms (+0.000ms): _spin_unlock (rt_check_expire)
00000101 0.015ms (+0.000ms): _spin_lock (rt_check_expire)
00000102 0.015ms (+0.000ms): _spin_unlock (rt_check_expire)
... MANY repetitions ...
00000101 0.151ms (+0.000ms): _spin_lock (rt_check_expire)
00000102 0.152ms (+0.000ms): _spin_unlock (rt_check_expire)
00000101 0.152ms (+0.001ms): _spin_lock (rt_check_expire)
00000102 0.153ms (+0.000ms): rt_may_expire (rt_check_expire)
00000102 0.153ms (+0.000ms): _spin_unlock (rt_check_expire)
00000101 0.154ms (+0.000ms): _spin_lock (rt_check_expire)
00000102 0.154ms (+0.000ms): _spin_unlock (rt_check_expire)
... MANY more repetitions ...
00000101 0.289ms (+0.000ms): _spin_lock (rt_check_expire)
00000102 0.289ms (+0.000ms): _spin_unlock (rt_check_expire)
00000101 0.290ms (+0.000ms): mod_timer (rt_check_expire)
00000101 0.290ms (+0.000ms): __mod_timer (rt_check_expire)
00000101 0.290ms (+0.000ms): _spin_lock_irqsave (__mod_timer)
00000102 0.291ms (+0.000ms): _spin_lock (__mod_timer)
00000103 0.291ms (+0.000ms): internal_add_timer (__mod_timer)
00000103 0.291ms (+0.000ms): _spin_unlock (__mod_timer)
00000102 0.292ms (+0.000ms): _spin_unlock_irqrestore (__mod_timer)
00000101 0.292ms (+0.000ms): cond_resched_all (run_timer_softirq)
00000101 0.292ms (+0.000ms): cond_resched_softirq (run_timer_softirq)
00000101 0.292ms (+0.000ms): _spin_lock_irq (run_timer_softirq)
00000101 0.292ms (+0.000ms): _spin_lock_irqsave (run_timer_softirq)
00000102 0.293ms (+0.000ms): _spin_unlock_irq (run_timer_softirq)
00000101 0.293ms (+0.000ms): __wake_up (run_timer_softirq)
00000101 0.293ms (+0.000ms): _spin_lock_irqsave (__wake_up)
00000102 0.293ms (+0.000ms): __wake_up_common (__wake_up)
00000102 0.294ms (+0.000ms): _spin_unlock_irqrestore (run_timer_softirq)
00000101 0.294ms (+0.000ms): cond_resched_all (___do_softirq)
00000101 0.294ms (+0.000ms): cond_resched_softirq (___do_softirq)
00000001 0.295ms (+0.000ms): sub_preempt_count (irq_exit)
00000001 0.295ms (+0.000ms): update_max_trace (check_preempt_timing)
[2] do_wait
Another example of several cycles up / down from preempt 1 to 2 to 1 ...
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 228 us, entries: 572 (572) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: init/1, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _read_lock+0x1b/0x80
=> ended at: _read_unlock+0x1a/0x40
=======>
00000001 0.000ms (+0.000ms): _read_lock (do_wait)
00000001 0.000ms (+0.001ms): eligible_child (do_wait)
00000001 0.002ms (+0.000ms): selinux_task_wait (eligible_child)
00000001 0.003ms (+0.000ms): task_has_perm (selinux_task_wait)
00000001 0.004ms (+0.000ms): avc_has_perm (task_has_perm)
00000001 0.004ms (+0.000ms): avc_has_perm_noaudit (avc_has_perm)
00000001 0.004ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
00000002 0.005ms (+0.000ms): avc_lookup (avc_has_perm_noaudit)
00000002 0.006ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
00000001 0.007ms (+0.000ms): security_compute_av (avc_has_perm_noaudit)
00000001 0.007ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
00000002 0.007ms (+0.000ms): avc_insert (avc_has_perm_noaudit)
00000002 0.008ms (+0.000ms): memcpy (avc_has_perm_noaudit)
00000002 0.008ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
00000001 0.009ms (+0.000ms): avc_audit (avc_has_perm)
00000001 0.009ms (+0.000ms): eligible_child (do_wait)
00000001 0.010ms (+0.000ms): selinux_task_wait (eligible_child)
00000001 0.010ms (+0.000ms): task_has_perm (selinux_task_wait)
00000001 0.011ms (+0.000ms): avc_has_perm (task_has_perm)
00000001 0.011ms (+0.000ms): avc_has_perm_noaudit (avc_has_perm)
00000001 0.012ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
...
00000001 0.221ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
00000002 0.221ms (+0.000ms): memcpy (avc_has_perm_noaudit)
00000002 0.221ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
00000001 0.222ms (+0.000ms): avc_audit (avc_has_perm)
00000001 0.222ms (+0.000ms): eligible_child (do_wait)
00000001 0.222ms (+0.000ms): selinux_task_wait (eligible_child)
00000001 0.223ms (+0.000ms): task_has_perm (selinux_task_wait)
00000001 0.223ms (+0.000ms): avc_has_perm (task_has_perm)
00000001 0.223ms (+0.000ms): avc_has_perm_noaudit (avc_has_perm)
00000001 0.224ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
00000002 0.224ms (+0.000ms): memcpy (avc_has_perm_noaudit)
00000002 0.224ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
00000001 0.224ms (+0.000ms): avc_audit (avc_has_perm)
00000001 0.225ms (+0.000ms): wait_task_zombie (do_wait)
00000001 0.226ms (+0.000ms): _spin_lock_irq (wait_task_zombie)
00000001 0.226ms (+0.000ms): _spin_lock_irqsave (wait_task_zombie)
00000002 0.227ms (+0.000ms): _spin_unlock_irq (wait_task_zombie)
00000001 0.227ms (+0.000ms): _read_unlock (wait_task_zombie)
00000001 0.228ms (+0.000ms): sub_preempt_count (_read_unlock)
00000001 0.228ms (+0.000ms): update_max_trace (check_preempt_timing)
[3] do_IRQ
Several traces look similar to those I reported previously, just have
some additional latency at the start & end. For example:
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 450 us, entries: 898 (898) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: cpu_burn/13771, uid:0 nice:10 policy:0 rt_prio:0
-----------------
=> started at: do_IRQ+0x19/0x90
=> ended at: irq_exit+0x33/0x50
=======>
00010000 0.000ms (+0.000ms): do_IRQ (common_interrupt)
00010000 0.000ms (+0.000ms): do_IRQ (<08048340>)
00010000 0.000ms (+0.000ms): do_IRQ (<0000000b>)
00010000 0.000ms (+0.000ms): _spin_lock (__do_IRQ)
00010001 0.000ms (+0.000ms): mask_and_ack_level_ioapic_irq (__do_IRQ)
...
00020000 0.041ms (+0.000ms): irq_exit (do_IRQ)
00010000 0.041ms (+0.000ms): rtl8139_interrupt (handle_IRQ_event)
00010000 0.041ms (+0.001ms): _spin_lock (rtl8139_interrupt)
00010001 0.042ms (+0.000ms): _spin_unlock (rtl8139_interrupt)
00010000 0.043ms (+0.000ms): _spin_lock (__do_IRQ)
00010001 0.043ms (+0.000ms): note_interrupt (__do_IRQ)
00010001 0.043ms (+0.000ms): end_level_ioapic_irq (__do_IRQ)
00010001 0.043ms (+0.000ms): unmask_IO_APIC_irq (__do_IRQ)
00010001 0.044ms (+0.000ms): _spin_lock_irqsave (unmask_IO_APIC_irq)
00010002 0.044ms (+0.000ms): __unmask_IO_APIC_irq (unmask_IO_APIC_irq)
00010002 0.044ms (+0.013ms): __modify_IO_APIC_irq (__unmask_IO_APIC_irq)
00010002 0.058ms (+0.000ms): _spin_unlock_irqrestore (__do_IRQ)
00010001 0.058ms (+0.000ms): _spin_unlock (__do_IRQ)
00010000 0.058ms (+0.000ms): irq_exit (do_IRQ)
00000001 0.059ms (+0.000ms): do_softirq (irq_exit)
00000001 0.059ms (+0.000ms): __do_softirq (do_softirq)
00000101 0.059ms (+0.000ms): ___do_softirq (__do_softirq)
00000101 0.059ms (+0.000ms): net_rx_action (___do_softirq)
... if I read this right, 60 usec was taken prior to getting to
the soft IRQ w/o any preemption opportunities ...
... many more traces ...
00010101 0.427ms (+0.000ms): rtl8139_interrupt (handle_IRQ_event)
00010101 0.427ms (+0.001ms): _spin_lock (rtl8139_interrupt)
00010102 0.429ms (+0.000ms): rtl8139_tx_interrupt (rtl8139_interrupt)
00010102 0.430ms (+0.000ms): _spin_unlock (rtl8139_interrupt)
00010101 0.430ms (+0.000ms): preempt_schedule (rtl8139_interrupt)
00010101 0.431ms (+0.000ms): _spin_lock (__do_IRQ)
00010102 0.431ms (+0.000ms): note_interrupt (__do_IRQ)
00010102 0.431ms (+0.000ms): end_level_ioapic_irq (__do_IRQ)
00010102 0.431ms (+0.000ms): unmask_IO_APIC_irq (__do_IRQ)
00010102 0.432ms (+0.000ms): _spin_lock_irqsave (unmask_IO_APIC_irq)
00010103 0.432ms (+0.000ms): __unmask_IO_APIC_irq (unmask_IO_APIC_irq)
00010103 0.432ms (+0.013ms): __modify_IO_APIC_irq (__unmask_IO_APIC_irq)
00010103 0.446ms (+0.000ms): _spin_unlock_irqrestore (__do_IRQ)
00010102 0.446ms (+0.000ms): preempt_schedule (__do_IRQ)
00010102 0.447ms (+0.000ms): _spin_unlock (__do_IRQ)
00010101 0.447ms (+0.000ms): preempt_schedule (__do_IRQ)
00010101 0.447ms (+0.000ms): irq_exit (do_IRQ)
00000101 0.448ms (+0.000ms): __wake_up (run_timer_softirq)
00000101 0.448ms (+0.000ms): _spin_lock_irqsave (__wake_up)
00000102 0.448ms (+0.000ms): __wake_up_common (__wake_up)
00000102 0.448ms (+0.000ms): _spin_unlock_irqrestore (run_timer_softirq)
00000101 0.449ms (+0.000ms): preempt_schedule (run_timer_softirq)
00000101 0.449ms (+0.000ms): cond_resched_all (___do_softirq)
00000101 0.449ms (+0.001ms): cond_resched_softirq (___do_softirq)
00000001 0.450ms (+0.000ms): sub_preempt_count (irq_exit)
00000001 0.450ms (+0.000ms): update_max_trace (check_preempt_timing)
[not quite sure if the same symptom but has many of the same
features as above...]
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 507 us, entries: 609 (609) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: rcp/13989, uid:2711 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80
=> ended at: local_bh_enable+0x3f/0xb0
[16 - ditto, providing some detail]
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 474 us, entries: 919 (919) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: klogd/1953, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: schedule+0x43/0x810
=> ended at: schedule+0x412/0x810
=======>
00000001 0.000ms (+0.000ms): schedule (work_resched)
00000001 0.000ms (+0.000ms): sched_clock (schedule)
00000001 0.001ms (+0.000ms): _spin_lock_irq (schedule)
00000001 0.001ms (+0.001ms): _spin_lock_irqsave (schedule)
00000002 0.002ms (+0.000ms): dequeue_task (schedule)
00000002 0.003ms (+0.000ms): recalc_task_prio (schedule)
00000002 0.003ms (+0.000ms): effective_prio (recalc_task_prio)
00000002 0.003ms (+0.005ms): enqueue_task (schedule)
00000002 0.008ms (+0.001ms): __switch_to (schedule)
00000002 0.010ms (+0.000ms): finish_task_switch (schedule)
00000002 0.010ms (+0.001ms): _spin_unlock_irq (finish_task_switch)
00000002 0.011ms (+0.000ms): smp_apic_timer_interrupt (_spin_unlock_irq)
00010002 0.012ms (+0.000ms): profile_tick (smp_apic_timer_interrupt)
00010002 0.013ms (+0.000ms): profile_hook (profile_tick)
00010002 0.013ms (+0.001ms): _read_lock (profile_hook)
00010003 0.014ms (+0.000ms): notifier_call_chain (profile_hook)
00010003 0.015ms (+0.000ms): _read_unlock (profile_tick)
00010002 0.015ms (+0.000ms): profile_hit (smp_apic_timer_interrupt)
00010002 0.016ms (+0.000ms): update_process_times
(smp_apic_timer_interrupt)
00010002 0.017ms (+0.001ms): update_one_process (update_process_times)
00010002 0.018ms (+0.000ms): run_local_timers (update_process_times)
00010002 0.018ms (+0.000ms): raise_softirq (update_process_times)
00010002 0.019ms (+0.000ms): scheduler_tick (update_process_times)
00010002 0.019ms (+0.002ms): sched_clock (scheduler_tick)
00010002 0.021ms (+0.000ms): _spin_lock (scheduler_tick)
00010003 0.022ms (+0.000ms): task_timeslice (scheduler_tick)
00010003 0.022ms (+0.001ms): __bitmap_weight (scheduler_tick)
00010003 0.024ms (+0.000ms): __bitmap_weight (scheduler_tick)
00010003 0.024ms (+0.000ms): dequeue_task (scheduler_tick)
00010003 0.024ms (+0.000ms): effective_prio (scheduler_tick)
00010003 0.025ms (+0.000ms): enqueue_task (scheduler_tick)
00010003 0.025ms (+0.000ms): _spin_unlock (scheduler_tick)
00010002 0.025ms (+0.000ms): preempt_schedule (scheduler_tick)
00010002 0.025ms (+0.000ms): rebalance_tick (scheduler_tick)
00010002 0.026ms (+0.000ms): irq_exit (smp_apic_timer_interrupt)
00000003 0.026ms (+0.000ms): do_softirq (irq_exit)
00000003 0.027ms (+0.000ms): __do_softirq (do_softirq)
00000103 0.027ms (+0.000ms): ___do_softirq (__do_softirq)
00010103 0.028ms (+0.000ms): do_IRQ (___do_softirq)
00010103 0.028ms (+0.000ms): do_IRQ (<0000000b>)
00010103 0.029ms (+0.000ms): _spin_lock (__do_IRQ)
00010104 0.029ms (+0.000ms): mask_and_ack_level_ioapic_irq (__do_IRQ)
00010104 0.029ms (+0.000ms): mask_IO_APIC_irq
(mask_and_ack_level_ioapic_irq)
00010104 0.030ms (+0.000ms): _spin_lock_irqsave (mask_IO_APIC_irq)
00010105 0.030ms (+0.000ms): __mask_IO_APIC_irq (mask_IO_APIC_irq)
00010105 0.030ms (+0.014ms): __modify_IO_APIC_irq (__mask_IO_APIC_irq)
00010105 0.045ms (+0.000ms): _spin_unlock_irqrestore
(mask_and_ack_level_ioapic_irq)
00010104 0.045ms (+0.000ms): preempt_schedule
(mask_and_ack_level_ioapic_irq)
00010104 0.045ms (+0.000ms): redirect_hardirq (__do_IRQ)
00010104 0.045ms (+0.000ms): _spin_unlock (__do_IRQ)
00010103 0.046ms (+0.000ms): preempt_schedule (__do_IRQ)
00010103 0.046ms (+0.000ms): handle_IRQ_event (__do_IRQ)
00010103 0.047ms (+0.000ms): rtl8139_interrupt (handle_IRQ_event)
... pretty deep nesting ...
00000109 0.464ms (+0.000ms): activate_task (try_to_wake_up)
00000109 0.465ms (+0.000ms): sched_clock (activate_task)
00000109 0.465ms (+0.000ms): recalc_task_prio (activate_task)
00000109 0.465ms (+0.000ms): effective_prio (recalc_task_prio)
00000109 0.466ms (+0.000ms): enqueue_task (activate_task)
00000109 0.466ms (+0.000ms): _spin_unlock_irqrestore (try_to_wake_up)
00000108 0.466ms (+0.000ms): preempt_schedule (try_to_wake_up)
00000108 0.466ms (+0.000ms): _spin_unlock_irqrestore
(sk_stream_write_space)
00000107 0.467ms (+0.000ms): preempt_schedule (sk_stream_write_space)
00000107 0.467ms (+0.000ms): _spin_unlock (tcp_v4_rcv)
00000106 0.467ms (+0.000ms): preempt_schedule (tcp_v4_rcv)
00000105 0.468ms (+0.000ms): preempt_schedule (ip_local_deliver)
00000104 0.468ms (+0.000ms): preempt_schedule (netif_receive_skb)
00000104 0.469ms (+0.001ms): rtl8139_isr_ack (rtl8139_rx)
00000104 0.470ms (+0.000ms): _spin_unlock (rtl8139_poll)
00000103 0.471ms (+0.000ms): preempt_schedule (rtl8139_poll)
00000103 0.471ms (+0.000ms): cond_resched_all (___do_softirq)
00000103 0.471ms (+0.001ms): cond_resched_softirq (___do_softirq)
00000001 0.473ms (+0.001ms): preempt_schedule (finish_task_switch)
00000001 0.474ms (+0.000ms): sub_preempt_count (schedule)
00000001 0.474ms (+0.000ms): update_max_trace (check_preempt_timing)
[79 - ditto, different way to get started]
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 209 us, entries: 275 (275) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: sleep/15758, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: unmap_vmas+0x12e/0x280
=> ended at: _spin_unlock+0x2d/0x60
[4] rt_run_flush
The trace buffer filled up before this completed so I don't have
the sequence that stopped the trace.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 2586 us, entries: 4000 (12487) | [VP:1 KP:1 SP:0 HP:0
#CPUS:2]
-----------------
| task: cpu_burn/13771, uid:0 nice:10 policy:0 rt_prio:0
-----------------
=> started at: smp_apic_timer_interrupt+0x43/0xf0
=> ended at: irq_exit+0x33/0x50
=======>
00010000 0.000ms (+0.000ms): smp_apic_timer_interrupt
(apic_timer_interrupt)
00010000 0.000ms (+0.000ms): profile_tick (smp_apic_timer_interrupt)
00010000 0.000ms (+0.000ms): profile_hook (profile_tick)
00010000 0.000ms (+0.000ms): _read_lock (profile_hook)
00010001 0.001ms (+0.000ms): notifier_call_chain (profile_hook)
00010001 0.001ms (+0.000ms): _read_unlock (profile_tick)
00010000 0.001ms (+0.000ms): update_process_times
(smp_apic_timer_interrupt)
00010000 0.001ms (+0.000ms): update_one_process (update_process_times)
00010000 0.002ms (+0.000ms): run_local_timers (update_process_times)
00010000 0.002ms (+0.000ms): raise_softirq (update_process_times)
00010000 0.002ms (+0.000ms): scheduler_tick (update_process_times)
00010000 0.002ms (+0.000ms): sched_clock (scheduler_tick)
00010000 0.003ms (+0.000ms): _spin_lock (scheduler_tick)
00010001 0.003ms (+0.000ms): _spin_unlock (scheduler_tick)
00010000 0.003ms (+0.000ms): rebalance_tick (scheduler_tick)
00010000 0.004ms (+0.000ms): irq_exit (smp_apic_timer_interrupt)
00000001 0.004ms (+0.000ms): do_softirq (irq_exit)
00000001 0.004ms (+0.000ms): __do_softirq (do_softirq)
00000101 0.004ms (+0.000ms): ___do_softirq (__do_softirq)
00000101 0.005ms (+0.000ms): run_timer_softirq (___do_softirq)
00000101 0.005ms (+0.000ms): _spin_lock_irq (run_timer_softirq)
00000101 0.005ms (+0.000ms): _spin_lock_irqsave (run_timer_softirq)
00000102 0.005ms (+0.000ms): _spin_unlock_irq (run_timer_softirq)
00000101 0.006ms (+0.000ms): rt_secret_rebuild (run_timer_softirq)
00000101 0.006ms (+0.000ms): rt_cache_flush (rt_secret_rebuild)
00000101 0.007ms (+0.000ms): _spin_lock_bh (rt_cache_flush)
00000101 0.007ms (+0.000ms): _spin_lock_irqsave (_spin_lock_bh)
00000202 0.007ms (+0.000ms): del_timer (rt_cache_flush)
00000202 0.008ms (+0.000ms): _spin_unlock_bh (rt_cache_flush)
00000201 0.008ms (+0.000ms): local_bh_enable (rt_cache_flush)
00000101 0.009ms (+0.000ms): rt_run_flush (rt_secret_rebuild)
00000101 0.009ms (+0.000ms): get_random_bytes (rt_run_flush)
00000101 0.009ms (+0.000ms): extract_entropy (get_random_bytes)
00000101 0.009ms (+0.000ms): _spin_lock_irqsave (extract_entropy)
00000102 0.010ms (+0.000ms): __wake_up (extract_entropy)
00000102 0.010ms (+0.000ms): _spin_lock_irqsave (__wake_up)
00000103 0.010ms (+0.000ms): __wake_up_common (__wake_up)
00000103 0.010ms (+0.000ms): _spin_unlock_irqrestore (extract_entropy)
00000102 0.011ms (+0.000ms): _spin_unlock_irqrestore (extract_entropy)
00000101 0.011ms (+0.000ms): SHATransform (extract_entropy)
00000101 0.011ms (+0.002ms): memcpy (SHATransform)
00000101 0.014ms (+0.000ms): add_entropy_words (extract_entropy)
00000101 0.014ms (+0.000ms): _spin_lock_irqsave (add_entropy_words)
00000102 0.014ms (+0.000ms): _spin_unlock_irqrestore (extract_entropy)
00000101 0.015ms (+0.000ms): SHATransform (extract_entropy)
00000101 0.015ms (+0.002ms): memcpy (SHATransform)
00000101 0.018ms (+0.000ms): add_entropy_words (extract_entropy)
00000101 0.018ms (+0.000ms): _spin_lock_irqsave (add_entropy_words)
00000102 0.018ms (+0.000ms): _spin_unlock_irqrestore (extract_entropy)
00000101 0.019ms (+0.000ms): _spin_lock_bh (rt_run_flush)
00000101 0.019ms (+0.000ms): _spin_lock_irqsave (_spin_lock_bh)
00000202 0.020ms (+0.000ms): _spin_unlock_bh (rt_run_flush)
00000201 0.020ms (+0.000ms): local_bh_enable (rt_run_flush)
00000101 0.020ms (+0.000ms): cond_resched_all (rt_run_flush)
00000101 0.020ms (+0.000ms): cond_resched_softirq (rt_run_flush)
00000101 0.021ms (+0.000ms): _spin_lock_bh (rt_run_flush)
00000101 0.021ms (+0.000ms): _spin_lock_irqsave (_spin_lock_bh)
00000202 0.021ms (+0.000ms): _spin_unlock_bh (rt_run_flush)
00000201 0.021ms (+0.000ms): local_bh_enable (rt_run_flush)
... this kind of cycle repeats a VERY long time and the watchdog
provides this piece of information ...
00000101 0.769ms (+0.000ms): _spin_lock_irqsave (_spin_lock_bh)
00000202 0.770ms (+0.000ms): _spin_unlock_bh (rt_run_flush)
00000201 0.770ms (+0.000ms): local_bh_enable (rt_run_flush)
00000101 0.770ms (+0.000ms): cond_resched_all (rt_run_flush)
00000101 0.771ms (+0.000ms): do_nmi (___trace)
00010101 0.771ms (+0.002ms): do_nmi (<08049b20>)
00010101 0.773ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010101 0.773ms (+464353.452ms): nmi_watchdog_tick (default_do_nmi)
00000002 464354.226ms (+12313.439ms): _spin_unlock (release_console_sem)
00010101 0.774ms (+0.000ms): do_IRQ (___trace)
00010101 0.774ms (+0.000ms): do_IRQ (<00000000>)
00010101 0.774ms (+0.000ms): _spin_lock (__do_IRQ)
00010102 0.775ms (+0.000ms): ack_edge_ioapic_irq (__do_IRQ)
00010102 0.775ms (+0.000ms): redirect_hardirq (__do_IRQ)
00010102 0.775ms (+0.000ms): _spin_unlock (__do_IRQ)
00010101 0.776ms (+0.000ms): handle_IRQ_event (__do_IRQ)
00010101 0.776ms (+0.000ms): timer_interrupt (handle_IRQ_event)
00010101 0.776ms (+0.000ms): _spin_lock (timer_interrupt)
00010102 0.776ms (+0.000ms): mark_offset_tsc (timer_interrupt)
00010102 0.777ms (+0.000ms): _spin_lock (mark_offset_tsc)
00010103 0.777ms (+0.010ms): _spin_lock (mark_offset_tsc)
00010104 0.787ms (+0.000ms): _spin_unlock (mark_offset_tsc)
00010103 0.788ms (+0.000ms): _spin_unlock (mark_offset_tsc)
00010102 0.788ms (+0.003ms): _spin_lock (timer_interrupt)
00010103 0.792ms (+0.000ms): _spin_unlock (timer_interrupt)
00010102 0.792ms (+0.000ms): do_timer (timer_interrupt)
00010102 0.792ms (+0.000ms): update_wall_time (do_timer)
... more traces ...
00000101 1.023ms (+0.000ms): cond_resched_all (rt_run_flush)
00000101 1.023ms (+0.000ms): cond_resched_softirq (rt_run_flush)
00000101 1.023ms (+0.000ms): _spin_lock_bh (rt_run_flush)
00000101 1.024ms (+0.000ms): _spin_lock_irqsave (_spin_lock_bh)
00000202 1.024ms (+0.000ms): _spin_unlock_bh (rt_run_flush)
00000201 1.024ms (+0.000ms): local_bh_enable (rt_run_flush)
00000101 1.025ms (+0.000ms): cond_resched_all (rt_run_flush)
00000101 1.025ms (+0.000ms): cond_resched_softirq (rt_run_flush)
00000101 1.025ms (+0.000ms): _spin_lock_bh (rt_run_flush)
00000101 1.025ms (+0.000ms): _spin_lock_irqsave (_spin_lock_bh)
00000202 1.026ms (+0.000ms): _spin_unlock_bh (rt_run_flush)
00000201 1.026ms (+0.000ms): local_bh_enable (rt_run_flush)
00000101 1.026ms (+0.000ms): cond_resched_all (rt_run_flush)
00000101 1.026ms (+2915178.859ms): cond_resched_softirq (rt_run_flush)
[not sure why the odd values shown in some of the traces either]
[5] rcu / cache actions
May be a false positive due to latency tracing overhead.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 207 us, entries: 431 (431) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: kswapd0/72, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: cond_resched_lock+0x24/0xf0
=> ended at: cond_resched_lock+0x60/0xf0
=======>
00000001 0.000ms (+0.000ms): touch_preempt_timing (cond_resched_lock)
00000001 0.000ms (+0.000ms): smp_apic_timer_interrupt
(touch_preempt_timing)
00010001 0.000ms (+0.000ms): profile_tick (smp_apic_timer_interrupt)
00010001 0.000ms (+0.000ms): profile_hook (profile_tick)
00010001 0.001ms (+0.000ms): _read_lock (profile_hook)
00010002 0.001ms (+0.000ms): notifier_call_chain (profile_hook)
00010002 0.001ms (+0.000ms): _read_unlock (profile_tick)
00010001 0.001ms (+0.000ms): profile_hit (smp_apic_timer_interrupt)
00010001 0.002ms (+0.000ms): update_process_times
(smp_apic_timer_interrupt)
00010001 0.002ms (+0.000ms): update_one_process (update_process_times)
00010001 0.002ms (+0.000ms): run_local_timers (update_process_times)
00010001 0.002ms (+0.000ms): raise_softirq (update_process_times)
00010001 0.003ms (+0.000ms): scheduler_tick (update_process_times)
00010001 0.003ms (+0.000ms): sched_clock (scheduler_tick)
00010001 0.003ms (+0.000ms): rcu_check_callbacks (scheduler_tick)
00010001 0.003ms (+0.000ms): idle_cpu (rcu_check_callbacks)
00010001 0.004ms (+0.000ms): __tasklet_schedule (scheduler_tick)
00010001 0.004ms (+0.000ms): _spin_lock (scheduler_tick)
00010002 0.004ms (+0.000ms): task_timeslice (scheduler_tick)
00010002 0.004ms (+0.000ms): __bitmap_weight (scheduler_tick)
00010002 0.005ms (+0.000ms): _spin_unlock (scheduler_tick)
00010001 0.005ms (+0.000ms): rebalance_tick (scheduler_tick)
00010001 0.005ms (+0.000ms): irq_exit (smp_apic_timer_interrupt)
00000002 0.006ms (+0.000ms): do_softirq (irq_exit)
00000002 0.006ms (+0.000ms): __do_softirq (do_softirq)
00000102 0.006ms (+0.000ms): ___do_softirq (__do_softirq)
00000102 0.006ms (+0.000ms): run_timer_softirq (___do_softirq)
00000102 0.006ms (+0.000ms): _spin_lock_irq (run_timer_softirq)
00000102 0.007ms (+0.000ms): _spin_lock_irqsave (run_timer_softirq)
00000103 0.007ms (+0.000ms): _spin_unlock_irq (run_timer_softirq)
00000102 0.007ms (+0.000ms): __wake_up (run_timer_softirq)
00000102 0.007ms (+0.000ms): _spin_lock_irqsave (__wake_up)
00000103 0.008ms (+0.000ms): __wake_up_common (__wake_up)
00000103 0.008ms (+0.000ms): _spin_unlock_irqrestore (run_timer_softirq)
00000102 0.008ms (+0.000ms): cond_resched_all (___do_softirq)
00000102 0.008ms (+0.000ms): cond_resched_softirq (___do_softirq)
00000102 0.009ms (+0.000ms): tasklet_action (___do_softirq)
00000102 0.009ms (+0.000ms): rcu_process_callbacks (tasklet_action)
00000102 0.009ms (+0.000ms): __rcu_process_callbacks
(rcu_process_callbacks)
00000102 0.010ms (+0.000ms): _spin_lock (__rcu_process_callbacks)
00000103 0.010ms (+0.000ms): rcu_start_batch (__rcu_process_callbacks)
00000103 0.010ms (+0.000ms): _spin_unlock (__rcu_process_callbacks)
00000102 0.010ms (+0.000ms): rcu_check_quiescent_state
(__rcu_process_callbacks)
00000102 0.011ms (+0.000ms): rcu_do_batch (rcu_process_callbacks)
00000102 0.012ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.012ms (+0.001ms): kmem_cache_free (d_callback)
00000102 0.013ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.014ms (+0.000ms): kmem_cache_free (d_callback)
00000102 0.014ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.015ms (+0.000ms): kmem_cache_free (d_callback)
00000102 0.016ms (+0.000ms): cache_flusharray (kmem_cache_free)
00000102 0.016ms (+0.001ms): _spin_lock (cache_flusharray)
00000103 0.017ms (+0.003ms): free_block (cache_flusharray)
00000103 0.021ms (+0.000ms): _spin_unlock (cache_flusharray)
00000102 0.021ms (+0.000ms): memmove (cache_flusharray)
00000102 0.021ms (+0.000ms): memcpy (memmove)
00000102 0.022ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.022ms (+0.000ms): kmem_cache_free (d_callback)
00000102 0.022ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.023ms (+0.000ms): kmem_cache_free (d_callback)
00000102 0.023ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.024ms (+0.000ms): kmem_cache_free (d_callback)
00000102 0.024ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.025ms (+0.000ms): kmem_cache_free (d_callback)
...
00000102 0.190ms (+0.000ms): cache_flusharray (kmem_cache_free)
00000102 0.190ms (+0.000ms): _spin_lock (cache_flusharray)
00000103 0.191ms (+0.003ms): free_block (cache_flusharray)
00000103 0.194ms (+0.000ms): _spin_unlock (cache_flusharray)
00000102 0.194ms (+0.000ms): memmove (cache_flusharray)
00000102 0.195ms (+0.000ms): memcpy (memmove)
00000102 0.195ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.195ms (+0.000ms): kmem_cache_free (d_callback)
00000102 0.196ms (+0.000ms): __tasklet_schedule (rcu_process_callbacks)
00000102 0.196ms (+0.000ms): __rcu_process_callbacks
(rcu_process_callbacks)
00000102 0.197ms (+0.000ms): rcu_check_quiescent_state
(__rcu_process_callbacks)
00000102 0.197ms (+0.000ms): cond_resched_all (___do_softirq)
00000102 0.197ms (+0.000ms): cond_resched_softirq (___do_softirq)
00000102 0.198ms (+0.000ms): wake_up_process (__do_softirq)
00000102 0.198ms (+0.000ms): try_to_wake_up (wake_up_process)
00000102 0.199ms (+0.000ms): task_rq_lock (try_to_wake_up)
00000102 0.199ms (+0.001ms): _spin_lock (task_rq_lock)
00000103 0.201ms (+0.000ms): activate_task (try_to_wake_up)
00000103 0.201ms (+0.000ms): sched_clock (activate_task)
00000103 0.202ms (+0.001ms): recalc_task_prio (activate_task)
00000103 0.203ms (+0.000ms): effective_prio (recalc_task_prio)
00000103 0.203ms (+0.001ms): enqueue_task (activate_task)
00000103 0.204ms (+0.000ms): resched_task (try_to_wake_up)
00000103 0.205ms (+0.001ms): _spin_unlock_irqrestore (try_to_wake_up)
00000102 0.206ms (+0.001ms): preempt_schedule (try_to_wake_up)
00000001 0.207ms (+0.000ms): sub_preempt_count (cond_resched_lock)
00000001 0.208ms (+0.000ms): update_max_trace (check_preempt_timing)
[6] prune_icache
If I read this right, an operation that should have taken 100 usec or
so, was preempted by an even longer duration series of operations.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 263 us, entries: 597 (597) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: kswapd0/72, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock+0x1f/0x70
=> ended at: _spin_unlock+0x2d/0x60
=======>
00000001 0.000ms (+0.000ms): _spin_lock (prune_icache)
00000001 0.000ms (+0.000ms): inode_has_buffers (prune_icache)
00000001 0.001ms (+0.000ms): inode_has_buffers (prune_icache)
00000001 0.001ms (+0.000ms): inode_has_buffers (prune_icache)
00000001 0.002ms (+0.000ms): inode_has_buffers (prune_icache)
...
00000001 0.075ms (+0.000ms): smp_apic_timer_interrupt (prune_icache)
00010001 0.075ms (+0.000ms): profile_tick (smp_apic_timer_interrupt)
00010001 0.075ms (+0.000ms): profile_hook (profile_tick)
00010001 0.075ms (+0.000ms): _read_lock (profile_hook)
00010002 0.076ms (+0.000ms): notifier_call_chain (profile_hook)
00010002 0.076ms (+0.000ms): _read_unlock (profile_tick)
00010001 0.076ms (+0.000ms): profile_hit (smp_apic_timer_interrupt)
00010001 0.077ms (+0.000ms): update_process_times
(smp_apic_timer_interrupt)
00010001 0.077ms (+0.000ms): update_one_process (update_process_times)
00010001 0.078ms (+0.000ms): run_local_timers (update_process_times)
00010001 0.078ms (+0.000ms): raise_softirq (update_process_times)
00010001 0.078ms (+0.000ms): scheduler_tick (update_process_times)
00010001 0.078ms (+0.000ms): sched_clock (scheduler_tick)
00010001 0.079ms (+0.000ms): rcu_check_callbacks (scheduler_tick)
00010001 0.080ms (+0.000ms): idle_cpu (rcu_check_callbacks)
... it then runs later ...
00000102 0.097ms (+0.001ms): tasklet_action (___do_softirq)
00000102 0.098ms (+0.000ms): rcu_process_callbacks (tasklet_action)
00000102 0.098ms (+0.000ms): __rcu_process_callbacks
(rcu_process_callbacks)
00000102 0.099ms (+0.000ms): rcu_check_quiescent_state
(__rcu_process_callbacks)
00000102 0.100ms (+0.000ms): rcu_do_batch (rcu_process_callbacks)
00000102 0.100ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.101ms (+0.000ms): kmem_cache_free (d_callback)
00000102 0.101ms (+0.000ms): d_callback (rcu_do_batch)
00000102 0.102ms (+0.000ms): kmem_cache_free (d_callback)
... finally getting back to the original work ...
00000103 0.237ms (+0.000ms): resched_task (try_to_wake_up)
00000103 0.238ms (+0.000ms): _spin_unlock_irqrestore (try_to_wake_up)
00000102 0.238ms (+0.001ms): preempt_schedule (try_to_wake_up)
00000001 0.239ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.240ms (+0.000ms): inode_has_buffers (prune_icache)
00000001 0.241ms (+0.000ms): inode_has_buffers (prune_icache)
...
00000001 0.262ms (+0.000ms): inode_has_buffers (prune_icache)
00000001 0.263ms (+0.001ms): _spin_unlock (prune_icache)
00000001 0.264ms (+0.000ms): sub_preempt_count (_spin_unlock)
00000001 0.264ms (+0.000ms): update_max_trace (check_preempt_timing)
[7] clear_page_tables
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 208 us, entries: 67 (67) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: get_ltrace.sh/16132, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: unmap_vmas+0x12e/0x280
=> ended at: _spin_unlock+0x2d/0x60
=======>
00000001 0.000ms (+0.000ms): touch_preempt_timing (unmap_vmas)
00000001 0.000ms (+0.000ms): __bitmap_weight (unmap_vmas)
00000001 0.000ms (+0.000ms): unmap_page_range (unmap_vmas)
00000001 0.000ms (+0.000ms): zap_pmd_range (unmap_page_range)
00000001 0.000ms (+0.000ms): zap_pte_range (zap_pmd_range)
00000001 0.001ms (+0.000ms): kmap_atomic (zap_pte_range)
00000002 0.001ms (+0.001ms): page_address (zap_pte_range)
00000002 0.002ms (+0.000ms): set_page_dirty (zap_pte_range)
00000002 0.002ms (+0.000ms): page_remove_rmap (zap_pte_range)
00000002 0.002ms (+0.000ms): set_page_dirty (zap_pte_range)
00000002 0.003ms (+0.000ms): page_remove_rmap (zap_pte_range)
00000002 0.003ms (+0.000ms): set_page_dirty (zap_pte_range)
00000002 0.003ms (+0.000ms): page_remove_rmap (zap_pte_range)
00000002 0.004ms (+0.000ms): kunmap_atomic (zap_pte_range)
00000001 0.004ms (+0.002ms): vm_acct_memory (exit_mmap)
00000001 0.007ms (+0.181ms): clear_page_tables (exit_mmap)
00000001 0.189ms (+0.000ms): flush_tlb_mm (exit_mmap)
...
[8] avc_insert
The watchdog timer woke this up but avc_insert appears to be
the long step.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 248 us, entries: 35 (35) | [VP:1 KP:1 SP:0 HP:0 #CPUS:2]
-----------------
| task: fam/2929, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80
=> ended at: _spin_unlock_irqrestore+0x32/0x70
=======>
00000001 0.000ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
00000001 0.000ms (+0.020ms): avc_insert (avc_has_perm_noaudit)
00010001 0.020ms (+0.001ms): do_nmi (avc_insert)
00010001 0.022ms (+0.005ms): do_nmi (<08049b20>)
00010001 0.027ms (+0.000ms): notifier_call_chain (default_do_nmi)
00010001 0.027ms (+0.185ms): nmi_watchdog_tick (default_do_nmi)
00000001 0.213ms (+0.000ms): memcpy (avc_has_perm_noaudit)
00000001 0.213ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
00010001 0.214ms (+0.000ms): do_IRQ (_spin_unlock_irqrestore)
00010001 0.214ms (+0.000ms): do_IRQ (<00000000>)
00010001 0.215ms (+0.000ms): _spin_lock (__do_IRQ)
00010002 0.215ms (+0.000ms): ack_edge_ioapic_irq (__do_IRQ)
00010002 0.216ms (+0.000ms): redirect_hardirq (__do_IRQ)
00010002 0.216ms (+0.000ms): _spin_unlock (__do_IRQ)
00010001 0.216ms (+0.000ms): handle_IRQ_event (__do_IRQ)
00010001 0.217ms (+0.000ms): timer_interrupt (handle_IRQ_event)
00010001 0.217ms (+0.000ms): _spin_lock (timer_interrupt)
00010002 0.218ms (+0.000ms): mark_offset_tsc (timer_interrupt)
00010002 0.218ms (+0.000ms): _spin_lock (mark_offset_tsc)
00010003 0.219ms (+0.013ms): _spin_lock (mark_offset_tsc)
00010004 0.232ms (+0.000ms): _spin_unlock (mark_offset_tsc)
00010003 0.233ms (+0.000ms): _spin_unlock (mark_offset_tsc)
00010002 0.234ms (+0.007ms): _spin_lock (timer_interrupt)
00010003 0.242ms (+0.000ms): _spin_unlock (timer_interrupt)
00010002 0.242ms (+0.000ms): do_timer (timer_interrupt)
00010002 0.243ms (+0.000ms): update_wall_time (do_timer)
00010002 0.243ms (+0.003ms): update_wall_time_one_tick (update_wall_time)
00010002 0.246ms (+0.000ms): _spin_unlock (timer_interrupt)
00010001 0.247ms (+0.000ms): _spin_lock (__do_IRQ)
00010002 0.247ms (+0.000ms): note_interrupt (__do_IRQ)
00010002 0.247ms (+0.000ms): end_edge_ioapic_irq (__do_IRQ)
00010002 0.247ms (+0.000ms): _spin_unlock (__do_IRQ)
00010001 0.248ms (+0.001ms): irq_exit (do_IRQ)
00000001 0.249ms (+0.001ms): sub_preempt_count (_spin_unlock_irqrestore)
00000001 0.250ms (+0.000ms): update_max_trace (check_preempt_timing)
On Fri, 2004-10-08 at 03:36, Peter Williams wrote:
> Ingo Molnar wrote:
> > * Lee Revell <[email protected]> wrote:
> >
> >
> >>On Thu, 2004-10-07 at 19:26, Rui Nuno Capela wrote:
> >>
> >>>Ingo Molnar wrote:
> >>>
> >>>>>i've released the -T3 VP patch:
> >>>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
> >>>>>
> >>>>
> >>>OK. Just to let you know, both of my personal machines are now running on
> >>>bleeding-edge 2.6.9-rc3-mm3-T3, and very happily may I assure :)
> >>
> >>This actually feels a _lot_ snappier than mm2, which seemed prone to
> >>weird stalls. I don't have any numbers to back this up yet.
> >
> >
> > yeah, -mm is back to the development branch of the stock scheduler.
> > (i.e. the scheduler changes destined for 2.6.10.)
>
> It's also got a fix for the cache hot timing bug which was causing havoc
> with the load balancer.
Wouldn't this only be an issue on SMP? I am on a UP system.
Lee
On Fri, 2004-10-08 at 02:49, Con Kolivas wrote:
> Lee Revell wrote:
> > On Thu, 2004-10-07 at 19:26, Rui Nuno Capela wrote:
> >
> >>Ingo Molnar wrote:
> >>
> >>>>i've released the -T3 VP patch:
> >>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
> >>>>
> >>>
> >>OK. Just to let you know, both of my personal machines are now running on
> >>bleeding-edge 2.6.9-rc3-mm3-T3, and very happily may I assure :)
> >
> >
> > This actually feels a _lot_ snappier than mm2, which seemed prone to
> > weird stalls. I don't have any numbers to back this up yet.
>
> mm2 had a completely different cpu scheduler so no meaningful comparison
> can be made. Try comparing to mm3 vanilla.
Well, I figured the change from -mm2 to -mm3 was responsible, as I have
never seen the VP patches make a perceptible difference in system
response time. The VP effect only becomes apparent when you do
something that really needs millisecond or sub-ms latency. I guess a
bug in the VP patch could cause performance regressions though. However
no one reported sluggishness with mm2+S7, but it's apparent when you try
mm3+T3 that it feels a lot more responsive.
Anyway I was just wondering if there was an obvious change that would
cause this.
Lee
On Thu, 2004-10-07 at 06:52, Ingo Molnar wrote:
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
[adding Andrew Morton to the cc: list as these issues are increasingly
relevant to -mm and not VP specific]
I am seeing the same prune_icache latency that Mark reported. I have
never seen this one at all before T3. This one seem very frequent,
enough so to overtake the netif_skb single-packet processing latency
that seems to be our lower bound.
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 242 us, entries: 178 (178) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: kswapd0/54, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: prune_icache+0x52/0x460
=> ended at: prune_icache+0x162/0x460
=======>
00000001 0.000ms (+0.001ms): prune_icache (shrink_icache_memory)
00000001 0.001ms (+0.002ms): inode_has_buffers (prune_icache)
00000001 0.004ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.005ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.007ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.008ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.010ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.011ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.012ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.014ms (+0.001ms): inode_has_buffers (prune_icache)
00000001 0.015ms (+0.001ms): inode_has_buffers (prune_icache)
[more of same interrupted by the timer a few times]
Workload is just a kernel compile and an RT task (jackd).
Interestingly, kswapd seems to have triggered the above, but I should
not be hitting swap! I have swappiness set to 0, and here is what
vmstat showed:
procs -----------memory---------- ---swap-- -----io---- --system-- ----cpu----
r b swpd free buff cache si so bi bo in cs us sy id wa
1 0 4 30608 54476 221680 0 0 9 12 310 600 36 8 56 0
1 0 4 41424 54476 221796 0 0 0 0 1109 2577 88 12 0 0
1 0 4 41168 54484 221808 0 0 24 0 1005 2158 83 17 0 0
1 0 4 34704 54496 221808 0 0 0 192 1015 2063 92 8 0 0
1 0 4 32208 54496 221808 0 0 0 0 1003 2045 96 4 0 0
1 0 4 30928 54496 221808 0 0 0 0 1004 2090 98 2 0 0
Lee
On Thu, 2004-10-07 at 06:52, Ingo Molnar wrote:
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
Also, I am still seeing some long latencies in the ext3 journaling code:
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 607 us, entries: 1087 (1087) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: kjournald/687, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: journal_commit_transaction+0x75/0x2830
=> ended at: __journal_clean_checkpoint_list+0xb2/0xf0
=======>
00000001 0.000ms (+0.003ms): journal_commit_transaction (kjournald)
Here is the loop:
00000002 0.003ms (+0.001ms): kfree (journal_commit_transaction)
00000001 0.004ms (+0.001ms): journal_refile_buffer (journal_commit_transaction)
00000003 0.006ms (+0.000ms): __journal_refile_buffer (journal_refile_buffer)
00000003 0.006ms (+0.001ms): __journal_unfile_buffer (journal_refile_buffer)
00000002 0.008ms (+0.000ms): journal_remove_journal_head (journal_refile_buffer)
00000003 0.008ms (+0.000ms): __journal_remove_journal_head (journal_remove_journal_head)
00000003 0.009ms (+0.000ms): __brelse (__journal_remove_journal_head)
00000003 0.010ms (+0.000ms): journal_free_journal_head (journal_remove_journal_head)
00000003 0.010ms (+0.001ms): kmem_cache_free (journal_free_journal_head)
00000001 0.011ms (+0.000ms): __brelse (journal_commit_transaction)
[end loop]
00000002 0.012ms (+0.000ms): kfree (journal_commit_transaction)
00000001 0.013ms (+0.000ms): journal_refile_buffer (journal_commit_transaction)
I think I already reported this one with S7.
Lee
Config for HPET not correctly set in arch/x86_64/Kconfig.
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y are the only ones to appear and it doesn't
build hpet.o, build also fails with # CONFIG_HPET_TIMER is not set.
CHK include/linux/version.h
make[1]: `arch/x86_64/kernel/asm-offsets.s' is up to date.
CHK include/linux/compile.h
GEN_INITRAMFS_LIST usr/initramfs_list
Using shipped usr/initramfs_list
GEN .version
CHK include/linux/compile.h
UPD include/linux/compile.h
CC init/version.o
LD init/built-in.o
LD .tmp_vmlinux1
arch/x86_64/kernel/built-in.o(.init.text+0x1dc1): In function
`late_hpet_init':
: undefined reference to `hpet_alloc'
--------------------------------------
x86 builds OK with:-
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
CONFIG_HPET_MMAP=y
Regards
Sid.
--
Sid Boyce .... Hamradio G3VBV and keen Flyer
=====LINUX ONLY USED HERE=====
On Thu, 2004-10-07 at 06:52, Ingo Molnar wrote:
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
Wow, this is a really bad one - 5576 usecs in shmem_truncate! I think
this was triggered by Mozilla deleting a large video file from /tmp
which is a tmpfs mount.
(mozilla-bin/16141): new 5576 us maximum-latency critical section.
=> started at: <shmem_truncate+0x67/0x630>
=> ended at: <shmem_truncate+0x3bc/0x630>
[<c013d96a>] check_preempt_timing+0x14a/0x1e0
[<c013d2f5>] __mcount+0x15/0x20
[<c02dd08a>] preempt_schedule+0xa/0x70
[<c013dbba>] sub_preempt_count+0x5a/0x90
[<c016b39c>] shmem_truncate+0x3bc/0x630
[<c016b39c>] shmem_truncate+0x3bc/0x630
[<c016b964>] shmem_delete_inode+0x134/0x340
[<c016b830>] shmem_delete_inode+0x0/0x340
[<c01991ee>] generic_delete_inode+0xde/0x300
[<c0189615>] sys_unlink+0xf5/0x150
[<c0106b47>] syscall_call+0x7/0xb
Full trace:
http://krustophenia.net/testresults.php?dataset=2.6.9-rc3-mm3-VP-T3#/var/www/2.6.9-rc3-mm3-VP-T3/shmem-truncate-latency-trace.txt
Lee
On Thu, 2004-10-07 at 06:52, Ingo Molnar wrote:
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
With VP and PREEMPT in general, does the scheduler always run the
highest priority process, or do we only preempt if a SCHED_FIFO process
is runnable?
Lee
On Thu, 7 Oct 2004 12:52:30 +0200
Ingo Molnar <[email protected]> wrote:
>
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
Hi,
i just wanted to report that audio usage has been become quite a bit worse
when compared to T1. I get more xruns (40 to 80 per night as opposed to 5 to
10 on T1), RT apps like ardour seem to be much more unstable with T3 than T1
(ardour gets kicked off the jack graph regularly at 64 frames on T3 which
doesn't happen with T1).
This goes together with a general increase of > 200us non preempt. crit.
sect. which were very seldom in T1 (at loeast for the work i do) but appear
rather regularly in T3.
Flo
P.S.: Are there tools available which can check the "correctness" of the
interplay of nptl libc and kernel wrt to threading? Especially when it comes
to wakeup order of blocking threads in different scheduler classes and with
different priorities.
i've released the -T4 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T4
the big change in this release is the addition of PREEMPT_REALTIME,
which is a new implementation of a fully preemptible kernel model:
- spinlocks and rwlocks use semaphores and are preemptible by default
- the _irq variants of these locks do not disable interrupts but rely
on IRQ threading to exclude against interrupt contexts.
note that this implementation is different from other kernel-preemption
patches, in a number of key areas. Initially i looked at merging the
MontaVista patchset from two days ago but decided to implement a new one
from scratch to cure a number of conceptual problems:
- this patch auto-detects the 'type' of the lock at compilation time.
All fully-preemptible kernel patches i've seen so far suffer from one
nasty problem: they are very large because they redefine _all_ the
spinlock APIs to provide separation between 'mutex based' and
'original' spinlocks. E.g. check out the sheer size of the MontaVista
patchset: Linux-2.6.9-rc3-RT_spinlock1.patch and
Linux-2.6.9-rc3-RT_spinlock2.patch are 84K and 92K and they convert
~30 core spinlocks to new APIs.
OTOH this patch converts _90_ spinlocks in roughly half the
patchsize, which makes a large difference in maintainability.
How it works: this implementation uses a gcc feature to detect the
type of the spinlock compile-time and to switch to the mutex or
raw_spinlock API accordingly. Only one, very isolated change has to
be done to switch a generic spinlock to a spin-only lock: spinlock_t
is changed to raw_spinlock_t and the initializer is fixed up. All the
other code remains untouched - and this even if a single C module
contains both mutex-based and spinlock-based API calls. This approach
is quite close to a simple object-oriented lock type - but written in
C and compatible with the existing spinlock APIs.
- i used the native Linux semaphores/rwsems to implement
spinlock/rwlock preemption. E.g. the MontaVista patches use separate
synchronization objects (kmutex/pmutex) to implement this.
I believe using native semaphores is the better approach
architecturally because this means that we have to add priority
inheritance handling only once and to the native Linux semaphores.
This has the additional benefit of fixing all mutex-using
kernel code's priority inheritance problems. (which kmutex/pmutex
does not solve.)
OTOH the MontaVista patches naturally have the advantage of having a
working priority-inheritance mechanism in the pmutex code, right now.
(I did a brief attempt to plug the pmutex code into this patch but it
didnt look good of a match - but others might want to try to
integrate it nevertheless.)
also, another bad property of the kmutex/pmutex code is that it uses
assembly which makes it quite hard to port to non-x86 architectures.
OTOH, the native Linux semaphores and rwsems work on every
architecture.
- the patch converts rwlocks too, while e.g. the MontaVista patchset
still keeps rwlocks as spinlocks. It is important to convert rwlocks
to rw-semaphores, most notably this allow the conversion of the
tasklist and signal spinlocks.
- finally, i went for correctness primarily, not latencies. I checked
out the MontaVista patches and they categorize roughly 30 spinlocks
as the ones that are necessary to be 'raw'. Unfortunately this is
inadequate, my patch excludes 90 such locks and it's still probably
not a 100% correct conversion. The core kernel needs changes in the
locking infrastructure to get rid of most of the these 90 non-mutex
locks.
it is highly recommended to enable DEBUG_PREEMPT when enabling
PREEMPT_REALTIME. It will warn about all the places that are unsafe. The
patch is x86-only for the time being, but the changes necessary for
other architectures should be relatively low.
NOTE: CONFIG_PREEMPT_REALTIME is default-off and i'd only suggest to
enable it on non-critical systems. It is the first iteration of this
feature and it will sure have rough edges. Not for the faint hearted!
NOTE2: some of the lock-break functionality offered by the -VP patchset
is disabled if PREEMPT_REALTIME is enabled - this is temporary. This
will likely result in an increase of the maximum measured latencies.
NOTE3: since so many spinlocks are still non-mutex, even average
latencies will be well above what we could achieve - but i wanted to
reach a known-correct codebase first. For example, most of the
networking spinlocks had to be made non-mutex because of networking's
use of RCU locking primitives and per-CPU data structures. The same is
true for the VFS - many of its locks are non-mutex still due to RCU.
Once this infrastructure work is done the size of the patch will
decrease significantly.
to build a -T4 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T4
Ingo
On Mon, 11 Oct 2004 16:29:53 +0200
Ingo Molnar <[email protected]> wrote:
>
> i've released the -T4 VP patch:
Hi,
doesn't build here:
CHK include/linux/version.h
HOSTCC scripts/basic/fixdep
HOSTCC scripts/basic/split-include
HOSTCC scripts/basic/docproc
HOSTCC scripts/genksyms/genksyms.o
HOSTCC scripts/genksyms/lex.o
HOSTCC scripts/genksyms/parse.o
HOSTLD scripts/genksyms/genksyms
CC scripts/mod/empty.o
HOSTCC scripts/mod/mk_elfconfig
MKELF scripts/mod/elfconfig.h
HOSTCC scripts/mod/file2alias.o
HOSTCC scripts/mod/modpost.o
HOSTCC scripts/mod/sumversion.o
HOSTLD scripts/mod/modpost
HOSTCC scripts/kallsyms
HOSTCC scripts/conmakehash
HOSTCC scripts/bin2c
CC arch/i386/kernel/asm-offsets.s
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:419: error: parse error before '*' token
include/linux/spinlock.h:419: warning: function declaration isn't a prototype
include/linux/spinlock.h:420: error: parse error before '*' token
include/linux/spinlock.h:420: warning: function declaration isn't a prototype
include/linux/spinlock.h:421: error: parse error before '*' token
include/linux/spinlock.h:421: warning: function declaration isn't a prototype
include/linux/spinlock.h:422: error: parse error before '*' token
include/linux/spinlock.h:422: warning: function declaration isn't a prototype
include/linux/spinlock.h:423: error: parse error before '*' token
include/linux/spinlock.h:423: warning: function declaration isn't a prototype
include/linux/spinlock.h:424: error: parse error before '*' token
include/linux/spinlock.h:424: warning: function declaration isn't a prototype
include/linux/spinlock.h:425: error: parse error before '*' token
include/linux/spinlock.h:425: warning: function declaration isn't a prototype
include/linux/spinlock.h:426: error: parse error before '*' token
include/linux/spinlock.h:426: warning: function declaration isn't a prototype
include/linux/spinlock.h:427: error: parse error before '*' token
include/linux/spinlock.h:427: warning: function declaration isn't a prototype
include/linux/spinlock.h:428: error: parse error before '*' token
include/linux/spinlock.h:428: warning: function declaration isn't a prototype
include/linux/spinlock.h:429: error: parse error before '*' token
include/linux/spinlock.h:429: warning: function declaration isn't a prototype
include/linux/spinlock.h:430: error: parse error before '*' token
include/linux/spinlock.h:430: warning: function declaration isn't a prototype
include/linux/spinlock.h:431: error: parse error before '*' token
include/linux/spinlock.h:431: warning: function declaration isn't a prototype
include/linux/spinlock.h:467: error: parse error before '*' token
include/linux/spinlock.h:467: warning: function declaration isn't a prototype
include/linux/spinlock.h:468: error: parse error before '*' token
include/linux/spinlock.h:468: warning: function declaration isn't a prototype
include/linux/spinlock.h:469: error: parse error before '*' token
include/linux/spinlock.h:469: warning: function declaration isn't a prototype
include/linux/spinlock.h:470: error: parse error before '*' token
include/linux/spinlock.h:470: warning: function declaration isn't a prototype
include/linux/spinlock.h:471: error: parse error before '*' token
include/linux/spinlock.h:471: warning: function declaration isn't a prototype
include/linux/spinlock.h:472: error: parse error before '*' token
include/linux/spinlock.h:472: warning: function declaration isn't a prototype
include/linux/spinlock.h:473: error: parse error before '*' token
include/linux/spinlock.h:473: warning: function declaration isn't a prototype
include/linux/spinlock.h:474: error: parse error before '*' token
include/linux/spinlock.h:474: warning: function declaration isn't a prototype
include/linux/spinlock.h:475: error: parse error before '*' token
include/linux/spinlock.h:475: warning: function declaration isn't a prototype
include/linux/spinlock.h:476: error: parse error before '*' token
include/linux/spinlock.h:476: warning: function declaration isn't a prototype
include/linux/spinlock.h:477: error: parse error before '*' token
include/linux/spinlock.h:477: warning: function declaration isn't a prototype
include/linux/spinlock.h:478: error: parse error before '*' token
include/linux/spinlock.h:478: warning: function declaration isn't a prototype
include/linux/spinlock.h:479: error: parse error before '*' token
include/linux/spinlock.h:479: warning: function declaration isn't a prototype
include/linux/spinlock.h:480: error: parse error before '*' token
include/linux/spinlock.h:480: warning: function declaration isn't a prototype
include/linux/spinlock.h:481: error: parse error before '*' token
include/linux/spinlock.h:481: warning: function declaration isn't a prototype
include/linux/spinlock.h:482: error: parse error before '*' token
include/linux/spinlock.h:482: warning: function declaration isn't a prototype
include/linux/spinlock.h:483: error: parse error before '*' token
include/linux/spinlock.h:483: warning: function declaration isn't a prototype
include/linux/spinlock.h:484: error: parse error before '*' token
include/linux/spinlock.h:484: warning: function declaration isn't a prototype
include/linux/spinlock.h:485: error: parse error before '*' token
include/linux/spinlock.h:485: warning: function declaration isn't a prototype
include/linux/spinlock.h:486: error: parse error before '*' token
include/linux/spinlock.h:486: warning: function declaration isn't a prototype
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:543:1: warning: "spin_lock_init" redefined
include/linux/spinlock.h:208:1: warning: this is the location of the previous definition
include/linux/spinlock.h:548:1: warning: "rwlock_init" redefined
include/linux/spinlock.h:226:1: warning: this is the location of the previous definition
include/linux/spinlock.h:553:1: warning: "spin_is_locked" redefined
include/linux/spinlock.h:210:1: warning: this is the location of the previous definition
include/linux/spinlock.h:563:1: warning: "spin_unlock_wait" redefined
include/linux/spinlock.h:212:1: warning: this is the location of the previous definition
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:669: error: parse error before "raw_spinlock_t"
include/linux/spinlock.h:669: warning: function declaration isn't a prototype
In file included from include/linux/time.h:7,
from include/linux/timex.h:58,
from include/linux/sched.h:11,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/seqlock.h:35: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h:35: warning: no semicolon at end of struct or union
include/linux/seqlock.h:36: warning: type defaults to `int' in declaration of `seqlock_t'
include/linux/seqlock.h:36: warning: data definition has no type or storage class
include/linux/seqlock.h:50: error: parse error before '*' token
include/linux/seqlock.h:51: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `write_seqlock':
include/linux/seqlock.h:52: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h:52: error: (Each undeclared identifier is reported only once
include/linux/seqlock.h:52: error: for each function it appears in.)
include/linux/seqlock.h:52: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h:52: error: `raw_spinlock_t' undeclared (first use in this function)
include/linux/seqlock.h:52: error: parse error before ')' token
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:52: error: parse error before "while"
include/linux/seqlock.h:57: error: parse error before '*' token
include/linux/seqlock.h:58: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `write_sequnlock':
include/linux/seqlock.h:60: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h:61: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:61: error: parse error before "while"
include/linux/seqlock.h:64: error: parse error before '*' token
include/linux/seqlock.h:65: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `write_tryseqlock':
include/linux/seqlock.h:66: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h:66: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h:66: warning: unused variable `__ret'
include/linux/seqlock.h:66: error: parse error before "while"
include/linux/seqlock.h:66: error: `raw_spinlock_t' undeclared (first use in this function)
include/linux/seqlock.h:66: error: parse error before ')' token
include/linux/seqlock.h:66: warning: left-hand operand of comma expression has no effect
include/linux/seqlock.h:66: warning: unused variable `ret'
include/linux/seqlock.h:66: warning: no return statement in function returning non-void
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:66: error: parse error before ')' token
include/linux/seqlock.h:66: warning: type defaults to `int' in declaration of `__ret'
include/linux/seqlock.h:66: warning: data definition has no type or storage class
include/linux/seqlock.h:66: error: parse error before '}' token
include/linux/seqlock.h:76: warning: type defaults to `int' in declaration of `seqlock_t'
include/linux/seqlock.h:76: error: parse error before '*' token
include/linux/seqlock.h:77: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `read_seqbegin':
include/linux/seqlock.h:78: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:91: warning: type defaults to `int' in declaration of `seqlock_t'
include/linux/seqlock.h:91: error: parse error before '*' token
include/linux/seqlock.h:92: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `read_seqretry':
include/linux/seqlock.h:94: error: `iv' undeclared (first use in this function)
include/linux/seqlock.h:94: error: `sl' undeclared (first use in this function)
In file included from include/linux/timex.h:58,
from include/linux/sched.h:11,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/time.h: At top level:
include/linux/time.h:83: error: parse error before "xtime_lock"
include/linux/time.h:83: warning: type defaults to `int' in declaration of `xtime_lock'
include/linux/time.h:83: warning: data definition has no type or storage class
In file included from include/asm/semaphore.h:41,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/wait.h:82: error: parse error before '*' token
include/linux/wait.h:83: warning: function declaration isn't a prototype
include/linux/wait.h: In function `init_waitqueue_head':
include/linux/wait.h:84: error: `q' undeclared (first use in this function)
include/linux/wait.h:84: error: `RAW_SPIN_LOCK_UNLOCKED' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:103: error: parse error before '*' token
include/linux/wait.h:104: warning: function declaration isn't a prototype
include/linux/wait.h: In function `waitqueue_active':
include/linux/wait.h:105: error: `q' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:117: error: parse error before '*' token
include/linux/wait.h:117: warning: function declaration isn't a prototype
include/linux/wait.h:118: error: parse error before '*' token
include/linux/wait.h:118: warning: function declaration isn't a prototype
include/linux/wait.h:119: error: parse error before '*' token
include/linux/wait.h:119: warning: function declaration isn't a prototype
include/linux/wait.h:121: error: parse error before '*' token
include/linux/wait.h:122: warning: function declaration isn't a prototype
include/linux/wait.h: In function `__add_wait_queue':
include/linux/wait.h:123: error: `new' undeclared (first use in this function)
include/linux/wait.h:123: error: `head' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:129: error: parse error before '*' token
include/linux/wait.h:131: warning: function declaration isn't a prototype
include/linux/wait.h: In function `__add_wait_queue_tail':
include/linux/wait.h:132: error: `new' undeclared (first use in this function)
include/linux/wait.h:132: error: `head' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:135: error: parse error before '*' token
include/linux/wait.h:137: warning: function declaration isn't a prototype
include/linux/wait.h: In function `__remove_wait_queue':
include/linux/wait.h:138: error: `old' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:141: error: parse error before '*' token
include/linux/wait.h:141: warning: function declaration isn't a prototype
include/linux/wait.h:142: error: parse error before '*' token
include/linux/wait.h:142: warning: function declaration isn't a prototype
include/linux/wait.h:143: error: parse error before '*' token
include/linux/wait.h:143: warning: function declaration isn't a prototype
include/linux/wait.h:144: error: parse error before '*' token
include/linux/wait.h:144: warning: function declaration isn't a prototype
include/linux/wait.h:145: error: parse error before '*' token
include/linux/wait.h:145: error: `__wait_on_bit' declared as function returning a function
include/linux/wait.h:145: warning: function declaration isn't a prototype
include/linux/wait.h:145: error: parse error before "unsigned"
include/linux/wait.h:146: error: parse error before '*' token
include/linux/wait.h:146: error: `__wait_on_bit_lock' declared as function returning a function
include/linux/wait.h:146: warning: function declaration isn't a prototype
include/linux/wait.h:146: error: parse error before "unsigned"
include/linux/wait.h:150: error: parse error before '*' token
include/linux/wait.h:150: warning: type defaults to `int' in declaration of `bit_waitqueue'
include/linux/wait.h:150: warning: data definition has no type or storage class
include/linux/wait.h:288: error: parse error before '*' token
include/linux/wait.h:290: warning: function declaration isn't a prototype
include/linux/wait.h: In function `add_wait_queue_exclusive_locked':
include/linux/wait.h:291: error: `wait' undeclared (first use in this function)
include/linux/wait.h:292: error: `q' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:298: error: parse error before '*' token
include/linux/wait.h:300: warning: function declaration isn't a prototype
include/linux/wait.h: In function `remove_wait_queue_locked':
include/linux/wait.h:301: error: `q' undeclared (first use in this function)
include/linux/wait.h:301: error: `wait' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:309: error: parse error before '*' token
include/linux/wait.h:309: warning: function declaration isn't a prototype
include/linux/wait.h:310: error: parse error before '*' token
include/linux/wait.h:310: warning: function declaration isn't a prototype
include/linux/wait.h:312: error: parse error before '*' token
include/linux/wait.h:312: warning: function declaration isn't a prototype
include/linux/wait.h:313: error: parse error before '*' token
include/linux/wait.h:313: warning: function declaration isn't a prototype
include/linux/wait.h:319: error: parse error before '*' token
include/linux/wait.h:319: warning: function declaration isn't a prototype
include/linux/wait.h:321: error: parse error before '*' token
include/linux/wait.h:321: warning: function declaration isn't a prototype
include/linux/wait.h:323: error: parse error before '*' token
include/linux/wait.h:323: warning: function declaration isn't a prototype
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:26: error: `raw_spinlock_t' used prior to declaration
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:80:1: warning: "SPIN_LOCK_UNLOCKED" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:199:1: warning: this is the location of the previous definition
include/asm/spinlock.h:86: error: conflicting types for `spinlock_t'
include/linux/spinlock.h:198: error: previous declaration of `spinlock_t'
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:93:1: warning: "RW_LOCK_UNLOCKED" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:220:1: warning: this is the location of the previous definition
include/asm/spinlock.h:99: error: conflicting types for `rwlock_t'
include/linux/spinlock.h:219: error: previous declaration of `rwlock_t'
include/asm/spinlock.h:165: error: parse error before "do"
include/asm/spinlock.h:197: error: parse error before "void"
include/asm/spinlock.h:207: error: parse error before "do"
include/asm/spinlock.h:259: error: parse error before "do"
include/asm/spinlock.h:267: error: parse error before "do"
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:275:1: warning: "_raw_read_unlock" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:228:1: warning: this is the location of the previous definition
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:276:1: warning: "_raw_write_unlock" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:230:1: warning: this is the location of the previous definition
include/asm/spinlock.h:278: error: parse error before '{' token
include/asm/spinlock.h:281: warning: type defaults to `int' in declaration of `atomic_dec'
include/asm/spinlock.h:281: warning: parameter names (without types) in function declaration
include/asm/spinlock.h:281: error: conflicting types for `atomic_dec'
include/asm/atomic.h:116: error: previous declaration of `atomic_dec'
include/asm/spinlock.h:281: warning: data definition has no type or storage class
include/asm/spinlock.h:282: error: parse error before "if"
include/asm/spinlock.h:284: warning: type defaults to `int' in declaration of `atomic_inc'
include/asm/spinlock.h:284: warning: parameter names (without types) in function declaration
include/asm/spinlock.h:284: error: conflicting types for `atomic_inc'
include/asm/atomic.h:102: error: previous declaration of `atomic_inc'
include/asm/spinlock.h:284: warning: data definition has no type or storage class
include/asm/spinlock.h:285: error: parse error before "return"
include/asm/spinlock.h:288: error: parse error before '{' token
include/asm/spinlock.h:293: error: parse error before numeric constant
include/asm/spinlock.h:293: warning: type defaults to `int' in declaration of `atomic_add'
include/asm/spinlock.h:293: warning: function declaration isn't a prototype
include/asm/spinlock.h:293: error: conflicting types for `atomic_add'
include/asm/atomic.h:53: error: previous declaration of `atomic_add'
include/asm/spinlock.h:293: warning: data definition has no type or storage class
In file included from arch/i386/kernel/asm-offsets.c:7:
include/linux/sched.h:847:56: macro "_spin_lock_irqsave" requires 2 arguments, but only 1 given
In file included from arch/i386/kernel/asm-offsets.c:7:
include/linux/sched.h: In function `dequeue_signal_lock':
include/linux/sched.h:847: error: `_spin_lock_irqsave' undeclared (first use in this function)
include/linux/seqlock.h: In function `write_tryseqlock':
include/linux/seqlock.h:66: warning: statement with no effect
make[1]: *** [arch/i386/kernel/asm-offsets.s] Error 1
make: *** [arch/i386/kernel/asm-offsets.s] Error 2
.config:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.9-rc4-mm1-VP-T4
# Mon Oct 11 19:40:39 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y
#
# General setup
#
CONFIG_LOCALVERSION="-LT"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=14
# CONFIG_HOTPLUG is not set
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_PREEMPT_TIMING=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
# CONFIG_TINY_SHMEM is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_REALTIME=y
# CONFIG_X86_UP_APIC is not set
CONFIG_X86_TSC=y
# CONFIG_X86_MCE is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOW_INTS is not set
CONFIG_APM_REAL_MODE_POWER_OFF=y
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_LEGACY_PROC is not set
CONFIG_PCI_NAMES=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
# CONFIG_DEBUG_DRIVER is not set
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
# CONFIG_PARPORT is not set
#
# Plug and Play support
#
#
# Block devices
#
CONFIG_BLK_DEV_FD=m
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_LBD is not set
# CONFIG_CDROM_PKTCDVD is not set
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
CONFIG_BLK_DEV_IDESCSI=m
# CONFIG_IDE_TASK_IOCTL is not set
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
CONFIG_BLK_DEV_SIS5513=y
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_ARM is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=m
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=m
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_DEBUG_ENABLE is not set
CONFIG_AIC7XXX_DEBUG_MASK=0
# CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLOGIC_1280_1040 is not set
CONFIG_SCSI_QLA2XXX=m
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
CONFIG_NETLINK_DEV=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
# CONFIG_IP_MULTIPLE_TABLES is not set
# CONFIG_IP_ROUTE_MULTIPATH is not set
# CONFIG_IP_ROUTE_VERBOSE is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_TUNNEL is not set
#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
# CONFIG_IP_NF_CT_ACCT is not set
# CONFIG_IP_NF_CT_PROTO_SCTP is not set
# CONFIG_IP_NF_FTP is not set
# CONFIG_IP_NF_IRC is not set
# CONFIG_IP_NF_TFTP is not set
# CONFIG_IP_NF_AMANDA is not set
# CONFIG_IP_NF_QUEUE is not set
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
# CONFIG_IP_NF_MATCH_SCTP is not set
# CONFIG_IP_NF_MATCH_COMMENT is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
# CONFIG_IP_NF_NAT_LOCAL is not set
# CONFIG_IP_NF_NAT_SNMP_BASIC is not set
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
# CONFIG_IP_NF_RAW is not set
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_HW_FLOWCONTROL is not set
#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CLK_JIFFIES=y
# CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set
# CONFIG_NET_SCH_CLK_CPU is not set
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
# CONFIG_NET_SCH_HFSC is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
# CONFIG_NET_SCH_NETEM is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
# CONFIG_CLS_U32_PERF is not set
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_POLICE=y
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_ETHERTAP is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
CONFIG_SIS900=m
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
CONFIG_GAMEPORT=m
CONFIG_SOUND_GAMEPORT=m
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
# CONFIG_GAMEPORT_EMU10K1 is not set
# CONFIG_GAMEPORT_VORTEX is not set
# CONFIG_GAMEPORT_FM801 is not set
# CONFIG_GAMEPORT_CS461x is not set
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=y
# CONFIG_INPUT_UINPUT is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_CONSOLE is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
CONFIG_NVRAM=y
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=m
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_INTEL_MCH is not set
# CONFIG_AGP_NVIDIA is not set
CONFIG_AGP_SIS=m
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HANGCHECK_TIMER=m
#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m
#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
# CONFIG_I2C_ALGOPCA is not set
#
# I2C Hardware Bus support
#
CONFIG_I2C_ALI1535=m
CONFIG_I2C_ALI1563=m
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_ISA=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
CONFIG_SCx200_ACB=m
CONFIG_I2C_SIS5595=m
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=m
# CONFIG_I2C_STUB is not set
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
# CONFIG_I2C_PCA_ISA is not set
#
# Hardware Sensors Chip support
#
CONFIG_I2C_SENSOR=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_MAX1619=m
# CONFIG_SENSORS_SMSC47M1 is not set
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
#
# Other I2C Chip support
#
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
CONFIG_SENSORS_PCF8591=m
CONFIG_SENSORS_RTC8564=m
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
# CONFIG_FB is not set
CONFIG_VIDEO_SELECT=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
#
# Sound
#
CONFIG_SOUND=m
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
# CONFIG_SND_VERBOSE_PRINTK is not set
CONFIG_SND_DEBUG=y
# CONFIG_SND_DEBUG_MEMORY is not set
# CONFIG_SND_DEBUG_DETECT is not set
#
# Generic devices
#
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
#
# PCI devices
#
CONFIG_SND_AC97_CODEC=m
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VX222 is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
# CONFIG_USB is not set
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
# CONFIG_EXT2_FS_POSIX_ACL is not set
# CONFIG_EXT2_FS_SECURITY is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISER4_FS is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
CONFIG_ROMFS_FS=y
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
#
# Caches
#
# CONFIG_FSCACHE is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS_XATTR=y
# CONFIG_DEVPTS_FS_SECURITY is not set
CONFIG_TMPFS=y
# CONFIG_TMPFS_XATTR is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=m
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
#
# Network File Systems
#
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set
# CONFIG_EXPORTFS is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
CONFIG_AFS_FS=m
CONFIG_RXRPC=m
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
CONFIG_NLS_CODEPAGE_850=y
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
CONFIG_NLS_CODEPAGE_1250=y
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=y
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set
#
# Profiling support
#
# CONFIG_PROFILING is not set
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
CONFIG_DEBUG_PREEMPT=y
# CONFIG_DEBUG_INFO is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_KPROBES is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_4KSTACKS is not set
# CONFIG_SCHEDSTATS is not set
# CONFIG_KGDB is not set
#
# Security options
#
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITY_NETWORK is not set
CONFIG_SECURITY_CAPABILITIES=m
# CONFIG_SECURITY_SECLVL is not set
# CONFIG_SECURITY_SELINUX is not set
#
# Cryptographic options
#
CONFIG_CRYPTO=y
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
# CONFIG_CRYPTO_MD5 is not set
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES_586 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set
#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC32=m
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_GENERIC_HARDIRQS=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
>i've released the -T4 VP patch:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T4
I would have to say this is "very rough" at this point. I had the following
problems in the build:
[1] kernel/ksyms.c - undefined symbols
[2] kernel/mutex.c - obvious cut / paste problems
[3] XFS has incompatible mutex definition
[4] suspicious warnings
[5] missing symbols for modules
Details at the end.
I booted w/ SMP and the machine threw a lot of error messages about
sleeping in
an invalid context. For example:
include/linux/rwsem.h:43
in_atomic():1 [00010001], irqs_disabled():1
[<c011f0ea>] __might_sleep+0xca/0xe0
[<c01390d4>] rw_mutex_read_lock+0x34/0x50
[<c0122dbd>] profile_hook+0x1d/0x50
[<c0123338>] profile_tick+0x68/0x70
[<c01150ad>] smp_apic_timer_interrupt+0x5d/0xf0
[<c0105820>] default_idle+0x0/0x40
[<c010854a>] apic_timer_interrupt+0x1a/0x20
[<c0105820>] default_idle+0x0/0x40
[<c011007b>] dmi_get_system_info+0xb/0x20
[<c010585a>] default_idle+0x3a/0x40
[<c03b4a4d>] start_kernel+0x19d/0x1e0
[<c03b4440>] unknown_bootoption+0x0/0x180
(somehow managed to stop the scrolling console with the above message
displayed...)
Finally died with a kernel BUG
kernel BUG at kernel/latenc.c:419!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in microcode dm_mod uhci_hcd ext3 jbd
CPU: 1
EIP: 0001:[<00000000>] Not tainted VLI
EFLAGS: c1663f38 (2.6.9-rc4-mm1-VP-T4)
EIP is at 0x0
eax: 00000000 ebx: c1663f54 ecx: c0109a8c edx: c1663f78
esi: 0000000c edi: c1663f28 ebp: c1663f3c esp: c1663f78
ds: 007b es: 07b ss: 4f03 preempt: 00000001
Process swapper (pid: 0, threadinfo=c1662000 task=c165e550)
<0> Kernel panic - not syncing: Attempted to kill the idle task!
Rebooting with num_cpus=1 and appeared to make it farther but then the
console scrolled like crazy and finally said "console shuts up ..."
and the machine appeared to be hung. Could not scroll the window up
or down to see the full message. Had to power off / on to get the
machine back up. Going back to -T3 until I see some fixes.
If the machine managed to record some good data in /var/log/messages
I will send them separately to you for reference.
--Mark H Johnson
<mailto:[email protected]>
Details on build problems / work arounds follow:
[1] ksyms.c - I commented these lines out, to get a complete build, but
there appears to
be code that expects rtc_lock to be defined. See #5.
arch/i386/kernel/i386_ksyms.c:166 error: `rtc_lock' undeclared here (not in
a function)
arch/i386/kernel/i386_ksyms.c:166 error: initializer element is not
constant
arch/i386/kernel/i386_ksyms.c:166 error: (near initialization for
`__ksymtab_rtc_lock.value')
followed by a similar error for atomic_dec_and_lock at line 177.
[2] mutex.c - several symbols were defined twice, fixed by changing the
names to
the functions preceeding them. See lines 108, 201, 213, 297.
[3] XFS compile failed as follows:
CC [M] fs/xfs/quota/xfs_dquot.o
In file included from fs/xfs/linux-2.6/xfs_linux.h:63,
from fs/xfs/xfs.h:35,
from fs/xfs/quota/xfs_dquot.c:33:
fs/xfs/linux-2.6/mutex.h:45: error: conflicting types for `mutex_t'
include/asm/spinlock.h:79: error: previous declaration of `mutex_t'
In file included from fs/xfs/linux-2.6/xfs_linux.h:102,
from fs/xfs/xfs.h:35,
from fs/xfs/quota/xfs_dquot.c:33:
fs/xfs/linux-2.6/xfs_vnode.h:578:30: macro "mutex_lock" requires 2
arguments, but only 1 given
fs/xfs/linux-2.6/xfs_vnode.h:585:30: macro "mutex_lock" requires 2
arguments, but only 1 given
fs/xfs/quota/xfs_dquot.c:1327:23: macro "mutex_lock" requires 2 arguments,
but only 1 given
fs/xfs/quota/xfs_dquot.c:1390:41: macro "mutex_lock" requires 2 arguments,
but only 1 given
fs/xfs/linux-2.6/xfs_vnode.h: In function `vn_flagset':
fs/xfs/linux-2.6/xfs_vnode.h:578: warning: statement with no effect
fs/xfs/linux-2.6/xfs_vnode.h: In function `vn_flagclr':
fs/xfs/linux-2.6/xfs_vnode.h:585: warning: statement with no effect
Turned off XFS in the build.
[4] I considered the following warnings to be "suspicious" but am not sure
if they are really problems or not.
CC security/selinux/ss/policydb.o
fs/dcache.c: In function `prune_dcache':
fs/dcache.c:391: warning: passing arg 1 of `cond_resched_lock' from
incompatible pointer type
CC security/selinux/ss/services.o
CC fs/inode.o
fs/inode.c: In function `invalidate_list':
fs/inode.c:317: warning: passing arg 1 of `cond_resched_lock' from
incompatible pointer type
[5] Several modules had undefined symbols. The messages were...
Kernel: arch/i386/boot/bzImage is ready
*** Warning: "mutex_trylock_bh" [drivers/net/ppp_synctty.ko] undefined!
*** Warning: "del_mtd_partitions" [drivers/mtd/maps/scx200_docflash.ko]
undefined!
*** Warning: "add_mtd_partitions" [drivers/mtd/maps/scx200_docflash.ko]
undefined!
*** Warning: "i2o_msg_in_to_virt" [drivers/message/i2o/i2o_scsi.ko]
undefined!
*** Warning: "i2o_msg_out_to_virt" [drivers/message/i2o/i2o_core.ko]
undefined!
*** Warning: "i2o_msg_in_to_virt" [drivers/message/i2o/i2o_core.ko]
undefined!
*** Warning: "i2o_msg_in_to_virt" [drivers/message/i2o/i2o_block.ko]
undefined!
*** Warning: "rtc_lock" [drivers/char/nvram.ko] undefined!
*** Warning: "rtc_lock" [drivers/char/mwave/mwave.ko] undefined!
*** Warning: "rtc_lock" [drivers/block/floppy.ko] undefined!
...
if [ -r System.map ]; then /sbin/depmod -ae -F System.map -b
/var/tmp/kernel-2.6.9rc4mm1VPT4-root -r 2.6.9-rc4-mm1-VP-T4; fi
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/net/ppp_synctty.ko
needs unknown symbol mutex_trylock_bh
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/mtd/maps/scx200_docflash.ko
needs unknown symbol del_mtd_partitions
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/mtd/maps/scx200_docflash.ko
needs unknown symbol add_mtd_partitions
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/message/i2o/i2o_scsi.ko
needs unknown symbol i2o_msg_in_to_virt
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/message/i2o/i2o_core.ko
needs unknown symbol i2o_msg_in_to_virt
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/message/i2o/i2o_core.ko
needs unknown symbol i2o_msg_out_to_virt
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/message/i2o/i2o_block.ko
needs unknown symbol i2o_msg_in_to_virt
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/char/nvram.ko
needs unknown symbol rtc_lock
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/char/mwave/mwave.ko
needs unknown symbol rtc_lock
WARNING: /var/tmp/kernel-2.6.9
rc4mm1VPT4-root/lib/modules/2.6.9-rc4-mm1-VP-T4/kernel/drivers/block/floppy.ko
needs unknown symbol rtc_lock
> Ingo Molnar
>
> i've released the -T4 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T4
>
Very rough indeed. First and only attempt on my SMP/HT box stumbled very
early at make time (CONFIG_PREEMPT_REALTIME=y):
...
arch/i386/kernel/i386_ksyms.c:166: error: `rtc_lock' undeclared here (not
in a function)
arch/i386/kernel/i386_ksyms.c:166: error: initializer element is not constant
arch/i386/kernel/i386_ksyms.c:166: error: (near initialization for
`__ksymtab_rtc_lock.value')
arch/i386/kernel/i386_ksyms.c:177: error: `atomic_dec_and_lock' undeclared
here(not in a function)
arch/i386/kernel/i386_ksyms.c:177: error: initializer element is not constant
arch/i386/kernel/i386_ksyms.c:177: error: (near initialization for
`__ksymtab_atomic_dec_and_lock.value')
arch/i386/kernel/i386_ksyms.c:166: error: __ksymtab_rtc_lock causes a
section type conflict
arch/i386/kernel/i386_ksyms.c:177: error: __ksymtab_atomic_dec_and_lock
causes a section type conflict
make[1]: *** [arch/i386/kernel/i386_ksyms.o] Error 1
make: *** [arch/i386/kernel] Error 2
Bye.
--
rncbc aka Rui Nuno Capela
[email protected]
On Mon, 2004-10-11 at 17:22, Rui Nuno Capela wrote:
> > Ingo Molnar
> >
> > i've released the -T4 VP patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T4
> >
>
> Very rough indeed. First and only attempt on my SMP/HT box stumbled very
> early at make time (CONFIG_PREEMPT_REALTIME=y):
Ingo, any reason you bump the major version number to U?
Maybe it's my problem, I just posted yesterday on LAU that number
changes are usually bugfix releases, and letter changes represent big,
possibly untested changes. So we had a few users try T4 and were
surprised it didn't work ;-)
Lee
* [email protected] <[email protected]> wrote:
> I would have to say this is "very rough" at this point. I had the
> following problems in the build:
i've uploaded -T5 which should fix most of the build issues:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T5
CONFIG_PREEMPT_REALTIME is still an experimental feature and defaults to
'n'.
-T5 will likely not fix the exit.c warnings, which, unless they are
accompanied by real crashes, should be mostly harmless. (famous last
words.) (The zombie task and self-reaping thread handling is a really
hard nut to crack, and i have nobody but me to blame for that code ...)
Ingo
On Mon, 11 Oct 2004 23:59:09 +0200
Ingo Molnar <[email protected]> wrote:
> i've uploaded -T5 which should fix most of the build issues:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T5
>
hi,
i still can't build it. Fist i reverse applied T4, then applied T5 and tried
a make bzImage. I'll try from scratch though to make sure, cause these
errors look identical to the T4 ones.
CLEAN .tmp_versions
CHK include/linux/version.h
HOSTCC scripts/basic/fixdep
HOSTCC scripts/basic/split-include
HOSTCC scripts/basic/docproc
HOSTCC scripts/genksyms/genksyms.o
HOSTCC scripts/genksyms/lex.o
HOSTCC scripts/genksyms/parse.o
HOSTLD scripts/genksyms/genksyms
CC scripts/mod/empty.o
HOSTCC scripts/mod/mk_elfconfig
MKELF scripts/mod/elfconfig.h
HOSTCC scripts/mod/file2alias.o
HOSTCC scripts/mod/modpost.o
HOSTCC scripts/mod/sumversion.o
HOSTLD scripts/mod/modpost
HOSTCC scripts/kallsyms
HOSTCC scripts/conmakehash
HOSTCC scripts/bin2c
CC arch/i386/kernel/asm-offsets.s
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:419: error: parse error before '*' token
include/linux/spinlock.h:419: warning: function declaration isn't a prototype
include/linux/spinlock.h:420: error: parse error before '*' token
include/linux/spinlock.h:420: warning: function declaration isn't a prototype
include/linux/spinlock.h:421: error: parse error before '*' token
include/linux/spinlock.h:421: warning: function declaration isn't a prototype
include/linux/spinlock.h:422: error: parse error before '*' token
include/linux/spinlock.h:422: warning: function declaration isn't a prototype
include/linux/spinlock.h:423: error: parse error before '*' token
include/linux/spinlock.h:423: warning: function declaration isn't a prototype
include/linux/spinlock.h:424: error: parse error before '*' token
include/linux/spinlock.h:424: warning: function declaration isn't a prototype
include/linux/spinlock.h:425: error: parse error before '*' token
include/linux/spinlock.h:425: warning: function declaration isn't a prototype
include/linux/spinlock.h:426: error: parse error before '*' token
include/linux/spinlock.h:426: warning: function declaration isn't a prototype
include/linux/spinlock.h:427: error: parse error before '*' token
include/linux/spinlock.h:427: warning: function declaration isn't a prototype
include/linux/spinlock.h:428: error: parse error before '*' token
include/linux/spinlock.h:428: warning: function declaration isn't a prototype
include/linux/spinlock.h:429: error: parse error before '*' token
include/linux/spinlock.h:429: warning: function declaration isn't a prototype
include/linux/spinlock.h:430: error: parse error before '*' token
include/linux/spinlock.h:430: warning: function declaration isn't a prototype
include/linux/spinlock.h:431: error: parse error before '*' token
include/linux/spinlock.h:431: warning: function declaration isn't a prototype
include/linux/spinlock.h:467: error: parse error before '*' token
include/linux/spinlock.h:467: warning: function declaration isn't a prototype
include/linux/spinlock.h:468: error: parse error before '*' token
include/linux/spinlock.h:468: warning: function declaration isn't a prototype
include/linux/spinlock.h:469: error: parse error before '*' token
include/linux/spinlock.h:469: warning: function declaration isn't a prototype
include/linux/spinlock.h:470: error: parse error before '*' token
include/linux/spinlock.h:470: warning: function declaration isn't a prototype
include/linux/spinlock.h:471: error: parse error before '*' token
include/linux/spinlock.h:471: warning: function declaration isn't a prototype
include/linux/spinlock.h:472: error: parse error before '*' token
include/linux/spinlock.h:472: warning: function declaration isn't a prototype
include/linux/spinlock.h:473: error: parse error before '*' token
include/linux/spinlock.h:473: warning: function declaration isn't a prototype
include/linux/spinlock.h:474: error: parse error before '*' token
include/linux/spinlock.h:474: warning: function declaration isn't a prototype
include/linux/spinlock.h:475: error: parse error before '*' token
include/linux/spinlock.h:475: warning: function declaration isn't a prototype
include/linux/spinlock.h:476: error: parse error before '*' token
include/linux/spinlock.h:476: warning: function declaration isn't a prototype
include/linux/spinlock.h:477: error: parse error before '*' token
include/linux/spinlock.h:477: warning: function declaration isn't a prototype
include/linux/spinlock.h:478: error: parse error before '*' token
include/linux/spinlock.h:478: warning: function declaration isn't a prototype
include/linux/spinlock.h:479: error: parse error before '*' token
include/linux/spinlock.h:479: warning: function declaration isn't a prototype
include/linux/spinlock.h:480: error: parse error before '*' token
include/linux/spinlock.h:480: warning: function declaration isn't a prototype
include/linux/spinlock.h:481: error: parse error before '*' token
include/linux/spinlock.h:481: warning: function declaration isn't a prototype
include/linux/spinlock.h:482: error: parse error before '*' token
include/linux/spinlock.h:482: warning: function declaration isn't a prototype
include/linux/spinlock.h:483: error: parse error before '*' token
include/linux/spinlock.h:483: warning: function declaration isn't a prototype
include/linux/spinlock.h:484: error: parse error before '*' token
include/linux/spinlock.h:484: warning: function declaration isn't a prototype
include/linux/spinlock.h:485: error: parse error before '*' token
include/linux/spinlock.h:485: warning: function declaration isn't a prototype
include/linux/spinlock.h:486: error: parse error before '*' token
include/linux/spinlock.h:486: warning: function declaration isn't a prototype
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:543:1: warning: "spin_lock_init" redefined
include/linux/spinlock.h:208:1: warning: this is the location of the previous definition
include/linux/spinlock.h:548:1: warning: "rwlock_init" redefined
include/linux/spinlock.h:226:1: warning: this is the location of the previous definition
include/linux/spinlock.h:553:1: warning: "spin_is_locked" redefined
include/linux/spinlock.h:210:1: warning: this is the location of the previous definition
include/linux/spinlock.h:563:1: warning: "spin_unlock_wait" redefined
include/linux/spinlock.h:212:1: warning: this is the location of the previous definition
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:669: error: parse error before "raw_spinlock_t"
include/linux/spinlock.h:669: warning: function declaration isn't a prototype
In file included from include/linux/time.h:7,
from include/linux/timex.h:58,
from include/linux/sched.h:11,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/seqlock.h:35: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h:35: warning: no semicolon at end of struct or union
include/linux/seqlock.h:36: warning: type defaults to `int' in declaration of `seqlock_t'
include/linux/seqlock.h:36: warning: data definition has no type or storage class
include/linux/seqlock.h:50: error: parse error before '*' token
include/linux/seqlock.h:51: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `write_seqlock':
include/linux/seqlock.h:52: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h:52: error: (Each undeclared identifier is reported only once
include/linux/seqlock.h:52: error: for each function it appears in.)
include/linux/seqlock.h:52: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h:52: error: `raw_spinlock_t' undeclared (first use in this function)
include/linux/seqlock.h:52: error: parse error before ')' token
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:52: error: parse error before "while"
include/linux/seqlock.h:57: error: parse error before '*' token
include/linux/seqlock.h:58: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `write_sequnlock':
include/linux/seqlock.h:60: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h:61: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:61: error: parse error before "while"
include/linux/seqlock.h:64: error: parse error before '*' token
include/linux/seqlock.h:65: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `write_tryseqlock':
include/linux/seqlock.h:66: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h:66: error: parse error before "raw_spinlock_t"
include/linux/seqlock.h:66: warning: unused variable `__ret'
include/linux/seqlock.h:66: error: parse error before "while"
include/linux/seqlock.h:66: error: `raw_spinlock_t' undeclared (first use in this function)
include/linux/seqlock.h:66: error: parse error before ')' token
include/linux/seqlock.h:66: warning: left-hand operand of comma expression has no effect
include/linux/seqlock.h:66: warning: unused variable `ret'
include/linux/seqlock.h:66: warning: no return statement in function returning non-void
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:66: error: parse error before ')' token
include/linux/seqlock.h:66: warning: type defaults to `int' in declaration of `__ret'
include/linux/seqlock.h:66: warning: data definition has no type or storage class
include/linux/seqlock.h:66: error: parse error before '}' token
include/linux/seqlock.h:76: warning: type defaults to `int' in declaration of `seqlock_t'
include/linux/seqlock.h:76: error: parse error before '*' token
include/linux/seqlock.h:77: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `read_seqbegin':
include/linux/seqlock.h:78: error: `sl' undeclared (first use in this function)
include/linux/seqlock.h: At top level:
include/linux/seqlock.h:91: warning: type defaults to `int' in declaration of `seqlock_t'
include/linux/seqlock.h:91: error: parse error before '*' token
include/linux/seqlock.h:92: warning: function declaration isn't a prototype
include/linux/seqlock.h: In function `read_seqretry':
include/linux/seqlock.h:94: error: `iv' undeclared (first use in this function)
include/linux/seqlock.h:94: error: `sl' undeclared (first use in this function)
In file included from include/linux/timex.h:58,
from include/linux/sched.h:11,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/time.h: At top level:
include/linux/time.h:83: error: parse error before "xtime_lock"
include/linux/time.h:83: warning: type defaults to `int' in declaration of `xtime_lock'
include/linux/time.h:83: warning: data definition has no type or storage class
In file included from include/asm/semaphore.h:41,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/wait.h:82: error: parse error before '*' token
include/linux/wait.h:83: warning: function declaration isn't a prototype
include/linux/wait.h: In function `init_waitqueue_head':
include/linux/wait.h:84: error: `q' undeclared (first use in this function)
include/linux/wait.h:84: error: `RAW_SPIN_LOCK_UNLOCKED' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:103: error: parse error before '*' token
include/linux/wait.h:104: warning: function declaration isn't a prototype
include/linux/wait.h: In function `waitqueue_active':
include/linux/wait.h:105: error: `q' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:117: error: parse error before '*' token
include/linux/wait.h:117: warning: function declaration isn't a prototype
include/linux/wait.h:118: error: parse error before '*' token
include/linux/wait.h:118: warning: function declaration isn't a prototype
include/linux/wait.h:119: error: parse error before '*' token
include/linux/wait.h:119: warning: function declaration isn't a prototype
include/linux/wait.h:121: error: parse error before '*' token
include/linux/wait.h:122: warning: function declaration isn't a prototype
include/linux/wait.h: In function `__add_wait_queue':
include/linux/wait.h:123: error: `new' undeclared (first use in this function)
include/linux/wait.h:123: error: `head' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:129: error: parse error before '*' token
include/linux/wait.h:131: warning: function declaration isn't a prototype
include/linux/wait.h: In function `__add_wait_queue_tail':
include/linux/wait.h:132: error: `new' undeclared (first use in this function)
include/linux/wait.h:132: error: `head' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:135: error: parse error before '*' token
include/linux/wait.h:137: warning: function declaration isn't a prototype
include/linux/wait.h: In function `__remove_wait_queue':
include/linux/wait.h:138: error: `old' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:141: error: parse error before '*' token
include/linux/wait.h:141: warning: function declaration isn't a prototype
include/linux/wait.h:142: error: parse error before '*' token
include/linux/wait.h:142: warning: function declaration isn't a prototype
include/linux/wait.h:143: error: parse error before '*' token
include/linux/wait.h:143: warning: function declaration isn't a prototype
include/linux/wait.h:144: error: parse error before '*' token
include/linux/wait.h:144: warning: function declaration isn't a prototype
include/linux/wait.h:145: error: parse error before '*' token
include/linux/wait.h:145: error: `__wait_on_bit' declared as function returning a function
include/linux/wait.h:145: warning: function declaration isn't a prototype
include/linux/wait.h:145: error: parse error before "unsigned"
include/linux/wait.h:146: error: parse error before '*' token
include/linux/wait.h:146: error: `__wait_on_bit_lock' declared as function returning a function
include/linux/wait.h:146: warning: function declaration isn't a prototype
include/linux/wait.h:146: error: parse error before "unsigned"
include/linux/wait.h:150: error: parse error before '*' token
include/linux/wait.h:150: warning: type defaults to `int' in declaration of `bit_waitqueue'
include/linux/wait.h:150: warning: data definition has no type or storage class
include/linux/wait.h:288: error: parse error before '*' token
include/linux/wait.h:290: warning: function declaration isn't a prototype
include/linux/wait.h: In function `add_wait_queue_exclusive_locked':
include/linux/wait.h:291: error: `wait' undeclared (first use in this function)
include/linux/wait.h:292: error: `q' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:298: error: parse error before '*' token
include/linux/wait.h:300: warning: function declaration isn't a prototype
include/linux/wait.h: In function `remove_wait_queue_locked':
include/linux/wait.h:301: error: `q' undeclared (first use in this function)
include/linux/wait.h:301: error: `wait' undeclared (first use in this function)
include/linux/wait.h: At top level:
include/linux/wait.h:309: error: parse error before '*' token
include/linux/wait.h:309: warning: function declaration isn't a prototype
include/linux/wait.h:310: error: parse error before '*' token
include/linux/wait.h:310: warning: function declaration isn't a prototype
include/linux/wait.h:312: error: parse error before '*' token
include/linux/wait.h:312: warning: function declaration isn't a prototype
include/linux/wait.h:313: error: parse error before '*' token
include/linux/wait.h:313: warning: function declaration isn't a prototype
include/linux/wait.h:319: error: parse error before '*' token
include/linux/wait.h:319: warning: function declaration isn't a prototype
include/linux/wait.h:321: error: parse error before '*' token
include/linux/wait.h:321: warning: function declaration isn't a prototype
include/linux/wait.h:323: error: parse error before '*' token
include/linux/wait.h:323: warning: function declaration isn't a prototype
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:26: error: `raw_spinlock_t' used prior to declaration
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:80:1: warning: "SPIN_LOCK_UNLOCKED" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:199:1: warning: this is the location of the previous definition
include/asm/spinlock.h:86: error: conflicting types for `spinlock_t'
include/linux/spinlock.h:198: error: previous declaration of `spinlock_t'
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:93:1: warning: "RW_LOCK_UNLOCKED" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:220:1: warning: this is the location of the previous definition
include/asm/spinlock.h:99: error: conflicting types for `rwlock_t'
include/linux/spinlock.h:219: error: previous declaration of `rwlock_t'
include/asm/spinlock.h:165: error: parse error before "do"
include/asm/spinlock.h:197: error: parse error before "void"
include/asm/spinlock.h:207: error: parse error before "do"
include/asm/spinlock.h:259: error: parse error before "do"
include/asm/spinlock.h:267: error: parse error before "do"
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:275:1: warning: "_raw_read_unlock" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:228:1: warning: this is the location of the previous definition
In file included from include/asm/rwsem.h:42,
from include/linux/rwsem.h:27,
from include/asm/semaphore.h:42,
from include/linux/sched.h:19,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/spinlock.h:276:1: warning: "_raw_write_unlock" redefined
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:230:1: warning: this is the location of the previous definition
include/asm/spinlock.h:278: error: parse error before '{' token
include/asm/spinlock.h:281: warning: type defaults to `int' in declaration of `atomic_dec'
include/asm/spinlock.h:281: warning: parameter names (without types) in function declaration
include/asm/spinlock.h:281: error: conflicting types for `atomic_dec'
include/asm/atomic.h:116: error: previous declaration of `atomic_dec'
include/asm/spinlock.h:281: warning: data definition has no type or storage class
include/asm/spinlock.h:282: error: parse error before "if"
include/asm/spinlock.h:284: warning: type defaults to `int' in declaration of `atomic_inc'
include/asm/spinlock.h:284: warning: parameter names (without types) in function declaration
include/asm/spinlock.h:284: error: conflicting types for `atomic_inc'
include/asm/atomic.h:102: error: previous declaration of `atomic_inc'
include/asm/spinlock.h:284: warning: data definition has no type or storage class
include/asm/spinlock.h:285: error: parse error before "return"
include/asm/spinlock.h:288: error: parse error before '{' token
include/asm/spinlock.h:293: error: parse error before numeric constant
include/asm/spinlock.h:293: warning: type defaults to `int' in declaration of `atomic_add'
include/asm/spinlock.h:293: warning: function declaration isn't a prototype
include/asm/spinlock.h:293: error: conflicting types for `atomic_add'
include/asm/atomic.h:53: error: previous declaration of `atomic_add'
include/asm/spinlock.h:293: warning: data definition has no type or storage class
In file included from arch/i386/kernel/asm-offsets.c:7:
include/linux/sched.h:847:56: macro "_spin_lock_irqsave" requires 2 arguments, but only 1 given
In file included from arch/i386/kernel/asm-offsets.c:7:
include/linux/sched.h: In function `dequeue_signal_lock':
include/linux/sched.h:847: error: `_spin_lock_irqsave' undeclared (first use in this function)
include/linux/seqlock.h: In function `write_tryseqlock':
include/linux/seqlock.h:66: warning: statement with no effect
make[1]: *** [arch/i386/kernel/asm-offsets.s] Error 1
make: *** [arch/i386/kernel/asm-offsets.s] Error 2
On Tue, 12 Oct 2004 00:57:54 +0200
Florian Schmidt <[email protected]> wrote:
> hi,
>
> i still can't build it. Fist i reverse applied T4, then applied T5 and tried
> a make bzImage. I'll try from scratch though to make sure, cause these
> errors look identical to the T4 ones.
>
same errors.. Both with the preemptible real time thingy and without..
flo
Ingo Molnar wrote:
>
> i've uploaded -T5 which should fix most of the build issues:
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T5
>
Sharp roughness is the name of the game ;)
Built fine here, for P4 SMP/HT, and CONFIG_PREEMPT_REALTIME=y.
But, not surprisingly, booting/initing gives me lots of fireworks. This is
just some (tiny) sample taken from syslog:
Oct 12 00:15:26 gamma-suse1 kernel: Debug: sleeping function called from
invalid context pdflush(120) at kernel/mutex.c:23
Oct 12 00:15:26 gamma-suse1 kernel: in_atomic():1 [00000001],
irqs_disabled():1
Oct 12 00:15:26 gamma-suse1 kernel: [__might_sleep+193/212]
__might_sleep+0xc1/0xd4
Oct 12 00:15:26 gamma-suse1 kernel: [<c0118fca>] __might_sleep+0xc1/0xd4
Oct 12 00:15:26 gamma-suse1 kernel: [_mutex_lock+39/96]
_mutex_lock+0x27/0x60
Oct 12 00:15:26 gamma-suse1 kernel: [<c0130d0c>] _mutex_lock+0x27/0x60
Oct 12 00:15:26 gamma-suse1 kernel: [_mutex_lock_irqsave+22/26]
_mutex_lock_irqsave+0x16/0x1a
Oct 12 00:15:26 gamma-suse1 kernel: [<c0130d7d>]
_mutex_lock_irqsave+0x16/0x1a
Oct 12 00:15:26 gamma-suse1 kernel: [page_address+78/158]
page_address+0x4e/0x9e
Oct 12 00:15:26 gamma-suse1 kernel: [<c0149c97>] page_address+0x4e/0x9e
Oct 12 00:15:26 gamma-suse1 kernel: [bio_hw_segments+12/48]
bio_hw_segments+0xc/0x30
Oct 12 00:15:26 gamma-suse1 kernel: [<c01617ca>] bio_hw_segments+0xc/0x30
Oct 12 00:15:26 gamma-suse1 kernel: [__make_request+1010/1251]
__make_request+0x3f2/0x4e3
Oct 12 00:15:26 gamma-suse1 kernel: [<c023db69>] __make_request+0x3f2/0x4e3
Oct 12 00:15:26 gamma-suse1 kernel: [generic_make_request+298/662]
generic_make_request+0x12a/0x296
Oct 12 00:15:26 gamma-suse1 kernel: [<c023dee4>]
generic_make_request+0x12a/0x296
...
Oct 12 00:15:26 gamma-suse1 kernel: Debug: sleeping function called from
invalid context modprobe(1666) at kernel/mutex.c:23
Oct 12 00:15:26 gamma-suse1 kernel: in_atomic():1 [00000001],
irqs_disabled():1
Oct 12 00:15:26 gamma-suse1 kernel: [__might_sleep+193/212]
__might_sleep+0xc1/0xd4
Oct 12 00:15:26 gamma-suse1 kernel: [<c0118fca>] __might_sleep+0xc1/0xd4
Oct 12 00:15:26 gamma-suse1 kernel: [_mutex_lock+39/96]
_mutex_lock+0x27/0x60
Oct 12 00:15:26 gamma-suse1 kernel: [<c0130d0c>] _mutex_lock+0x27/0x60
Oct 12 00:15:26 gamma-suse1 kernel: [_mutex_lock_irqsave+22/26]
_mutex_lock_irqsave+0x16/0x1a
Oct 12 00:15:26 gamma-suse1 kernel: [<c0130d7d>]
_mutex_lock_irqsave+0x16/0x1a
Oct 12 00:15:26 gamma-suse1 kernel: [page_address+78/158]
page_address+0x4e/0x9e
Oct 12 00:15:26 gamma-suse1 kernel: [<c0149c97>] page_address+0x4e/0x9e
Oct 12 00:15:26 gamma-suse1 kernel: [bio_hw_segments+12/48]
bio_hw_segments+0xc/0x30
Oct 12 00:15:26 gamma-suse1 kernel: [<c01617ca>] bio_hw_segments+0xc/0x30
Oct 12 00:15:26 gamma-suse1 kernel: [__make_request+1010/1251]
__make_request+0x3f2/0x4e3
Oct 12 00:15:26 gamma-suse1 kernel: [<c023db69>] __make_request+0x3f2/0x4e3
Oct 12 00:15:26 gamma-suse1 kernel: [generic_make_request+298/662]
generic_make_request+0x12a/0x296
Oct 12 00:15:26 gamma-suse1 kernel: [<c023dee4>]
generic_make_request+0x12a/0x296
...
Oct 12 00:15:26 gamma-suse1 kernel: Debug: sleeping function called from
invalid context blogd(1851) at kernel/mutex.c:23
Oct 12 00:15:26 gamma-suse1 kernel: in_atomic():1 [00000001],
irqs_disabled():1
Oct 12 00:15:26 gamma-suse1 kernel: [__might_sleep+193/212]
__might_sleep+0xc1/0xd4
Oct 12 00:15:26 gamma-suse1 kernel: [<c0118fca>] __might_sleep+0xc1/0xd4
Oct 12 00:15:26 gamma-suse1 kernel: [_mutex_lock+39/96]
_mutex_lock+0x27/0x60
Oct 12 00:15:26 gamma-suse1 kernel: [<c0130d0c>] _mutex_lock+0x27/0x60
Oct 12 00:15:26 gamma-suse1 kernel: [_mutex_lock_irqsave+22/26]
_mutex_lock_irqsave+0x16/0x1a
Oct 12 00:15:26 gamma-suse1 kernel: [<c0130d7d>]
_mutex_lock_irqsave+0x16/0x1a
Oct 12 00:15:26 gamma-suse1 kernel: [page_address+78/158]
page_address+0x4e/0x9e
Oct 12 00:15:26 gamma-suse1 kernel: [<c0149c97>] page_address+0x4e/0x9e
Oct 12 00:15:26 gamma-suse1 kernel: [bio_hw_segments+12/48]
bio_hw_segments+0xc/0x30
Oct 12 00:15:26 gamma-suse1 kernel: [<c01617ca>] bio_hw_segments+0xc/0x30
Oct 12 00:15:26 gamma-suse1 kernel: [__make_request+1010/1251]
__make_request+0x3f2/0x4e3
Oct 12 00:15:26 gamma-suse1 kernel: [<c023db69>] __make_request+0x3f2/0x4e3
Oct 12 00:15:26 gamma-suse1 kernel: [generic_make_request+298/662]
generic_make_request+0x12a/0x296
Oct 12 00:15:26 gamma-suse1 kernel: [<c023dee4>]
generic_make_request+0x12a/0x296
...
An so on, and so on...
--
rncbc aka Rui Nuno Capela
[email protected]
On Mon, 2004-10-11 at 19:14, Florian Schmidt wrote:
> On Tue, 12 Oct 2004 00:57:54 +0200
> Florian Schmidt <[email protected]> wrote:
>
> > hi,
> >
> > i still can't build it. Fist i reverse applied T4, then applied T5 and tried
> > a make bzImage. I'll try from scratch though to make sure, cause these
> > errors look identical to the T4 ones.
> >
>
> same errors.. Both with the preemptible real time thingy and without..
>
Try building for SMP. I suspect this is a UP build problem.
Lee
On Mon, 2004-10-11 at 17:59, Ingo Molnar wrote:
> * [email protected] <[email protected]> wrote:
>
> > I would have to say this is "very rough" at this point. I had the
> > following problems in the build:
>
> i've uploaded -T5 which should fix most of the build issues:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T5
>
Ingo, are any of the VP patches known to work on x64? Here is thewade's
latest report:
--
I applied the patch and the kernel built, but like 2.6.9-mm2-VP-S9 it
crashed before it could load. The last bit of the message was a lot of
what I guess are frame pointers, but there was a few lines that had
info.
For example an RIP message having to do with add_preempt_count+16
But it all ended with Aieee...
I have yet to get any VP kernel to run on my x86_64. I suppose I should
try just the mm3 or mm4 patches without the VP portion, so that is what
I will do.
--
Lee
Ingo Molnar wrote:
> * [email protected] <[email protected]> wrote:
>
>
>>I would have to say this is "very rough" at this point. I had the
>>following problems in the build:
>
>
> i've uploaded -T5 which should fix most of the build issues:
>
This fixed the build problems for me (SMP). I did get one unresolved
symbol when building this with REALTIME enabled. Also got error messages
scrolling up the screen when I tried to boot it (looked very much like
Mark's problem with T4) and it never made it. :( If I had to guess, it
might be related to APICs? I always have to use "noapic" boot parameter.
Ingo what are you running this on? I don't have the exact error
messages, but I'm rebuilding it now to try to get those. Without RT
Preemption it seems to be running very nicely.
kr
On Thu, 2004-10-07 at 06:52, Ingo Molnar wrote:
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
Just to recap, these are the three problem areas that still produce
latencies over 500 usec on my machine.
journal_clean_checkpoint_list
rt_garbage_collect
vga console
I have found that the latter does not require switching back and forth
to X; anything that produces a lot of console output can trigger 500
usec latencies.
Lee
* K.R. Foley <[email protected]> wrote:
> >i've uploaded -T5 which should fix most of the build issues:
> >
>
> This fixed the build problems for me (SMP). I did get one unresolved
> symbol when building this with REALTIME enabled.
(which symbol was this?)
> [...] Also got error messages scrolling up the screen when I tried to
> boot it (looked very much like Mark's problem with T4) and it never
> made it. :( If I had to guess, it might be related to APICs? I always
> have to use "noapic" boot parameter. Ingo what are you running this
> on? I don't have the exact error messages, but I'm rebuilding it now
> to try to get those. Without RT Preemption it seems to be running very
> nicely.
dont worry about it not booting on your setup with PREEMPT_REALTIME, as
long as it boots with !PREEMPT_REALTIME - i only really converted my
testsystems which are basically IDE + e100/e1000/rtl8139, ext3 and the
bare minimum that is needed to run Fedora. It might be useful to send me
a bootlog if you have any easy way to capture it - if not it's not a big
problem either.
Ingo
* Florian Schmidt <[email protected]> wrote:
> On Tue, 12 Oct 2004 00:57:54 +0200
> Florian Schmidt <[email protected]> wrote:
>
> > hi,
> >
> > i still can't build it. Fist i reverse applied T4, then applied T5 and tried
> > a make bzImage. I'll try from scratch though to make sure, cause these
> > errors look identical to the T4 ones.
> >
>
> same errors.. Both with the preemptible real time thingy and without..
could you send me your .config? Had to do some wacky include file magic
to be able to use semaphores in spinlocks, but could easily have missed
some .config variations.
Ingo
* Lee Revell <[email protected]> wrote:
> On Mon, 2004-10-11 at 17:59, Ingo Molnar wrote:
> > * [email protected] <[email protected]> wrote:
> >
> > > I would have to say this is "very rough" at this point. I had the
> > > following problems in the build:
> >
> > i've uploaded -T5 which should fix most of the build issues:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T5
> >
>
> Ingo, are any of the VP patches known to work on x64? Here is
> thewade's latest report:
i have two x64 boxes which i tested with S* -VP and i also did a quick
testboot of -T3 on one of them but YMMV. (There's one caveat: latency
tracing must not be enabled, x64 gcc has a nastiness that makes -pg
unusable for tracing purposes on x64.)
-T4 and upwards will likely not even compile on x64 - i'll fix it up
once the rate of PREEMPT_REALTIME changes calms down.
Ingo
Lee Revell wrote:
> On Mon, 2004-10-11 at 19:14, Florian Schmidt wrote:
> > On Tue, 12 Oct 2004 00:57:54 +0200
> > Florian Schmidt <[email protected]> wrote:
> >
> > > hi,
> > >
> > > i still can't build it. Fist i reverse applied T4, then applied T5 and tried
> > > a make bzImage. I'll try from scratch though to make sure, cause these
> > > errors look identical to the T4 ones.
> > >
> >
> > same errors.. Both with the preemptible real time thingy and without..
> >
>
> Try building for SMP. I suspect this is a UP build problem.
I got same errors...
Struct mutex_t is defined in include/asm-i386/spinlock.h. It's only
included in include/linux/spinlock.h if CONFIG_SMP is set, but mutex_t
is used at include/linux/spinlock.h:419. Set CONFIG_SMP=y then kernel
builds successfully here.
--
Best Regards,
Wen-chien Jesse Sung
i've uploaded -T6:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T6
this should fix the UP build issues reported by many. -T6 also brings
back the ->break_lock framework and converts a few more locks to raw.
SMP is still expected to be flaky due to the zombie-task problem(s). But
UP is not out of the 'extremely experimental' status either.
to create a -T6 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T6
Ingo
* Lee Revell <[email protected]> wrote:
> Just to recap, these are the three problem areas that still produce
> latencies over 500 usec on my machine.
>
> journal_clean_checkpoint_list
you might want to send this trace to Andrew too - the primary master of
ext3 latency-breaking.
> rt_garbage_collect
this one is still nasty and needs revisiting.
> vga console
>
> I have found that the latter does not require switching back and forth
> to X; anything that produces a lot of console output can trigger 500
> usec latencies.
the vga console one we got rid of at a certain stage and it now
resurfaced. The issue was doing VGA-text-RAM copies/memsets under the
vga_lock. Maybe there were changes in vgacon recently that moved some of
those back under the lock?
Ingo
Ingo Molnar wrote:
> i've uploaded -T6:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T6
>
> this should fix the UP build issues reported by many. -T6 also brings
> back the ->break_lock framework and converts a few more locks to raw.
UP build is still failed:
arch/i386/kernel/vm86.c:707: error: `__RAW_SPIN_LOCK_UNLOCKED'
undeclared here (not in a function)
--
Best Regards,
Wen-chien Jesse Sung
* Ingo Molnar <[email protected]> wrote:
> i've uploaded -T6:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T6
>
> this should fix the UP build issues reported by many. -T6 also brings
> back the ->break_lock framework and converts a few more locks to raw.
>
> SMP is still expected to be flaky due to the zombie-task problem(s).
> But UP is not out of the 'extremely experimental' status either.
one more warning wrt. PREEMPT_REALTIME: if this option is enabled then
it is not safe to make interrupts non-threaded via
/proc/irq/*/*/threaded. If you need to turn an interrupt into a
high-prio event then its irq thread should be set to RT priority via
'chrt'. (-T7 will turn off /proc/irq/*/*/threaded altogether, to make
sure it's not set accidentally.)
Ingo
* Ingo Molnar <[email protected]> wrote:
> one more warning wrt. PREEMPT_REALTIME: if this option is enabled then
> it is not safe to make interrupts non-threaded via
> /proc/irq/*/*/threaded. If you need to turn an interrupt into a
> high-prio event then its irq thread should be set to RT priority via
> 'chrt'. (-T7 will turn off /proc/irq/*/*/threaded altogether, to make
> sure it's not set accidentally.)
in fact i've re-uploaded a new version of the -T6 patch to disable
direct interrupts under PREEMPT_REALTIME kernels. The only exception is
IRQ1 on PCs (the keyboard irq), which can be useful for debugging
purposes (SysRq, etc.). I turned the keyboard related locks into raw
spinlocks to make this safe.
Ingo
On Tue, 12 Oct 2004 08:12:01 +0200
Ingo Molnar <[email protected]> wrote:
> > same errors.. Both with the preemptible real time thingy and without..
>
> could you send me your .config? Had to do some wacky include file magic
> to be able to use semaphores in spinlocks, but could easily have missed
> some .config variations.
Sure,
you released T6 already, but here's my T5 config anyways (this one w/o CONFIG_PREEMPT_REALTIME, same result with it enabled though) Gonna try building T6 now:
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.9-rc4-mm1-VP-T5
# Tue Oct 12 01:13:25 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y
#
# General setup
#
CONFIG_LOCALVERSION="-LT"
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=14
# CONFIG_HOTPLUG is not set
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_PREEMPT_TIMING=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
# CONFIG_TINY_SHMEM is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
# CONFIG_PREEMPT_REALTIME is not set
# CONFIG_X86_UP_APIC is not set
CONFIG_X86_TSC=y
# CONFIG_X86_MCE is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOW_INTS is not set
CONFIG_APM_REAL_MODE_POWER_OFF=y
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_LEGACY_PROC is not set
CONFIG_PCI_NAMES=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
# CONFIG_DEBUG_DRIVER is not set
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
# CONFIG_PARPORT is not set
#
# Plug and Play support
#
#
# Block devices
#
CONFIG_BLK_DEV_FD=m
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_LBD is not set
# CONFIG_CDROM_PKTCDVD is not set
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
CONFIG_BLK_DEV_IDESCSI=m
# CONFIG_IDE_TASK_IOCTL is not set
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
CONFIG_BLK_DEV_SIS5513=y
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_ARM is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=m
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=m
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_DEBUG_ENABLE is not set
CONFIG_AIC7XXX_DEBUG_MASK=0
# CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLOGIC_1280_1040 is not set
CONFIG_SCSI_QLA2XXX=m
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
CONFIG_NETLINK_DEV=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
# CONFIG_IP_MULTIPLE_TABLES is not set
# CONFIG_IP_ROUTE_MULTIPATH is not set
# CONFIG_IP_ROUTE_VERBOSE is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_TUNNEL is not set
#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
# CONFIG_IP_NF_CT_ACCT is not set
# CONFIG_IP_NF_CT_PROTO_SCTP is not set
# CONFIG_IP_NF_FTP is not set
# CONFIG_IP_NF_IRC is not set
# CONFIG_IP_NF_TFTP is not set
# CONFIG_IP_NF_AMANDA is not set
# CONFIG_IP_NF_QUEUE is not set
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
# CONFIG_IP_NF_MATCH_SCTP is not set
# CONFIG_IP_NF_MATCH_COMMENT is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
# CONFIG_IP_NF_NAT_LOCAL is not set
# CONFIG_IP_NF_NAT_SNMP_BASIC is not set
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
# CONFIG_IP_NF_RAW is not set
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_HW_FLOWCONTROL is not set
#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CLK_JIFFIES=y
# CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set
# CONFIG_NET_SCH_CLK_CPU is not set
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
# CONFIG_NET_SCH_HFSC is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
# CONFIG_NET_SCH_NETEM is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
# CONFIG_CLS_U32_PERF is not set
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_POLICE=y
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_ETHERTAP is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
CONFIG_SIS900=m
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
CONFIG_GAMEPORT=m
CONFIG_SOUND_GAMEPORT=m
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
# CONFIG_GAMEPORT_EMU10K1 is not set
# CONFIG_GAMEPORT_VORTEX is not set
# CONFIG_GAMEPORT_FM801 is not set
# CONFIG_GAMEPORT_CS461x is not set
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=y
# CONFIG_INPUT_UINPUT is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_CONSOLE is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
CONFIG_NVRAM=y
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=m
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_INTEL_MCH is not set
# CONFIG_AGP_NVIDIA is not set
CONFIG_AGP_SIS=m
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HANGCHECK_TIMER=m
#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m
#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
# CONFIG_I2C_ALGOPCA is not set
#
# I2C Hardware Bus support
#
CONFIG_I2C_ALI1535=m
CONFIG_I2C_ALI1563=m
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
CONFIG_I2C_AMD8111=m
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_ISA=m
CONFIG_I2C_NFORCE2=m
CONFIG_I2C_PARPORT_LIGHT=m
CONFIG_I2C_PIIX4=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
CONFIG_SCx200_ACB=m
CONFIG_I2C_SIS5595=m
CONFIG_I2C_SIS630=m
CONFIG_I2C_SIS96X=m
# CONFIG_I2C_STUB is not set
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
# CONFIG_I2C_PCA_ISA is not set
#
# Hardware Sensors Chip support
#
CONFIG_I2C_SENSOR=m
CONFIG_SENSORS_ADM1021=m
CONFIG_SENSORS_ADM1025=m
CONFIG_SENSORS_ADM1031=m
CONFIG_SENSORS_ASB100=m
CONFIG_SENSORS_DS1621=m
CONFIG_SENSORS_FSCHER=m
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_IT87=m
CONFIG_SENSORS_LM75=m
CONFIG_SENSORS_LM77=m
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
CONFIG_SENSORS_LM83=m
CONFIG_SENSORS_LM85=m
CONFIG_SENSORS_LM90=m
CONFIG_SENSORS_MAX1619=m
# CONFIG_SENSORS_SMSC47M1 is not set
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
CONFIG_SENSORS_W83L785TS=m
CONFIG_SENSORS_W83627HF=m
#
# Other I2C Chip support
#
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
CONFIG_SENSORS_PCF8591=m
CONFIG_SENSORS_RTC8564=m
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
# CONFIG_FB is not set
CONFIG_VIDEO_SELECT=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
#
# Sound
#
CONFIG_SOUND=m
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
# CONFIG_SND_VERBOSE_PRINTK is not set
CONFIG_SND_DEBUG=y
# CONFIG_SND_DEBUG_MEMORY is not set
# CONFIG_SND_DEBUG_DETECT is not set
#
# Generic devices
#
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
#
# PCI devices
#
CONFIG_SND_AC97_CODEC=m
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
# CONFIG_SND_INTEL8X0 is not set
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VX222 is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
# CONFIG_USB is not set
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
# CONFIG_EXT2_FS_POSIX_ACL is not set
# CONFIG_EXT2_FS_SECURITY is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISER4_FS is not set
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
CONFIG_ROMFS_FS=y
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
#
# Caches
#
# CONFIG_FSCACHE is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS_XATTR=y
# CONFIG_DEVPTS_FS_SECURITY is not set
CONFIG_TMPFS=y
# CONFIG_TMPFS_XATTR is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=m
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
#
# Network File Systems
#
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set
# CONFIG_EXPORTFS is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
CONFIG_AFS_FS=m
CONFIG_RXRPC=m
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
CONFIG_NLS_CODEPAGE_850=y
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
CONFIG_NLS_CODEPAGE_1250=y
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=y
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set
#
# Profiling support
#
# CONFIG_PROFILING is not set
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_SPINLOCK is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
CONFIG_DEBUG_PREEMPT=y
# CONFIG_DEBUG_INFO is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_KPROBES is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_4KSTACKS is not set
# CONFIG_SCHEDSTATS is not set
# CONFIG_KGDB is not set
#
# Security options
#
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITY_NETWORK is not set
CONFIG_SECURITY_CAPABILITIES=m
# CONFIG_SECURITY_SECLVL is not set
# CONFIG_SECURITY_SELINUX is not set
#
# Cryptographic options
#
CONFIG_CRYPTO=y
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
# CONFIG_CRYPTO_MD5 is not set
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES_586 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set
#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC32=m
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_GENERIC_HARDIRQS=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
* Wen-chien Jesse Sung <[email protected]> wrote:
> > this should fix the UP build issues reported by many. -T6 also brings
> > back the ->break_lock framework and converts a few more locks to raw.
>
> UP build is still failed:
> arch/i386/kernel/vm86.c:707: error: `__RAW_SPIN_LOCK_UNLOCKED'
> undeclared here (not in a function)
ok, fixed this one too and re-uploaded -T6 - please check whether it
builds for you now.
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>>i've uploaded -T5 which should fix most of the build issues:
>>>
>>
>>This fixed the build problems for me (SMP). I did get one unresolved
>>symbol when building this with REALTIME enabled.
>
>
> (which symbol was this?)
WARNING:
/lib/modules/2.6.9-rc4-mm1-VP-T5-RT/kernel/drivers/net/ppp_synctty.ko
needs unknown symbol _mutex_trylock_bh
I shouldn't have even had ppp enabled and then wouldn't have noticed it,
but...
>
>
>>[...] Also got error messages scrolling up the screen when I tried to
>>boot it (looked very much like Mark's problem with T4) and it never
>>made it. :( If I had to guess, it might be related to APICs? I always
>>have to use "noapic" boot parameter. Ingo what are you running this
>>on? I don't have the exact error messages, but I'm rebuilding it now
>>to try to get those. Without RT Preemption it seems to be running very
>>nicely.
>
>
> dont worry about it not booting on your setup with PREEMPT_REALTIME, as
> long as it boots with !PREEMPT_REALTIME - i only really converted my
> testsystems which are basically IDE + e100/e1000/rtl8139, ext3 and the
> bare minimum that is needed to run Fedora. It might be useful to send me
> a bootlog if you have any easy way to capture it - if not it's not a big
> problem either.
>
> Ingo
>
Ingo Molnar wrote:
> * Wen-chien Jesse Sung <[email protected]> wrote:
>
> > > this should fix the UP build issues reported by many. -T6 also brings
> > > back the ->break_lock framework and converts a few more locks to raw.
> >
> > UP build is still failed:
> > arch/i386/kernel/vm86.c:707: error: `__RAW_SPIN_LOCK_UNLOCKED'
> > undeclared here (not in a function)
>
> ok, fixed this one too and re-uploaded -T6 - please check whether it
> builds for you now.
Yes, it works now! Thanks a lot! :)
--
Best Regards,
Wen-chien Jesse Sung
* K.R. Foley <[email protected]> wrote:
> >(which symbol was this?)
>
> WARNING:
> /lib/modules/2.6.9-rc4-mm1-VP-T5-RT/kernel/drivers/net/ppp_synctty.ko
> needs unknown symbol _mutex_trylock_bh
thx - fix will be in -T7.
Ingo
i've uploaded -T7:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T7
Changes since -T6:
- further stabilization of PREEMPT_REALTIME: fixed the task-reaping
problem by moving TASK_ZOMBIE out of p->state and thus completely
separating preemption from the child-exit mechanism. This got rid of
the 'Badness in exit.c' warnings on my SMP testbox (and related
crashes).
- fixed the _mutex_trylock_bh missing symbol problem reported by K.R.
Foley and Florian Schmidt.
- turned the sysrq lock into a raw spinlock, to enable direct keyboard
irqs.
PREEMPT_REALTIME is still experimental, but it's already looking much
better on my testboxes.
to create a -T7 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T7
Ingo
Ingo Molnar wrote:
>
> i've uploaded -T7:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T7
>
OK. 2.6.9-rc4-mm1-T7 builds and runs on my laptop (P4/UP), apparently
fine. I know it's probably too early to complain, but I'm sending a couple
of dmesg's that I took right after init, showing some badness going on.
.config file is aldo attached.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> OK. 2.6.9-rc4-mm1-T7 builds and runs on my laptop (P4/UP), apparently
> fine. I know it's probably too early to complain, but I'm sending a
> couple of dmesg's that I took right after init, showing some badness
> going on.
does the patch below ontop of T7 fix these messages for you?
Ingo
--- linux/sound/pci/ali5451/ali5451.c.orig
+++ linux/sound/pci/ali5451/ali5451.c
@@ -261,8 +261,8 @@ struct snd_stru_ali {
unsigned short ac97_ext_id;
unsigned short ac97_ext_status;
- spinlock_t reg_lock;
- spinlock_t voice_alloc;
+ raw_spinlock_t reg_lock;
+ raw_spinlock_t voice_alloc;
#ifdef CONFIG_PM
ali_image_t *image;
--- linux/fs/fcntl.c.orig
+++ linux/fs/fcntl.c
@@ -541,7 +541,7 @@ int send_sigurg(struct fown_struct *fown
return ret;
}
-static rwlock_t fasync_lock = RW_LOCK_UNLOCKED;
+static DECLARE_RAW_RWLOCK(fasync_lock);
static kmem_cache_t *fasync_cache;
/*
Oct 12 09:52:58 swdev14 syslogd 1.4.1: restart.
Oct 12 09:52:58 swdev14 syslog: syslogd startup succeeded
Oct 12 09:52:59 swdev14 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Oct 12 09:52:59 swdev14 syslog: klogd startup succeeded
Oct 12 09:52:59 swdev14 kernel: sys_init_module+0x68/0x1ce
Oct 12 09:52:59 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:52:59 swdev14 kernel: scheduling while atomic: usb.agent/0x04010000/1298
Oct 12 09:52:59 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:52:59 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:52:59 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:52:59 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:52:59 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:52:59 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:52:59 swdev14 irqbalance: irqbalance startup succeeded
Oct 12 09:52:59 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:52:59 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:52:59 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:52:59 swdev14 kernel: EXT3 FS on hda6, internal journal
Oct 12 09:52:59 swdev14 kernel: device-mapper: 4.1.0-ioctl (2003-12-10) initialised: [email protected]
Oct 12 09:52:59 swdev14 kernel: Adding 2048216k swap on /dev/hda5. Priority:-1 extents:1
Oct 12 09:52:59 swdev14 kernel: scheduling while atomic: rc.sysinit/0x04010001/1561
Oct 12 09:52:59 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:52:59 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:52:59 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:52:59 swdev14 portmap: portmap startup succeeded
Oct 12 09:52:59 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:52:59 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:52:59 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:52:59 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:52:59 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:52:59 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:52:59 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 09:52:59 swdev14 kernel: [<c029a35d>] _spin_unlock_irqrestore+0x10/0x36
Oct 12 09:52:59 swdev14 rpc.statd[2649]: Version 1.0.6 Starting
Oct 12 09:52:59 swdev14 kernel: [<c0118895>] try_to_wake_up+0x1e8/0x270
Oct 12 09:52:59 swdev14 kernel: [<c0118940>] wake_up_process+0x23/0x27
Oct 12 09:52:59 swdev14 kernel: [<c011923d>] sched_migrate_task+0x7e/0x9d
Oct 12 09:52:59 swdev14 nfslock: rpc.statd startup succeeded
Oct 12 09:52:59 swdev14 kernel: [<c01192e6>] sched_exec+0x8a/0xd4
Oct 12 09:52:59 swdev14 kernel: [<c01192fb>] sched_exec+0x9f/0xd4
Oct 12 09:52:59 swdev14 kernel: [<c0166f3b>] do_execve+0x3e/0x249
Oct 12 09:52:59 swdev14 kernel: [<c0168881>] getname+0x91/0xbc
Oct 12 09:52:59 swdev14 kernel: [<c0104d5d>] sys_execve+0x47/0x9a
Oct 12 09:52:59 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:52:59 swdev14 kernel: scheduling while atomic: grep/0x04010000/1569
Oct 12 09:52:59 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:52:59 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:52:59 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:52:59 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:00 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:00 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:00 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:00 swdev14 rpcidmapd: rpc.idmapd startup succeeded
Oct 12 09:53:00 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:53:00 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:00 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:00 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 09:53:00 swdev14 kernel: [<c02996c7>] cond_resched+0x14/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0133822>] _mutex_lock+0x19/0x3f
Oct 12 09:53:00 swdev14 kernel: [<c014d347>] handle_mm_fault+0x54/0x18a
Oct 12 09:53:00 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:00 swdev14 random: Initializing random number generator: succeeded
Oct 12 09:53:00 swdev14 kernel: [<c0116fa8>] do_page_fault+0x20b/0x662
Oct 12 09:53:00 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:00 swdev14 kernel: [<c02996c1>] cond_resched+0xe/0x83
Oct 12 09:53:00 swdev14 kernel: [<c014e08f>] sys_brk+0x28/0x10f
Oct 12 09:53:00 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:00 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:00 swdev14 kernel: [<c014e08f>] sys_brk+0x28/0x10f
Oct 12 09:53:00 swdev14 kernel: [<c014fd3a>] sys_munmap+0x59/0x7b
Oct 12 09:53:00 swdev14 rc: Starting pcmcia: succeeded
Oct 12 09:53:00 swdev14 kernel: [<c0116d9d>] do_page_fault+0x0/0x662
Oct 12 09:53:00 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 09:53:00 swdev14 kernel: scheduling while atomic: cat/0x04010000/1752
Oct 12 09:53:00 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:00 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:00 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:00 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:00 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:00 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:00 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:00 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:53:00 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:00 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:00 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 09:53:00 swdev14 kernel: [<c015b55f>] vfs_read+0x0/0x134
Oct 12 09:53:00 swdev14 kernel: [<c015b8ed>] sys_read+0x50/0x7a
Oct 12 09:53:00 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:00 swdev14 kernel: schedulingro0x603>] er_i> [ic_<4c0134_pre<4> pr<c01tr88>] checkpt1fd9>
Oct 12 09:53:00 swdev14 kernel: [<c0299] 49/0x1pr19136>] pt_ [<mc4> _spore+0xb[<c7
Oct 12 09:53:00 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:00 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:00 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:00 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:00 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:00 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:00 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:00 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:00 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:00 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:00 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:00 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:00 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:00 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:00 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:00 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:00 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:01 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:01 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:01 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:01 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:01 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:01 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:01 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:01 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:01 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:01 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:01 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: ]97a>]6>0dae+0>+0c0+0xc0_wxumi5e+] v/t[x010xdux0>0c0>on> [o0c0/41cre46/nd_x8he> _mu+pd/ro03>e [i
Oct 12 09:53:01 swdev14 kernel: [<]9/0r1936pt> [m4>_sre+[07
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:01 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:01 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:01 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:01 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:01 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:01 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:01 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:01 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:01 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:01 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:01 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:01 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:01 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:01 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:01 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:01 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:01 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:01 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:01 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:01 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:01 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:01 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 last message repeated 2 times
Oct 12 09:53:01 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:01 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:01 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:01 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:01 swdev14 kernel: [<c01a9bd8>] dummy_file_permiansm026chec> [_skb<4>x3c
Oct 12 09:53:01 swdev14 kernel: 3>] tc4a8eg+0[<nt+54sg+04aft_
Oct 12 09:53:01 swdev14 kernel: su7
Oct 12 09:53:01 swdev14 kernel: ] x97
Oct 12 09:53:01 swdev14 kernel: ref>] c01348_pr
Oct 12 09:53:01 swdev14 kernel: pr<c01tra88>] check_pt1f9d9>7
Oct 12 09:53:01 swdev14 kernel: [<c02992] 496/0x1pr19136>]pt_ [<mc4> _spore+0x[<7
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x1>] ] dunc[<cwn_<c013.lo338ck_ [omec514d>c0unl02404f3>] qest[<c0nt+qdisc56/0>] de+outd1e
Oct 12 09:53:01 swdev14 kernel: ] 0x239>[<cout>] suntsub_pree+0xpin_unlox34 subt+0xpree2/0a2c7ck+134eem29a2clocc013nt+265end
Oct 12 09:53:01 swdev14 kernel: <4 tc [unttcp_v4_s+0x5fmit6>x3c>] +0x[<c025sen8d
Oct 12 09:53:01 swdev14 kernel: ] a65>] t+0x134ampt0134emp34apt_caio_x13>]x50a>e+0d4>0x1d>] e+01a9bile_cricrit<4> [<mco[<cwri10617_pa3>scle 2/27is /0x2674> [ondc0pre
Oct 12 09:53:01 swdev14 kernel: <4] c [<ck_
Oct 12 09:53:01 swdev14 kernel: <029<4con<4> [<c_rw+0c01/fde
Oct 12 09:53:01 swdev14 kernel: < smnter4
Oct 12 09:53:01 swdev14 kernel: <er_in/0xe_con9/fb2>]12cheing1e1d6edce+08>] _ti106fck+] 0xn+0x8a/[<ctimi9
Oct 12 09:53:01 swdev14 kernel: <_pr0x_mcount+0x11
Oct 12 09:53:01 swdev14 kernel: <] _sqrec0298491>] __d+0+0<c04def0x0/029+0x01ock.46
Oct 12 09:53:01 swdev14 kernel: mutexve+ boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:01 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:01 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:01 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:01 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:01 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:01 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:01 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:01 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:01 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:01 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:01 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:02 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:02 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:02 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:02 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:02 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:02 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:02 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:02 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:02 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:02 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 last message repeated 2 times
Oct 12 09:53:02 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:02 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:02 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:02 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:02 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:02 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:02 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:02 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:02 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:02 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:02 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:02 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:02 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:02 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:02 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:02 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:02 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:02 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:02 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:02 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:02 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:53:02 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:02 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:02 swdev14 kernel: [<c011f0b2>] release_console_sem+0x59/0xcf
Oct 12 09:53:02 swdev14 kernel: [<c011efb2>] vprintk+0x128/0x16f
Oct 12 09:53:02 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:02 swdev14 kernel: [<c011ee86>] printk+0x1d/0x21
Oct 12 09:53:02 swdev14 kernel: [<c0106edc>] show_trace+0x4e/0x8d
Oct 12 09:53:02 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:02 swdev14 kernel: [<c0106fd9>] dump_stack+0x23/0x27
Oct 12 09:53:02 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:02 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:02 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:02 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:02 swdev14 kernel: [<c01347
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:02 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:02 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:02 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:02 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:02 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:02 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:02 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:02 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:02 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:02 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:02 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:02 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:02 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:02 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:02 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:02 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:02 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:02 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:02 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:02 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:02 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:02 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:02 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:02 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:02 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:02 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:02 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:02 swdev14 last message repeated 2 times
Oct 12 09:53:02 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:02 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:02 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:02 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:02 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:02 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:03 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:03 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:03 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:03 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:03 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:03 swdev14 kernel: c0c>o4> [ou0c0/41c0r4d_xche> w_mu+pdo03>er i< ><4 c0t88>] cp1fd97
Oct 12 09:53:03 swdev14 kernel: [<]49/0r136>p [m<4_sore+[7
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:03 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:03 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:03 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:03 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:03 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:03 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:03 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:03 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:03 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:03 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:03 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:03 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:03 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:03 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:03 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:03 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:03 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:03 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:03 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:03 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:03 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:03 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0254sg4at<4 97
Oct 12 09:53:03 swdev14 kernel: >]97a6>xde+0>0c0xc_wx0mmi5e p<c01tra88>] checkpt_x
Oct 12 09:53:03 swdev14 kernel: [<] s96/0x1pr19936>]pt_> [<mc<4> _spore+0x[<07
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:03 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:03 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:03 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:03 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:03 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:03 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:03 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:03 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:03 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:03 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:03 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:03 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:03 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:03 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:03 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:03 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:03 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:03 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:03 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:03 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:03 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:03 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:03 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:03 swdev14 last message repeated 2 times
Oct 12 09:53:03 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:03 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:03 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:03 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:03 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:03 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:03 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:03 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:03 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:03 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:03 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:04 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:04 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:04 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:04 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:04 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:04 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:04 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:04 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:04 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:04 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:04 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:04 swdev14 kernel: [ro0x603>] er_> [<ic_
Oct 12 09:53:04 swdev14 kernel: recf>] <c013488_pr<4> pr<c010tra888>] check_pt_x1f9d9>
Oct 12 09:53:04 swdev14 kernel: [<c029926] s496/0xr19136>] pt [<mco<4> _spre+0xb[<7
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x1] dunc[<cwn_<c013.lo338ck_> [omec514dc0unl02404f3>] qest[<c0nt56/0>] de+0xecfa>outdst_oux2f4/0x18148
Oct 12 09:53:04 swdev14 kernel: <4 nf_11e
Oct 12 09:53:04 swdev14 kernel: 4>] 0x239>[<out>] suuntsub_pree+0xpin_unlox34 subt+0pre2/0a2c7ck+134eemp29a2clocc013nt+265end
Oct 12 09:53:04 swdev14 kernel: <4 tcp [unttcp_v4_s+0x5fmit>x3c>] +0x<c025sen8d
Oct 12 09:53:04 swdev14 kernel: ] a65>] t+0x34mpt13emp34apt_ciox1]x50>e+040x13d>] e+0>] aue_fritcri<4> [<mc<cwri1061_pa3>scle 2/27is /0267> ond0pre
Oct 12 09:53:04 swdev14 kernel: <] c [<ck_
Oct 12 09:53:04 swdev14 kernel: <4con<4> [<c_r0c0/fde
Oct 12 09:53:04 swdev14 kernel: < smter4
Oct 12 09:53:04 swdev14 kernel: <er_in/0_con9/efb2>]12cheing1e1d6edce+08>] _ti106fck] 0xb+0x8a/[<ctimi9
Oct 12 09:53:04 swdev14 kernel: _pr0x4__mcount+0x1
Oct 12 09:53:04 swdev14 kernel: <] _sqre0298491>] __+0x+0<c04de+0x0/029+0x01ck.46
Oct 12 09:53:04 swdev14 kernel: mutexve] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:04 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:04 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:04 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:04 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:04 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:04 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:04 swdev14 kernel: [<cg[<nt4sgat<4 s7
Oct 12 09:53:04 swdev14 kernel: <>97a>6>]0xdte+0>0c00xce_wx0mi15e /0n[<x01xd0>+0c4>o> [ouc6/481cr46d_xhed>_mu+pd/ro03>e i< c>]c01_p
Oct 12 09:53:04 swdev14 kernel: <4 <c0tr81fd9
Oct 12 09:53:04 swdev14 kernel: [<]9/0r1936pt [m4>_re+[<7
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:04 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:04 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:04 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:04 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:04 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:04 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:04 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:04 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:04 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:04 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:04 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:04 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:04 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:04 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:04 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:04 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:04 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:04 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:04 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:04 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:04 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:04 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:04 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:04 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:04 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:04 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:04 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c0254a65>] tcp_sendmsg+at<4 7
Oct 12 09:53:04 swdev14 kernel: <>x97a>6>xde+0>0c0xc_wx0ummio15e v/t[<x00xdx0>cc04>o4> [ou0c02/41cr4d_x8che>w_mu+pd/o> [<ic_
Oct 12 09:53:04 swdev14 kernel: rec>]c01348_pre<4> pr<c01tra88>] checkptx1f9d9>7
Oct 12 09:53:04 swdev14 kernel: [<c02992] s49/0x10r191936>]pt [mc4>_re+[<07
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:04 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:04 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:04 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:04 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:04 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:04 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:04 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:04 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:04 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:04 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:04 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:04 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:04 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:04 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:04 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:04 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:04 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:04 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:05 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:05 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:05 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:05 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:05 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:05 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c0254a65>] tcp_sgat<4> s97
Oct 12 09:53:05 swdev14 kernel: <>]97a>]6>xde+0> 0c00xc0mi5e] v/t[x010xd0>0c0>o> [o0c02/4x1cr46d_xched> _mut+pd/o03>er ic
Oct 12 09:53:05 swdev14 kernel: < f>c013_p<4 <c0t88>] cpx1d9
Oct 12 09:53:05 swdev14 kernel: [<c]9/0136>p [mc4_sre+[7
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:05 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:05 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:05 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:05 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:05 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c5<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:05 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:05 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:05 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:05 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:05 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:05 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 last message repeated 2 times
Oct 12 09:53:05 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:05 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:05 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:05 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:05 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:05 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:05 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:05 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:05 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:05 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:05 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:05 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:05 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:05 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:05 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:05 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:05 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:05 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:05 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:05 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:05 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:53:05 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:05 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:05 swdev14 kernel: [<c011f0b2>] release_console_sem+0x59/0xcf
Oct 12 09:53:05 swdev14 kernel: [<c011efb2>] vprintk+0x128/0x16f
Oct 12 09:53:05 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:05 swdev14 kernel: [<c011ee86>] printk+0x1d/0x21
Oct 12 09:53:05 swdev14 kernel: [<c0106edc>] show_trace+0x4e/0x8d
Oct 12 09:53:05 swdev14 kernel: 134g+0 [<k+<c0<4__c01348_p91134ng[<21>] rqre6
Oct 12 09:53:05 swdev14 kernel: x85/001> [<cdo> [<cef1c
Oct 12 09:53:05 swdev14 kernel: >] <4.te0x5/001save> [rt_xeb _1
Oct 12 09:53:05 swdev14 kernel: ck+0xb/[<c0x132<c/0x15171e6
Oct 12 09:53:05 swdev14 kernel: [<dd>it+0> [_output+0xd5/6
Oct 12 09:53:05 swdev14 kernel: </06d402pu023bf_s<c02ut> [<_q9e<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:05 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:05 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:05 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:05 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:05 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:05 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:05 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:05 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:05 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:05 swdev14 last message repeated 2 times
Oct 12 09:53:05 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:05 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:05 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:05 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:05 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:05 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:05 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:05 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:05 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:05 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:05 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:05 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:05 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:05 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:06 swdev14 kernel: [<c0134936>] touch_preempt_ti0x9 _spina/0] chimi
Oct 12 09:53:06 swdev14 kernel: _s34
Oct 12 09:53:06 swdev14 kernel: <4d><c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:06 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:06 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:06 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:06 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:06 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:06 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:06 swdev14 last message repeated 2 times
Oct 12 09:53:06 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:06 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:06 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:06 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:06 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:06 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:06 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:06 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:06 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:06 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:06 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:06 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:06 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:06 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:06 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:06 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:06 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:06 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:06 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:06 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:06 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:53:06 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:06 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:06 swdev14 kernel: [<c011f0b2>] release_console_sem+0x59/0xcf
Oct 12 09:53:06 swdev14 kernel: [<c011efb2>] vprintk+0x128/0x16f
Oct 12 09:53:06 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:06 swdev14 kernel: [<c011ee86>] printk+0x1d/0x21
Oct 12 09:53:06 swdev14 kernel: [<c0106edc>] show_trace+0x4e/0x8d
Oct 12 09:53:06 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:06 swdev14 kernel: [<c0106fd9>] dump_stack+0x23/0x27
Oct 12 09:53:06 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:06 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:06 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:06 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:06 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:06 swdev14 kernel: [<c029a358>] _spin_unlock_irqrestore+0xb/0x36
Oct 12 09:53:06 swdev14 kernel: [<c0298491>] __down+0x85/0x107
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:06 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:06 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:06 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:06 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:06 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:06 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:06 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:06 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:06 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:06 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:06 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mco[<c_obfow+while atfti<c02 i83
Oct 12 09:53:06 swdev14 kernel: t+0 schedule+0xxbe<4>t+0>]
Oct 12 09:53:06 swdev14 kernel: cond_resed+out<c0ti[<cpreem2/0> co26/1>]
Oct 12 09:53:06 swdev14 kernel: pt<c0134<cing+0x191/0x1f9 _spin_unlock+0x1a/0x34
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c0134936>] [<c0134af1>] touch_preempt_timing+0x46/0x4a sub_preempt_count+0x82/0x97
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c02996d9>] [<c0134af1>] cond_resched+0x26/0x83 sub_preempt_count+0x82/0x97
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c029a2c7>] [<c0299716>] _spin_unlock+0x1a/0x34 cond_resched+0x63/0x83
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c0134888>] [<c0133b8e>] check_preempt_timing+0x191/0x1f9 _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c029a2c7>] [<c011f8f1>] _spin_unlock+0x1a/0x34 profile_hook+0x1d/0x47
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c013414d>] [<c011fde3>] __mcount+0x1d/0x21 profile_tick+0x63/0x65
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c0265d8c>] [<c0113d03>] tcp_v4_send_check+0xe/0xe2 smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c025fbf1>] [<c0106bb6>] tcp_transmit_skb+0x432/0x852 apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] [<c029007b>] mcount+0x14/0x18 packet_sendmsg+0x210/0x280
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c0265d8c>] [<c0299fdb>] tcp_v4_send_check+0xe/0xe2 _spin_lock+0x56/0x78
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c025fc9d>] [<c02405a7>] tcp_transmit_skb+0x4de/0x852 dev_watchdog+0x0/0xb7
Oct 12 09:53:06 swdev14 kernel:
Oct 12 09:53:06 swdev14 kernel: [<c01b2056>] [<c02405c3>] memcpy+x2eb [3c59x]
Oct 12 09:53:06 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:06 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:06 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:06 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:06 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:06 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:06 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:06 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:06 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:06 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:06 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:06 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:06 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:06 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:06 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:06 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:06 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:06 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:06 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:06 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:06 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:07 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:07 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:07 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:07 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:07 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:07 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:07 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:07 swdev14 last message repeated 2 times
Oct 12 09:53:07 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:07 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:07 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:07 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:07 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:07 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:07 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:07 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:07 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:07 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:07 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:07 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:07 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:07 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:07 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:07 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:07 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:07 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:07 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:07 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:07 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:53:07 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:07 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:07 swdev14 kernel: [<c011f0b2>] release_console_sem+0x59/0xcf
Oct 12 09:53:07 swdev14 kernel: [<c011efb2>] vprintk+0x128/0x16f
Oct 12 09:53:07 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:07 swdev14 kernel: [<c011ee86>] printk+0x1d/0x21
Oct 12 09:53:07 swdev14 kernel: [<c0106edc>] show_trace+0x4e/0x8d
Oct 12 09:53:07 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:07 swdev14 kernel: [<c0106fd9>] dump_stack+0x23/0x27
Oct 12 09:53:07 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:07 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:07 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:07 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:07 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:07 swdev14 kernel: [<c029a358>] _spin_unlock_irqrestore+0xb/0x36
Oct 12 09:53:07 swdev14 kernel: [<c0298491>] __down+0x85/0x107
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:07 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:07 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:07 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:07 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:07 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:07 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:07 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:07 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:07 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:07 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:07 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:07 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:07 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:07 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:07 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:07 swdev14 kernel: [>] de+0xecfa>outpdst_ox2f4/0x18
Oct 12 09:53:07 swdev14 kernel: <148> [<c02_hoe
Oct 12 09:53:07 swdev14 kernel: <4] d539>it+> [t_o [<_pr134afmpt4> [spi
Oct 12 09:53:07 swdev14 kernel: +0x8c01/02c7> [<c0134eck0x<c04
Oct 12 09:53:07 swdev14 kernel: ] 1>] eck+0[<rans/0x14/0026che> [_skb<43c3>] 4a8g+0[<nt54sg+4at_c
Oct 12 09:53:07 swdev14 kernel: su97
Oct 12 09:53:07 swdev14 kernel: >] x97
Oct 12 09:53:07 swdev14 kernel: <4a>36
Oct 12 09:53:07 swdev14 kernel: >] 0x5da>te+0x12> [0xc01+0xac01e_wakx0/ummyion15e+0] vf[<c0x500100x5dulx0>c+0xc02>on4> [<c0ouc+0c06/488x1c01pree46nd_rex8hed+0> w_mutex+0 prd/0ro0x603>] er_ [<ic_
Oct 12 09:53:07 swdev14 kernel: rcf>]<c013488_pr
Oct 12 09:53:07 swdev14 kernel: p<c010tra88>] check_pptx1f9d9>7
Oct 12 09:53:07 swdev14 kernel: [<c0299267] s49/0xp36>]pt_> [<cmco4> _spiore+0xb[<07
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:07 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:07 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:07 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:07 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:07 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:07 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:07 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:07 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:07 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:07 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:07 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:07 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:07 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:07 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:07 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:07 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:07 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:07 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:07 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:07 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:08 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:08 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:08 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:08 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:08 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:08 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:08 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:08 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:08 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 last message repeated 2 times
Oct 12 09:53:08 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:08 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:08 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:08 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:08 swdev14 kernel: ummyio15e+0] vf/0xnt[<0x50100x5dux0>c+0c02>on> [<c0ou0c026/48x1c01re46/nd_rex8ched+> _mute+0 pd/0o0x03>] er [ic
Oct 12 09:53:08 swdev14 kernel: <4 rcf>]<c0134_pr
Oct 12 09:53:08 swdev14 kernel: p<c01tra88>] check_pt_1fd97
Oct 12 09:53:08 swdev14 kernel: [<c029996/0xpr191936>]pt_> [<mc<4> _spore+0x[<7
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:08 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:08 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:08 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:08 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:08 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:08 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:08 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:08 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:08 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:08 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:08 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:08 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:08 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:08 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:08 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:08 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:08 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:08 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:08 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:08 swdev14 kernel: [<c025fbf1>] tcp_transmit_skb+0x432/0x852
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>cp_transmit_skb+0x432/0x852
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c0265d8c>] tcp_v4_send_check+0xe/0xe2
Oct 12 09:53:08 swdev14 kernel: [<c025fc9d>] tcp_transmit_skb+0x4de/0x852
Oct 12 09:53:08 swdev14 kernel: [<c01b2056>] memcpy+0x12/0x3c
Oct 12 09:53:08 swdev14 kernel: [<c0260a83>] tcp_write_xmit+0x149/0x2c0
Oct 12 09:53:08 swdev14 kernel: [<c0254a8e>] tcp_sendmsg+0x4ff/0x108d
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c0254a65>] tcp_sendmsg+0x4d6/0x108d
Oct 12 09:53:08 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:08 swdev14 last message repeated 2 times
Oct 12 09:53:08 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:08 swdev14 kernel: [<c02764e0>] inet_sendmsg+0x50/0x5b
Oct 12 09:53:08 swdev14 kernel: [<c0227bda>] sock_aio_write+0x124/0x136
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c015b73d>] do_sync_write+0xaa/0xd6
Oct 12 09:53:08 swdev14 kernel: [<c01333e3>] autoremove_wake_function+0x0/0x57
Oct 12 09:53:08 swdev14 kernel: [<c01a9bd8>] dummy_file_permission+0x8/0xc
Oct 12 09:53:08 swdev14 kernel: [<c015b80b>] vfs_write+0xa2/0x134
Oct 12 09:53:08 swdev14 kernel: [<c015b869>] vfs_write+0x100/0x134
Oct 12 09:53:08 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:08 swdev14 kernel: [<c015b967>] sys_write+0x50/0x7a
Oct 12 09:53:08 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 09:53:08 swdev14 kernel: scheduling while atomic: mount/0x04010002/2732
Oct 12 09:53:08 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:08 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:08 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:08 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:08 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0prx02> d91p[l>c.4av/p4> p> [/0x48e0x1a/023c/0x11514
Oct 12 09:53:08 swdev14 kernel: i>st> b_p82/0134cou4> ock> [reem/0x_pre82/7+0xchecingc7+0x4d>d/0v4_/0ra/0 m
Oct 12 09:53:08 swdev14 kernel: <4ck<c0t_skb+0
Oct 12 09:53:08 swdev14 kernel: <4em4>i4>x4<c+0<c6/0x01x82/013pt_co97
Oct 12 09:53:08 swdev14 kernel: mpx97_a0xe0>]g+0>] m18
Oct 12 09:53:08 swdev14 kernel: wr4>ake_fun0x5_fil0x vfs0xrite+0x100/
Oct 12 09:53:08 swdev14 kernel: <4 mco
Oct 12 09:53:08 swdev14 kernel: <4ite+0x50/0x7a
Oct 12 09:53:08 swdev14 kernel: [52/lin10lerx63/0299/0x99ed [<preemp46d_resched+0x26/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:08 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:08 swdev14 kernel: [<c02996d9>] cond_resched+0x26/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0299716>] cond_resched+0x63/0x83
Oct 12 09:53:08 swdev14 kernel: [<c0133b8e>] _rw_mutex_read_lock+0x24/0x39
Oct 12 09:53:08 swdev14 kernel: [<c011f8f1>] profile_hook+0x1d/0x47
Oct 12 09:53:08 swdev14 kernel: [<c011fde3>] profile_tick+0x63/0x65
Oct 12 09:53:08 swdev14 kernel: [<c0113d03>] smp_apic_timer_interrupt+0x60/0xe4
Oct 12 09:53:08 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 09:53:08 swdev14 kernel: [<c011f0b2>] release_console_sem+0x59/0xcf
Oct 12 09:53:08 swdev14 kernel: [<c011efb2>] vprintk+0x128/0x16f
Oct 12 09:53:08 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:08 swdev14 kernel: [<c011ee86>] printk+0x1d/0x21
Oct 12 09:53:09 swdev14 kernel: [<c0106edc>] show_trace+0x4e/0x8d
Oct 12 09:53:09 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:09 swdev14 kernel: [<c0106fd9>] dump_stack+0x23/0x27
Oct 12 09:53:09 swdev14 kernel: [<c0299267>] schedule+0xbaf/0xbe2
Oct 12 09:53:09 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:09 swdev14 kernel: [<c0134888>] check_preempt_timing+0x191/0x1f9
Oct 12 09:53:09 swdev14 kernel: [<c0134936>] touch_preempt_timing+0x46/0x4a
Oct 12 09:53:09 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:09 swdev14 kernel: [<c029a358>] _spin_unlock_irqrestore+0xb/0x36
Oct 12 09:53:09 swdev14 kernel: [<c0298491>] __down+0x85/0x107
Oct 12 09:53:09 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:09 swdev14 kernel: [<c0298496>] __down+0x8a/0x107
Oct 12 09:53:09 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 09:53:09 swdev14 kernel: [<c029865c>] __down_failed+0x8/0xc
Oct 12 09:53:09 swdev14 kernel: [<c0133ee3>] .text.lock.mutex+0x5/0x146
Oct 12 09:53:09 swdev14 kernel: [<c0133884>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 09:53:09 swdev14 kernel: [<e0840abe>] boomerang_start_xmit+0x123/0x2eb [3c59x]
Oct 12 09:53:09 swdev14 kernel: [<c013414d>] __mcount+0x1d/0x21
Oct 12 09:53:09 swdev14 kernel: [<c029a2b8>] _spin_unlock+0xb/0x34
Oct 12 09:53:09 swdev14 kernel: [<c02404f3>] qdisc_restart+0x132/0x1e6
Oct 12 09:53:09 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:09 swdev14 kernel: [<c0240517>] qdisc_restart+0x156/0x1e6
Oct 12 09:53:09 swdev14 kernel: [<c02313dd>] dev_queue_xmit+0x239/0x2d9
Oct 12 09:53:09 swdev14 kernel: [<c024ecfa>] ip_finish_output+0xd5/0x216
Oct 12 09:53:09 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:09 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 09:53:09 swdev14 kernel: [<c025148e>] dst_output+0x1a/0x2f
Oct 12 09:53:09 swdev14 kernel: [<c023bf2c>] nf_hook_slow+0xec/0x11e
Oct 12 09:53:09 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:09 swdev14 kernel: [<c024f539>] ip_queue_xmit+0x495/0x59e
Oct 12 09:53:09 swdev14 kernel: [<c0251474>] dst_output+0x0/0x2f
Oct 12 09:53:09 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:09 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:53:09 swdev14 kernel: [<c029a2c7>] _spin_unlock+0x1a/0x34
Oct 12 09:53:09 swdev14 kernel: [<c0134af1>] sub_preempt_count+0x82/0x97
Oct 12 09:56:00 swdev14 syslogd 1.4.1: restart.
* K.R. Foley <[email protected]> wrote:
> OK. This one builds just fine here. Again I tried booting preempt
> realtime. We were going along fine and then all hell broke loose on
> the console. Pressed Ctrl-s to stop the scrolling and it then bit the
> dust. It did manage to get into the logs this time and I am attaching
> that. This is a different SMP system that I use as a workstation at a
> client site. Dual 2.6GHz Xeons (with HT) 512MB
does the patch below make your system bootable? It should fix the two
most common messages you got.
Ingo
--- linux/kernel/profile.c.orig
+++ linux/kernel/profile.c
@@ -169,7 +169,7 @@ int profile_event_unregister(enum profil
}
static struct notifier_block * profile_listeners;
-static rwlock_t profile_lock = RW_LOCK_UNLOCKED;
+static raw_rwlock_t profile_lock = RAW_RW_LOCK_UNLOCKED;
int register_profile_notifier(struct notifier_block * nb)
{
--- linux/drivers/net/3c59x.c.orig
+++ linux/drivers/net/3c59x.c
@@ -832,8 +832,8 @@ struct vortex_private {
u16 deferred; /* Resend these interrupts when we
* bale from the ISR */
u16 io_size; /* Size of PCI region (for release_region) */
- spinlock_t lock; /* Serialise access to device & its vortex_private */
- spinlock_t mdio_lock; /* Serialise access to mdio hardware */
+ raw_spinlock_t lock; /* Serialise access to device & its vortex_private */
+ raw_spinlock_t mdio_lock; /* Serialise access to mdio hardware */
struct mii_if_info mii; /* MII lib hooks/info */
};
>i've uploaded -T7:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T7
>
This crashes at boot time again. Several more scrolling messages that end
with
(all I can see on the screen)
Modules linked in: ext3 jbd
CPU: 1
EIP: 0060:[<c031563e>] Not tainted VLI
EFLAGS: 00000046 (2.6.9-rc4-mm1-VP-T7)
EIP is at _spin_lock+0x43/0x70
eax: 00000001 ebx: c1406020 ecx: 0104ef60 edx: 00000001
esi: c166a000 edi: 00000002 ebp: c166bf04 esp: c166bef8
ds: 007b es: 007b ss: 0068 preempt: 00010002
Process ksoftirqd/1 (pid: 5, threadinfo=c166a000 task=c1659000)
Stack: 00000001 c1406020 c1436020 c166bf18 c011bbf0 c1436a00 c1406020
c1436a00
c166bf48 c011c726 c1436020 c1406020 c166bf38 00000002 c1659000
c166bf48
00000001 c1436a00 00000001 0104ef60 c166bfa4 c03146e3 00000001
c1436020
Call Trace:
[<c011bbf0>] double_lock_balance+0x40/0x50
[<c011c726>] load_balance_newidle+0x66/0xc0
[<c03146e3>] schedule+0x733/0x830
[<c0114b60>] mcount+0x14/0x18
[<c01280c4>] ksoftirqd+0xd4/0xf0
[<c013837b>] kthread_0xbb/0xc0
[<c0127ff0>] ksoftirqd+0x0/0xf0
[<c01382c0>] kthread+0x0/0xc0
[<c0105b19>] kernel_thread_helper+0x5/0xc
Code: ff 21 e6 31 c0 86 03 84 c0 7e 0a 8b 5d f8 8b 75 fc 89 ec 5d c3 c7 04
24 01
00 00 00 e8 0c 4d e2 ff 8b 46 08 a8 08 75 1e 8b 43 08 <85> c0 75 07 c7 43
08 01
Had to cycle power to get the machine back. Rebooting with max_cpus=1
crashed in a different way. Was able to get past mounting the disks and
some of the init script items before stopping at the same location with
a different call trace:
Call Trace:
[<c011cb58>] scheduler_tick+0x148/0x490
[<c012bee3>] update_process_times+0x43/0x60
[<c0114b60>] mcount+0x14/0x18
[<c012beef>] update_process_times_0x4f/0x60
[<c0115141>] smp_apic_timer_interrupt+0xe1/0xf0
[<c011cb73>] scheduler_tick+0x16e/0x490
[<c010854a>] apic_timer_interrupt+0x1a/0x20
[<c031007b>] unix_stream_recvmsg+0x5b/0x450
[<c011cb7e>] scheduler_tick+0x16e/0x490
[<c012bee3>] update_process_times+0x43/0x60
[<c0114b60>] mcount+0x14/0x18
[<c012beef>] update_process_times+0x4f/0x60
[<c0115141>] smp_apic_timer_interrupt+0xe1/0xf0
[<c01225d4>] release_console_sem+0x64/0xe0
[<c012236d>] printk+0x1d/0x30
Will send you more messages if they made it to disk separately.
--Mark H Johnson
<mailto:[email protected]>
Oct 12 11:46:34 swdev14 syslogd 1.4.1: restart.
Oct 12 11:46:34 swdev14 syslog: syslogd startup succeeded
Oct 12 11:46:34 swdev14 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Oct 12 11:46:34 swdev14 syslog: klogd startup succeeded
Oct 12 11:46:34 swdev14 kernel: 26/0x83>
Oct 12 11:46:34 swdev14 kernel: => ended at: <cond_resched+0x26/0x83>
Oct 12 11:46:34 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:34 swdev14 kernel: [<c013484c>] check_preempt_timing+0x161/0x1f9
Oct 12 11:46:34 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:34 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:34 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:34 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:34 swdev14 kernel: [<c0133816>] _mutex_lock+0x19/0x3f
Oct 12 11:46:34 swdev14 kernel: [<c0133878>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 11:46:34 swdev14 kernel: [<c01cbe88>] tty_register_ldisc+0x37/0xa4
Oct 12 11:46:34 swdev14 kernel: [<c036be3e>] console_init+0x27/0x4a
Oct 12 11:46:34 swdev14 kernel: [<c035487a>] start_kernel+0xd7/0x1c6
Oct 12 11:46:34 swdev14 kernel: [<c03543b0>] unknown_bootoption+0x0/0x15d
Oct 12 11:46:34 swdev14 irqbalance: irqbalance startup succeeded
Oct 12 11:46:34 swdev14 kernel: Console: colour VGA+ 80x25
Oct 12 11:46:34 swdev14 kernel: Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Oct 12 11:46:34 swdev14 kernel: Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Oct 12 11:46:34 swdev14 kernel: Memory: 513500k/523712k available (1645k kernel code, 9608k reserved, 726k data, 272k init, 0k highmem)
Oct 12 11:46:34 swdev14 kernel: Checking if this processor honours the WP bit even in supervisor mode... Ok.
Oct 12 11:46:34 swdev14 kernel: Security Scaffold v1.0.0 initialized
Oct 12 11:46:34 swdev14 kernel: Capability LSM initialized
Oct 12 11:46:34 swdev14 kernel: Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
Oct 12 11:46:34 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 11:46:34 swdev14 portmap: portmap startup succeeded
Oct 12 11:46:34 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 11:46:34 swdev14 kernel: CPU: Physical Processor ID: 0
Oct 12 11:46:34 swdev14 kernel: Intel machine check architecture supported.
Oct 12 11:46:34 swdev14 kernel: Intel machine check reporting enabled on CPU#0.
Oct 12 11:46:34 swdev14 kernel: CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 11:46:34 swdev14 kernel: Enabling fast FPU save and restore... done.
Oct 12 11:46:34 swdev14 kernel: Enabling unmasked SIMD FPU exception support... done.
Oct 12 11:46:34 swdev14 kernel: Checking 'hlt' instruction... OK.
Oct 12 11:46:34 swdev14 kernel: CPU0: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 11:46:34 swdev14 kernel: per-CPU timeslice cutoff: 1462.71 usecs.
Oct 12 11:46:34 swdev14 kernel: task migration cache decay timeout: 2 msecs.
Oct 12 11:46:34 swdev14 kernel: Booting processor 1/1 eip 2000
Oct 12 11:46:34 swdev14 kernel: Initializing CPU#1
Oct 12 11:46:34 swdev14 rpc.statd[2649]: Version 1.0.6 Starting
Oct 12 11:46:34 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 11:46:34 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 11:46:34 swdev14 nfslock: rpc.statd startup succeeded
Oct 12 11:46:34 swdev14 kernel: CPU: Physical Processor ID: 0
Oct 12 11:46:34 swdev14 kernel: Intel machine check architecture supported.
Oct 12 11:46:34 swdev14 kernel: Intel machine check reporting enabled on CPU#1.
Oct 12 11:46:34 swdev14 kernel: CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 11:46:34 swdev14 kernel: CPU1: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 11:46:34 swdev14 kernel: Booting processor 2/6 eip 2000
Oct 12 11:46:34 swdev14 kernel: Initializing CPU#2
Oct 12 11:46:34 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 11:46:34 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 11:46:34 swdev14 kernel: CPU: Physical Processor ID: 3
Oct 12 11:46:35 swdev14 kernel: Intel machine check architecture supported.
Oct 12 11:46:35 swdev14 kernel: Intel machine check reporting enabled on CPU#2.
Oct 12 11:46:35 swdev14 kernel: CPU2: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 11:46:35 swdev14 kernel: CPU2: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 11:46:35 swdev14 kernel: Booting processor 3/7 eip 2000
Oct 12 11:46:35 swdev14 kernel: Initializing CPU#3
Oct 12 11:46:35 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 11:46:35 swdev14 rpcidmapd: rpc.idmapd startup succeeded
Oct 12 11:46:35 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 11:46:35 swdev14 kernel: CPU: Physical Processor ID: 3
Oct 12 11:46:35 swdev14 kernel: Intel machine check architecture supported.
Oct 12 11:46:35 swdev14 kernel: Intel machine check reporting enabled on CPU#3.
Oct 12 11:46:35 swdev14 kernel: CPU3: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 11:46:35 swdev14 kernel: CPU3: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 11:46:35 swdev14 kernel: Total of 4 processors activated (20611.07 BogoMIPS).
Oct 12 11:46:35 swdev14 kernel: checking TSC synchronization across 4 CPUs: passed.
Oct 12 11:46:35 swdev14 kernel: ksoftirqd started up.
Oct 12 11:46:35 swdev14 last message repeated 2 times
Oct 12 11:46:35 swdev14 kernel: Brought up 4 CPUs
Oct 12 11:46:35 swdev14 random: Initializing random number generator: succeeded
Oct 12 11:46:35 swdev14 kernel: ksoftirqd started up.
Oct 12 11:46:35 swdev14 rc: Starting pcmcia: succeeded
Oct 12 11:46:35 swdev14 kernel: checking if image is initramfs...it isn't (no cpio magic); looks like an initrd
Oct 12 11:46:35 swdev14 kernel: Freeing initrd memory: 207k freed
Oct 12 11:46:35 swdev14 kernel: NET: Registered protocol family 16
Oct 12 11:46:35 swdev14 kernel: PCI: PCI BIOS revision 2.10 entry at 0xfd915, last bus=5
Oct 12 11:46:35 swdev14 kernel: PCI: Using configuration type 1
Oct 12 11:46:35 swdev14 kernel: mtrr: v2.0 (20020519)
Oct 12 11:46:35 swdev14 kernel: Linux Plug and Play Support v0.97 (c) Adam Belay
Oct 12 11:46:35 swdev14 kernel: PCI: Probing PCI hardware
Oct 12 11:46:35 swdev14 kernel: PCI: Probing PCI hardware (bus 00)
Oct 12 11:46:35 swdev14 kernel: PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
Oct 12 11:46:35 swdev14 kernel: PCI: Transparent bridge - 0000:00:1e.0
Oct 12 11:46:35 swdev14 kernel: Simple Boot Flag at 0x36 set to 0x1
Oct 12 11:46:35 swdev14 kernel: apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
Oct 12 11:46:35 swdev14 kernel: apm: disabled - APM is not SMP safe.
Oct 12 11:46:35 swdev14 kernel: VFS: Disk quotas dquot_6.5.1
Oct 12 11:46:35 swdev14 kernel: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Oct 12 11:46:35 swdev14 kernel: Initializing Cryptographic API
Oct 12 11:46:35 swdev14 kernel: vesafb: probe of vesafb0 failed with error -6
Oct 12 11:46:35 swdev14 kernel: isapnp: Scanning for PnP cards...
Oct 12 11:46:35 swdev14 kernel: isapnp: No Plug & Play device found
Oct 12 11:46:35 swdev14 kernel: scheduling while atomic: swapper/0x04000001/1
Oct 12 11:46:35 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:35 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:35 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:35 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:35 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:36 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:36 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:36 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:36 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:36 swdev14 kernel: [<c0133b82>] _rw_mutex_read_lock+0x24/0x39
Oct 12 11:46:36 swdev14 kernel: [<c011f62c>] profile_handoff_task+0x1a/0x52
Oct 12 11:46:36 swdev14 netfs: Mounting NFS filesystems: succeeded
Oct 12 11:46:36 swdev14 kernel: [<c011c508>] __put_task_struct+0x66/0x119
Oct 12 11:46:36 swdev14 kernel: [<c0298a0b>] schedule+0x35f/0xbe2
Oct 12 11:46:36 swdev14 kernel: [<c029a392>] _spin_unlock_irq+0x1b/0x35
Oct 12 11:46:36 swdev14 netfs: Mounting other filesystems: succeeded
Oct 12 11:46:36 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:36 swdev14 kernel: [<c0134ae5>] sub_preempt_count+0x82/0x97
Oct 12 11:46:36 swdev14 kernel: [<c029a392>] _spin_unlock_irq+0x1b/0x35
Oct 12 11:46:36 swdev14 kernel: [<c029938c>] wait_for_completion+0x84/0xe3
Oct 12 11:46:36 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:36 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:36 swdev14 kernel: [<c012eab2>] queue_work+0x72/0xa0
Oct 12 11:46:36 swdev14 kernel: [<c012e9cd>] call_usermodehelper+0xc7/0xce
Oct 12 11:46:36 swdev14 kernel: [<c012e89d>] __call_usermodehelper+0x0/0x69
Oct 12 11:46:36 swdev14 kernel: [<c01e54d5>] class_hotplug+0x0/0x44
Oct 12 11:46:36 swdev14 kernel: [<c01aeeb4>] kobject_hotplug+0x27e/0x2e2
Oct 12 11:46:36 swdev14 kernel: [<c01ae0f0>] create_dir+0x3e/0x4e
Oct 12 11:46:36 swdev14 autofs: automount startup succeeded
Oct 12 11:46:36 swdev14 kernel: [<c01ae34e>] kobject_add+0x8c/0xfa
Oct 12 11:46:36 swdev14 kernel: [<c01e56b7>] class_device_add+0x8d/0x15e
Oct 12 11:46:36 swdev14 kernel: [<c01e5d0b>] class_simple_device_add+0xa3/0x104
Oct 12 11:46:36 swdev14 kernel: [<c01cfb4d>] tty_register_device+0x73/0xdd
Oct 12 11:46:36 swdev14 kernel: [<c01e64a8>] kobj_map+0xa0/0x136
Oct 12 11:46:36 swdev14 kernel: [<c0164ba4>] cdev_add+0x4b/0x4f
Oct 12 11:46:36 swdev14 kernel: [<c01cfe6a>] tty_register_driver+0x14c/0x243
Oct 12 11:46:36 swdev14 smartd[2807]: smartd version 5.21 Copyright (C) 2002-3 Bruce Allen
Oct 12 11:46:36 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 11:46:36 swdev14 smartd[2807]: Home page is http://smartmontools.sourceforge.net/
Oct 12 11:46:36 swdev14 kernel: [<c036c34d>] legacy_pty_init+0x28a/0x2c8
Oct 12 11:46:36 swdev14 smartd[2807]: Opened configuration file /etc/smartd.conf
Oct 12 11:46:36 swdev14 kernel: [<c036c657>] pty_init+0xd/0x16
Oct 12 11:46:36 swdev14 smartd[2807]: Configuration file /etc/smartd.conf parsed.
Oct 12 11:46:36 swdev14 kernel: [<c03549b2>] do_initcalls+0x30/0xbd
Oct 12 11:46:36 swdev14 smartd[2807]: Device: /dev/hda, opened
Oct 12 11:46:36 swdev14 kernel: [<c0100541>] init+0x87/0x19a
Oct 12 11:46:36 swdev14 kernel: [<c01004ba>] init+0x0/0x19a
Oct 12 11:46:36 swdev14 smartd[2807]: Device: /dev/hda, not found in smartd database.
Oct 12 11:46:36 swdev14 kernel: [<c01042c9>] kernel_thread_helper+0x5/0xb
Oct 12 11:46:36 swdev14 smartd[2807]: Device: /dev/hda, is SMART capable. Adding to "monitor" list.
Oct 12 11:46:36 swdev14 kernel: scheduling while atomic: swapper/0x04000001/1
Oct 12 11:46:36 swdev14 smartd[2807]: Monitoring 1 ATA and 0 SCSI devices
Oct 12 11:46:36 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:37 swdev14 smartd[2809]: smartd has fork()ed into background mode. New PID=2809.
Oct 12 11:46:37 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:37 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:37 swdev14 smartd: smartd startup succeeded
Oct 12 11:46:37 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:37 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:37 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:37 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:37 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:37 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:37 swdev14 kernel: [<c0133b82>] _rw_mutex_read_lock+0x24/0x39
Oct 12 11:46:37 swdev14 kernel: [<c011f62c>] profile_handoff_task+0x1a/0x52
Oct 12 11:46:37 swdev14 kernel: [<c011c508>] __put_task_struct+0x66/0x119
Oct 12 11:46:37 swdev14 kernel: [<c0298a0b>] schedule+0x35f/0xbe2
Oct 12 11:46:37 swdev14 kernel: [<c029a392>] _spin_unlock_irq+0x1b/0x35
Oct 12 11:46:37 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:37 swdev14 sshd: succeeded
Oct 12 11:46:37 swdev14 kernel: [<c0134ae5>] sub_preempt_count+0x82/0x97
Oct 12 11:46:37 swdev14 kernel: [<c029a392>] _spin_unlock_irq+0x1b/0x35
Oct 12 11:46:37 swdev14 kernel: [<c029938c>] wait_for_completion+0x84/0xe3
Oct 12 11:46:37 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:37 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:37 swdev14 kernel: [<c012eab2>] queue_work+0x72/0xa0
Oct 12 11:46:37 swdev14 kernel: [<c012e9cd>] call_usermodehelper+0xc7/0xce
Oct 12 11:46:37 swdev14 kernel: [<c012e89d>] __call_usermodehelper+0x0/0x69
Oct 12 11:46:37 swdev14 kernel: [<c01e54d5>] class_hotplug+0x0/0x44
Oct 12 11:46:37 swdev14 kernel: [<c01aeeb4>.20
Oct 12 11:46:37 swdev14 xinetd: xinetd startup succeeded
Oct 12 11:46:37 swdev14 kernel: ip_tables: (C) 2000-2002 Netfilter core team
Oct 12 11:46:37 swdev14 kernel: NET: Registered protocol family 10
Oct 12 11:46:37 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:37 swdev14 kernel: caller is raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:37 swdev14 ntpdate[2873]: can't find host wizard
Oct 12 11:46:37 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:37 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:37 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:37 swdev14 kernel: [<e0b1d282>] inet6_create+0x26f/0x315 [ipv6]
Oct 12 11:46:37 swdev14 kernel: [<c02284aa>] __sock_create+0xd5/0x19d
Oct 12 11:46:37 swdev14 kernel: [<c02285dc>] sock_create_kern+0x33/0x37
Oct 12 11:46:37 swdev14 kernel: [<e09bc824>] icmpv6_init+0xc4/0x110 [ipv6]
Oct 12 11:46:37 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:37 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:37 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc856>] icmpv6_init+0xf6/0x110 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b1d282>] inet6_create+0x26f/0x315 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c02284aa>] __sock_create+0xd5/0x19d
Oct 12 11:46:38 swdev14 kernel: [<c02285dc>] sock_create_kern+0x33/0x37
Oct 12 11:46:38 swdev14 kernel: [<e09bc824>] icmpv6_init+0xc4/0x110 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc856>] icmpv6_init+0xf6/0x110 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b1d282>] inet6_create+0x26f/0x315 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c02284aa>] __sock_create+0xd5/0x19d
Oct 12 11:46:38 swdev14 kernel: [<c02285dc>] sock_create_kern+0x33/0x37
Oct 12 11:46:38 swdev14 kernel: [<e09bc824>] icmpv6_init+0xc4/0x110 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc856>] icmpv6_init+0xf6/0x110 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b1d282>] inet6_create+0x26f/0x315 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c02284aa>] __sock_create+0xd5/0x19d
Oct 12 11:46:38 swdev14 kernel: [<c02285dc>] sock_create_kern+0x33/0x37
Oct 12 11:46:38 swdev14 kernel: [<e09bc824>] icmpv6_init+0xc4/0x110 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc856>] icmpv6_init+0xf6/0x110 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc14b>] inet6_init+0xb9/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32f56>] raw_v6_hash+0x62/0x85 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b1d282>] inet6_create+0x26f/0x315 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c02284aa>] __sock_create+0xd5/0x19d
Oct 12 11:46:38 swdev14 kernel: [<c02285dc>] sock_create_kern+0x33/0x37
Oct 12 11:46:38 swdev14 kernel: [<e09bc60d>] ndisc_init+0x32/0xe9 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc161>] inet6_init+0xcf/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: modprobe/2819
Oct 12 11:46:38 swdev14 kernel: caller is raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c011a2aa>] smp_processor_id+0xa8/0xb9
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e0b32fda>] raw_v6_unhash+0x61/0xad [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc646>] ndisc_init+0x6b/0xe9 [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<e09bc161>] inet6_init+0xcf/0x20c [ipv6]
Oct 12 11:46:38 swdev14 kernel: [<c0138c03>] sys_init_module+0x15c/0x1ce
Oct 12 11:46:38 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:38 swdev14 kernel: IPv6 over IPv4 tunneling driver
Oct 12 11:46:38 swdev14 kernel: ------------[ cut here ]------------
Oct 12 11:46:38 swdev14 kernel: kernel BUG at kernel/mutex.c:185!
Oct 12 11:46:38 swdev14 kernel: invalid operand: 0000 [#1]
Oct 12 11:46:38 swdev14 kernel: PREEMPT SMP
Oct 12 11:46:38 swdev14 kernel: Modules linked in: ipv6 autofs4 nfs lockd sunrpc iptable_filter ip_tables ide_cd cdrom 3c59x mii tg3 floppy sg scsi_mod parport_pc parport microcode dm_mod evdev usbhid uhci_hcd usbcore ext3 jbd
Oct 12 11:46:38 swdev14 kernel: CPU: 2
Oct 12 11:46:38 swdev14 kernel: EIP: 0060:[<c0133bbc>] Not tainted VLI
Oct 12 11:46:38 swdev14 kernel: EFLAGS: 00010246 (2.6.9-rc4-mm1-VP-T7)
Oct 12 11:46:38 swdev14 kernel: EIP is at _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:38 swdev14 kernel: eax: 00000000 ebx: 00000000 ecx: c0350e00 edx: c1466b80
Oct 12 11:46:38 swdev14 kernel: esi: ddef3ac8 edi: dd878814 ebp: de5d7f18 esp: de5d7f18
Oct 12 11:46:38 swdev14 kernel: ds: 007b es: 007b ss: 0068 preempt: 00000002
Oct 12 11:46:38 swdev14 kernel: Process sshd (pid: 2849, threadinfo=de5d6000 task=de4a7a80)
Oct 12 11:46:38 swdev14 kernel: Stack: de5d7f44 c02537ec c0350e00 00000016 c029a3c6 dfe12a80 ddef3c94 ddef3bfc
Oct 12 11:46:38 swdev14 kernel: dfe12a80 ddef3a80 ffffffea de5d7f5c c0275b18 ddef3a80 00000005 dfe12a80
Oct 12 11:46:38 swdev14 kernel: 00000003 de5d7f78 c022889e dfe12a80 00000005 00000000 00000004 08090bf0
Oct 12 11:46:38 swdev14 kernel: Call Trace:
Oct 12 11:46:38 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:38 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:38 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:39 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:39 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:39 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:39 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:39 swdev14 kernel: Code: 75 fc 89 ec 5d c3 55 89 e5 e8 21 fb fd ff 8b 4d 08 8b 01 85 c0 74 14 8d 41 04 ba ff ff 00 00 f0 0f c1 10 0f 85 6e 03 00 00 5d c3 <0f> 0b b9 00 1a c7 2a c0 eb e2 55 89 e5 e8 f2 fa fd ff 8b 4d 08
Oct 12 11:46:39 swdev14 kernel: <3>scheduling while atomic: sshd/0x04000001/2849
Oct 12 11:46:39 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c011f5d4>] profile_task_exit+0x18/0x56
Oct 12 11:46:39 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 11:46:39 swdev14 kernel: [<c01216f8>] do_exit+0x1f/0x3bd
Oct 12 11:46:39 swdev14 kernel: [<c029929f>] preempt_schedule+0x11/0x7a
Oct 12 11:46:39 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 11:46:39 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 11:46:39 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 11:46:39 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 11:46:39 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:39 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:39 swdev14 kernel: [<c029a1aa>] _write_lock+0x1b/0x76
Oct 12 11:46:39 swdev14 kernel: [<c0264422>] tcp_listen_wlock+0x16/0xac
Oct 12 11:46:39 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 11:46:39 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 11:46:39 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:39 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:39 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:39 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:39 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:39 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:39 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:39 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:39 swdev14 kernel: note: sshd[2849] exited with preempt_count 1
Oct 12 11:46:39 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2849
Oct 12 11:46:39 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c014b81d>] unmap_vmas+0x190/0x29b
Oct 12 11:46:39 swdev14 kernel: [<c015006f>] exit_mmap+0xb2/0x1cc
Oct 12 11:46:39 swdev14 kernel: [<c011c979>] mmput+0x3b/0xb9
Oct 12 11:46:39 swdev14 kernel: [<c0121807>] do_exit+0x12e/0x3bd
Oct 12 11:46:39 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 11:46:39 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 11:46:39 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 11:46:39 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 11:46:39 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:39 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:39 swdev14 kernel: [<c029a1aa>] _write_lock+0x1b/0x76
Oct 12 11:46:39 swdev14 kernel: [<c0264422>] tcp_listen_wlock+0x16/0xac
Oct 12 11:46:39 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 11:46:39 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 11:46:39 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:39 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:39 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:39 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:39 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:39 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:39 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:39 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:39 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2849
Oct 12 11:46:39 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c0133816>] _mutex_lock+0x19/0x3f
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c014dff0>] remove_vm_struct+0x34/0x9f
Oct 12 11:46:39 swdev14 kernel: [<c015012f>] exit_mmap+0x172/0x1cc
Oct 12 11:46:39 swdev14 kernel: [<c011c979>] mmput+0x3b/0xb9
Oct 12 11:46:39 swdev14 kernel: [<c0121807>] do_exit+0x12e/0x3bd
Oct 12 11:46:39 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 11:46:39 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 11:46:39 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 11:46:39 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 11:46:39 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:39 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:39 swdev14 kernel: [<c029a1aa>] _write_lock+0x1b/0x76
Oct 12 11:46:39 swdev14 kernel: [<c0264422>] tcp_listen_wlock+0x16/0xac
Oct 12 11:46:39 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 11:46:39 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 11:46:39 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:39 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:39 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:39 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:39 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:39 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:39 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:39 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:39 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2849
Oct 12 11:46:39 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:39 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:39 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:39 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:39 swdev14 kernel: [<c0133816>] _mutex_lock+0x19/0x3f
Oct 12 11:46:39 swdev14 kernel: [<c015197a>] anon_vma_unlink+0x23/0x90
Oct 12 11:46:39 swdev14 kernel: [<c015c50e>] fput+0xe/0x1f
Oct 12 11:46:39 swdev14 kernel: [<c014e032>] remove_vm_struct+0x76/0x9f
Oct 12 11:46:39 swdev14 kernel: [<c015012f>] exit_mmap+0x172/0x1cc
Oct 12 11:46:39 swdev14 kernel: [<c011c979>] mmput+0x3b/0xb9
Oct 12 11:46:39 swdev14 kernel: [<c0121807>] do_exit+0x12e/0x3bd
Oct 12 11:46:39 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 11:46:39 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 11:46:39 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 11:46:40 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:40 swdev14 kernel: [<c029a1aa>] _write_lock+0x1b/0x76
Oct 12 11:46:40 swdev14 kernel: [<c0264422>] tcp_listen_wlock+0x16/0xac
Oct 12 11:46:40 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 11:46:40 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:40 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:40 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:40 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:40 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:40 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:40 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:40 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2849
Oct 12 11:46:40 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:40 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:40 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:40 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:40 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:40 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:40 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:40 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:40 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:40 swdev14 kernel: [<c014dfdb>] remove_vm_struct+0x1f/0x9f
Oct 12 11:46:40 swdev14 kernel: [<c015012f>] exit_mmap+0x172/0x1cc
Oct 12 11:46:40 swdev14 kernel: [<c011c979>] mmput+0x3b/0xb9
Oct 12 11:46:40 swdev14 kernel: [<c0121807>] do_exit+0x12e/0x3bd
Oct 12 11:46:40 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 11:46:40 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 11:46:40 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 11:46:40 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:40 swdev14 kernel: [<c029a1aa>] _write_lock+0x1b/0x76
Oct 12 11:46:40 swdev14 kernel: [<c0264422>] tcp_listen_wlock+0x16/0xac
Oct 12 11:46:40 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 11:46:40 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:40 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:40 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:40 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:40 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:40 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:40 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:40 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2849
Oct 12 11:46:40 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:40 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:40 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:40 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:40 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:40 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:40 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:40 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:40 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:40 swdev14 kernel: [<c014dfdb>] remove_vm_struct+0x1f/0x9f
Oct 12 11:46:40 swdev14 kernel: [<c015012f>] exit_mmap+0x172/0x1cc
Oct 12 11:46:40 swdev14 kernel: [<c011c979>] mmput+0x3b/0xb9
Oct 12 11:46:40 swdev14 kernel: [<c0121807>] do_exit+0x12e/0x3bd
Oct 12 11:46:40 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 11:46:40 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 11:46:40 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 11:46:40 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:40 swdev14 kernel: [<c029a1aa>] _write_lock+0x1b/0x76
Oct 12 11:46:40 swdev14 kernel: [<c0264422>] tcp_listen_wlock+0x16/0xac
Oct 12 11:46:40 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 11:46:40 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:40 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:40 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:40 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:40 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:40 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:40 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:40 swdev14 kernel: scheduling while atomic: sshd/0x00000001/2849
Oct 12 11:46:40 swdev14 kernel: caller is __down+0x8a/0x107
Oct 12 11:46:40 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:40 swdev14 kernel: [<c029848a>] __down+0x8a/0x107
Oct 12 11:46:40 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:40 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:40 swdev14 kernel: [<c029a34c>] _spin_unlock_irqrestore+0xb/0x36
Oct 12 11:46:40 swdev14 kernel: [<c0298485>] __down+0x85/0x107
Oct 12 11:46:40 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 11:46:40 swdev14 kernel: [<c029848a>] __down+0x8a/0x107
Oct 12 11:46:40 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:40 swdev14 kernel: [<c0298650>] __down_failed+0x8/0xc
Oct 12 11:46:40 swdev14 kernel: [<c011c3d8>] .text.lock.sched+0x5/0x15
Oct 12 11:46:40 swdev14 kernel: [<c01ccbc1>] disassociate_ctty+0x1d/0x16d
Oct 12 11:46:40 swdev14 kernel: [<c01218f1>] do_exit+0x218/0x3bd
Oct 12 11:46:40 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 11:46:40 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 11:46:40 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 11:46:40 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:40 swdev14 kernel: [<c029a1aa>] _write_lock+0x1b/0x76
Oct 12 11:46:40 swdev14 kernel: [<c0264422>] tcp_listen_wlock+0x16/0xac
Oct 12 11:46:40 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 11:46:40 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 11:46:40 swdev14 kernel: [<c0133bbc>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 11:46:40 swdev14 kernel: [<c02537ec>] tcp_listen_start+0x175/0x1d1
Oct 12 11:46:40 swdev14 kernel: [<c029a3c6>] _spin_unlock_bh+0x1a/0x34
Oct 12 11:46:40 swdev14 kernel: [<c0275b18>] inet_listen+0x65/0x7a
Oct 12 11:46:40 swdev14 kernel: [<c022889e>] sys_listen+0x5c/0x74
Oct 12 11:46:40 swdev14 kernel: [<c0229558>] sys_socketcall+0xb1/0x239
Oct 12 11:46:40 swdev14 kernel: [<c015aeb9>] sys_close+0x75/0x91
Oct 12 11:46:40 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:46 swdev14 ntpdate[2873]: step time server 159.82.80.104 offset -0.498445 sec
Oct 12 11:46:46 swdev14 ntpd: succeeded
Oct 12 11:46:46 swdev14 ntpd[2877]: ntpd [email protected] Thu Mar 11 11:46:39 EST 2004 (1)
Oct 12 11:46:46 swdev14 ntpd: ntpd startup succeeded
Oct 12 11:46:46 swdev14 ntpd[2877]: precision = 1.000 usec
Oct 12 11:46:46 swdev14 ntpd[2877]: kernel time sync status 0040
Oct 12 11:46:46 swdev14 ntpd[2877]: configure: keyword "opus" unknown, line ignored
Oct 12 11:46:46 swdev14 ntpd[2877]: configure: keyword "hal" unknown, line ignored
Oct 12 11:46:46 swdev14 ntpd[2877]: configure: keyword "wizard" unknown, line ignored
Oct 12 11:46:46 swdev14 ntpd[2877]: configure: keyword "time1.utc.com" unknown, line ignored
Oct 12 11:46:46 swdev14 ntpd[2877]: configure: keyword "time2.utc.com" unknown, line ignored
Oct 12 11:46:46 swdev14 ntpd[2877]: configure: keyword "time3.utc.com" unknown, line ignored
Oct 12 11:46:46 swdev14 vsftpd: vsftpd vsftpd succeeded
Oct 12 11:46:46 swdev14 ntpd[2877]: frequency initialized 124.193 PPM from /var/lib/ntp/drift
Oct 12 11:46:46 swdev14 ntpd[2877]: configure: keyword "authenticate" unknown, line ignored
Oct 12 11:46:46 swdev14 gpm[2896]: *** info [startup.c(95)]:
Oct 12 11:46:46 swdev14 gpm[2896]: Started gpm successfully. Entered daemon mode.
Oct 12 11:46:46 swdev14 gpm[2896]: *** info [mice.c(1766)]:
Oct 12 11:46:46 swdev14 gpm[2896]: imps2: Auto-detected intellimouse PS/2
Oct 12 11:46:47 swdev14 gpm: gpm startup succeeded
Oct 12 11:46:47 swdev14 crond: crond startup succeeded
Oct 12 11:46:49 swdev14 kernel: lp0: using parport0 (polling).
Oct 12 11:46:49 swdev14 kernel: lp0: console ready
Oct 12 11:46:49 swdev14 kernel: scheduling while atomic: serial/0x04000001/2947
Oct 12 11:46:49 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:49 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:49 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:49 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:49 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:49 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:49 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:49 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:49 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:49 swdev14 kernel: [<c0133b82>] _rw_mutex_read_lock+0x24/0x39
Oct 12 11:46:49 swdev14 kernel: [<c011f62c>] profile_handoff_task+0x1a/0x52
Oct 12 11:46:49 swdev14 kernel: [<c011c508>] __put_task_struct+0x66/0x119
Oct 12 11:46:49 swdev14 kernel: [<c0298a0b>] schedule+0x35f/0xbe2
Oct 12 11:46:49 swdev14 kernel: [<c029a392>] _spin_unlock_irq+0x1b/0x35
Oct 12 11:46:49 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:49 swdev14 kernel: [<c0134ae5>] sub_preempt_count+0x82/0x97
Oct 12 11:46:49 swdev14 kernel: [<c029a392>] _spin_unlock_irq+0x1b/0x35
Oct 12 11:46:49 swdev14 kernel: [<c029938c>] wait_for_completion+0x84/0xe3
Oct 12 11:46:49 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:49 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:49 swdev14 kernel: [<c012ead4>] queue_work+0x94/0xa0
Oct 12 11:46:49 swdev14 kernel: [<c012e9cd>] call_usermodehelper+0xc7/0xce
Oct 12 11:46:49 swdev14 kernel: [<c012e89d>] __call_usermodehelper+0x0/0x69
Oct 12 11:46:49 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 11:46:49 swdev14 kernel: [<c012e6e1>] request_module+0xa1/0xe5
Oct 12 11:46:49 swdev14 kernel: [<c0134141>] __mcount+0x1d/0x21
Oct 12 11:46:49 swdev14 kernel: [<c0164d2c>] base_probe+0xe/0x52
Oct 12 11:46:49 swdev14 kernel: [<c01e672b>] kobj_lookup+0xf8/0x105
Oct 12 11:46:49 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 11:46:49 swdev14 kernel: [<c0164d50>] base_probe+0x32/0x52
Oct 12 11:46:49 swdev14 kernel: [<c01e672b>] kobj_lookup+0xf8/0x105
Oct 12 11:46:49 swdev14 kernel: [<c0164d1e>] base_probe+0x0/0x52
Oct 12 11:46:49 swdev14 kernel: [<c01649ff>] chrdev_open+0xf2/0x16f
Oct 12 11:46:49 swdev14 kernel: [<c016490d>] chrdev_open+0x0/0x16f
Oct 12 11:46:49 swdev14 kernel: [<c015aafa>] dentry_open+0x106/0x180
Oct 12 11:46:49 swdev14 kernel: [<c015a9f2>] filp_open+0x62/0x64
Oct 12 11:46:49 swdev14 kernel: [<c013388c>] _mutex_unlock+0xe/0x5e
Oct 12 11:46:49 swdev14 kernel: [<c01b1bfa>] find_next_zero_bit+0x14/0xa6
Oct 12 11:46:49 swdev14 kernel: [<c015ac04>] get_unused_fd+0x90/0xe4
Oct 12 11:46:49 swdev14 kernel: [<c015ad58>] sys_open+0x4b/0x88
Oct 12 11:46:49 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 11:46:50 swdev14 kernel: scheduling while atomic: khelper/0x04000001/14
Oct 12 11:46:50 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 11:46:50 swdev14 kernel: [<c029925b>] schedule+0xbaf/0xbe2
Oct 12 11:46:50 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:50 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:50 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:50 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:50 swdev14 kernel: [<c013492a>] touch_preempt_timing+0x46/0x4a
Oct 12 11:46:50 swdev14 kernel: [<c02996cd>] cond_resched+0x26/0x83
Oct 12 11:46:50 swdev14 kernel: [<c029970a>] cond_resched+0x63/0x83
Oct 12 11:46:50 swdev14 kernel: [<c0133b82>] _rw_mutex_read_lock+0x24/0x39
Oct 12 11:46:50 swdev14 kernel: [<c011f62c>] profile_handoff_task+0x1a/0x52
Oct 12 11:46:50 swdev14 kernel: [<c011c508>] __put_task_struct+0x66/0x119
Oct 12 11:46:50 swdev14 kernel: [<c0298a0b>] schedule+0x35f/0xbe2
Oct 12 11:46:50 swdev14 kernel: [<c029a35d>] _spin_unlock_irqrestore+0x1c/0x36
Oct 12 11:46:50 swdev14 kernel: [<c013487c>] check_preempt_timing+0x191/0x1f9
Oct 12 11:46:50 swdev14 kernel: [<c0134ae5>] sub_preempt_count+0x82/0x97
Oct 12 11:46:50 swdev14 kernel: [<c029a35d>] _spin_unlock_irqrestore+0x1c/0x36
Oct 12 11:46:50 swdev14 kernel: [<c012edbf>] worker_thread+0x20c/0x22a
Oct 12 11:46:50 swdev14 kernel: [<c012e89d>] __call_usermodehelper+0x0/0x69
Oct 12 11:46:50 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:50 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 11:46:50 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 11:46:50 swdev14 kernel: [<c0132f2b>] kthread+0xbc/0xc1
Oct 12 11:46:50 swdev14 kernel: [<c012ebb3>] worker_thread+0x0/0x22a
Oct 12 11:46:50 swdev14 kernel: [<c0132e6f>] kthread+0x0/0xc1
Oct 12 11:46:50 swdev14 kernel: [<c01042c9>] kernel_thread_helper+0x5/0xb
Oct 12 11:48:12 swdev14 syslogd 1.4.1: restart.
On Tue, 2004-10-12 at 09:39, [email protected] wrote:
> Had to cycle power to get the machine back. Rebooting with max_cpus=1
> crashed in a different way. Was able to get past mounting the disks and
> some of the init script items before stopping at the same location with
> a different call trace:
>
> Call Trace:
> [<c011cb58>] scheduler_tick+0x148/0x490
> [<c012bee3>] update_process_times+0x43/0x60
> [<c0114b60>] mcount+0x14/0x18
> [<c012beef>] update_process_times_0x4f/0x60
> [<c0115141>] smp_apic_timer_interrupt+0xe1/0xf0
> [<c011cb73>] scheduler_tick+0x16e/0x490
> [<c010854a>] apic_timer_interrupt+0x1a/0x20
> [<c031007b>] unix_stream_recvmsg+0x5b/0x450
> [<c011cb7e>] scheduler_tick+0x16e/0x490
> [<c012bee3>] update_process_times+0x43/0x60
> [<c0114b60>] mcount+0x14/0x18
> [<c012beef>] update_process_times+0x4f/0x60
> [<c0115141>] smp_apic_timer_interrupt+0xe1/0xf0
> [<c01225d4>] release_console_sem+0x64/0xe0
> [<c012236d>] printk+0x1d/0x30
Do you have hyper threading turned on? Seems like I've seen this trace a
few times before..
Daniel Walker
i've uploaded the -T8 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T8
lots of stabilization of CONFIG_PREEMPT_REALTIME. It's still in
experimental status but general stability is improving.
Changes since -T7:
- fixed a nasty category of bugs that were introduced by the use of
rwsems for rwlocks. rwsems are not read-recursive, while rwlocks are.
Fortunately it was not too hard to identify & fix recursive users of
tasklist_lock, in fact each of these also qualifies as a cleanup. The
symptom of this bug was a soft-deadlocking of the system.
- fixed profiler locks, i believe this could resolve the bootup crash
reported by K.R. Foley.
- fixed VP / XFS namespace collision reported by Mark H. Johnson
- fix one more final detail of the new zombie-task handling code
- fixed 3c59x.c, fasync-handling, ipv6, module-loader runtime
warnings reported by K.R. Foley.
- fixed the ali5451 locking
to create a -T8 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T8
Ingo
* [email protected] <[email protected]> wrote:
> >i've uploaded -T7:
> This crashes at boot time again. Several more scrolling messages that
> end with (all I can see on the screen)
could you try to capture the full bootlog of the -T8 kernel (which i've
just released)? The reason is that often the crash you get is just a
side-effect of a problem that was warned about in one of the 'scolling
by' messages. In particular the scheduler crash you got seems to be of
that vintage.
Ingo
Oct 12 15:33:34 swdev14 syslogd 1.4.1: restart.
Oct 12 15:33:34 swdev14 syslog: syslogd startup succeeded
Oct 12 15:33:34 swdev14 kernel: klogd 1.4.1, log source = /proc/kmsg started.
Oct 12 15:33:34 swdev14 syslog: klogd startup succeeded
Oct 12 15:33:34 swdev14 kernel: Linux version 2.6.9-rc4-mm1-VP-T8 ([email protected]) (gcc version 3.3.3 20040412 (Red Hat Linux 3.3.3-7)) #11 SMP Tue Oct 12 15:25:18 CDT 2004
Oct 12 15:33:34 swdev14 kernel: BIOS-provided physical RAM map:
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 00000000000e0000 - 0000000000100000 (reserved)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 0000000000100000 - 000000001ff70000 (usable)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 000000001ff70000 - 000000001ff77000 (ACPI data)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 000000001ff77000 - 000000001ff80000 (ACPI NVS)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 000000001ff80000 - 0000000020000000 (reserved)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 00000000fec00000 - 00000000fec10000 (reserved)
Oct 12 15:33:34 swdev14 kernel: BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
Oct 12 15:33:35 swdev14 kernel: BIOS-e820: 00000000ff800000 - 00000000ffc00000 (reserved)
Oct 12 15:33:35 swdev14 kernel: BIOS-e820: 00000000fff00000 - 0000000100000000 (reserved)
Oct 12 15:33:35 swdev14 kernel: 0MB HIGHMEM available.
Oct 12 15:33:35 swdev14 irqbalance: irqbalance startup succeeded
Oct 12 15:33:35 swdev14 kernel: 511MB LOWMEM available.
Oct 12 15:33:35 swdev14 kernel: found SMP MP-table at 000f6b00
Oct 12 15:33:35 swdev14 kernel: DMI present.
Oct 12 15:33:35 swdev14 portmap: portmap startup succeeded
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC (acpi_id[0x00] lapic_id[0x00] enabled)
Oct 12 15:33:35 swdev14 kernel: Processor #0 15:2 APIC version 20
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC (acpi_id[0x01] lapic_id[0x06] enabled)
Oct 12 15:33:35 swdev14 kernel: Processor #6 15:2 APIC version 20
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Oct 12 15:33:35 swdev14 kernel: Processor #1 15:2 APIC version 20
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC (acpi_id[0x03] lapic_id[0x07] enabled)
Oct 12 15:33:35 swdev14 kernel: Processor #7 15:2 APIC version 20
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC_NMI (acpi_id[0x00] high edge lint[0x1])
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC_NMI (acpi_id[0x01] high edge lint[0x1])
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC_NMI (acpi_id[0x02] high edge lint[0x1])
Oct 12 15:33:35 swdev14 kernel: ACPI: LAPIC_NMI (acpi_id[0x03] high edge lint[0x1])
Oct 12 15:33:35 swdev14 kernel: Using ACPI for processor (LAPIC) configuration information
Oct 12 15:33:35 swdev14 kernel: Intel MultiProcessor Specification v1.4
Oct 12 15:33:35 swdev14 kernel: Virtual Wire compatibility mode.
Oct 12 15:33:35 swdev14 kernel: OEM ID: Product ID: PLACER CRB APIC at: 0xFEE00000
Oct 12 15:33:35 swdev14 kernel: I/O APIC #2 Version 32 at 0xFEC00000.
Oct 12 15:33:35 swdev14 kernel: I/O APIC #3 Version 32 at 0xFEC80000.
Oct 12 15:33:35 swdev14 kernel: I/O APIC #4 Version 32 at 0xFEC80100.
Oct 12 15:33:35 swdev14 rpc.statd[2649]: Version 1.0.6 Starting
Oct 12 15:33:35 swdev14 kernel: Enabling APIC mode: Flat. Using 3 I/O APICs
Oct 12 15:33:35 swdev14 kernel: Processors: 4
Oct 12 15:33:35 swdev14 kernel: Built 1 zonelists
Oct 12 15:33:35 swdev14 kernel: Initializing CPU#0
Oct 12 15:33:35 swdev14 nfslock: rpc.statd startup succeeded
Oct 12 15:33:35 swdev14 kernel: Kernel command line: ro root=LABEL=/ noapic
Oct 12 15:33:35 swdev14 kernel: (swapper/0): new 324744 us maximum-latency critical section.
Oct 12 15:33:35 swdev14 kernel: => started at: <start_kernel+0x48/0x1c6>
Oct 12 15:33:35 swdev14 kernel: => ended at: <cond_resched+0x26/0x83>
Oct 12 15:33:35 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:35 swdev14 kernel: [<c0134834>] check_preempt_timing+0x161/0x1f9
Oct 12 15:33:35 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:35 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:35 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:35 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:35 swdev14 kernel: [<c0135c0c>] register_cpu_notifier+0x18/0x58
Oct 12 15:33:36 swdev14 kernel: [<c01301b4>] rcu_cpu_notify+0x36/0x38
Oct 12 15:33:36 swdev14 kernel: [<c03646c7>] rcu_init+0x70/0x74
Oct 12 15:33:36 swdev14 kernel: [<c035485c>] start_kernel+0xb9/0x1c6
Oct 12 15:33:36 swdev14 kernel: [<c03543b0>] unknown_bootoption+0x0/0x15d
Oct 12 15:33:36 swdev14 kernel: PID hash table entries: 2048 (order: 11, 32768 bytes)
Oct 12 15:33:36 swdev14 kernel: (swapper/0): new 354276 us maximum-latency critical section.
Oct 12 15:33:36 swdev14 rpcidmapd: rpc.idmapd startup succeeded
Oct 12 15:33:36 swdev14 kernel: => started at: <cond_resched+0x26/0x83>
Oct 12 15:33:36 swdev14 kernel: => ended at: <cond_resched+0x26/0x83>
Oct 12 15:33:36 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:36 swdev14 kernel: [<c0134834>] check_preempt_timing+0x161/0x1f9
Oct 12 15:33:36 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:36 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:36 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:36 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:36 swdev14 random: Initializing random number generator: succeeded
Oct 12 15:33:36 swdev14 kernel: [<c0135c0c>] register_cpu_notifier+0x18/0x58
Oct 12 15:33:36 swdev14 kernel: [<c0128433>] timer_cpu_notify+0x25/0x27
Oct 12 15:33:36 swdev14 kernel: [<c03643cb>] init_timers+0x34/0x54
Oct 12 15:33:36 swdev14 kernel: [<c035486b>] start_kernel+0xc8/0x1c6
Oct 12 15:33:36 swdev14 kernel: [<c03543b0>] unknown_bootoption+0x0/0x15d
Oct 12 15:33:36 swdev14 kernel: Detected 2592.193 MHz processor.
Oct 12 15:33:36 swdev14 kernel: Using tsc for high-res timesource
Oct 12 15:33:36 swdev14 kernel: (swapper/0): new 705817 us maximum-latency critical section.
Oct 12 15:33:36 swdev14 rc: Starting pcmcia: succeeded
Oct 12 15:33:36 swdev14 kernel: => started at: <cond_resched+0x26/0x83>
Oct 12 15:33:36 swdev14 kernel: => ended at: <cond_resched+0x26/0x83>
Oct 12 15:33:36 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:36 swdev14 kernel: [<c0134834>] check_preempt_timing+0x161/0x1f9
Oct 12 15:33:36 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:36 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:36 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:36 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:36 swdev14 kernel: [<c01337fe>] _mutex_lock+0x19/0x3f
Oct 12 15:33:36 swdev14 kernel: [<c0133860>] _mutex_lock_irqsave+0x16/0x1c
Oct 12 15:33:36 swdev14 kernel: [<c01cbdd4>] tty_register_ldisc+0x37/0xa4
Oct 12 15:33:36 swdev14 kernel: [<c036be3e>] console_init+0x27/0x4a
Oct 12 15:33:36 swdev14 kernel: [<c035487a>] start_kernel+0xd7/0x1c6
Oct 12 15:33:36 swdev14 kernel: [<c03543b0>] unknown_bootoption+0x0/0x15d
Oct 12 15:33:36 swdev14 kernel: Console: colour VGA+ 80x25
Oct 12 15:33:36 swdev14 kernel: Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Oct 12 15:33:36 swdev14 kernel: Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Oct 12 15:33:36 swdev14 kernel: Memory: 513504k/523712k available (1645k kernel code, 9604k reserved, 726k data, 272k init, 0k highmem)
Oct 12 15:33:36 swdev14 netfs: Mounting NFS filesystems: succeeded
Oct 12 15:33:36 swdev14 kernel: Checking if this processor honours the WP bit even in supervisor mode... Ok.
Oct 12 15:33:36 swdev14 kernel: Security Scaffold v1.0.0 initialized
Oct 12 15:33:36 swdev14 netfs: Mounting other filesystems: succeeded
Oct 12 15:33:36 swdev14 kernel: Capability LSM initialized
Oct 12 15:33:36 swdev14 kernel: Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
Oct 12 15:33:37 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 15:33:37 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 15:33:37 swdev14 kernel: CPU: Physical Processor ID: 0
Oct 12 15:33:37 swdev14 kernel: Intel machine check architecture supported.
Oct 12 15:33:37 swdev14 kernel: Intel machine check reporting enabled on CPU#0.
Oct 12 15:33:37 swdev14 kernel: CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 15:33:37 swdev14 kernel: Enabling fast FPU save and restore... done.
Oct 12 15:33:37 swdev14 kernel: Enabling unmasked SIMD FPU exception support... done.
Oct 12 15:33:37 swdev14 kernel: Checking 'hlt' instruction... OK.
Oct 12 15:33:37 swdev14 kernel: CPU0: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 15:33:37 swdev14 kernel: per-CPU timeslice cutoff: 1462.71 usecs.
Oct 12 15:33:37 swdev14 kernel: task migration cache decay timeout: 2 msecs.
Oct 12 15:33:37 swdev14 kernel: Booting processor 1/1 eip 2000
Oct 12 15:33:37 swdev14 autofs: automount startup succeeded
Oct 12 15:33:37 swdev14 kernel: Initializing CPU#1
Oct 12 15:33:37 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 15:33:37 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 15:33:37 swdev14 smartd[2807]: smartd version 5.21 Copyright (C) 2002-3 Bruce Allen
Oct 12 15:33:37 swdev14 kernel: CPU: Physical Processor ID: 0
Oct 12 15:33:37 swdev14 smartd[2807]: Home page is http://smartmontools.sourceforge.net/
Oct 12 15:33:37 swdev14 smartd[2807]: Opened configuration file /etc/smartd.conf
Oct 12 15:33:37 swdev14 kernel: Intel machine check architecture supported.
Oct 12 15:33:37 swdev14 smartd[2807]: Configuration file /etc/smartd.conf parsed.
Oct 12 15:33:37 swdev14 smartd[2807]: Device: /dev/hda, opened
Oct 12 15:33:37 swdev14 kernel: Intel machine check reporting enabled on CPU#1.
Oct 12 15:33:37 swdev14 kernel: CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 15:33:37 swdev14 kernel: CPU1: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 15:33:37 swdev14 kernel: Booting processor 2/6 eip 2000
Oct 12 15:33:37 swdev14 smartd[2807]: Device: /dev/hda, not found in smartd database.
Oct 12 15:33:37 swdev14 kernel: Initializing CPU#2
Oct 12 15:33:37 swdev14 smartd[2807]: Device: /dev/hda, is SMART capable. Adding to "monitor" list.
Oct 12 15:33:37 swdev14 smartd[2807]: Monitoring 1 ATA and 0 SCSI devices
Oct 12 15:33:37 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 15:33:37 swdev14 smartd[2809]: smartd has fork()ed into background mode. New PID=2809.
Oct 12 15:33:37 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 15:33:37 swdev14 smartd: smartd startup succeeded
Oct 12 15:33:37 swdev14 kernel: CPU: Physical Processor ID: 3
Oct 12 15:33:38 swdev14 kernel: Intel machine check architecture supported.
Oct 12 15:33:38 swdev14 kernel: Intel machine check reporting enabled on CPU#2.
Oct 12 15:33:38 swdev14 kernel: CPU2: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 15:33:38 swdev14 kernel: CPU2: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 15:33:38 swdev14 kernel: Booting processor 3/7 eip 2000
Oct 12 15:33:38 swdev14 kernel: Initializing CPU#3
Oct 12 15:33:38 swdev14 kernel: CPU: Trace cache: 12K uops, L1 D cache: 8K
Oct 12 15:33:38 swdev14 sshd: succeeded
Oct 12 15:33:38 swdev14 kernel: CPU: L2 cache: 512K
Oct 12 15:33:38 swdev14 kernel: CPU: Physical Processor ID: 3
Oct 12 15:33:38 swdev14 kernel: Intel machine check architecture supported.
Oct 12 15:33:38 swdev14 kernel: Intel machine check reporting enabled on CPU#3.
Oct 12 15:33:38 swdev14 kernel: CPU3: Intel P4/Xeon Extended MCE MSRs (12) available
Oct 12 15:33:38 swdev14 kernel: CPU3: Intel(R) Xeon(TM) CPU 2.60GHz stepping 07
Oct 12 15:33:38 swdev14 kernel: Total of 4 processors activated (20594.68 BogoMIPS).
Oct 12 15:33:38 swdev14 kernel: checking TSC synchronization across 4 CPUs:
Oct 12 15:33:38 swdev14 kernel: CPU#0 had 0 usecs TSC skew, fixed it up.
Oct 12 15:33:38 swdev14 kernel: CPU#2 had 0 usecs TSC skeFlag at 0x36 set to 0x1
Oct 12 15:33:38 swdev14 kernel: apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
Oct 12 15:33:38 swdev14 xinetd: xinetd startup succeeded
Oct 12 15:33:38 swdev14 kernel: apm: disabled - APM is not SMP safe.
Oct 12 15:33:38 swdev14 kernel: VFS: Disk quotas dquot_6.5.1
Oct 12 15:33:38 swdev14 kernel: Dquot-cache hash table entries: 1024 (order 0, 4096 bytes)
Oct 12 15:33:38 swdev14 kernel: Initializing Cryptographic API
Oct 12 15:33:38 swdev14 kernel: vesafb: probe of vesafb0 failed with error -6
Oct 12 15:33:38 swdev14 kernel: isapnp: Scanning for PnP cards...
Oct 12 15:33:38 swdev14 kernel: isapnp: No Plug & Play device found
Oct 12 15:33:38 swdev14 kernel: requesting new irq thread for IRQ8...
Oct 12 15:33:38 swdev14 kernel: Real Time Clock Driver v1.12
Oct 12 15:33:38 swdev14 kernel: requesting new irq thread for IRQ12...
Oct 12 15:33:38 swdev14 kernel: Failed to disable AUX port, but continuing anyway... Is this a SiS?
Oct 12 15:33:38 swdev14 kernel: If AUX port is really absent please use the 'i8042.noaux' option.
Oct 12 15:33:38 swdev14 kernel: serio: i8042 AUX port at 0x60,0x64 irq 12
Oct 12 15:33:38 swdev14 kernel: serio: i8042 KBD port at 0x60,0x64 irq 1
Oct 12 15:33:38 swdev14 ntpdate[2873]: can't find host wizard
Oct 12 15:33:38 swdev14 kernel: io scheduler noop registered
Oct 12 15:33:39 swdev14 kernel: io scheduler anticipatory registered
Oct 12 15:33:39 swdev14 kernel: io scheduler deadline registered
Oct 12 15:33:39 swdev14 kernel: io scheduler cfq registered
Oct 12 15:33:39 swdev14 kernel: RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Oct 12 15:33:39 swdev14 kernel: Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
Oct 12 15:33:39 swdev14 kernel: ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
Oct 12 15:33:39 swdev14 kernel: ICH4: IDE controller at PCI slot 0000:00:1f.1
Oct 12 15:33:39 swdev14 kernel: PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
Oct 12 15:33:39 swdev14 kernel: ICH4: chipset revision 2
Oct 12 15:33:39 swdev14 kernel: ICH4: not 100%% native mode: will probe irqs later
Oct 12 15:33:39 swdev14 kernel: ide0: BM-DMA at 0x1460-0x1467, BIOS settings: hda:DMA, hdb:pio
Oct 12 15:33:39 swdev14 kernel: ide1: BM-DMA at 0x1468-0x146f, BIOS settings: hdc:DMA, hdd:pio
Oct 12 15:33:39 swdev14 kernel: hda: WDC WD800BB-75CAA0, ATA DISK drive
Oct 12 15:33:39 swdev14 kernel: requesting new irq thread for IRQ14...
Oct 12 15:33:39 swdev14 kernel: elevator: using anticipatory as default io scheduler
Oct 12 15:33:39 swdev14 kernel: ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Oct 12 15:33:39 swdev14 kernel: hdc: SONY DVD RW DW-U18A, ATAPI CD/DVD-ROM drive
Oct 12 15:33:39 swdev14 kernel: requesting new irq thread for IRQ15...
Oct 12 15:33:39 swdev14 kernel: ide1 at 0x170-0x177,0x376 on irq 15
Oct 12 15:33:39 swdev14 kernel: hda: max request size: 128KiB
Oct 12 15:33:39 swdev14 kernel: IRQ#14 thread started up.
Oct 12 15:33:39 swdev14 kernel: hda: Host Protected Area detected.
Oct 12 15:33:39 swdev14 kernel: ^Icurrent capacity is 156250000 sectors (80000 MB)
Oct 12 15:33:39 swdev14 kernel: ^Inative capacity is 156301488 sectors (80026 MB)
Oct 12 15:33:39 swdev14 kernel: hda: Host Protected Area disabled.
Oct 12 15:33:39 swdev14 kernel: hda: 156301488 sectors (80026 MB) w/2048KiB Cache, CHS=65535/16/63, UDMA(100)
Oct 12 15:33:39 swdev14 kernel: hda: hda1 hda2 < hda5 hda6 hda7 >
Oct 12 15:33:39 swdev14 kernel: mice: PS/2 mouse device common for all mice
Oct 12 15:33:39 swdev14 kernel: IRQ#12 thread started up.
Oct 12 15:33:39 swdev14 kernel: requesting new irq thread for IRQ1...
Oct 12 15:33:39 swdev14 kernel: IRQ#1 thread started up.
Oct 12 15:33:39 swdev14 kernel: input: AT Translated Set 2 keyboard on isa0060/serio0
Oct 12 15:33:39 swdev14 kernel: md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 2
Oct 12 15:33:39 swdev14 kernel: IP: routing cache hash table of 2048 buckets, 32Kbytes
Oct 12 15:33:39 swdev14 kernel: TCP: Hash tables configured (established 16384 bind 21845)
Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 1
Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 17
Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 8
Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 20
Oct 12 15:33:39 swdev14 kernel: Starting balanced_irq
Oct 12 15:33:39 swdev14 kernel: md: Autodetecting RAID arrays.
Oct 12 15:33:39 swdev14 kernel: md: autorun ...
Oct 12 15:33:39 swdev14 kernel: md: ... autorun DONE.
Oct 12 15:33:39 swdev14 kernel: RAMDISK: Compressed image found at block 0
Oct 12 15:33:39 swdev14 kernel: VFS: Mounted root (ext2 filesystem).
Oct 12 15:33:39 swdev14 kernel: kjournald starting. Commit interval 5 seconds
Oct 12 15:33:39 swdev14 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 12 15:33:39 swdev14 kernel: Freeing unused kernel memory: 272k freed
Oct 12 15:33:39 swdev14 kernel: IRQ#8 thread started up.
Oct 12 15:33:39 swdev14 kernel: usbcore: registered new driver usbfs
Oct 12 15:33:39 swdev14 kernel: usbcore: registered new driver hub
Oct 12 15:33:39 swdev14 kernel: USB Universal Host Controller Interface driver v2.2
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.0: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #1
Oct 12 15:33:39 swdev14 kernel: requesting new irq thread for IRQ11...
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.0: irq 11, io base 0x1400
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1
Oct 12 15:33:39 swdev14 kernel: hub 1-0:1.0: USB hub found
Oct 12 15:33:39 swdev14 kernel: hub 1-0:1.0: 2 ports detected
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.1: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #2
Oct 12 15:33:39 swdev14 kernel: requesting new irq thread for IRQ10...
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.1: irq 10, io base 0x1420
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 2
Oct 12 15:33:39 swdev14 kernel: hub 2-0:1.0: USB hub found
Oct 12 15:33:39 swdev14 kernel: hub 2-0:1.0: 2 ports detected
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.2: Intel Corp. 82801DB/DBL/DBM (ICH4/ICH4-L/ICH4-M) USB UHCI Controller #3
Oct 12 15:33:39 swdev14 kernel: requesting new irq thread for IRQ5...
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.2: irq 5, io base 0x1440
Oct 12 15:33:39 swdev14 kernel: uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 3
Oct 12 15:33:39 swdev14 kernel: hub 3-0:1.0: USB hub found
Oct 12 15:33:39 swdev14 kernel: hub 3-0:1.0: 2 ports detected
Oct 12 15:33:39 swdev14 kernel: usb 1-2: new low speed USB device using address 2
Oct 12 15:33:39 swdev14 kernel: IRQ#11 thread started up.
Oct 12 15:33:39 swdev14 kernel: input: USB HID v1.00 Mouse [Microsoft Microsoft IntelliMouse? Optical] on usb-0000:00:1d.0-2
Oct 12 15:33:39 swdev14 kernel: usbcore: registered new driver usbhid
Oct 12 15:33:39 swdev14 kernel: drivers/usb/input/hid-core.c: v2.0:USB HID core driver
Oct 12 15:33:39 swdev14 kernel: EXT3 FS on hda6, internal journal
Oct 12 15:33:39 swdev14 kernel: device-mapper: 4.1.0-ioctl (2003-12-10) initialised: [email protected]
Oct 12 15:33:39 swdev14 kernel: Adding 2048216k swap on /dev/hda5. Priority:-1 extents:1
Oct 12 15:33:39 swdev14 kernel: kjournald starting. Commit interval 5 seconds
Oct 12 15:33:39 swdev14 kernel: EXT3 FS on hda1, internal journal
Oct 12 15:33:39 swdev14 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 12 15:33:39 swdev14 kernel: kjournald starting. Commit interval 5 seconds
Oct 12 15:33:39 swdev14 kernel: EXT3 FS on hda7, internal journal
Oct 12 15:33:39 swdev14 kernel: EXT3-fs: mounted filesystem with ordered data mode.
Oct 12 15:33:39 swdev14 kernel: IA-32 Microcode Update Driver: v1.14 <[email protected]>
Oct 12 15:33:39 swdev14 kernel: microcode: No suitable data for CPU0
Oct 12 15:33:39 swdev14 kernel: microcode: No suitable data for CPU2
Oct 12 15:33:39 swdev14 kernel: microcode: No suitable data for CPU3
Oct 12 15:33:39 swdev14 kernel: microcode: No suitable data for CPU1
Oct 12 15:33:39 swdev14 kernel: parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP]
Oct 12 15:33:39 swdev14 kernel: parport0: irq 7 detected
Oct 12 15:33:39 swdev14 kernel: SCSI subsystem initialized
Oct 12 15:33:39 swdev14 kernel: inserting floppy driver for 2.6.9-rc4-mm1-VP-T8
Oct 12 15:33:39 swdev14 kernel: Floppy drive(s): fd0 is 1.44M
Oct 12 15:33:39 swdev14 kernel: requesting new irq thread for IRQ6...
Oct 12 15:33:39 swdev14 kernel: IRQ#6 thread started up.
Oct 12 15:33:39 swdev14 kernel: FDC 0 is a National Semiconductor PC87306
Oct 12 15:33:39 swdev14 kernel: tg3.c:v3.10 (September 14, 2004)
Oct 12 15:33:39 swdev14 kernel: eth0: Tigon3 [partno(BCM95702A20) rev 1002 PHY(5703)] (PCI:66MHz:32-bit) 10/100/1000BaseT Ethernet 00:50:45:00:9b:33
Oct 12 15:33:39 swdev14 kernel: eth0: RXcsums[1] LinkChgREG[0] MIirq[0] ASF[0] Split[0] WireSpeed[1] TSOcap[1]
Oct 12 15:33:39 swdev14 kernel: 3c59x: Donald Becker and others. http://www.scyld.com/network/vortex.html
Oct 12 15:33:39 swdev14 kernel: 0000:05:01.0: 3Com PCI 3c905C Tornado at 0x2000. Vers LK1.1.19
Oct 12 15:33:39 swdev14 kernel: IRQ#15 thread started up.
Oct 12 15:33:39 swdev14 kernel: hdc: ATAPI 40X DVD-ROM DVD-R CD-R/RW drive, 2048kB Cache, UDMA(33)
Oct 12 15:33:39 swdev14 kernel: Uniform CD-ROM driver Revision: 3.20
Oct 12 15:33:39 swdev14 kernel: ip_tables: (C) 2000-2002 Netfilter core team
Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 10
Oct 12 15:33:39 swdev14 kernel: IPv6 over IPv4 tunneling driver
Oct 12 15:33:40 swdev14 kernel: ------------[ cut here ]------------
Oct 12 15:33:40 swdev14 kernel: kernel BUG at kernel/mutex.c:185!
Oct 12 15:33:40 swdev14 kernel: invalid operand: 0000 [#1]
Oct 12 15:33:40 swdev14 kernel: PREEMPT SMP
Oct 12 15:33:40 swdev14 kernel: Modules linked in: ipv6 autofs4 nfs lockd sunrpc iptable_filter ip_tables ide_cd cdrom 3c59x mii tg3 floppy sg scsi_mod parport_pc parport microcode dm_mod evdev usbhid uhci_hcd usbcore ext3 jbd
Oct 12 15:33:40 swdev14 kernel: CPU: 0
Oct 12 15:33:40 swdev14 kernel: EIP: 0060:[<c0133ba4>] Not tainted VLI
Oct 12 15:33:40 swdev14 kernel: EFLAGS: 00010246 (2.6.9-rc4-mm1-VP-T8)
Oct 12 15:33:40 swdev14 kernel: EIP is at _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:40 swdev14 kernel: eax: 00000000 ebx: 00000000 ecx: c0350f80 edx: c1406b80
Oct 12 15:33:40 swdev14 kernel: esi: da261ac8 edi: da7b1814 ebp: de661f18 esp: de661f18
Oct 12 15:33:40 swdev14 kernel: ds: 007b es: 007b ss: 0068 preempt: 00000002
Oct 12 15:33:40 swdev14 kernel: Process sshd (pid: 2835, threadinfo=de660000 task=debaea80)
Oct 12 15:33:40 swdev14 kernel: Stack: de661f44 c0253738 c0350f80 00000016 c029a33a c16caa80 da261c94 da261bfc
Oct 12 15:33:40 swdev14 kernel: c16caa80 da261a80 ffffffea de661f5c c0275a8c da261a80 00000005 c16caa80
Oct 12 15:33:40 swdev14 kernel: 00000003 de661f78 c02287ea c16caa80 00000005 00000000 00000004 08090bf0
Oct 12 15:33:40 swdev14 kernel: Call Trace:
Oct 12 15:33:40 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
Oct 12 15:33:40 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
Oct 12 15:33:40 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
Oct 12 15:33:40 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
Oct 12 15:33:40 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
Oct 12 15:33:40 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
Oct 12 15:33:40 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 15:33:40 swdev14 kernel: Code: 75 fc 89 ec 5d c3 55 89 e5 e8 39 fb fd ff 8b 4d 08 8b 01 85 c0 74 14 8d 41 04 ba ff ff 00 00 f0 0f c1 10 0f 85 6e 03 00 00 5d c3 <0f> 0b b9 00 9a c6 2a c0 eb e2 55 89 e5 e8 0a fb fd ff 8b 4d 08
Oct 12 15:33:40 swdev14 kernel: <3>scheduling while atomic: sshd/0x04000001/2835
Oct 12 15:33:40 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c02991cf>] schedule+0xbaf/0xbe2
Oct 12 15:33:40 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:40 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:40 swdev14 kernel: [<c0134864>] check_preempt_timing+0x191/0x1f9
Oct 12 15:33:40 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:40 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:40 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c011f5c8>] profile_task_exit+0x18/0x56
Oct 12 15:33:40 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 15:33:40 swdev14 kernel: [<c01216e0>] do_exit+0x1f/0x3bd
Oct 12 15:33:40 swdev14 kernel: [<c0299213>] preempt_schedule+0x11/0x7a
Oct 12 15:33:40 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 15:33:40 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 15:33:40 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 15:33:40 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 15:33:40 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:40 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:40 swdev14 kernel: [<c029a11e>] _write_lock+0x1b/0x76
Oct 12 15:33:40 swdev14 kernel: [<c026436e>] tcp_listen_wlock+0x16/0xac
Oct 12 15:33:40 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 15:33:40 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 15:33:40 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:40 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
Oct 12 15:33:40 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
Oct 12 15:33:40 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
Oct 12 15:33:40 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
Oct 12 15:33:40 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
Oct 12 15:33:40 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
Oct 12 15:33:40 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 15:33:40 swdev14 kernel: note: sshd[2835] exited with preempt_count 1
Oct 12 15:33:40 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2835
Oct 12 15:33:40 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c02991cf>] schedule+0xbaf/0xbe2
Oct 12 15:33:40 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:40 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:40 swdev14 kernel: [<c0134864>] check_preempt_timing+0x191/0x1f9
Oct 12 15:33:40 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:40 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:40 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c014b7e5>] unmap_vmas+0x190/0x29b
Oct 12 15:33:40 swdev14 kernel: [<c0150037>] exit_mmap+0xb2/0x1cc
Oct 12 15:33:40 swdev14 kernel: [<c011c96d>] mmput+0x3b/0xb9
Oct 12 15:33:40 swdev14 kernel: [<c01217ef>] do_exit+0x12e/0x3bd
Oct 12 15:33:40 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 15:33:40 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 15:33:40 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 15:33:40 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 15:33:40 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:40 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:40 swdev14 kernel: [<c029a11e>] _write_lock+0x1b/0x76
Oct 12 15:33:40 swdev14 kernel: [<c026436e>] tcp_listen_wlock+0x16/0xac
Oct 12 15:33:40 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 15:33:40 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 15:33:40 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:40 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
Oct 12 15:33:40 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
Oct 12 15:33:40 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
Oct 12 15:33:40 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
Oct 12 15:33:40 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
Oct 12 15:33:40 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
Oct 12 15:33:40 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 15:33:40 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2835
Oct 12 15:33:40 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c02991cf>] schedule+0xbaf/0xbe2
Oct 12 15:33:40 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:40 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:40 swdev14 kernel: [<c0134864>] check_preempt_timing+0x191/0x1f9
Oct 12 15:33:40 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:40 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:40 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:40 swdev14 kernel: [<c014b7e5>] unmap_vmas+0x190/0x29b
Oct 12 15:33:40 swdev14 kernel: [<c0150037>] exit_mmap+0xb2/0x1cc
Oct 12 15:33:40 swdev14 kernel: [<c011c96d>] mmput+0x3b/0xb9
Oct 12 15:33:40 swdev14 kernel: [<c01217ef>] do_exit+0x12e/0x3bd
Oct 12 15:33:40 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 15:33:40 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 15:33:40 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 15:33:40 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 15:33:40 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:40 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:40 swdev14 kernel: [<c029a11e>] _write_lock+0x1b/0x76
Oct 12 15:33:40 swdev14 kernel: [<c026436e>] tcp_listen_wlock+0x16/0xac
Oct 12 15:33:40 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 15:33:40 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 15:33:40 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:40 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
Oct 12 15:33:41 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
Oct 12 15:33:41 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
Oct 12 15:33:41 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
Oct 12 15:33:41 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
Oct 12 15:33:41 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
Oct 12 15:33:41 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 15:33:41 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2835
Oct 12 15:33:41 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 15:33:41 swdev14 kernel: [<c02991cf>] schedule+0xbaf/0xbe2
Oct 12 15:33:41 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:41 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:41 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:41 swdev14 kernel: [<c0134864>] check_preempt_timing+0x191/0x1f9
Oct 12 15:33:41 swdev14 kernel: [<c0134912>] touch_preempt_timing+0x46/0x4a
Oct 12 15:33:41 swdev14 kernel: [<c0299641>] cond_resched+0x26/0x83
Oct 12 15:33:41 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:41 swdev14 kernel: [<c014b7e5>] unmap_vmas+0x190/0x29b
Oct 12 15:33:41 swdev14 kernel: [<c0150037>] exit_mmap+0xb2/0x1cc
Oct 12 15:33:41 swdev14 kernel: [<c011c96d>] mmput+0x3b/0xb9
Oct 12 15:33:41 swdev14 kernel: [<c01217ef>] do_exit+0x12e/0x3bd
Oct 12 15:33:41 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 15:33:41 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 15:33:41 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 15:33:41 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 15:33:41 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:41 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:41 swdev14 kernel: [<c029a11e>] _write_lock+0x1b/0x76
Oct 12 15:33:41 swdev14 kernel: [<c026436e>] tcp_listen_wlock+0x16/0xac
Oct 12 15:33:41 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 15:33:41 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 15:33:41 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:41 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
Oct 12 15:33:41 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
Oct 12 15:33:41 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
Oct 12 15:33:41 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
Oct 12 15:33:41 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
Oct 12 15:33:41 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
Oct 12 15:33:41 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 15:33:41 swdev14 kernel: scheduling while atomic: sshd/0x04000001/2835
Oct 12 15:33:41 swdev14 kernel: caller is cond_resched+0x63/0x83
Oct 12 15:33:41 swdev14 kernel: [<c02991cf>] schedule+0xbaf/0xbe2
Oct 12 15:33:41 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:41 swdev14 kernel: [<c0113d1c>] smp_apic_timer_interrupt+0x79/0xe4
Oct 12 15:33:41 swdev14 kernel: [<c0134914>] touch_preempt_timing+0x48/0x4a
Oct 12 15:33:41 swdev14 kernel: [<c0106bb6>] apic_timer_interrupt+0x1a/0x20
Oct 12 15:33:41 swdev14 kernel: [<c0134914>] touch_preempt_timing+0x48/0x4a
Oct 12 15:33:41 swdev14 kernel: [<c029967e>] cond_resched+0x63/0x83
Oct 12 15:33:41 swdev14 kernel: [<c014dfa3>] remove_vm_struct+0x1f/0x9f
Oct 12 15:33:41 swdev14 kernel: [<c01500f7>] exit_mmap+0x172/0x1cc
Oct 12 15:33:41 swdev14 kernel: [<c011c96d>] mmput+0x3b/0xb9
Oct 12 15:33:41 swdev14 kernel: [<c01217ef>] do_exit+0x12e/0x3bd
Oct 12 15:33:41 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 15:33:41 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 15:33:41 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 15:33:41 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 15:33:41 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:41 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:41 swdev14 kernel: [<c029a11e>] _write_lock+0x1b/0x76
Oct 12 15:33:41 swdev14 kernel: [<c026436e>] tcp_listen_wlock+0x16/0xac
Oct 12 15:33:41 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 15:33:41 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 15:33:41 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:41 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
Oct 12 15:33:41 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
Oct 12 15:33:41 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
Oct 12 15:33:41 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
Oct 12 15:33:41 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
Oct 12 15:33:41 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
Oct 12 15:33:41 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 15:33:41 swdev14 kernel: scheduling while atomic: sshd/0x00000001/2835
Oct 12 15:33:41 swdev14 kernel: caller is __down+0x8a/0x107
Oct 12 15:33:41 swdev14 kernel: [<c02991cf>] schedule+0xbaf/0xbe2
Oct 12 15:33:41 swdev14 kernel: [<c02983fe>] __down+0x8a/0x107
Oct 12 15:33:41 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:41 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:41 swdev14 kernel: [<c029a2c0>] _spin_unlock_irqrestore+0xb/0x36
Oct 12 15:33:41 swdev14 kernel: [<c02983f9>] __down+0x85/0x107
Oct 12 15:33:41 swdev14 kernel: [<c01136d4>] mcount+0x14/0x18
Oct 12 15:33:41 swdev14 kernel: [<c02983fe>] __down+0x8a/0x107
Oct 12 15:33:41 swdev14 kernel: [<c011a368>] default_wake_function+0x0/0x1c
Oct 12 15:33:41 swdev14 kernel: [<c02985c4>] __down_failed+0x8/0xc
Oct 12 15:33:41 swdev14 kernel: [<c011c3cc>] .text.lock.sched+0x5/0x15
Oct 12 15:33:41 swdev14 kernel: [<c01ccb0d>] disassociate_ctty+0x1d/0x16d
Oct 12 15:33:41 swdev14 kernel: [<c01218d9>] do_exit+0x218/0x3bd
Oct 12 15:33:41 swdev14 kernel: [<c0107780>] do_invalid_op+0x0/0x10b
Oct 12 15:33:41 swdev14 kernel: [<c01073e2>] do_divide_error+0x0/0x131
Oct 12 15:33:41 swdev14 kernel: [<c0117898>] fixup_exception+0x1c/0x38
Oct 12 15:33:41 swdev14 kernel: [<c0107889>] do_invalid_op+0x109/0x10b
Oct 12 15:33:41 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:41 swdev14 kernel: [<c0134129>] __mcount+0x1d/0x21
Oct 12 15:33:41 swdev14 kernel: [<c029a11e>] _write_lock+0x1b/0x76
Oct 12 15:33:41 swdev14 kernel: [<c026436e>] tcp_listen_wlock+0x16/0xac
Oct 12 15:33:41 swdev14 kernel: [<c0106c31>] error_code+0x2d/0x38
Oct 12 15:33:41 swdev14 kernel: [<c011007b>] generic_set_mtrr+0x68/0x9c
Oct 12 15:33:41 swdev14 kernel: [<c0133ba4>] _rw_mutex_write_unlock+0x25/0x2f
Oct 12 15:33:41 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
Oct 12 15:33:41 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
Oct 12 15:33:41 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
Oct 12 15:33:41 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
Oct 12 15:33:41 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
Oct 12 15:33:41 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
Oct 12 15:33:41 swdev14 kernel: [<c0106175>] sysenter_past_esp+0x52/0x71
Oct 12 15:33:47 swdev14 ntpdate[2873]: step time server 159.82.80.54 offset -0.490683 sec
Oct 12 15:33:47 swdev14 ntpd: succeeded
Oct 12 15:33:47 swdev14 ntpd[2877]: ntpd [email protected] Thu Mar 11 11:46:39 EST 2004 (1)
Oct 12 15:33:47 swdev14 ntpd: ntpd startup succeeded
Oct 12 15:33:47 swdev14 ntpd[2877]: precision = 1.000 usec
Oct 12 15:33:47 swdev14 ntpd[2877]: kernel time sync status 0040
Oct 12 15:33:47 swdev14 ntpd[2877]: configure: keyword "opus" unknown, line ignored
Oct 12 15:33:47 swdev14 ntpd[2877]: configure: keyword "hal" unknown, line ignored
Oct 12 15:33:47 swdev14 ntpd[2877]: configure: keyword "wizard" unknown, line ignored
Oct 12 15:33:47 swdev14 ntpd[2877]: configure: keyword "time1.utc.com" unknown, line ignored
Oct 12 15:33:47 swdev14 ntpd[2877]: configure: keyword "time2.utc.com" unknown, line ignored
Oct 12 15:33:47 swdev14 ntpd[2877]: configure: keyword "time3.utc.com" unknown, line ignored
Oct 12 15:33:47 swdev14 vsftpd: vsftpd vsftpd succeeded
Oct 12 15:33:47 swdev14 ntpd[2877]: frequency initialized 117.670 PPM from /var/lib/ntp/drift
Oct 12 15:33:47 swdev14 ntpd[2877]: configure: keyword "authenticate" unknown, line ignored
Oct 12 15:33:47 swdev14 gpm[2896]: *** info [startup.c(95)]:
Oct 12 15:33:47 swdev14 gpm[2896]: Started gpm successfully. Entered daemon mode.
Oct 12 15:33:47 swdev14 gpm[2896]: *** info [mice.c(1766)]:
Oct 12 15:33:47 swdev14 gpm[2896]: imps2: Auto-detected intellimouse PS/2
Oct 12 15:33:48 swdev14 gpm: gpm startup succeeded
Oct 12 15:33:48 swdev14 crond: crond startup succeeded
Oct 12 15:33:50 swdev14 kernel: lp0: using parport0 (polling).
Oct 12 15:33:50 swdev14 kernel: lp0: console ready
Oct 12 15:36:04 swdev14 syslogd 1.4.1: restart.
* K.R. Foley <[email protected]> wrote:
> Booted part way this time but then soft locked. [...]
the soft lock it seems was due to a mutex assert in the TCP code. Now i
see that you have the ipv6 module loaded. Would it be possible to
disable ipv6 briefly, to see whether there are other issues?
> Mouse was still working but not KB. Any idea why the KB is not working
> with these? I am rebuilding with all of the pwr managment off because
> I am still getting acpi messages in the boot :-/ [...]
hm, the keyboard should be working - i have not seen any specific
keyboard problems with PREEMPT_REALTIME enabled (except in the very
early stages of this feature). So it must be something else. Is this a
PS2 or an USB keyboard?
> Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 10
> Oct 12 15:33:39 swdev14 kernel: IPv6 over IPv4 tunneling driver
> Oct 12 15:33:40 swdev14 kernel: ------------[ cut here ]------------
> Oct 12 15:33:40 swdev14 kernel: kernel BUG at kernel/mutex.c:185!
> Oct 12 15:33:40 swdev14 kernel: EIP is at _rw_mutex_write_unlock+0x25/0x2f
> Oct 12 15:33:40 swdev14 kernel: Call Trace:
> Oct 12 15:33:40 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
> Oct 12 15:33:40 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
> Oct 12 15:33:40 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
> Oct 12 15:33:40 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
> Oct 12 15:33:40 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
> Oct 12 15:33:40 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
this assert says that a write_unlock() was done before the mutex was
initialized - which is a no-no. Note that the stock kernel does not do
this checking and there's a chance that it has a true bug here which it
ignores silently. The more likely scenario is that the kernel mutex code
somewhere changed an assumption which broke the code. I'll try to
reproduce this.
Ingo
i've uploaded the -T9 VP patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T9
this is a bugfixes-only release that should fix the highmem-related
issues reported by K.R. Foley and Mark H. Johnson: 3 more locks had to
be converted to raw.
to create a -T9 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T9
Ingo
Ingo Molnar wrote:
>
> i've uploaded the -T9 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T9
>
Built and running on my laptop (P4/UP). Usual dmesg info goes attached,
where some exposed traces need a look.
OTOH, my desktop (P4/SMP/HT) refuses to boot/init all the way up. I guess
you know the picture ;) Couldn't get anything out of the mess. Will try
later.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Booted part way this time but then soft locked. [...]
>
>
> the soft lock it seems was due to a mutex assert in the TCP code. Now i
> see that you have the ipv6 module loaded. Would it be possible to
> disable ipv6 briefly, to see whether there are other issues?
Yes. I don't need and will disable when building T9.
>
>
>>Mouse was still working but not KB. Any idea why the KB is not working
>>with these? I am rebuilding with all of the pwr managment off because
>>I am still getting acpi messages in the boot :-/ [...]
>
>
> hm, the keyboard should be working - i have not seen any specific
> keyboard problems with PREEMPT_REALTIME enabled (except in the very
> early stages of this feature). So it must be something else. Is this a
> PS2 or an USB keyboard?
PS2 keyboard. For clarification, I have intermittent problems with the
keyboard on this machine with anything to do with preempt. Sometimes it
works on boot sometimes it does not. Actually since it is so
unpredictable, I can't even say for sure that it has to do with preempt,
but it hasn't failed yet on rc4-mm1. I have tried to find a pattern and
have yet to do so. Possibly pertinent info:
Iwill MB dual 2.6G Xeon 512 ram
00:00.0 Host bridge: Intel Corp. E7505 Memory Controller Hub (rev 03)
00:00.1 Class ff00: Intel Corp. E7000 Series RAS Controller (rev 03)
00:01.0 PCI bridge: Intel Corp. E7000 Series Processor to AGP Controller
(rev 03)
00:02.0 PCI bridge: Intel Corp. E7000 Series Hub Interface B PCI-to-PCI
Bridge (rev 03)
00:1d.0 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #1 (rev 02)
00:1d.1 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #2 (rev 02)
00:1d.2 USB Controller: Intel Corp. 82801DB (ICH4) USB UHCI #3 (rev 02)
00:1d.7 USB Controller: Intel Corp. 82801DB (ICH4) USB2 EHCI Controller
(rev 02)
00:1e.0 PCI bridge: Intel Corp. 82801BA/CA/DB/EB/ER Hub interface to PCI
Bridge (rev 82)
00:1f.0 ISA bridge: Intel Corp. 82801DB (ICH4) LPC Bridge (rev 02)
00:1f.1 IDE interface: Intel Corp. 82801DB (ICH4) Ultra ATA 100 Storage
Controller (rev 02)
00:1f.3 SMBus: Intel Corp. 82801DB/DBM (ICH4) SMBus Controller (rev 02)
01:00.0 VGA compatible controller: nVidia Corporation NV17 [GeForce4 MX
440] (rev a3)
02:1c.0 PIC: Intel Corp. 82870P2 P64H2 I/OxAPIC (rev 03)
02:1d.0 PCI bridge: Intel Corp. 82870P2 P64H2 Hub PCI Bridge (rev 03)
02:1e.0 PIC: Intel Corp. 82870P2 P64H2 I/OxAPIC (rev 03)
02:1f.0 PCI bridge: Intel Corp. 82870P2 P64H2 Hub PCI Bridge (rev 03)
04:04.0 Ethernet controller: Broadcom Corporation NetXtreme BCM5702X
Gigabit Ethernet (rev 02)
05:01.0 Ethernet controller: 3Com Corporation 3c905C-TX/TX-M [Tornado]
(rev 78)
>
>
>>Oct 12 15:33:39 swdev14 kernel: NET: Registered protocol family 10
>>Oct 12 15:33:39 swdev14 kernel: IPv6 over IPv4 tunneling driver
>>Oct 12 15:33:40 swdev14 kernel: ------------[ cut here ]------------
>>Oct 12 15:33:40 swdev14 kernel: kernel BUG at kernel/mutex.c:185!
>
>
>>Oct 12 15:33:40 swdev14 kernel: EIP is at _rw_mutex_write_unlock+0x25/0x2f
>
>
>>Oct 12 15:33:40 swdev14 kernel: Call Trace:
>>Oct 12 15:33:40 swdev14 kernel: [<c0253738>] tcp_listen_start+0x175/0x1d1
>>Oct 12 15:33:40 swdev14 kernel: [<c029a33a>] _spin_unlock_bh+0x1a/0x34
>>Oct 12 15:33:40 swdev14 kernel: [<c0275a8c>] inet_listen+0x65/0x7a
>>Oct 12 15:33:40 swdev14 kernel: [<c02287ea>] sys_listen+0x5c/0x74
>>Oct 12 15:33:40 swdev14 kernel: [<c02294a4>] sys_socketcall+0xb1/0x239
>>Oct 12 15:33:40 swdev14 kernel: [<c015ae81>] sys_close+0x75/0x91
>
>
> this assert says that a write_unlock() was done before the mutex was
> initialized - which is a no-no. Note that the stock kernel does not do
> this checking and there's a chance that it has a true bug here which it
> ignores silently. The more likely scenario is that the kernel mutex code
> somewhere changed an assumption which broke the code. I'll try to
> reproduce this.
>
> Ingo
>
Ingo Molnar wrote:
> i've uploaded the -T9 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T9
>
OK. This one actually boots all the way into X and even shuts down
cleanly (no errors either way). Still no keyboard, which is why I had to
shut it down. :) Does this indicate that the keyboard is actually being
detected or no?
Oct 13 09:29:59 swdev14 kernel: requesting new irq thread for IRQ1...
Oct 13 09:29:59 swdev14 kernel: IRQ#1 thread started up.
Oct 13 09:29:59 swdev14 kernel: input: AT Translated Set 2 keyboard on
isa0060/serio0
We're getting there. This is with ipv6 disabled, btw.
kr
>i've uploaded the -T9 VP patch:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-T9
I may have been lucky with -T8 where it made it through a complete boot and
shutdown cycle (spewing lots of messages), but -T9 appeared to start OK but
died
with the following messages displayed. [my back was turned, so I am not
sure the
step in the init sequence that was active, trying again...]
[<c0114b60>] mcount+0x14/0x18
[<c011d0a1>] __wake_up_common+0x51/0x80
[<c011d113>] __wake_up+0x43/0x60
[<c0313d54>] __up_wakeup+0x8/0xc
[<c0139685>] .text.lock.mutex+0x2d/0x148
[<c01c50cb>] avc_has_perm_noaudit+0xab/0x1d0
[<c01c5204>] avc_has_perm+0x14/0x80
[<c01c67c4>] inode_has_perm+0x64/0x90
[<c0114b60>] mcount+0x14/0x18
[<c01c5233>] avc_has_perm+0x43/0x80
[<c01c97f7>] file_map_prot_check+0x117/0x120
[<c0114b60>] mcount+0x14/0x18
[<c01c67c4>] inode_has_perm+0x64/0x90
[<c01c8a6e>] selinux_inode_getattr+0x5e/0x70
[<c0114b60>] mcount+0x14/0x18
[<c01398ed>] __mcount+0x1d/0x30
[<c01c96ee>] file_map_prot_check+0xe/0x120
[<c0159635>] do_mmap_pgoff+0x205/0x790
[<c0114b60>] mcount+0x14/0x18
[<c01c97f7>] file_map_prot_check+0x117/0x120
[<c0159635>] do_mmap_pgoff+0x205/0x790
[<c0107b5b>] syscall_call+0x7/0xb
Code: 00 55 89 e5 83 ec 0c 89 5d f8 89 75 fc e8 6b f7 df ff c7 04 24 01 00
00 00
89 c3 e8 e1 4d e2 ff be 00 e0 ff ff 21 e6 31 c0 86 03
Had to cycle power to get system to reboot; watching more closely this
time.
Of course, the second time it boots all the way but was spewing some (not a
lot)
of messages related to the network interface I have (8139too). I tried to
login
but the system hung before displaying anything interesting. Rebooted with
-T3
again to get some work done. Will send the boot messages separately.
--Mark H Johnson
<mailto:[email protected]>
On Thu, 2004-10-07 at 06:52, Ingo Molnar wrote:
> i've released the -T3 VP patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc3-mm3-T3
>
Ingo, here's the data from an 87 million sample run (about 24 hours) of
my jackd test. Note the 1292 usec outlier. This is definitely not
latency trace overhead as it was disabled. I suspect either the
rt_garbage_collect or journal_clean_checkpoint_list code path is
responsible.
Delay # samples
----- ---------
0 86798124
1 122
2 113
3 118
4 109
5 96
6 78
7 66
8 66
9 51
10 52
11 41
12 39
13 43
14 43
15 30
16 17
17 18
18 28
19 17
20 18
21 16
22 19
23 19
24 19
25 15
26 26
27 13
28 13
29 10
30 17
31 19
32 12
33 12
34 17
35 7
36 16
37 13
38 7
39 10
40 9
41 9
42 9
43 8
44 11
45 18
46 12
47 13
48 6
49 12
50 12
51 13
52 8
53 11
54 14
55 5
56 5
57 11
58 8
59 11
60 8
61 5
62 11
63 8
64 9
65 11
66 9
67 8
68 8
69 8
70 8
71 8
72 11
73 10
74 10
75 8
76 2
77 7
78 4
79 9
80 5
81 4
82 2
83 7
84 6
85 1
86 7
87 5
88 10
89 7
90 3
91 7
92 7
93 7
94 1
95 7
96 5
97 7
98 2
99 3
100 3
101 4
102 4
103 4
104 5
105 3
106 5
107 2
108 4
109 1
110 2
111 2
112 3
113 2
115 1
116 3
117 2
118 3
119 3
120 1
121 2
122 2
123 2
124 1
127 1
128 2
129 2
130 2
133 1
135 1
139 1
141 2
145 1
147 1
152 1
156 1
169 1
173 1
177 1
187 1
194 2
233 1
242 1
290 1
352 1
1292 1
Lee
On Tue, 2004-10-12 at 05:17, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > Just to recap, these are the three problem areas that still produce
> > latencies over 500 usec on my machine.
> >
> > journal_clean_checkpoint_list
>
> you might want to send this trace to Andrew too - the primary master of
> ext3 latency-breaking.
>
OK, Andrew, here it is. This is one of the last 2 or 3 code paths that
can still produce latencies > 200 usecs on a typical machine.
--
Also, I am still seeing some long latencies in the ext3 journaling code:
preemption latency trace v1.0.7 on 2.6.9-rc3-mm3-VP-T3
-------------------------------------------------------
latency: 607 us, entries: 1087 (1087) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: kjournald/687, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: journal_commit_transaction+0x75/0x2830
=> ended at: __journal_clean_checkpoint_list+0xb2/0xf0
=======>
00000001 0.000ms (+0.003ms): journal_commit_transaction (kjournald)
Here is the loop:
00000002 0.003ms (+0.001ms): kfree (journal_commit_transaction)
00000001 0.004ms (+0.001ms): journal_refile_buffer (journal_commit_transaction)
00000003 0.006ms (+0.000ms): __journal_refile_buffer (journal_refile_buffer)
00000003 0.006ms (+0.001ms): __journal_unfile_buffer (journal_refile_buffer)
00000002 0.008ms (+0.000ms): journal_remove_journal_head (journal_refile_buffer)
00000003 0.008ms (+0.000ms): __journal_remove_journal_head (journal_remove_journal_head)
00000003 0.009ms (+0.000ms): __brelse (__journal_remove_journal_head)
00000003 0.010ms (+0.000ms): journal_free_journal_head (journal_remove_journal_head)
00000003 0.010ms (+0.001ms): kmem_cache_free (journal_free_journal_head)
00000001 0.011ms (+0.000ms): __brelse (journal_commit_transaction)
[end loop]
00000002 0.012ms (+0.000ms): kfree (journal_commit_transaction)
00000001 0.013ms (+0.000ms): journal_refile_buffer (journal_commit_transaction)
Lee
On Wed, 13 Oct 2004 08:15:18 +0200, Ingo Molnar <[email protected]> wrote:
>
> i've uploaded the -T9 VP patch:
>
There are lots of similar warnings in Reiser4:
CC fs/reiser4/plugin/node/node.o
In file included from include/linux/spinlock.h:16,
from include/linux/wait.h:25,
from include/linux/fs.h:12,
from fs/reiser4/plugin/node/../../reiser4.h:13,
from fs/reiser4/plugin/node/../../debug.h:9,
from fs/reiser4/plugin/node/node.c:47:
include/asm/mutex.h:75:5: warning: "RWSEM_DEBUG" is not defined
Also, there is an easy to fix compile error - redefinition of
inode_lock in fs/reiser4/plugin/object.c
i'm pleased to announce a significantly improved version of the
Real-Time Preemption (PREEMPT_REALTIME) feature that i have been working
towards in the past couple of weeks.
the patch (against 2.6.9-rc4-mm1) can be downloaded from:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U0
this is i believe the first correct conversion of the Linux kernel to a
fully preemptible (fully mutex-based) preemption model, while still
keeping all locking properties of Linux.
I also think that this feature can and should be integrated into the
upstream kernel sometime in the future. It will need improvements and
fixes and lots of testing, but i believe the basic concept is sound and
inclusion is manageable and desirable. So comments, testing and feedback
is more than welcome!
to recap the conceptual issues that needed solving: the previous patch
already converted a fair portion of spinlocks to mutexes, but that in a
fully preemptible kernel model the following locking primitives are
especially problematic:
- RCU locking
- per-cpu counters and per-cpu variables
- tlb gather/release logic
- seqlocks
- atomic-kmaps
- pte mapping
note that while we tend to think about these as SMP locking primitives,
in a fully preemptible model these locking rules are necessary for
correctness, even on a single-CPU embedded box.
Unfortunately none of the existing preemption patches solve these
problems: they concentrate on UP but these locking rules must not be
ignored on UP either!
(Bill Huey's mutex patch he just released is the one notable exception i
know about, which is also a correct implementation i believe, but it
doesnt attack these locking rules [yet]. Bill's locking hierarchy is i
believe quite similar to the -T9 patch - this i believe is roughly the
best one can get when only using spinlocks as a vehicle.)
In the previous (-T9) version of the Real-Time Preemption patch, the
above locking primitives kept large portions of kernel code
non-preemptable, causing a 'spreadout' of raw spinlocks and
non-preemptible sections to various kernel subsystems.
Another, even more problematic effect was that both network drivers and
block IO drivers got 'contaminated' by raw spinlocks, triggering lots of
per-driver changes and thus limiting testability. It is basically an
opt-in model to correctness which is bad from a maintainance and
upstream acceptance point of view. The -T9 patch was a prime example of
the problems the Linux locking primitives cause in a fully preemptible
kernel model.
To solve all these fundamental problems, i improved/fixed/changed all of
these locking methods to be preemption-friendly. Most of the time it was
necessary to introduce an additional API variant because e.g.
rcu_read_lock() is anonymous (it doesnt identify the data protected), so
i introduced a variant that takes the write-lock as an argument. In the
PREEMPT_REALTIME case we can thus properly serialize on that lock.
For per-cpu variables i introduced a new API variant that creates a
spinlock-array for the per-cpu-variable, and users must make sure the
cpu field doesnt change. Migration to another CPU can happen within the
critical section, but 'statistically' the variable is still per-CPU and
update correctness is fully preserved.
TLB shootdown was the source of another nasty type of critical section:
it keeps preemption disabled during much of the pagetable zapping and
also relies on a per-CPU variable to keep TLB state. The fix for
PREEMPT_REALTIME (on x86) was to implement a simpler but preemptible TLB
flushing method. This necessiated two minor API changes to the generic
TLB shootdown code - pretty straightforward ones.
Atomic kmaps were another source of non-preemptible sections:
pte_map/pte_unmap both used nontrivial functions within and ran for a
long time, creating a lock dependency and a latency problem as well. I
wrapped them via normal kmaps, which thus become preemptible. (The main
reason for atomic kmaps were non-preemptability constraints - but those
are not present in a fully preemptible model.)
seqlocks (used within the VFS and other places) were another problem:
the are now preemptible by default, the same auto-type-detection logic
applies to them as to preemptible/raw spinlocks: switching between a
preemptible and a non-preemptible type is done by changing the
prototype, the locking APIs stay the same.
The improvements to locking allowed the gradual 'pulling out' of all
raw-spinlocks from various kernel subsystems. In the -U0 patch i have
almost completely finished this 'pullout', and as a result the following
kernel subsystems are now completely spinlock-free and 100% mutex-based:
- networking
- IO subsystem
- VFS and lowlevel filesystems
- memory management (except the SLAB allocator lock)
- signal code
- all of drivers/* and sound/*
note: it is important that when PREEMPT_REALTIME is disabled, the old
locking rules apply and there is no performance impact whatsoever. So
what this implements is in essence a compile-time multi-tier locking
architecture enabling 4 types of preemption models:
- stock (casual voluntary kernel preemption)
- CONFIG_PREEMPT_VOLUNTARY (lots of cond_resched() points)
- CONFIG_PREEMPT (involuntary preemption plus spinlocks)
- CONFIG_PREEMPT_REALTIME (everything is a mutex)
these models cover all the variants people are interested in: servers
with almost no latency worries, desktops with ~1msec needs and hard-RT
applications needing both microsecond-range latencies and provable
maximum latencies.
to quantitatively see the effects of these changes to the locking
landscape, here's the output of a script that disassembles the kernel
image and counts the number of actual spin lock/unlock function calls
done versus mutex lock/unlocks:
With PREEMPT_REALTIME disabled, all locks are spinlocks and old
semaphores:
spinlock API calls: 5359 (71.6%)
| old mutex API calls: 2120 (28.3%)
| new mutex API calls: 2 (0%)
all mutex API calls: 2122 (28.3%)
--------------------------------------
lock API calls: 7481 (100.0%)
with the -T9 kernel, which had the 'spread out' locking model, a
considerable portion of spinlocks were replaced by mutexes, but more
than 20% of usage was still spinlocks:
spinlock API calls: 1835 (23.1%)
| old mutex API calls: 2142 (26.9%)
| new mutex API calls: 3961 (49.8%)
all mutex API calls: 6103 (76.8%)
--------------------------------------
lock API calls: 7938 (100.0%)
here are some fresh numbers from Bill Huey's mmLinux kernel:
spinlock API calls: 2452 (30.3%)
all mutex API calls: 5614 (69.6%)
--------------------------------------
lock API calls: 8066 (100.0%)
(his mutex implementation directly falls back to up()/down() so the new
mutexes become part of the old mutexes.)
while i believe that the locking design is fundamentally incomplete in
the MontaVista kernel and thus is not directly comparable to
PREEMPT_REALTIME nor the mmLinux kernel, here are the stats from it
using a similar .config:
spinlock API calls: 1444 (26.1%)
| old mutex API calls: 2095 (37.9%)
| new mutex API calls: 1981 (35.8%)
all mutex API calls: 4076 (73.8%)
--------------------------------------
lock API calls: 5520 (100.0%)
(here it is visible that apparently a significant amount of [i believe
necessary] locking is missing from this kernel.)
finally, here are the stats from the new PREEMPT_REALTIME -U0 kernel:
spinlock API calls: 491 (6.0%)
| old mutex API calls: 2142 (26.2%)
| new mutex API calls: 5536 (67.7%)
all mutex API calls: 7678 (93.9%)
--------------------------------------
lock API calls: 8169 (100.0%)
note that almost all of the remaining spinlocks are held for a short
amount of time and have a finite maximum duration. They involve hardware
access, scheduling and interrupt handling and timers - almost all of
that code has O(1) characteristics.
what this means is that we are approaching hard-real-time domains ...
using what is in essence the stock Linux kernel!
note that priority inheritance is still not part of this patch, but that
effort can now be centralized to the two basic Linux semaphore types,
solving the full spectrum of priority inheritance problems!
the code is still x86-only but only for practical reasons - other
architectures will be covered in the future as well.
to create a -U0 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U0
Ingo
On Thu, 14 Oct 2004, Ingo Molnar wrote:
>
> i'm pleased to announce a significantly improved version of the
> Real-Time Preemption (PREEMPT_REALTIME) feature that i have been working
> towards in the past couple of weeks.
Tried this, it died.
During boot, I saw a message about scheduling while atomic for postgres, and a
stack dump. It then continued, and got all the way into x. I started to type
'ssh-'(add), and then userspace locked(ping still worked from remote.
It *was* able to save data to kern.log, which is helpful.
Also, I noticed that the swapper/0 had a high-latency critical section.
The bug that caused it to crash was in mm/highmem.c.
Attached you will find the kernel log, ksymoops output, and config. I've been
running 2.6.9-rc4 since it came out earlier this week. This is my first mm or
preempt kernel, however.
(note: I had to lowercase the version number, because make-kpkg doesn't like
uppercase)
* Adam Heath <[email protected]> wrote:
> The bug that caused it to crash was in mm/highmem.c.
could you disable HIGHMEM (or at least HIGHPTE) and try again? Some
last-minute bug slipped into that code.
Ingo
* Adam Heath <[email protected]> wrote:
> Also, I noticed that the swapper/0 had a high-latency critical
> section.
this is normal during bootup - we spend many seconds with preemption
disabled - it's fair at that stage. After it has booted up you can reset
the maximum-latency searching via:
echo 50 > /proc/sys/kernel/preempt_max_latency
i have that line in my rc.local.
Ingo
On Thu, 14 Oct 2004, Ingo Molnar wrote:
>
> * Adam Heath <[email protected]> wrote:
>
> > The bug that caused it to crash was in mm/highmem.c.
>
> could you disable HIGHMEM (or at least HIGHPTE) and try again? Some
> last-minute bug slipped into that code.
Well, it's a little better, but it still died. Just took longer.
However, this time, my kern.log got corrupted. I saw 2 scheduling while
atomic errors in dmesg(before it locked up), but only one in kern.log, and a
bunch of random data(using ext3 data=writeback). Symptoms this time around
were laggy keyboard handling, zombie processes(this may have been caused by
the scheduling while atomic problem), and ctrl-c not working.
I'll try again tomorrow, and hopefully get more data.
Ingo Molnar wrote:
> i'm pleased to announce a significantly improved version of the
> Real-Time Preemption (PREEMPT_REALTIME) feature that i have been working
> towards in the past couple of weeks.
>
> the patch (against 2.6.9-rc4-mm1) can be downloaded from:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U0
>
I have this built and running on my SMP system now and thus far it looks
very promising. Numbers to follow.
kr
Ingo Molnar wrote:
> i'm pleased to announce a significantly improved version of the
> Real-Time Preemption (PREEMPT_REALTIME) feature that i have been working
> towards in the past couple of weeks.
>
> the patch (against 2.6.9-rc4-mm1) can be downloaded from:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U0
>
Don't try to debug spinlocks as in CONFIG_DEBUG_SPINLOCK and
CONFIG_PREEMPT_REALTIME at the same time. The two are currently
incompatible.
kr
On Thu, 14 Oct 2004 02:24:33 +0200
Ingo Molnar <[email protected]> wrote:
>
> i'm pleased to announce a significantly improved version of the
> Real-Time Preemption (PREEMPT_REALTIME) feature that i have been working
> towards in the past couple of weeks.
Cool :)
Say, does it still apply that one should not use unthreaded IRQ handlers for
all IRQ's when using PREEMPT_REALTIME (Except maybe for the keyboard)?
flo
* Florian Schmidt <[email protected]> wrote:
> Cool :)
>
> Say, does it still apply that one should not use unthreaded IRQ
> handlers for all IRQ's when using PREEMPT_REALTIME (Except maybe for
> the keyboard)?
yes - and this kernel simply does not allow the un-threading of
interrupt handlers anymore, so you cannot accidentally misconfigure it.
(Not even the keyboard interrupt is an exception, it would have
lock-ripple-effects elsewhere.)
so the preferred (and only) interface to mark interrupts 'high prio' is
via process priorities. Starting from the -U1 kernel it will be possible
to do this:
chrt -f 60 -p `ps -C 'IRQ:1' -o pid=`
chrt -f 60 -p `ps -C 'IRQ:8' -o pid=`
this sets the keyboard and the RT-timer interrupt to FIFO:60.
In -U0 this is not possible because 'ps -C' does not handle kernel
threads with a space in their name. So there you'd need some wacky thing
like:
chrt -f 60 -p `ps ax -o pid= -o comm= | grep "IRQ 1$" | cut -dI -f1`
chrt -f 60 -p `ps ax -o pid= -o comm= | grep "IRQ 8$" | cut -dI -f1`
(someone should fix procps - or does it intentionally break with
whitespace command-strings?)
Ingo
On Thu, 14 Oct 2004 11:19:53 +0200
Ingo Molnar <[email protected]> wrote:
> In -U0 this is not possible because 'ps -C' does not handle kernel
> threads with a space in their name. So there you'd need some wacky thing
> like:
>
> chrt -f 60 -p `ps ax -o pid= -o comm= | grep "IRQ 1$" | cut -dI -f1`
> chrt -f 60 -p `ps ax -o pid= -o comm= | grep "IRQ 8$" | cut -dI -f1`
>
> (someone should fix procps - or does it intentionally break with
> whitespace command-strings?)
Hi,
thanks for the infos.
btw: i use:
chrt -f -p 99 `pidof "IRQ 5"`
for example (chrt commandline parsing is kinda braindead). It seems to work:
~$ ps -cmL `pidof "IRQ 5"`
PID LWP CLS PRI TTY STAT TIME COMMAND
110 - - - ? - 0:00 [IRQ 5]
- 110 FF 139 - S< 0:00 -
flo
* Florian Schmidt <[email protected]> wrote:
> btw: i use:
>
> chrt -f -p 99 `pidof "IRQ 5"`
ah, indeed, pidof is more robust. Ok, i changed the format from IRQ:5
back to "IRQ 5" in my tree to not break scripts.
Ingo
CC'ed jackit-devel mailing list, cause this might be interesting for them,
too.
Ah, btw: U0 booted fine here.. Seems to run allright, too (for everything
non jackd). Only thing is:
When starting jackd i get a floating point exception. Dunno where that comes from:
~$ jackd -d alsa -p 512
jackd 0.99.0
Copyright 2001-2003 Paul Davis and others.
jackd comes with ABSOLUTELY NO WARRANTY
This is free software, and you are welcome to redistribute it
under certain conditions; see the file COPYING for details
loading driver ..
creating alsa driver ... hw:0|hw:0|512|2|48000|0|0|nomon|swmeter|-|32bit
control device hw:0
configuring for 48000Hz, period = 512 frames, buffer = 2 periods
Couldn't open hw:0 for 32bit samples trying 24bit instead
Couldn't open hw:0 for 24bit samples trying 16bit instead
Couldn't open hw:0 for 32bit samples trying 24bit instead
Couldn't open hw:0 for 24bit samples trying 16bit instead
Floating point exception
running jackd in gdb locks up the gdb process i think [i'm not too
experienced in debugging stuff].
here;s partial strace and ltrace logs (only the end). I have no idea if this is a jack bug
exposed by your kernrel patches or a bug in your kernel patches exposed by
jackd :) But it seems to be a mutex/futex issue...
strace:
....
sched_get_priority_max(SCHED_FIFO) = 99
sched_get_priority_max(SCHED_FIFO) = 99
sched_get_priority_min(SCHED_FIFO) = 1
mmap2(NULL, 8388608, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6d42000
mprotect(0xb6d42000, 4096, PROT_NONE) = 0
clone(child_stack=0xb7541b48, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, parent_tidptr=0xb7541bf8, {entry_number:6, base_addr:0xb7541bb0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, child_tidptr=0xb7541bf8) = 1546
sched_setscheduler(1546, SCHED_OTHER, { 0 }) = 0
sched_setscheduler(1546, SCHED_FIFO, { 20 }) = 0
futex(0xb7541d94, FUTEX_WAKE, 1) = 1
ioctl(7, 0x4140, 0x1) = 0
ioctl(7, 0x4142, 0x1) = 0
sched_get_priority_max(SCHED_FIFO) = 99
sched_get_priority_min(SCHED_FIFO) = 1
mmap2(NULL, 8388608, PROT_READ|PROT_WRITE|PROT_EXEC, MAP_PRIVATE|MAP_ANONYMOUS, -1, 0) = 0xb6542000
mprotect(0xb6542000, 4096, PROT_NONE) = 0
clone(child_stack=0xb6d41b48, flags=CLONE_VM|CLONE_FS|CLONE_FILES|CLONE_SIGHAND|CLONE_THREAD|CLONE_SYSVSEM|CLONE_SETTLS|CLONE_PARENT_SETTID|CLONE_CHILD_CLEARTID|CLONE_DETACHED, parent_tidptr=0xb6d41bf8, {entry_number:6, base_addr:0xb6d41bb0, limit:1048575, seg_32bit:1, contents:0, read_exec_only:0, limit_in_pages:1, seg_not_present:0, useable:1}, child_tidptr=0xb6d41bf8) = 1547
sched_setscheduler(1547, SCHED_OTHER, { 0 }) = 0
sched_setscheduler(1547, SCHED_FIFO, { 10 }) = 0
futex(0xb6d41d94, FUTEX_WAKE, 1) = 1
+++ killed by SIGFPE +++
ltrace:
....
jack_client_alloc_internal(0x8057800, 0x8063a78, 0xbfffe310, 0x80538b1, 0xbffff758) = 0x8064d20
pthread_mutex_unlock(0x8063abc, 0x8063a78, 0xbfffe310, 0x80538b1, 0xbffff758) = 0
creating alsa driver ... hw:0|hw:0|512|2|48000|0|0|nomon|swmeter|-|32bit
control device hw:0
configuring for 48000Hz, period = 512 frames, buffer = 2 periods
Couldn't open hw:0 for 32bit samples trying 24bit instead
Couldn't open hw:0 for 24bit samples trying 16bit instead
Couldn't open hw:0 for 32bit samples trying 24bit instead
Couldn't open hw:0 for 24bit samples trying 16bit instead
free(0x8058658) = <void>
snprintf("/jck-[32 bit float mono audio]", 64, "/jck-[%s]", "32 bit float mono audio") = 30
jack_shmalloc(0xbfffd140, 262144, 0x8063b80, 0xb7fdc21c, 0) = 0
jack_attach_shm(0x8063b80, 262144, 0x8063b80, 0xb7fdc21c, 0) = 0
pthread_mutex_lock(0x8063b00, 0x8063b80, 0x8063b00, 262144, 2048) = 0
malloc(1024) = 0x806e830
malloc(8) = 0x806dd80
malloc(8) = 0x8058658
malloc(8 <unfinished ...>
+++ killed by SIGTRAP +++
Florian Schmidt wrote:
>
> CC'ed jackit-devel mailing list, cause this might be interesting for them,
> too.
>
> Ah, btw: U0 booted fine here.. Seems to run allright, too (for everything
> non jackd). Only thing is:
>
> When starting jackd i get a floating point exception. Dunno where that
> comes from:
>
> ~$ jackd -d alsa -p 512
> jackd 0.99.0
> Copyright 2001-2003 Paul Davis and others.
> jackd comes with ABSOLUTELY NO WARRANTY
> This is free software, and you are welcome to redistribute it
> under certain conditions; see the file COPYING for details
>
> loading driver ..
> creating alsa driver ... hw:0|hw:0|512|2|48000|0|0|nomon|swmeter|-|32bit
> control device hw:0
> configuring for 48000Hz, period = 512 frames, buffer = 2 periods
> Couldn't open hw:0 for 32bit samples trying 24bit instead
> Couldn't open hw:0 for 24bit samples trying 16bit instead
> Couldn't open hw:0 for 32bit samples trying 24bit instead
> Couldn't open hw:0 for 24bit samples trying 16bit instead
> Floating point exception
>
This does not happen on my laptop.
Testing also 2.6.9-rc4-mm1-U0, but a slightly custom jack 0.99.5 (cvs)
patched with "my" max_delayed_usecs suite.
And jackd it's pumping while I'm writing this lines: jackd -R -d alsa,
against bundled crapsound (ali5451).
My laptop is a P4 2.53Ghz, running on Mandrake 10.1c (gcc 3.4.1, glibc
2.3.3 NPTL).
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
On Thu, 14 Oct 2004 11:22:30 +0100 (WEST)
"Rui Nuno Capela" <[email protected]> wrote:
> > Floating point exception
> >
>
> This does not happen on my laptop.
>
> Testing also 2.6.9-rc4-mm1-U0, but a slightly custom jack 0.99.5 (cvs)
> patched with "my" max_delayed_usecs suite.
>
> And jackd it's pumping while I'm writing this lines: jackd -R -d alsa,
> against bundled crapsound (ali5451).
>
> My laptop is a P4 2.53Ghz, running on Mandrake 10.1c (gcc 3.4.1, glibc
> 2.3.3 NPTL).
Hi,
hmm, it could, of course, be again debian's infamous glibc which bites me in
the ass (as it did with ignoring pthread attributes (which still isn't fixed
afaics)). Which direction should i go with investigating this further? I
will build cvs jackd for a start.
flo
P.S.: attached is my .config
Florian Schmidt wrote:
> On Thu, 14 Oct 2004 11:22:30 +0100 (WEST)
> "Rui Nuno Capela" <[email protected]> wrote:
>
>
>>>Floating point exception
>>>
>>
>>This does not happen on my laptop.
>>
>>Testing also 2.6.9-rc4-mm1-U0, but a slightly custom jack 0.99.5 (cvs)
>>patched with "my" max_delayed_usecs suite.
>>
>>And jackd it's pumping while I'm writing this lines: jackd -R -d alsa,
>>against bundled crapsound (ali5451).
>>
>>My laptop is a P4 2.53Ghz, running on Mandrake 10.1c (gcc 3.4.1, glibc
>>2.3.3 NPTL).
>
>
> Hi,
>
> hmm, it could, of course, be again debian's infamous glibc which bites me in
> the ass (as it did with ignoring pthread attributes (which still isn't fixed
> afaics)). Which direction should i go with investigating this further? I
> will build cvs jackd for a start.
>
> flo
>
> P.S.: attached is my .config
Or maybe this:
#
# Security options
#
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITY_NETWORK is not set
CONFIG_SECURITY_CAPABILITIES=m
For me, anything that needs to use setcap/getcap fails if I don't
compile in security capabilities ie. CONFIG_SECURITY_CAPABILITIES=y.
Don't know if this is your problem or not.
kr
On Thu, 14 Oct 2004 05:54:46 -0500
"K.R. Foley" <[email protected]> wrote:
> > P.S.: attached is my .config
>
> Or maybe this:
>
> #
> # Security options
> #
> # CONFIG_KEYS is not set
> CONFIG_SECURITY=y
> # CONFIG_SECURITY_NETWORK is not set
> CONFIG_SECURITY_CAPABILITIES=m
>
> For me, anything that needs to use setcap/getcap fails if I don't
> compile in security capabilities ie. CONFIG_SECURITY_CAPABILITIES=y.
> Don't know if this is your problem or not.
nah, this worked very well for all other versions of the VP patches. right
now i'm building U0 w/o PREEMPT_REALTIME, to see if i still get the
exception.
flo
> nah, this worked very well for all other versions of the VP patches. right
> now i'm building U0 w/o PREEMPT_REALTIME, to see if i still get the
> exception.
>
jackd seems to work fine w/o PREEMPT_REALTIME (otherwise identical config).
flo
> nah, this worked very well for all other versions of the VP patches. right
> now i'm building U0 w/o PREEMPT_REALTIME, to see if i still get the
> exception.
>
jackd seems to work fine w/o PREEMPT_REALTIME (otherwise identical config).
flo
On Thu, 14 Oct 2004 13:42:50 +0200
Florian Schmidt <[email protected]> wrote:
> > nah, this worked very well for all other versions of the VP patches. right
> > now i'm building U0 w/o PREEMPT_REALTIME, to see if i still get the
> > exception.
> >
>
> jackd seems to work fine w/o PREEMPT_REALTIME (otherwise identical config).
err, sorry, shouldn't have cc'ed to jackit-devel as it's subscribers only.
sorry for that.. Ah, there's nothing better than making a complete ass out
of oneself :)
flo
i have released the -U1 PREEMPT_REALTIME patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
this is a strict bugfixes-only release. With -U1 i cannot reproduce any
of the bugs on my testsystems anymore, but take care nevertheless, this
is still experimental code.
Changes since -U0:
- bugfix: fixed the highmem related crash reported by Adam Heath and i
think this could also fix the crash reported by Mark H Johnson.
- bugfix: fixed a number of networking related soft-lockups, caused by
a deadlock scenarios in the ipv4, netfilter and net-xmit locking
code. This could fix the lockup reported by Lorenzo Allegrucci.
- bugfix: enable interrupts in the int3 handler - gdb will otherwise
trigger a kernel debug message.
- cleanup: reworked the RCU API wrappers, we now have the following
variants:
rcu_read_[un]lock_spin(&spinlock)
rcu_read_[un]lock_bh_spin(&spinlock)
rcu_read_[un]lock_sem(&semaphore)
this change was necessary for the network locking fixes.
- debugging helper: SysRq-T will now print the stacktrace of currently
running tasks too. (They might be a bit unreliable occasionally but
very useful to debug deadlocks.)
- configurability fix: disabled the /proc/kernel/softirq_preemption and
hardirq_preemption runtime flags (and the softirq-preempt= and
hardirq-preempt= boot flags) if PREEMPT_REALTIME is enabled - in the
fully preemptible model these must always be on.
there are no known bugs at this moment, so please re-report any issues
you might still encounter.
to create a -U1 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
Ingo
>i have released the -U1 PREEMPT_REALTIME patch:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
Not sure if I can bring this up to multi user yet. Some initial testing
in single user mode indicates problems when I turn on networking. See
the attached messages from /var/log/messages to see the kinds of problems
I am having. The key ones appear after doing
./S10network start
as part of single stepping the init sequence. I stopped at this point
to make sure I had a good record of the messages.
I also managed to get the machine stuck with
/sbin/reboot
not sure why.
And yes, the .config is basically the same as in all previous tests.
(no changes since my first tests with -T4...)
A side question - if
CONFIG_PREEMPT_REALTIME=y
you say that IRQ's must be threaded, is this going to be "permanent" and
if so - why?
I would prefer to not use threaded IRQ's if possible due to lower CPU
overhead [see previous email describing results...] and some problems
I see with setting priorities on those IRQ's (relative to real time tasks).
--Mark
* [email protected] <[email protected]> wrote:
> Not sure if I can bring this up to multi user yet. Some initial testing
> in single user mode indicates problems when I turn on networking. See
> the attached messages from /var/log/messages to see the kinds of problems
> I am having. The key ones appear after doing
> ./S10network start
> as part of single stepping the init sequence. I stopped at this point
> to make sure I had a good record of the messages.
could you try to disable SELINUX? It seems it's not fully safe yet.
> A side question - if
> CONFIG_PREEMPT_REALTIME=y
> you say that IRQ's must be threaded, is this going to be "permanent" and
> if so - why?
in a fully preemptible model all execution must be 'sequential', because
irq threads themselves can schedule too and could be preempted too. The
only way to make 'direct' interrupts possible again would be to disable
interrupts in _all_ non-preemptible sections, which would be quite some
work.
Another reason for the 'linearization' of as much execution as possible
is that such direct interrupts couldnt be preempted (or else you could
reenter them) which is impossible because all locks are mutexes.
a third reason is that nesting 'blocks' any underlying context. So if
task A is interrupted by irq X and schedules away (lets assume this is
safe) - nobody could unwind 'task A' - irq X blocks it until it finishes
execution. With linearlized contexts 'task A' could reschedule on
another CPU - or could get its priority raised with time if an RT
deadline is approaching, etc. It's much more flexible to have everything
flattened out.
this comes at a performance cost - but basically if you implement all
the properties one would expect form such an approach you'd end up with
a completely different irq scheduler - there's no point in that. Best is
to 'merge' all contexts, hardirqs and softirqs into the normal task
concept.
> I would prefer to not use threaded IRQ's if possible due to lower CPU
> overhead [see previous email describing results...] and some problems
> I see with setting priorities on those IRQ's (relative to real time
> tasks).
the overhead we can try to optimize later on. What problems do you see
with setting priorities on those IRQs?
Ingo
On Wed, 2004-10-13 at 17:24, Ingo Molnar wrote:
> To solve all these fundamental problems, i improved/fixed/changed all of
> these locking methods to be preemption-friendly. Most of the time it was
> necessary to introduce an additional API variant because e.g.
> rcu_read_lock() is anonymous (it doesnt identify the data protected), so
> i introduced a variant that takes the write-lock as an argument. In the
> PREEMPT_REALTIME case we can thus properly serialize on that lock.
When I was reviewing this it seemed like it would be possible to keep
RCU anonymous by moving the callback processing out of the tasklet . The
reason it was moved into a tasklet was to reduce latency. But if you
serialize it like you have, aren't you removing all the benefits of the
RCU type lock in those section that are converted to the new API ?
> For per-cpu variables i introduced a new API variant that creates a
> spinlock-array for the per-cpu-variable, and users must make sure the
> cpu field doesnt change. Migration to another CPU can happen within the
> critical section, but 'statistically' the variable is still per-CPU and
> update correctness is fully preserved.
Why not have a per cpu mutex instead of a per variable per cpu mutex?
I'm not sure what the trade off are, except size.
Daniel
>> I would prefer to not use threaded IRQ's if possible due to lower CPU
>> overhead [see previous email describing results...] and some problems
>> I see with setting priorities on those IRQ's (relative to real time
>> tasks).
>
>the overhead we can try to optimize later on. What problems do you see
>with setting priorities on those IRQs?
Perhaps I am old fashioned, but in building a real time system, I consider
hardware interrupt processing as something that is always at a higher
priority than real time tasks. In general that is not a problem because
hardware interrupt processing should do just enough to keep the hardware
happy and nothing more. I have enough spare CPU cycles within each frame
to account for [could be a large number of] interrupts that follow that
approach. Unthreaded IRQ's preserves that relationship.
However, with the threaded IRQ's, a real time program (e.g., latencytest)
can request a priority higher than IRQ processing - causing problems
interfacing with devices. At a minimum, the default priority of IRQ's
should
be some real time value so that nice -20 jobs won't bother them either.
A possibility that comes to mind is to schedule IRQ's at a range higher
than
available to all real time application tasks. I'll mention another
possibility below as well.
In the systems I have to deal with, I do not have a clear criteria
to set priorities of interrupts relative to each other. For example, I
have a real time simulation system using the following devices:
- occasional disk access to simulate disk I/O
- real time network traffic
- real time delivery of interrupts from a PCI timer card and APIC timers
- real time interrupts from a shared memory interface
The priorities of real time tasks are basically assigned based on the
rate of execution. 80 Hz tasks run at a higher priority than 60 Hz, 60 Hz >
40 Hz, and so on. A number of tasks can access each device.
As noted above, I can live with a system where I can guarantee that all
the IRQ processing has higher priority than all the real time tasks.
It would be "better" if the priority of the hardware interrupts somehow
inherited the priority (absolute or relative to other IRQ's) of the task
making the request. So in that way, a 40 Hz task making a network transfer
would somehow boost the priority of the network interface until that
transfer was complete. It would also be good if the queue of pending
transfers was reordered by RT priority, but I don't see that as an easy
thing to implement currently in Linux (but I can ask... :-) ).
Needless to say, if you implemented priority inheritance, when the 40 Hz
task is not doing network transfers, I would just as soon prefer that
other network operations (say from a 2 Hz tasks) does not get a priority
boost above a 20 Hz task accessing another device.
--Mark H Johnson
<mailto:[email protected]>
* Ingo Molnar <[email protected]> wrote:
> > as part of single stepping the init sequence. I stopped at this point
> > to make sure I had a good record of the messages.
>
> could you try to disable SELINUX? It seems it's not fully safe yet.
there wasnt all that much missing for SELINUX + PREEMPT_REALTIME
support. Could you try the patch below - does it fix your box?
Ingo
--- linux/net/ipv4/af_inet.c.orig
+++ linux/net/ipv4/af_inet.c
@@ -242,7 +242,7 @@ static int inet_create(struct socket *so
/* Look for the requested type/protocol pair. */
answer = NULL;
- rcu_read_lock();
+ rcu_read_lock_spin(&inetsw_lock);
list_for_each_rcu(p, &inetsw[sock->type]) {
answer = list_entry(p, struct inet_protosw, list);
@@ -276,7 +276,7 @@ static int inet_create(struct socket *so
answer_prot = answer->prot;
answer_no_check = answer->no_check;
answer_flags = answer->flags;
- rcu_read_unlock();
+ rcu_read_unlock_spin(&inetsw_lock);
BUG_TRAP(answer_prot->slab != NULL);
@@ -345,7 +345,7 @@ static int inet_create(struct socket *so
out:
return err;
out_rcu_unlock:
- rcu_read_unlock();
+ rcu_read_unlock_spin(&inetsw_lock);
goto out;
}
* Daniel Walker <[email protected]> wrote:
> When I was reviewing this it seemed like it would be possible to keep
> RCU anonymous by moving the callback processing out of the tasklet .
> The reason it was moved into a tasklet was to reduce latency. But if
> you serialize it like you have, aren't you removing all the benefits
> of the RCU type lock in those section that are converted to the new
> API ?
only if compiling for PREEMPT_REALTIME. Given the overhead of
PREEMPT_REALTIME i'm not sure RCU matters that much. But the nicest
would be Dipankar's preemptible-RCU patch.
> > For per-cpu variables i introduced a new API variant that creates a
> > spinlock-array for the per-cpu-variable, and users must make sure the
> > cpu field doesnt change. Migration to another CPU can happen within the
> > critical section, but 'statistically' the variable is still per-CPU and
> > update correctness is fully preserved.
>
> Why not have a per cpu mutex instead of a per variable per cpu mutex?
> I'm not sure what the trade off are, except size.
well, nesting would be one issue. What if such a section gets preempted
on this CPU and another task tries to use the same mutex?
Per-var-per-cpu mutexes seemed like the most orthogonal extension to the
existing concept. Keeping the original Linux locking semantics intact
seems like the primary mission, at least until the full scope is mapped.
Ingo
This was during NFS startup in init.
using smp_processor_id() in preemptible [00000001] code:
rpc.rquotad/2158
caller is ipt_do_table+0x7b/0x3a0
[<c011aa15>] smp_processor_id+0x95/0xa0
[<c038cbfb>] ipt_do_table+0x7b/0x3a0
[<c038aa8b>] ip_ct_refresh_acct+0xb/0x80
[<c038f1d4>] ipt_local_hook+0x74/0xc0
[<c034d73a>] nf_iterate+0x5a/0xa0
[<c035af00>] dst_output+0x0/0x40
[<c034da3c>] nf_hook_slow+0x5c/0x100
[<c035af00>] dst_output+0x0/0x40
[<c035aaf4>] ip_push_pending_frames+0x414/0x480
[<c035af00>] dst_output+0x0/0x40
[<c0377c88>] udp_push_pending_frames+0x148/0x260
[<c0378178>] udp_sendmsg+0x378/0x6e0
[<c0134c73>] __mcount+0x13/0x20
[<c037f7bc>] inet_sendmsg+0x3c/0x60
[<c03397d8>] sock_sendmsg+0xb8/0xe0
[<c0134c73>] __mcount+0x13/0x20
[<c0134c73>] __mcount+0x13/0x20
[<c0113d30>] mcount+0x14/0x18
[<c020172a>] __copy_from_user_ll+0xa/0x40
[<c0133d00>] autoremove_wake_function+0x0/0x60
[<c03391ef>] move_addr_to_kernel+0x2f/0x60
[<c033ab36>] sys_sendto+0xd6/0x100
[<c033d144>] sock_common_setsockopt+0x24/0x40
[<c0134c73>] __mcount+0x13/0x20
[<c020172a>] __copy_from_user_ll+0xa/0x40
[<c0201803>] copy_from_user+0x43/0x80
[<c0113d30>] mcount+0x14/0x18
[<c020172a>] __copy_from_user_ll+0xa/0x40
[<c033b297>] sys_socketcall+0xf7/0x180
[<c01176a0>] do_page_fault+0x0/0x62a
[<c0105357>] syscall_call+0x7/0xb
On Thu, Oct 14, 2004 at 02:13:15PM -0500, [email protected] wrote:
...
> be some real time value so that nice -20 jobs won't bother them either.
> A possibility that comes to mind is to schedule IRQ's at a range higher
> than
> available to all real time application tasks. I'll mention another
> possibility below as well.
The interrupt priority range probably needs to be increased to accommodate the
increased design demand of RT applications.
> In the systems I have to deal with, I do not have a clear criteria
> to set priorities of interrupts relative to each other. For example, I
> have a real time simulation system using the following devices:
> - occasional disk access to simulate disk I/O
> - real time network traffic
> - real time delivery of interrupts from a PCI timer card and APIC timers
> - real time interrupts from a shared memory interface
> The priorities of real time tasks are basically assigned based on the
> rate of execution. 80 Hz tasks run at a higher priority than 60 Hz, 60 Hz >
> 40 Hz, and so on. A number of tasks can access each device.
Crank it higher 120hz and see what kind of jitter your getting. Hit
something with high memory load, large, mmap images, swap and friends.
> It would be "better" if the priority of the hardware interrupts somehow
> inherited the priority (absolute or relative to other IRQ's) of the task
> making the request. So in that way, a 40 Hz task making a network transfer
> would somehow boost the priority of the network interface until that
> transfer was complete. It would also be good if the queue of pending
> transfers was reordered by RT priority, but I don't see that as an easy
> thing to implement currently in Linux (but I can ask... :-) ).
That's an RT app slippery slope and it should be handled by some kind of
in-kernel or kernel locking aware facilties. The reason why Linux is
ideal for RTOS usage is directly related to all of the SMP work that's
been done over the years. Contention, therefore the need for priority
inheritance, is evil. If you need that kind of functionality, then you
might be good to consider the scheduling indeterminancy of the lock chain
being aquired and it should have little or no overlap with things like
irq-threads. The system should be decoupled (queues, etc...) if possible
and you shouldn't abuse priority inheritance. The use of priority
inheritance should be considered a kind lock contention overload and
the algorithms it bounds should be optimized. In your case, the network
stack might need to be broken up to provide the kind of granularity
and control need to attach on a socket per process/thread basis, just
like Jeffery Hsu's lockless network stack effort in DragonFly BSD.
Long priority inheritance chains is an app-level indeterminacy nightmare
and either indicates an improperly written application or nasty SMP
contention issue. That's how I see it.
> Needless to say, if you implemented priority inheritance, when the 40 Hz
> task is not doing network transfers, I would just as soon prefer that
> other network operations (say from a 2 Hz tasks) does not get a priority
> boost above a 20 Hz task accessing another device.
bill
* [email protected] <[email protected]> wrote:
> >the overhead we can try to optimize later on. What problems do you see
> >with setting priorities on those IRQs?
>
> Perhaps I am old fashioned, but in building a real time system, I
> consider hardware interrupt processing as something that is always at
> a higher priority than real time tasks. [...]
this is what i believe you'll ultimately get under PREEMPT_REALTIME:
instant execution of the hardware interrupt thread! Just give it a
higher RT priority than any of the existing tasks in the system:
chrt -f -p 99 `pidof "IRQ 9"`
it is only a couple of microseconds to switch over from the current task
to the IRQ handling thread.
the only difference to a 'direct' interrupt is that it is you who
determines the policy and the priority of interrupt handling.
with direct interrupts there's no choice - a hardware interrupt has the
highest priority. In fact there's not even any way to prioritize
hardware interrupts relative to each other.
> [...] In general that is not a problem because hardware interrupt
> processing should do just enough to keep the hardware happy and
> nothing more. I have enough spare CPU cycles within each frame to
> account for [could be a large number of] interrupts that follow that
> approach. Unthreaded IRQ's preserves that relationship.
>
> However, with the threaded IRQ's, a real time program (e.g.,
> latencytest) can request a priority higher than IRQ processing -
> causing problems interfacing with devices. At a minimum, the default
> priority of IRQ's should be some real time value so that nice -20 jobs
> won't bother them either. A possibility that comes to mind is to
> schedule IRQ's at a range higher than available to all real time
> application tasks. I'll mention another possibility below as well.
we could increase the RT priority range perhaps, and only allow IRQ
threads to venture into that range. But, this is really pushing a piece
of policy into the kernel. RT tasks interfering with interrupt threads
is an application level problem: priorities have to be properly set up
between RT applications anyway.
> In the systems I have to deal with, I do not have a clear criteria
> to set priorities of interrupts relative to each other. For example, I
> have a real time simulation system using the following devices:
> - occasional disk access to simulate disk I/O
> - real time network traffic
> - real time delivery of interrupts from a PCI timer card and APIC timers
> - real time interrupts from a shared memory interface
> The priorities of real time tasks are basically assigned based on the
> rate of execution. 80 Hz tasks run at a higher priority than 60 Hz, 60 Hz >
> 40 Hz, and so on. A number of tasks can access each device.
if you dont know the relative priority and dont want to allow (non-RT)
userspace starving of IRQ processing then you can make all of them
SCHED_FIFO priority 99.
> As noted above, I can live with a system where I can guarantee that
> all the IRQ processing has higher priority than all the real time
> tasks.
what might make sense is to extend SELinux to allow partitioning of the
priority space. Allow 'normal' applications only SCHED_FIFO range 1-90,
and have 91-99 for IRQ threads, or something like that. I dont think
this priority scheme should be part of the kernel proper - it would be
an inflexible feature. But ... i have no strong feelings in either
direction.
> It would be "better" if the priority of the hardware interrupts
> somehow inherited the priority (absolute or relative to other IRQ's)
> of the task making the request. So in that way, a 40 Hz task making a
> network transfer would somehow boost the priority of the network
> interface until that transfer was complete. It would also be good if
> the queue of pending transfers was reordered by RT priority, but I
> don't see that as an easy thing to implement currently in Linux (but I
> can ask... :-) ).
unfortunately there's no 1:1 relationship between 'work' and
'completion' activies so no good mapping from tasks to interrupts. Think
about a SCHED_OTHER and a SCHED_FIFO task dirtying the same page and it
getting flushed out to disk by pdflush. Whose priority should the disk
interrupt inherit, if anything?
> Needless to say, if you implemented priority inheritance, when the 40
> Hz task is not doing network transfers, I would just as soon prefer
> that other network operations (say from a 2 Hz tasks) does not get a
> priority boost above a 20 Hz task accessing another device.
in reality it seems that most of the contention wrt. networks is on the
queueing level, not on the CPU use level. So the solution should rather
be on the 'jump the queue and get xmit-ed right now' level - i.e. the
use of priority-aware TCP/IP QoS features. They do not really need
priority inheritance for the hardware interrupt. (especially considering
that most network processing happens in softirq context, which is even
more anonymous than a hardirq handler.)
Ingo
I'm not sure about this one ..
------------[ cut here ]------------
kernel BUG at fs/buffer.c:1360!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c01619c1>] Not tainted VLI
EFLAGS: 00010002 (2.6.9-rc4-mm1-VP-U1)
EIP is at __find_get_block+0xe1/0x100
eax: 00000001 ebx: cfd30c14 ecx: cffdc600 edx: 00000000
esi: 0005709b edi: 00000000 ebp: cfd30b70 esp: cfd30b54
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process kjournald (pid: 786, threadinfo=cfd30000 task=cfd26040)
Stack: 00000002 00000000 0005709b 00000000 cfd30c14 0005709b 00000000
cfd30b94
c01619fe cffeab80 0005709b 00000000 00001000 cfd30c14 00000002
cfd30c44
cfd30bac c0161a99 cffeab80 0005709b 00000000 00001000 cfd30bd4
c019aff0
Call Trace:
[<c01619fe>] __getblk+0x1e/0x60
[<c0161a99>] __bread+0x19/0x40
[<c019aff0>] ext3_get_branch+0x70/0x100
[<c019b61a>] ext3_get_block_handle+0x7a/0x2e0
[<c026b1ee>] as_choose_req+0xe/0x1e0
[<c026bc5f>] as_update_arq+0x1f/0x60
[<c019b8c3>] ext3_get_block+0x43/0x80
[<c0163735>] generic_block_bmap+0x35/0x40
[<c0134c73>] __mcount+0x13/0x20
[<c019c26d>] ext3_bmap+0xd/0xa0
[<c01797c5>] bmap+0x45/0x60
[<c019c2dc>] ext3_bmap+0x7c/0xa0
[<c019b880>] ext3_get_block+0x0/0x80
[<c01797c5>] bmap+0x45/0x60
[<c01af8c2>] journal_bmap+0x42/0xa0
[<c0134c73>] __mcount+0x13/0x20
[<c0134249>] _mutex_unlock+0x9/0x60
[<c01af827>] journal_next_log_block+0x47/0xa0
[<c0113d30>] mcount+0x14/0x18
[<c01af832>] journal_next_log_block+0x52/0xa0
[<c01af939>] journal_get_descriptor_buffer+0x19/0xc0
[<c01ac4ec>] journal_commit_transaction+0xf6c/0x13e0
[<c01aedee>] kjournald+0xce/0x260
[<c013555c>] sub_preempt_count+0x7c/0xa0
[<c0133d00>] autoremove_wake_function+0x0/0x60
[<c03b2a33>] _spin_unlock_irq+0x13/0x40
[<c0133d00>] autoremove_wake_function+0x0/0x60
[<c0119459>] schedule_tail+0x19/0x60
[<c01aece0>] commit_timeout+0x0/0x20
[<c01aed20>] kjournald+0x0/0x260
[<c0103339>] kernel_thread_helper+0x5/0xc
Code: 45 14 39 43 10 75 a0 85 ff 74 17 8b 45 e4 89 f9 8d 14 b8 8b 42 fc
89 02 83 ea 04
* Daniel Walker <[email protected]> wrote:
> This was during NFS startup in init.
>
> using smp_processor_id() in preemptible [00000001] code:
> rpc.rquotad/2158
> caller is ipt_do_table+0x7b/0x3a0
> [<c011aa15>] smp_processor_id+0x95/0xa0
> [<c038cbfb>] ipt_do_table+0x7b/0x3a0
ugh, this is a nasty one - if you look at the TABLE_OFFSET trickery in
ipt_do_table it's basically an open-coded per-CPU variable in essence.
(probably predating percpu.h so it's fair.) Could you try the quick hack
below? (it compiles but is otherwise untested)
The proper solution would be to change the code to use per-cpu variables
(and get that patch accepted upstream) and then trivially convert it to
get_cpu_var_locked().
Ingo
--- linux/net/ipv4/netfilter/ip_tables.c.orig
+++ linux/net/ipv4/netfilter/ip_tables.c
@@ -287,10 +287,14 @@ ipt_do_table(struct sk_buff **pskb,
* match it. */
offset = ntohs(ip->frag_off) & IP_OFFSET;
+#ifdef CONFIG_PREEMPT_REALTIME
+ write_lock_bh(&table->lock);
+#else
read_lock_bh(&table->lock);
+#endif
IP_NF_ASSERT(table->valid_hooks & (1 << hook));
table_base = (void *)table->private->entries
- + TABLE_OFFSET(table->private, smp_processor_id());
+ + TABLE_OFFSET(table->private, _smp_processor_id());
e = get_entry(table_base, table->private->hook_entry[hook]);
#ifdef CONFIG_NETFILTER_DEBUG
@@ -397,7 +401,11 @@ ipt_do_table(struct sk_buff **pskb,
#ifdef CONFIG_NETFILTER_DEBUG
((struct ipt_entry *)table_base)->comefrom = 0xdead57ac;
#endif
+#ifdef CONFIG_PREEMPT_REALTIME
+ write_unlock_bh(&table->lock);
+#else
read_unlock_bh(&table->lock);
+#endif
#ifdef DEBUG_ALLOW_ALL
return NF_ACCEPT;
In article <[email protected]>,
Ingo Molnar <[email protected]> wrote:
>In -U0 this is not possible because 'ps -C' does not handle kernel
>threads with a space in their name. So there you'd need some wacky thing
>like:
>
> chrt -f 60 -p `ps ax -o pid= -o comm= | grep "IRQ 1$" | cut -dI -f1`
> chrt -f 60 -p `ps ax -o pid= -o comm= | grep "IRQ 8$" | cut -dI -f1`
>
>(someone should fix procps - or does it intentionally break with
>whitespace command-strings?)
Why not use ` pgrep -x 'IRQ 1' `. It's part of procps (at least
the version debian, even woody, is using), some kind of standard
(solaris has it too), and works.
Mike.
--
"In times of universal deceit, telling the truth becomes
a revolutionary act." -- George Orwell.
On Thu, Oct 14, 2004 at 09:28:04PM +0200, Ingo Molnar wrote:
>
> * Daniel Walker <[email protected]> wrote:
>
> > When I was reviewing this it seemed like it would be possible to keep
> > RCU anonymous by moving the callback processing out of the tasklet .
> > The reason it was moved into a tasklet was to reduce latency. But if
> > you serialize it like you have, aren't you removing all the benefits
> > of the RCU type lock in those section that are converted to the new
> > API ?
>
> only if compiling for PREEMPT_REALTIME. Given the overhead of
> PREEMPT_REALTIME i'm not sure RCU matters that much. But the nicest
> would be Dipankar's preemptible-RCU patch.
>
I am swamped this week and racing against time to get some other
pending things done in time. I will look at the issue of RCU with
PREEMPT_REALTIME next week and try to help out.
Thanks
Dipankar
* Daniel Walker <[email protected]> wrote:
> I'm not sure about this one ..
>
> ------------[ cut here ]------------
> kernel BUG at fs/buffer.c:1360!
> EIP is at __find_get_block+0xe1/0x100
> Call Trace:
> [<c01619fe>] __getblk+0x1e/0x60
> [<c0161a99>] __bread+0x19/0x40
> [<c019aff0>] ext3_get_branch+0x70/0x100
> [<c019b61a>] ext3_get_block_handle+0x7a/0x2e0
> [<c026b1ee>] as_choose_req+0xe/0x1e0
> [<c026bc5f>] as_update_arq+0x1f/0x60
> [<c019b8c3>] ext3_get_block+0x43/0x80
> [<c0163735>] generic_block_bmap+0x35/0x40
> [<c0134c73>] __mcount+0x13/0x20
> [<c019c26d>] ext3_bmap+0xd/0xa0
> [<c01797c5>] bmap+0x45/0x60
> [<c019c2dc>] ext3_bmap+0x7c/0xa0
> [<c019b880>] ext3_get_block+0x0/0x80
> [<c01797c5>] bmap+0x45/0x60
> [<c01af8c2>] journal_bmap+0x42/0xa0
> [<c0134c73>] __mcount+0x13/0x20
> [<c0134249>] _mutex_unlock+0x9/0x60
> [<c01af827>] journal_next_log_block+0x47/0xa0
> [<c0113d30>] mcount+0x14/0x18
> [<c01af832>] journal_next_log_block+0x52/0xa0
> [<c01af939>] journal_get_descriptor_buffer+0x19/0xc0
> [<c01ac4ec>] journal_commit_transaction+0xf6c/0x13e0
> [<c01aedee>] kjournald+0xce/0x260
> [<c013555c>] sub_preempt_count+0x7c/0xa0
> [<c0133d00>] autoremove_wake_function+0x0/0x60
> [<c03b2a33>] _spin_unlock_irq+0x13/0x40
> [<c0133d00>] autoremove_wake_function+0x0/0x60
> [<c0119459>] schedule_tail+0x19/0x60
> [<c01aece0>] commit_timeout+0x0/0x20
> [<c01aed20>] kjournald+0x0/0x260
> [<c0103339>] kernel_thread_helper+0x5/0xc
this is a weird one. This is the first message, right? I've reviewed
bh_lru_lock/unlock and cannot spot anything that could be wrong there.
Ingo
On Thu, 2004-10-14 at 16:16, Lorenzo Allegrucci wrote:
> BTW, I'm getting a lot of "scheduling while atomic" messages
> running LTP's runalltests.sh -x 200.
> Attached is the kern.log and the latency trace.
Looks like that latency trace is mostly printk overhead from the
scheduling while atomic errors. In general, if you are still getting
lots of printks in your logs due to bugs, the latency traces are not
very useful.
Lee
On Thu, Oct 14, 2004 at 01:26:33PM -0700, Bill Huey wrote:
> On Thu, Oct 14, 2004 at 12:06:22PM -0500, [email protected] wrote:
> > I also managed to get the machine stuck with
> > /sbin/reboot
> > not sure why.
> Mount the file system read/write and start slamming it with heavy disk
> activity. If it locks up, then it might just as well be a problem with
> the journaling code and the softirq system backing it. I ran into this
> in my project and it was the softirq related IO code all of the way down
> to the SCSI driver.
Heavy "sync" activity killed my machine. Try stuff that puts loads on
that system. :)
bill
This fixed it..
Daniel
On Thu, 2004-10-14 at 12:57, Ingo Molnar wrote:
> * Daniel Walker <[email protected]> wrote:
>
> > This was during NFS startup in init.
> >
> > using smp_processor_id() in preemptible [00000001] code:
> > rpc.rquotad/2158
> > caller is ipt_do_table+0x7b/0x3a0
> > [<c011aa15>] smp_processor_id+0x95/0xa0
> > [<c038cbfb>] ipt_do_table+0x7b/0x3a0
>
> ugh, this is a nasty one - if you look at the TABLE_OFFSET trickery in
> ipt_do_table it's basically an open-coded per-CPU variable in essence.
> (probably predating percpu.h so it's fair.) Could you try the quick hack
> below? (it compiles but is otherwise untested)
>
> The proper solution would be to change the code to use per-cpu variables
> (and get that patch accepted upstream) and then trivially convert it to
> get_cpu_var_locked().
>
> Ingo
>
> --- linux/net/ipv4/netfilter/ip_tables.c.orig
> +++ linux/net/ipv4/netfilter/ip_tables.c
> @@ -287,10 +287,14 @@ ipt_do_table(struct sk_buff **pskb,
> * match it. */
> offset = ntohs(ip->frag_off) & IP_OFFSET;
>
> +#ifdef CONFIG_PREEMPT_REALTIME
> + write_lock_bh(&table->lock);
> +#else
> read_lock_bh(&table->lock);
> +#endif
> IP_NF_ASSERT(table->valid_hooks & (1 << hook));
> table_base = (void *)table->private->entries
> - + TABLE_OFFSET(table->private, smp_processor_id());
> + + TABLE_OFFSET(table->private, _smp_processor_id());
> e = get_entry(table_base, table->private->hook_entry[hook]);
>
> #ifdef CONFIG_NETFILTER_DEBUG
> @@ -397,7 +401,11 @@ ipt_do_table(struct sk_buff **pskb,
> #ifdef CONFIG_NETFILTER_DEBUG
> ((struct ipt_entry *)table_base)->comefrom = 0xdead57ac;
> #endif
> +#ifdef CONFIG_PREEMPT_REALTIME
> + write_unlock_bh(&table->lock);
> +#else
> read_unlock_bh(&table->lock);
> +#endif
>
> #ifdef DEBUG_ALLOW_ALL
> return NF_ACCEPT;
On Thu, Oct 14, 2004 at 11:52:52AM -0700, Daniel Walker wrote:
> When I was reviewing this it seemed like it would be possible to keep
> RCU anonymous by moving the callback processing out of the tasklet . The
> reason it was moved into a tasklet was to reduce latency. But if you
> serialize it like you have, aren't you removing all the benefits of the
> RCU type lock in those section that are converted to the new API ?
What Ingo is doing now is mostly like a temporary fix for dealing with
this issue. Simple backing with a normal mutex should be sufficient for
protecting that access. RCU is still an open problem.
> Why not have a per cpu mutex instead of a per variable per cpu mutex?
> I'm not sure what the trade off are, except size.
It's a read-mostly read/write lock. N number of real processors can
do N number of read locks. That structure needs to be emulated somehow
and a per CPU mutex is probably the correct method of getting it.
It's just a matter of how. I did suggest something in my project
announcement.
I don't know if it's crack smoking or not. :)
bill
gzipped latecy_trace this time, sorry.
On Thursday 14 October 2004 16:31, Ingo Molnar wrote:
>
> i have released the -U1 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
>
> this is a strict bugfixes-only release. With -U1 i cannot reproduce any
> of the bugs on my testsystems anymore, but take care nevertheless, this
> is still experimental code.
>
> Changes since -U0:
>
> - bugfix: fixed the highmem related crash reported by Adam Heath and i
> think this could also fix the crash reported by Mark H Johnson.
>
> - bugfix: fixed a number of networking related soft-lockups, caused by
> a deadlock scenarios in the ipv4, netfilter and net-xmit locking
> code. This could fix the lockup reported by Lorenzo Allegrucci.
Yes, -U1 seems to have fixed it for me.
BTW, I'm getting a lot of "scheduling while atomic" messages
running LTP's runalltests.sh -x 200.
Attached is the kern.log and the latency trace.
--
I route therefore you are
On Thu, Oct 14, 2004 at 12:06:22PM -0500, [email protected] wrote:
> Not sure if I can bring this up to multi user yet. Some initial testing
> in single user mode indicates problems when I turn on networking. See
> the attached messages from /var/log/messages to see the kinds of problems
> I am having. The key ones appear after doing
> ./S10network start
> as part of single stepping the init sequence. I stopped at this point
> to make sure I had a good record of the messages.
...
> I also managed to get the machine stuck with
> /sbin/reboot
> not sure why.
These are two seperate problems from my guess.
Mount the file system read/write and start slamming it with heavy disk
activity. If it locks up, then it might just as well be a problem with
the journaling code and the softirq system backing it. I ran into this
in my project and it was the softirq related IO code all of the way down
to the SCSI driver.
It was difficult to get debug messages during a deadlock and I don't
know what kind of mileage you'll get by doing this.
bill
Oct 14 15:35:52 swdev14 kernel: using smp_processor_id() in preemptible [00000001] code: thunderbird-bin/3933
Oct 14 15:35:52 swdev14 kernel: caller is ipt_do_table+0x79/0x335 [ip_tables]
Oct 14 15:35:52 swdev14 kernel: [<c011878e>] smp_processor_id+0xa8/0xb9
Oct 14 15:35:52 swdev14 kernel: [<e08550a8>] ipt_do_table+0x79/0x335 [ip_tables]
Oct 14 15:35:52 swdev14 kernel: [<e08550a8>] ipt_do_table+0x79/0x335 [ip_tables]
Oct 14 15:35:52 swdev14 kernel: [<e09ba0b5>] ipt_local_out_hook+0x76/0x79 [iptable_filter]
Oct 14 15:35:52 swdev14 kernel: [<c02394ea>] nf_iterate+0x70/0xa1
Oct 14 15:35:52 swdev14 kernel: [<c024ef4f>] dst_output+0x0/0x2f
Oct 14 15:35:52 swdev14 kernel: [<c023982d>] nf_hook_slow+0x79/0x126
Oct 14 15:35:52 swdev14 kernel: [<c024ef4f>] dst_output+0x0/0x2f
Oct 14 15:35:52 swdev14 kernel: [<c024cfe2>] ip_queue_xmit+0x495/0x59e
Oct 14 15:35:52 swdev14 kernel: [<c024ef4f>] dst_output+0x0/0x2f
Oct 14 15:35:52 swdev14 kernel: [<c01325d5>] __mcount+0x1d/0x21
Oct 14 15:35:52 swdev14 kernel: [<c0298716>] _spin_unlock_irq+0xb/0x35
Oct 14 15:35:52 swdev14 kernel: [<c01171e7>] finish_task_switch+0x3c/0x85
Oct 14 15:35:52 swdev14 kernel: [<c0111d1c>] mcount+0x14/0x18
Oct 14 15:35:52 swdev14 kernel: [<c01325d5>] __mcount+0x1d/0x21
Oct 14 15:35:52 swdev14 kernel: [<c0263c48>] tcp_v4_send_check+0xe/0xe2
Oct 14 15:35:52 swdev14 kernel: [<c025d95b>] tcp_transmit_skb+0x435/0x85b
Oct 14 15:35:52 swdev14 kernel: [<c0111d1c>] mcount+0x14/0x18
Oct 14 15:35:52 swdev14 kernel: [<c0263c48>] tcp_v4_send_check+0xe/0xe2
Oct 14 15:35:52 swdev14 kernel: [<c025da07>] tcp_transmit_skb+0x4e1/0x85b
Oct 14 15:35:52 swdev14 kernel: [<c01af5ee>] memcpy+0x12/0x3c
Oct 14 15:35:52 swdev14 kernel: [<c025e830>] tcp_write_xmit+0x14c/0x2c6
Oct 14 15:35:52 swdev14 kernel: [<c0252608>] tcp_sendmsg+0x50d/0x10a7
Oct 14 15:35:52 swdev14 kernel: [<c0111d1c>] mcount+0x14/0x18
Oct 14 15:35:52 swdev14 kernel: [<c02525dc>] tcp_sendmsg+0x4e1/0x10a7
Oct 14 15:35:52 swdev14 kernel: [<c0224f0e>] sock_sendmsg+0xfa/0xfc
Oct 14 15:35:52 swdev14 kernel: [<c0274665>] inet_sendmsg+0x50/0x5b
Oct 14 15:35:52 swdev14 kernel: [<c0224f0e>] sock_sendmsg+0xfa/0xfc
Oct 14 15:35:52 swdev14 kernel: [<c01325d5>] __mcount+0x1d/0x21
Oct 14 15:35:52 swdev14 kernel: [<c01af10e>] find_next_bit+0x16/0x92
Oct 14 15:35:52 swdev14 kernel: [<c0117b37>] find_busiest_group+0xd4/0x2e0
Oct 14 15:35:52 swdev14 kernel: [<c0131d20>] _mutex_unlock+0xe/0x5e
Oct 14 15:35:52 swdev14 kernel: [<c0111d1c>] mcount+0x14/0x18
Oct 14 15:35:52 swdev14 kernel: [<c0131d20>] _mutex_unlock+0xe/0x5e
Oct 14 15:35:52 swdev14 kernel: [<c0131cba>] _mutex_lock+0x29/0x3f
Oct 14 15:35:52 swdev14 kernel: [<c013186b>] autoremove_wake_function+0x0/0x57
Oct 14 15:35:52 swdev14 kernel: [<c0224c69>] sockfd_lookup+0x1f/0x74
Oct 14 15:35:52 swdev14 kernel: [<c0111d1c>] mcount+0x14/0x18
Oct 14 15:35:52 swdev14 kernel: [<c0226493>] sys_sendto+0xed/0x10c
Oct 14 15:35:52 swdev14 kernel: [<c0173221>] inode_times_differ+0x9/0x4a
Oct 14 15:35:52 swdev14 kernel: [<c017332a>] update_atime+0xc8/0xcd
Oct 14 15:35:52 swdev14 kernel: [<c01325d5>] __mcount+0x1d/0x21
Oct 14 15:35:52 swdev14 kernel: [<c02264bd>] sys_send+0xb/0x3f
Oct 14 15:35:52 swdev14 kernel: [<c0226d5e>] sys_socketcall+0x12e/0x239
Oct 14 15:35:52 swdev14 kernel: [<c0111d1c>] mcount+0x14/0x18
Oct 14 15:35:52 swdev14 kernel: [<c02264ed>] sys_send+0x3b/0x3f
Oct 14 15:35:52 swdev14 kernel: [<c0226d5e>] sys_socketcall+0x12e/0x239
Oct 14 15:35:52 swdev14 kernel: [<c0158cdd>] sys_read+0x78/0x7a
Oct 14 15:35:52 swdev14 kernel: [<c0106161>] sysenter_past_esp+0x52/0x71
Ingo Molnar wrote:
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U0
I notice:
+ local_irq_save(flags);
+ ____trace(&__get_cpu_var(trace), eip, parent_eip);
+ local_irq_restore(flags);
Why not use the lockless logging available in relayfs, you'll avoid the
interrupt disabling altogether since, umm ... it's lockless.
Karim
--
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || [email protected] || 1-866-677-4546
On Thu, 14 Oct 2004 16:31:31 +0200, Ingo Molnar <[email protected]> wrote:
>
> i have released the -U1 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
>
"scheduling while atomic" messages in Reiser4 mentioned at -U0 thread
also appear in this version, but less often.
On Thu, 14 Oct 2004, Adam Heath wrote:
> On Thu, 14 Oct 2004, Ingo Molnar wrote:
>
> >
> > i have released the -U1 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
> >
> > Changes since -U0:
> >
> > - bugfix: fixed the highmem related crash reported by Adam Heath and i
> > think this could also fix the crash reported by Mark H Johnson.
>
> I've reenabled highmem(4g).
>
> Seems to be working fine. Has been running 11 minutes, without problems.
>
> ps: Something that irks me. During bootup, I get the high-latency traces for
> swapper/0. These fill up the dmesg ring buffer, so the early messages get
> dropped. Is there anything that can be done to fix that?
Got my first message.
scheduling while atomic: kswapd0/0x04000001/10
caller is cond_resched+0x53/0x70
[<c027ad31>] schedule+0x531/0x570
[<c027b2a3>] cond_resched+0x53/0x70
[<c012c604>] _mutex_lock+0x14/0x40
[<c0149521>] page_lock_anon_vma+0x31/0x60
[<c0149725>] page_referenced_anon+0x15/0x80
[<c01498ba>] page_referenced+0x7a/0x80
[<c0141635>] refill_inactive_zone+0x435/0x4b0
[<c01408a3>] shrink_slab+0x143/0x160
[<c0141728>] shrink_zone+0x78/0xc0
[<c0141b7a>] balance_pgdat+0x23a/0x2f0
[<c0141ced>] kswapd+0xbd/0xf0
[<c012c140>] autoremove_wake_function+0x0/0x50
[<c01056d2>] ret_from_fork+0x6/0x14
[<c012c140>] autoremove_wake_function+0x0/0x50
[<c0141c30>] kswapd+0x0/0xf0
[<c0103a2d>] kernel_thread_helper+0x5/0x18
Config is as before, with highmem enabled being the only difference.
>there wasnt all that much missing for SELINUX + PREEMPT_REALTIME
>support. Could you try the patch below - does it fix your box?
...
Alas no, it actually seemed to make things worse. After
/etc/rc3.d/S10network start
I got a few dumps (too fast to see) and then the following BUG.
[top of screen]
Modules linked in: iptable_filter ip_tables 8139too mii dm_mod uhci_hcd
ext3 jbd
CPU: 1
EIP: 0060:[<c0316366>] Not tainted VLI
EFLAGS: 00000002 (2.6.9-rc4-mm1-VP-U1a) [only change is your patch...]
eax: 00000002 ebx: c1405820 ecx: 0104cf60 edx: 00000001
esi: c166a000 edi: 00000002 ebp: c166bf04 esp: c166bef8
ds: 007b es: 007b ss: 0068 preempt: 00010003
Process ksoftirqd/1 (pid: 5, threadinfo=c166a000 task=c1658000)
Stack: 00000001 c1405820 c1435820 c166bf18 c011bc30 c1436200 c1405820
c1436200
c166bf48 c011c766 c1435820 c1405820 c166bf38 00000002 c1658000
c166bf48
00000001 c1436200 00000001 0104cf60 c166bfa4 c0315433 00000001
c1435820
Call Trace:
[<c011bc30>] double_lock_balance+0x40/0x50
[<c011c766>] load_balance_newidle+0x66/0xc0
[<c0315433>] schedule+0x733/0x830
[<c0114b30>] mcount+0x14/0x18
[<c01280b4>] ksoftirq+0xd4/0xf0
[<c01382b0>] kthread+0x0/0xc0
[<c0105b19>] kernel_thread_helper+0x5/0xc
Code: bf 00 00 00 00 55 89 e5 83 ec 0c 89 5d f8 89 75 fc e8 cb e7 df ff c7
04 24
01 00 00 00 89 c3 e8 d1 3e e2 ff be 00 e0 ff ff 21 e6 <31> c0 86 03 84 c0
7e 0a
8b 5d f8 8b 7f fc 89 ec 5d c3 c7
Rebooting to see if I was just "unlucky"...
Checking the log file after reboot, it appears I do have a trace to send
you
[next message...]. Trying again.
Different crash but at basically the same step. Getting tired of typing
these
in from the other screen...
EIP is at sub_preempt_count+0x5f+0xa0
...
preempt: 00010003
Call Trace:
[<c0316384>] _spin_lock+0x44/0x70
[<c011bc30>] double_lock_balance+0x40/0x50
[<c011c766>] load_balance_newidle+0x66/0xc0
[<c0315433>] schedule+0x733/0x830
[<c0114b30>] mcount+0x14/0x18
[<c01280b4>] ksoftirq+0xd4/0xf0
[<c013836b>] kthread+0xbb/0xc0
[<c0127fe0>] ksoftirq+0x0/0xf0
[<c01382b0>] kthread+0x0/0xc0
[<c0105b19>] kernel_thread_helper+0x5/0xc
... console shuts up ...
Try a third time with max_cpus=1
OK. Made it past S10network start, with just a couple messages about a
sleeping function called from invalid context; looks like a new cause
and will send you that in the next message too.
Did a couple other commands (less, ls) without problem. Tried
./S13portmap start
and the machine locked up (no response to Ctrl-C). Alt-SysRq-T did
display something. Alt-SysRq-S did an Emergency Sync (but also dumped
out...)
[top of screen]
in atomic():1 [00000001], irqs_disabled():0
[<c011f26a>] __might_sleep+0xca/0xe0
[<c0138de9>] _mutex_lock+0x29/0x70
[<c0138e86>] _mutex_lock_irqsave+0x16/0x20
[<c014c932>] pdflush_operation+0x32/0xd0
[<c01691ad>] emergency_sync+0x1d/0x30
[<c01690e0>] do_sync+0x0/0x90
[<c0217456>] __handle_sysrq+0x76/0xf0
[<c0210b1d>] kbd_event+0xad/0x110
[<c028da8b>] input_event+0xfb/0x3f0
[<c0114b30>] mcount+0x14/0x18
[<c0291923>] atkbd_report_key+0x43/0xa0
[<c0291ba6>] atkbd_interrupt+0x226/0x590
[<c0225f54>] serio_interrupt+0x54/0xa3
[<c0226681>] i8042_interrupt+0xc1/0x1a0
[<c01440b6>] handle_IRQ_event+0x46/0x80
[<c01448c0>] do_hardirq+0x70/0xf0
[<c0144a41>] do_irqd+0x101/0x1d0
[<c013836b>] kthread+0xbb/0xc0
[<c0144940>] do_irqd+0x0/0x1d0
[<c01382b0>] kthread+0x0/0xc0
[<c0105b19>] kernel_thread_helper+0x5/0xc
Emergency Sync complete
So there's appears to be a problem in Alt-SysRq handling as well.
Alt-SysRq-P doesn't show anything, not sure why.
Alt-SysRq-M appears to work OK.
Alt-SysRq-B works too :-).
Will bring up -T3 soon and send the messages on disk in a separate
message.
--Mark H Johnson
<mailto:[email protected]>
* Adam Heath <[email protected]> wrote:
> > Seems to be working fine. Has been running 11 minutes, without problems.
> >
> > ps: Something that irks me. During bootup, I get the high-latency traces for
> > swapper/0. These fill up the dmesg ring buffer, so the early messages get
> > dropped. Is there anything that can be done to fix that?
>
> Got my first message.
>
> scheduling while atomic: kswapd0/0x04000001/10
> caller is cond_resched+0x53/0x70
> [<c027ad31>] schedule+0x531/0x570
> [<c027b2a3>] cond_resched+0x53/0x70
> [<c012c604>] _mutex_lock+0x14/0x40
> [<c0149521>] page_lock_anon_vma+0x31/0x60
i'm working on this one currently, it's a bit tricky.
Ingo
On Thu, 14 Oct 2004 02:24:33 +0200, Ingo Molnar <[email protected]> wrote:
>
> i'm pleased to announce a significantly improved version of the
> Real-Time Preemption (PREEMPT_REALTIME) feature that i have been working
> towards in the past couple of weeks.
>
> the patch (against 2.6.9-rc4-mm1) can be downloaded from:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U0
>
I'm getting some scheduling_while_atomic messages concerning Reiser4.
They're non-fatal, please look into them.
scheduling while atomic: rc/0x04000001/908
caller is cond_resched+0x4c/0x69
[<c0105666>] dump_stack+0x1e/0x20
[<c035e40c>] schedule+0x94/0x3f9
[<c035ec0d>] cond_resched+0x4c/0x69
[<c012bda7>] _rw_mutex_read_lock+0x1f/0x33
[<c019e7dd>] cbk_cache_scan_slots+0x5c/0x276
[<c019ea22>] cbk_cache_search+0x2b/0x5e
[<c019d84d>] coord_by_handle+0x12/0x29
[<c019d831>] object_lookup+0xce/0xd8
[<c01c8dad>] find_entry+0x123/0x2b7
[<c01c788d>] lookup_name_hashed+0xc3/0x143
[<c01c7997>] lookup_hashed+0x3e/0xc0
[<c01a6bdd>] reiser4_lookup+0x83/0x10e
[<c015e02c>] real_lookup+0x74/0xf4
[<c015e2cb>] do_lookup+0x5e/0x9d
[<c015ec7c>] link_path_walk+0x972/0xde6
[<c015f49e>] path_lookup+0x19a/0x1a6
[<c015f63a>] __user_walk+0x31/0x53
[<c015a16d>] vfs_stat+0x1e/0x53
[<c015a890>] sys_stat64+0x19/0x32
[<c010509d>] sysenter_past_esp+0x52/0x71
scheduling while atomic: sh/0x04000001/4346
caller is cond_resched+0x4c/0x69
[<c0105666>] dump_stack+0x1e/0x20
[<c035e40c>] schedule+0x94/0x3f9
[<c035ec0d>] cond_resched+0x4c/0x69
[<c012bda7>] _rw_mutex_read_lock+0x1f/0x33
[<c019e7dd>] cbk_cache_scan_slots+0x5c/0x276
[<c019ea22>] cbk_cache_search+0x2b/0x5e
[<c019d84d>] coord_by_handle+0x12/0x29
[<c019d831>] object_lookup+0xce/0xd8
[<c01cfc83>] find_file_item+0x127/0x1bc
[<c01d1b2e>] read_unix_file+0x2e4/0x48d
[<c01a74e8>] reiser4_read+0x90/0xaf
[<c0150abb>] vfs_read+0xe0/0x126
[<c015b39d>] kernel_read+0x4a/0x57
[<c017aaeb>] load_elf_binary+0x303/0xba1
[<c015bfe2>] search_binary_handler+0xd6/0x1f1
[<c015c2fb>] do_execve+0x1fe/0x2af
[<c0103c1c>] sys_execve+0x3f/0x91
[<c010509d>] sysenter_past_esp+0x52/0x71
scheduling while atomic: gcc/0x04000001/5587
caller is cond_resched+0x4c/0x69
[<c0105666>] dump_stack+0x1e/0x20
[<c035e40c>] schedule+0x94/0x3f9
[<c035ec0d>] cond_resched+0x4c/0x69
[<c012bda7>] _rw_mutex_read_lock+0x1f/0x33
[<c019e7dd>] cbk_cache_scan_slots+0x5c/0x276
[<c019ea22>] cbk_cache_search+0x2b/0x5e
[<c019d84d>] coord_by_handle+0x12/0x29
[<c019d831>] object_lookup+0xce/0xd8
[<c01c8dad>] find_entry+0x123/0x2b7
[<c01c8acb>] rem_entry_hashed+0x75/0x1d5
[<c01c97cb>] unlink_common+0xd4/0x1ca
[<c01a6fe0>] unlink_file+0x48/0x75
[<c01a703a>] reiser4_unlink+0x2d/0x37
[<c0160d12>] vfs_unlink+0x1d1/0x225
[<c0160e24>] sys_unlink+0xbe/0x145
[<c010509d>] sysenter_past_esp+0x52/0x71
scheduling while atomic: xinit/0x04000001/5628
caller is cond_resched+0x4c/0x69
[<c0105666>] dump_stack+0x1e/0x20
[<c035e40c>] schedule+0x94/0x3f9
[<c035ec0d>] cond_resched+0x4c/0x69
[<c012bda7>] _rw_mutex_read_lock+0x1f/0x33
[<c019e7dd>] cbk_cache_scan_slots+0x5c/0x276
[<c019ea22>] cbk_cache_search+0x2b/0x5e
[<c019d84d>] coord_by_handle+0x12/0x29
[<c019d831>] object_lookup+0xce/0xd8
[<c01cfc83>] find_file_item+0x127/0x1bc
[<c01d1550>] readpage_unix_file+0xb1/0x35b
[<c01a7961>] reiser4_readpage+0x4d/0x7d
[<c0134e1e>] page_cache_read+0x72/0xea
[<c013505d>] filemap_nopage+0x1c7/0x391
[<c01d21bf>] unix_file_filemap_nopage+0x59/0x84
[<c01429c9>] do_no_page+0xb5/0x32f
[<c0142db6>] handle_mm_fault+0x91/0x164
[<c01133db>] do_page_fault+0x20c/0x654
[<c0105299>] error_code+0x2d/0x38
On Thu, 14 Oct 2004, Ingo Molnar wrote:
>
> * [email protected] <[email protected]> wrote:
>
> > >the overhead we can try to optimize later on. What problems do you see
> > >with setting priorities on those IRQs?
> >
> > Perhaps I am old fashioned, but in building a real time system, I
> > consider hardware interrupt processing as something that is always at
> > a higher priority than real time tasks. [...]
Let us say you have a server taking in requests over the network. Then you
want to run the ethernet device at very high priority - you can just as
well run it in the interrupt directly. But let us say you are making an
embedded device handling some hardware real-time but having a
web-interface to configure it. Then you dont want the traffic on the
network to take CPU from you real-time thread (if you don't have DMA it
can take a lot of CPU just to read the packets out of the controller!)
I do have real life experience with exactly this problem and the solution
was to move the interrupt-handler into a low-priority thread.
As I said on comments on lwn.net: Make these things parameters for the
real-time guys to choose per driver for their specific system. There is no
good setting useable for everybody. On normal systems let them stay in
interrupt context and use normal spinlocks for must things That _performs_
much better but gives higher latencies. Just make it possible for the
real-time system developeres to configure their system compiletime along
with choosing drivers, file systems etc.
Esben
* Karim Yaghmour <[email protected]> wrote:
>
> Ingo Molnar wrote:
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U0
>
> I notice:
> + local_irq_save(flags);
> + ____trace(&__get_cpu_var(trace), eip, parent_eip);
> + local_irq_restore(flags);
>
> Why not use the lockless logging available in relayfs, you'll avoid
> the interrupt disabling altogether since, umm ... it's lockless.
i just added something ad-hoc. I wanted it to be accurate across
interrupt entries. I have not looked at the relayfs locking but how does
it solve that? Also, cli/sti makes it obviously SMP-safe and is pretty
cheap on all x86 CPUs. (Also, i didnt want to use preempt_disable/enable
because the tracer interacts with that code quite heavily.)
Ingo
i have released the -U2 PREEMPT_REALTIME patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
this too is a bugfixes-only release, and it is still experimental code.
Changes since -U1:
- bugfix: fix page_lock_anon_vma() crash reported by Adam Heath and
Lorenzo Allegrucci.
- bugfix: fix selinux atomic-schedule warning messages, reported by
Mark H Johnson.
- bugfix: ip_tables atomic-schedule fix, fixes the messages reported by
Daniel Walker and K.R. Foley.
- bugfix: fix warnings/deadlocks in inet_create(), reported by Mark H
Johnson.
- bugfix: fixed a crash-in-shmfs-during-heavy-swapout bug
- bugfix: enable preemption while doing mmdrop() in the scheduler - it
may schedule.
- debugging feature: when PREEMPT_TIMING is enabled then the code also
keeps a trace/stack of preemption enabler EIPs. (if LATENCY_TRACE is
enabled as well then the parent EIP is recorded as well.) Whenever a
stack trace due to atomicity violations is printed, the preemption
stack is printed as well. This makes it much easier to identify the
place that did the illegal preemption-disabling.
to create a -U2 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
Ingo
Ingo Molnar wrote:
> i just added something ad-hoc.
Yes, I understood as much. I'm suggesting it because a lot of
people who need such ad-hoc functionality could easily be
using relayfs.
> I wanted it to be accurate across
> interrupt entries. I have not looked at the relayfs locking but how does
> it solve that?
cmpxchg (basically: try reserve; if fail retry; else write),
with per-cpu buffers.
> Also, cli/sti makes it obviously SMP-safe and is pretty
> cheap on all x86 CPUs. (Also, i didnt want to use preempt_disable/enable
> because the tracer interacts with that code quite heavily.)
No preempt_disable/enable found in the lockless logging in relayfs.
Karim
--
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || [email protected] || 1-866-677-4546
On Thu, 14 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U1 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
>
> Changes since -U0:
>
> - bugfix: fixed the highmem related crash reported by Adam Heath and i
> think this could also fix the crash reported by Mark H Johnson.
I've reenabled highmem(4g).
Seems to be working fine. Has been running 11 minutes, without problems.
ps: Something that irks me. During bootup, I get the high-latency traces for
swapper/0. These fill up the dmesg ring buffer, so the early messages get
dropped. Is there anything that can be done to fix that?
On Fri, 15 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U2 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
kernel/latency.c: In function `add_preempt_count':
kernel/latency.c:390: error: structure has no member named `preempt_trace_eip'
kernel/latency.c:394: error: structure has no member named `preempt_trace_parent_eip'
Karim> cmpxchg (basically: try reserve; if fail retry; else
Karim> write), with per-cpu buffers.
Not sure if I really understand the context where Ingo would use this,
but this lockless scheme doesn't seem to be safe for realtime; the
retry can potentially happen an arbitrary number of times.
- Roland
On Thu, 14 Oct 2004, Adam Heath wrote:
> On Fri, 15 Oct 2004, Ingo Molnar wrote:
>
> >
> > i have released the -U2 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
>
> kernel/latency.c: In function `add_preempt_count':
> kernel/latency.c:390: error: structure has no member named `preempt_trace_eip'
> kernel/latency.c:394: error: structure has no member named `preempt_trace_parent_eip'
Here's a patch:
--- kernel/latency.c.orig 2004-10-14 19:36:26.000000000 -0500
+++ kernel/latency.c 2004-10-14 19:33:30.000000000 -0500
@@ -387,11 +387,9 @@
if (val <= 10) {
unsigned int idx = preempt_count() & PREEMPT_MASK;
if (idx < MAX_PREEMPT_TRACE) {
- current->preempt_trace_eip[idx] = eip;
#ifdef CONFIG_LATENCY_TRACE
+ current->preempt_trace_eip[idx] = eip;
current->preempt_trace_parent_eip[idx] = parent_eip;
-#else
- current->preempt_trace_parent_eip[idx] = 0;
#endif
}
}
--
On Thu, 14 Oct 2004, Adam Heath wrote:
> On Thu, 14 Oct 2004, Adam Heath wrote:
>
> > On Fri, 15 Oct 2004, Ingo Molnar wrote:
> >
> > >
> > > i have released the -U2 PREEMPT_REALTIME patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
> >
> > kernel/latency.c: In function `add_preempt_count':
> > kernel/latency.c:390: error: structure has no member named `preempt_trace_eip'
> > kernel/latency.c:394: error: structure has no member named `preempt_trace_parent_eip'
>
> Here's a patch:
>
> --- kernel/latency.c.orig 2004-10-14 19:36:26.000000000 -0500
> +++ kernel/latency.c 2004-10-14 19:33:30.000000000 -0500
> @@ -387,11 +387,9 @@
> if (val <= 10) {
> unsigned int idx = preempt_count() & PREEMPT_MASK;
> if (idx < MAX_PREEMPT_TRACE) {
> - current->preempt_trace_eip[idx] = eip;
> #ifdef CONFIG_LATENCY_TRACE
> + current->preempt_trace_eip[idx] = eip;
> current->preempt_trace_parent_eip[idx] = parent_eip;
> -#else
> - current->preempt_trace_parent_eip[idx] = 0;
> #endif
> }
> }
> --
How do you set that config option? I only see it in .c and .h files.
Roland Dreier wrote:
> Not sure if I really understand the context where Ingo would use this,
> but this lockless scheme doesn't seem to be safe for realtime; the
> retry can potentially happen an arbitrary number of times.
In theory. In practice it doesn't often happen twice and very rarely
more than that.
Karim
--
Author, Speaker, Developer, Consultant
Pushing Embedded and Real-Time Linux Systems Beyond the Limits
http://www.opersys.com || [email protected] || 1-866-677-4546
Theoretically a problem, in practice not, i.e., good enough for soft/normal
real-time, not hard real-time; probably wouldn't want my heart monitor on
it, but then I wouldn't be using Linux for that either :-)
Robert Wisniewski
The K42 MP OS Project
Advanced Operating Systems
Scalable Parallel Systems
IBM T.J. Watson Research Center
914-945-3181
http://www.research.ibm.com/K42/
[email protected]
Roland Dreier writes:
> Karim> cmpxchg (basically: try reserve; if fail retry; else
> Karim> write), with per-cpu buffers.
>
> Not sure if I really understand the context where Ingo would use this,
> but this lockless scheme doesn't seem to be safe for realtime; the
> retry can potentially happen an arbitrary number of times.
>
> - Roland
On Fri, Oct 15, 2004 at 01:42:02AM +0200, Ingo Molnar wrote:
>
> i have released the -U2 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
mm/shmem.c: In function `shmem_dir_map':
mm/shmem.c:103: warning: implicit declaration of function `kmap_atomic_rt'
mm/shmem.c:103: error: `KM_USER0' undeclared (first use in this function)
mm/shmem.c:103: error: (Each undeclared identifier is reported only once
mm/shmem.c:103: error: for each function it appears in.)
mm/shmem.c: In function `shmem_dir_unmap':
mm/shmem.c:108: warning: implicit declaration of function `kunmap_atomic_rt'
mm/shmem.c:108: error: `KM_USER0' undeclared (first use in this function)
mm/shmem.c: In function `shmem_swp_map':
mm/shmem.c:113: error: `KM_USER1' undeclared (first use in this function)
mm/shmem.c: In function `shmem_swp_balance_unmap':
mm/shmem.c:125: error: `KM_USER1' undeclared (first use in this function)
mm/shmem.c: In function `shmem_swp_unmap':
mm/shmem.c:130: error: `KM_USER1' undeclared (first use in this function)
mm/shmem.c: In function `shmem_swp_set':
mm/shmem.c:333: warning: implicit declaration of function `kmap_atomic_to_page_rt'
mm/shmem.c:333: error: invalid type argument of `->'
mm/shmem.c: In function `shmem_file_write':
mm/shmem.c:1362: error: `KM_USER0' undeclared (first use in this function)
mm/shmem.c:1362: warning: assignment makes pointer from integer without a cast
mm/shmem.c: In function `shmem_symlink':
mm/shmem.c:1719: error: `KM_USER0' undeclared (first use in this function)
mm/shmem.c:1719: warning: assignment makes pointer from integer without a cast
make[1]: *** [mm/shmem.o] Error 1
make: *** [mm] Error 2
root@nietzsche> /home/bhuey/linux-2.6.8% 17# make tags
....
I've got kgdb targetted next and I'm trying to figure out how to write a
rw/semaphore with priority inheritance.
bill
On Thu, 2004-10-14 at 22:00, Robert Wisniewski wrote:
> Theoretically a problem, in practice not, i.e., good enough for soft/normal
> real-time, not hard real-time; probably wouldn't want my heart monitor on
> it, but then I wouldn't be using Linux for that either :-)
Also, the issue here is how we do debug logging. You would presumably
not use this at all in production.
Lee
On Fri, 15 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U2 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
scheduling while atomic: XFree86/0x04000002/1129
caller is cond_resched+0x53/0x70
[<c027acd7>] schedule+0x517/0x550
[<c012ce6a>] check_preempt_timing+0x1a/0x130
[<c0107b98>] do_IRQ+0x58/0x80
[<c027b243>] cond_resched+0x53/0x70
[<c012c684>] _mutex_lock+0x14/0x40
[<c012c6d5>] _mutex_lock_irqsave+0x5/0x10
[<c01b28bf>] avc_has_perm_noaudit+0x10f/0x180
[<c012ce6a>] check_preempt_timing+0x1a/0x130
[<c01b296a>] avc_has_perm+0x3a/0x78
[<c014de87>] shmem_truncate+0x1d7/0x400
[<c01b8060>] ipc_has_perm+0x70/0x90
[<c027b209>] cond_resched+0x19/0x70
[<c01a9c72>] ipcperms+0x72/0xa0
[<c01adaf7>] do_shmat+0xc7/0x300
[<c010b8b6>] sys_ipc+0x1c6/0x280
[<c01c8040>] copy_to_user+0x40/0x60
[<c011d69c>] sys_gettimeofday+0x2c/0x70
[<c01057fb>] syscall_call+0x7/0xb
scheduling while atomic: liquidwar/0x04000002/1553
caller is cond_resched+0x53/0x70
[<c027acd7>] schedule+0x517/0x550
[<c012ce6a>] check_preempt_timing+0x1a/0x130
[<c027b243>] cond_resched+0x53/0x70
[<c012c684>] _mutex_lock+0x14/0x40
[<c012c6d5>] _mutex_lock_irqsave+0x5/0x10
[<c01b27da>] avc_has_perm_noaudit+0x2a/0x180
[<c01b2992>] avc_has_perm+0x62/0x78
[<c01b296a>] avc_has_perm+0x3a/0x78
[<c012c690>] _mutex_lock+0x20/0x40
[<c01b8060>] ipc_has_perm+0x70/0x90
[<c01b8060>] ipc_has_perm+0x70/0x90
[<c0107b98>] do_IRQ+0x58/0x80
[<c01adb13>] do_shmat+0xe3/0x300
[<c010b8b6>] sys_ipc+0x1c6/0x280
[<c0147a4f>] sys_munmap+0x3f/0x60
[<c01057fb>] syscall_call+0x7/0xb
Bill Huey (hui) wrote:
> On Fri, Oct 15, 2004 at 01:42:02AM +0200, Ingo Molnar wrote:
>
>>i have released the -U2 PREEMPT_REALTIME patch:
>>
>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
>
>
> mm/shmem.c: In function `shmem_dir_map':
> mm/shmem.c:103: warning: implicit declaration of function `kmap_atomic_rt'
> mm/shmem.c:103: error: `KM_USER0' undeclared (first use in this function)
> mm/shmem.c:103: error: (Each undeclared identifier is reported only once
> mm/shmem.c:103: error: for each function it appears in.)
> mm/shmem.c: In function `shmem_dir_unmap':
> mm/shmem.c:108: warning: implicit declaration of function `kunmap_atomic_rt'
> mm/shmem.c:108: error: `KM_USER0' undeclared (first use in this function)
> mm/shmem.c: In function `shmem_swp_map':
> mm/shmem.c:113: error: `KM_USER1' undeclared (first use in this function)
> mm/shmem.c: In function `shmem_swp_balance_unmap':
> mm/shmem.c:125: error: `KM_USER1' undeclared (first use in this function)
> mm/shmem.c: In function `shmem_swp_unmap':
> mm/shmem.c:130: error: `KM_USER1' undeclared (first use in this function)
> mm/shmem.c: In function `shmem_swp_set':
> mm/shmem.c:333: warning: implicit declaration of function `kmap_atomic_to_page_rt'
> mm/shmem.c:333: error: invalid type argument of `->'
> mm/shmem.c: In function `shmem_file_write':
> mm/shmem.c:1362: error: `KM_USER0' undeclared (first use in this function)
> mm/shmem.c:1362: warning: assignment makes pointer from integer without a cast
> mm/shmem.c: In function `shmem_symlink':
> mm/shmem.c:1719: error: `KM_USER0' undeclared (first use in this function)
> mm/shmem.c:1719: warning: assignment makes pointer from integer without a cast
> make[1]: *** [mm/shmem.o] Error 1
> make: *** [mm] Error 2
> root@nietzsche> /home/bhuey/linux-2.6.8% 17# make tags
>
> ....
>
> I've got kgdb targetted next and I'm trying to figure out how to write a
> rw/semaphore with priority inheritance.
>
> bill
>
>
What platform are you getting this on?
kr
On Thu, Oct 14, 2004 at 09:40:10PM -0500, K.R. Foley wrote:
> >mm/shmem.c:1362: warning: assignment makes pointer from integer without a
> >cast
> >mm/shmem.c: In function `shmem_symlink':
> >mm/shmem.c:1719: error: `KM_USER0' undeclared (first use in this function)
> >mm/shmem.c:1719: warning: assignment makes pointer from integer without a
> >cast
> >make[1]: *** [mm/shmem.o] Error 1
> >make: *** [mm] Error 2
> >root@nietzsche> /home/bhuey/linux-2.6.8% 17# make tags
>
> What platform are you getting this on?
x86
I ran into other build problems BTW too.
bill
Bill Huey (hui) wrote:
> On Thu, Oct 14, 2004 at 09:40:10PM -0500, K.R. Foley wrote:
>
>>>mm/shmem.c:1362: warning: assignment makes pointer from integer without a
>>>cast
>>>mm/shmem.c: In function `shmem_symlink':
>>>mm/shmem.c:1719: error: `KM_USER0' undeclared (first use in this function)
>>>mm/shmem.c:1719: warning: assignment makes pointer from integer without a
>>>cast
>>>make[1]: *** [mm/shmem.o] Error 1
>>>make: *** [mm] Error 2
>>>root@nietzsche> /home/bhuey/linux-2.6.8% 17# make tags
>>
>>What platform are you getting this on?
>
>
> x86
Not sure how you could be missing these with any configuration. Ah. Did
you miss Linus' rc4 patch by any chance?
Just finished booting my slower UP box. My SMP box has been up for about
1 hr 45 mins.
kr
>
> I ran into other build problems BTW too.
>
> bill
>
>
On Thu, Oct 14, 2004 at 10:19:59PM -0500, K.R. Foley wrote:
> Not sure how you could be missing these with any configuration. Ah. Did
> you miss Linus' rc4 patch by any chance?
> Just finished booting my slower UP box. My SMP box has been up for about
> 1 hr 45 mins.
31 20:34 bzip2 -dc ../linux-2.6.8.tar.bz2 | tar xf -
37 20:36 cd linux-2.6.8/
38 20:36 bzip2 -dc ../../patch-2.6.9-rc4.bz2 | patch -p1
39 20:37 bzip2 -dc ../../2.6.9-rc4-mm1.bz2 | patch -p1
41 20:37 cat ../../voluntary-preempt-2.6.9-rc4-mm1-U2 | patch -p1
That's what I did.
bill
On Thu, Oct 14, 2004 at 08:47:29PM -0700, Bill Huey wrote:
> 31 20:34 bzip2 -dc ../linux-2.6.8.tar.bz2 | tar xf -
> 37 20:36 cd linux-2.6.8/
> 38 20:36 bzip2 -dc ../../patch-2.6.9-rc4.bz2 | patch -p1
> 39 20:37 bzip2 -dc ../../2.6.9-rc4-mm1.bz2 | patch -p1
> 41 20:37 cat ../../voluntary-preempt-2.6.9-rc4-mm1-U2 | patch -p1
>
> That's what I did.
.config attached.
bill
* Karim Yaghmour <[email protected]> wrote:
> >i just added something ad-hoc.
>
> Yes, I understood as much. I'm suggesting it because a lot of people
> who need such ad-hoc functionality could easily be using relayfs.
the latency tracer is pretty specialized for a number of reasons, i'm
not sure there's a good match between the two. If relayfs were in the
mainline kernel i'd consider reusing parts of it.
> >I wanted it to be accurate across
> >interrupt entries. I have not looked at the relayfs locking but how does
> >it solve that?
>
> cmpxchg (basically: try reserve; if fail retry; else write), with
> per-cpu buffers.
this still does not solve all problems related to irq entries: if the
IRQ interrups the tracing code after a 'successful reserve' but before
the 'else write' point, and the trace is printed/saved from an
interrupt, then there will be an incomplete entry in the trace.
also, there is the problem of timestamp atomicity: if an IRQ interrupts
the tracing code and the trace timestamp is taken in the 'else' branch
then a time-reversal situation can occur: the entry will have a
timestamp _larger_ than the IRQ trace-entries. With cli/sti all tracing
entries occur atomically: either fully or not at all.
> >Also, cli/sti makes it obviously SMP-safe and is pretty
> >cheap on all x86 CPUs. (Also, i didnt want to use preempt_disable/enable
> >because the tracer interacts with that code quite heavily.)
>
> No preempt_disable/enable found in the lockless logging in relayfs.
it would have to do that on PREEMPT_REALTIME. The irq flag solves both
the races, the predictability problem and the preemption problem nicely.
Ingo
* Bill Huey <[email protected]> wrote:
> On Fri, Oct 15, 2004 at 01:42:02AM +0200, Ingo Molnar wrote:
> >
> > i have released the -U2 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
>
> mm/shmem.c: In function `shmem_dir_map':
> mm/shmem.c:103: warning: implicit declaration of function `kmap_atomic_rt'
> mm/shmem.c:103: error: `KM_USER0' undeclared (first use in this function)
as a workaround enable HIGHMEM and PREEMPT_TIMING+LATENCY_TRACE.
(i fixed this in my tree, will be in -U3.)
Ingo
* Adam Heath <[email protected]> wrote:
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
> >
> > kernel/latency.c: In function `add_preempt_count':
> > kernel/latency.c:390: error: structure has no member named `preempt_trace_eip'
> > kernel/latency.c:394: error: structure has no member named `preempt_trace_parent_eip'
>
> Here's a patch:
please try the patch below instead - it will keep the tracer working
even with !LATENCY_TRACE.
Ingo
--- linux.old/include/linux/sched.h
+++ linux.new/include/linux/sched.h
@@ -706,7 +706,7 @@ struct task_struct {
#define MAX_PREEMPT_TRACE 16
-#ifdef CONFIG_LATENCY_TRACE
+#ifdef CONFIG_PREEMPT_TIMING
unsigned long preempt_trace_eip[MAX_PREEMPT_TRACE];
unsigned long preempt_trace_parent_eip[MAX_PREEMPT_TRACE];
#endif
--- linux.old/include/linux/highmem.h
+++ linux.new/include/linux/highmem.h
@@ -33,6 +33,11 @@ static inline void *kmap(struct page *pa
#define kmap_atomic_pfn(pfn, idx) page_address(pfn_to_page(pfn))
#define kmap_atomic_to_page(ptr) virt_to_page(ptr)
+#define kmap_atomic_rt kmap_atomic
+#define kmap_atomic_pfn_rt kmap_atomic_pfn
+#define kunmap_atomic_rt kunmap_atomic
+#define kmap_atomic_to_page_rt(kvaddr) kmap_atomic_to_page(kvaddr)
+
#endif /* CONFIG_HIGHMEM */
/* when CONFIG_HIGHMEM is not set these will be plain clear/copy_page */
On Fri, Oct 15, 2004 at 09:08:39AM +0200, Ingo Molnar wrote:
> as a workaround enable HIGHMEM and PREEMPT_TIMING+LATENCY_TRACE.
Build problem:
bill
i have released the -U3 PREEMPT_REALTIME patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
this is a buildfixes-only release, and it is still experimental code.
Changes since -U2:
- build fix: fixes the latency.c compilation error reported by Adam
Heath.
- build fix: fixes !HIGHMEM compilation, patch from Andrew Rodland
to create a -U3 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
Ingo
On Thu, 14 Oct 2004 16:31:31 +0200
Ingo Molnar <[email protected]> wrote:
>
> i have released the -U1 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
Ok,
with the help of Paul Davis i think i have found what's causing the jackd
FP exception. It seems to be a bug in the kernel when PREEMPT_REALTIME is
enabled:
~$ cat /proc/cpuinfo|grep cpu
cpu family : 6
cpu MHz : 0.001
cpuid level : 1
Mhz == 0.001? Hrmm. No wonder jackd was freaking out in its timing code..
The real cpu speed is 1.2ghz.
flo
P.S.: Will retry with U3, to see if this persists.
dmesg output:
Linux version 2.6.9-rc4-mm1-VP-U1-RT ([email protected]) (gcc version 3.3.5 (Debian 1:3.3.5-1)) #3 Thu Oct 14 22:40:26 CEST 2004
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 0000000030000000 (usable)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved)
BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
768MB LOWMEM available.
On node 0 totalpages: 196608
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 192512 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
Built 1 zonelists
Initializing CPU#0
Kernel command line: BOOT_IMAGE=2.6.9-U0 ro root=1601
PID hash table entries: 4096 (order: 12, 65536 bytes)
Detected 1195.144 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 776332k/786432k available (1673k kernel code, 9644k reserved, 482k data, 340k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 2367.48 BogoMIPS (lpj=1183744)
Security Scaffold v1.0.0 initialized
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: After vendor identify, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After all inits, caps: 0183f9ff c1c7f9ff 00000000 00000020
CPU: AMD Athlon(tm) Processor stepping 02
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
ksoftirqd started up.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfdb01, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
Uncovering SIS18 that hid as a SIS503 (compatible=1)
Enabling SiS 96x SMBus.
PCI: Using IRQ router SIS [1039/0018] at 0000:00:02.0
PCI: IRQ 0 for device 0000:00:02.1 doesn't match PIRQ mask - try pci=usepirqmask
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
Initializing Cryptographic API
Real Time Clock Driver v1.12
Non-volatile memory driver v1.2
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SIS5513: IDE controller at PCI slot 0000:00:02.5
SIS5513: chipset revision 208
SIS5513: not 100% native mode: will probe irqs later
SIS5513: SiS735 ATA 100 (2nd gen) controller
ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xff08-0xff0f, BIOS settings: hdc:DMA, hdd:DMA
Probing IDE interface ide0...
hda: IC35L060AVER07-0, ATA DISK drive
elevator: using anticipatory as default io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: ST340823A, ATA DISK drive
hdd: TDK CDRW121032, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
hda: 120103200 sectors (61492 MB) w/1916KiB Cache, CHS=65535/16/63, UDMA(100)
hda: cache flushes not supported
hda: hda1 hda2 hda3
hdc: max request size: 128KiB
hdc: Host Protected Area detected.
current capacity is 78165360 sectors (40020 MB)
native capacity is 78165361 sectors (40020 MB)
hdc: Host Protected Area disabled.
hdc: 78165361 sectors (40020 MB) w/1024KiB Cache, CHS=65535/16/63, UDMA(33)
hdc: cache flushes not supported
hdc: hdc1 hdc2
hdd: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache, DMA
Uniform CD-ROM driver Revision: 3.20
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImExPS/2 Logitech Explorer Mouse on isa0060/serio1
input: PC Speaker
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 64Kbytes
TCP: Hash tables configured (established 65536 bind 37449)
NET: Registered protocol family 1
NET: Registered protocol family 17
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 340k freed
kjournald starting. Commit interval 5 seconds
Adding 289160k swap on /dev/hda3. Priority:-1 extents:1
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on hdc1, internal journal
PCI: Found IRQ 5 for device 0000:00:0f.0
sis900.c: v1.08.07 11/02/2003
PCI: Found IRQ 10 for device 0000:00:03.0
eth0: Realtek RTL8201 PHY transceiver found at address 1.
eth0: Using transceiver found at address 1 as default
eth0: SiS 900 PCI Fast Ethernet at 0xdc00, IRQ 10, 00:d0:09:e9:c1:0f.
CSLIP: code copyright 1989 Regents of the University of California
PPP generic driver version 2.4.2
PPP Deflate Compression module registered
PPP BSD Compression module registered
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on hdc2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
complete /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 4
model name : AMD Athlon(tm) Processor
stepping : 2
cpu MHz : 0.001
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips : 2367.48
* Florian Schmidt <[email protected]> wrote:
> > i have released the -U1 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U1
>
> Ok,
>
> with the help of Paul Davis i think i have found what's causing the jackd
> FP exception. It seems to be a bug in the kernel when PREEMPT_REALTIME is
> enabled:
>
> ~$ cat /proc/cpuinfo|grep cpu
> cpu family : 6
> cpu MHz : 0.001
> cpuid level : 1
>
> Mhz == 0.001? Hrmm. No wonder jackd was freaking out in its timing code..
> The real cpu speed is 1.2ghz.
>
> flo
>
> P.S.: Will retry with U3, to see if this persists.
ah ... good eyes. Seems to be working fine here:
saturn:~> cat /proc/cpuinfo | grep -i mhz
cpu MHz : 2051.126
saturn:~> uname -a
Linux saturn 2.6.9-rc4-mm1-VP-U4 #288 SMP Fri Oct 15 12:31:38 CEST 2004
but it could easily be happening on some CPUs only. Let me know if that
problem persists. Fortunately i think it will be at most a detection
problem, not some FPU breakage that i initially suspected.
it could be the following thing: if you got an smp_processor_id()
warning _in the CPU detection code_ in earlier PREEMPT_REALTIME kernels
then the kernel could easily see that the CPU is extremely slow, because
it didnt manage to do much work (due to the long printout...). So i'd
say if this happens again it's most likely a debug printout in the
'calibrating delay loop' phase.
Ingo
Ingo Molnar wrote:
> * Bill Huey <[email protected]> wrote:
>
>
>>On Fri, Oct 15, 2004 at 01:42:02AM +0200, Ingo Molnar wrote:
>>
>>>i have released the -U2 PREEMPT_REALTIME patch:
>>>
>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
>>
>>mm/shmem.c: In function `shmem_dir_map':
>>mm/shmem.c:103: warning: implicit declaration of function `kmap_atomic_rt'
>>mm/shmem.c:103: error: `KM_USER0' undeclared (first use in this function)
>
>
> as a workaround enable HIGHMEM and PREEMPT_TIMING+LATENCY_TRACE.
>
> (i fixed this in my tree, will be in -U3.)
>
> Ingo
>
Sorry Bill. Is there a brown paper bag you can get that is not for
writing the bug but misdiagnosing it? :)
kr
On Friday 15 October 2004 12:26, Ingo Molnar wrote:
> i have released the -U3 PREEMPT_REALTIME patch:
>
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-
>U3
hi ingo!
can you change your version string to lower case letters to avoid "problems"
with make-kpkg?
if not, no problem...
thanks,
dominik
* K.R. Foley <[email protected]> wrote:
> >>>i have released the -U2 PREEMPT_REALTIME patch:
> >>>
> >>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U2
> >>
> >>mm/shmem.c: In function `shmem_dir_map':
> >>mm/shmem.c:103: warning: implicit declaration of function `kmap_atomic_rt'
> >>mm/shmem.c:103: error: `KM_USER0' undeclared (first use in this function)
> >
> >
> >as a workaround enable HIGHMEM and PREEMPT_TIMING+LATENCY_TRACE.
> >
> >(i fixed this in my tree, will be in -U3.)
> >
> > Ingo
> >
> Sorry Bill. Is there a brown paper bag you can get that is not for
> writing the bug but misdiagnosing it? :)
no you thief! Are you trying to steal my preciousss? :-)
Ingo
On Fri, 15 Oct 2004 13:44:05 +0200
Ingo Molnar <[email protected]> wrote:
> > cpu MHz : 0.001
> ah ... good eyes. Seems to be working fine here:
>
> saturn:~> cat /proc/cpuinfo | grep -i mhz
> cpu MHz : 2051.126
> saturn:~> uname -a
> Linux saturn 2.6.9-rc4-mm1-VP-U4 #288 SMP Fri Oct 15 12:31:38 CEST 2004
>
> but it could easily be happening on some CPUs only. Let me know if that
> problem persists.
Same problem with U3.
~$ uname -a
Linux mango.fruits.de 2.6.9-rc4-mm1-VP-U3-RT #1 Fri Oct 15 13:45:00 CEST 2004 i686 GNU/Linux
~$ cat /proc/cpuinfo |grep MHz
cpu MHz : 0.001
> Fortunately i think it will be at most a detection
> problem, not some FPU breakage that i initially suspected.
>
> it could be the following thing: if you got an smp_processor_id()
> warning _in the CPU detection code_ in earlier PREEMPT_REALTIME kernels
> then the kernel could easily see that the CPU is extremely slow, because
> it didnt manage to do much work (due to the long printout...). So i'd
> say if this happens again it's most likely a debug printout in the
> 'calibrating delay loop' phase.
I see. btw: i built this one with CONFIG_PREEMPT_TIMING and
CONFIG_LATENCY_TRACE and, naturally, this also throws the timing code of the
critical section timing off:
Linux version 2.6.9-rc4-mm1-VP-U3-RT ([email protected]) (gcc version 3.3.5 (Debian 1:3.3.5-1)) #1 Fri Oct 15 13:45:00 CEST 2004
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 0000000030000000 (usable)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffee0000 - 00000000fff00000 (reserved)
BIOS-e820: 00000000fffc0000 - 0000000100000000 (reserved)
768MB LOWMEM available.
On node 0 totalpages: 196608
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 192512 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
Built 1 zonelists
Initializing CPU#0
Kernel command line: BOOT_IMAGE=2.6.9-U3-RT ro root=1601
PID hash table entries: 4096 (order: 12, 65536 bytes)
(swapper/0): new 436746 us maximum-latency critical section.
=> started at: <start_kernel+0x39/0x1c0>
=> ended at: <cond_resched+0x23/0x80>
[<c012ea8c>] touch_preempt_timing+0x3c/0x40
[<c012e9b0>] check_preempt_timing+0x160/0x200
[<c02a3e23>] cond_resched+0x23/0x80
[<c012ea8c>] touch_preempt_timing+0x3c/0x40
[<c02a3e23>] cond_resched+0x23/0x80
[<c02a3e23>] cond_resched+0x23/0x80
[<c012d899>] _mutex_lock+0x19/0x40
[<c011dad0>] tasklet_hi_action+0x0/0x70
[<c010b51a>] get_cmos_time+0x1a/0x1e0
[<c03228e3>] start_kernel+0xc3/0x1c0
[<c0112240>] mcount+0x14/0x18
[<c0326ba0>] time_init+0x10/0x70
[<c011dad0>] tasklet_hi_action+0x0/0x70
[<c03228e3>] start_kernel+0xc3/0x1c0
[<c03225a0>] unknown_bootoption+0x0/0x160
preempt count: 1
entry 1: start_kernel+0x39/0x1c0 / (0xc010019f)
Detected 1195.144 MHz processor.
Using tsc for high-res timesource
(swapper/0): new 597854 us maximum-latency critical section.
=> started at: <cond_resched+0x23/0x80>
=> ended at: <cond_resched+0x23/0x80>
[<c012ea8c>] touch_preempt_timing+0x3c/0x40
[<c012e9b0>] check_preempt_timing+0x160/0x200
[<c02a3e23>] cond_resched+0x23/0x80
[<c012ea8c>] touch_preempt_timing+0x3c/0x40
[<c02a3e23>] cond_resched+0x23/0x80
[<c02a3e23>] cond_resched+0x23/0x80
[<c012d899>] _mutex_lock+0x19/0x40
[<c012d916>] _mutex_lock_irqsave+0x16/0x20
[<c01f0347>] tty_register_ldisc+0x37/0xb0
[<c0333367>] console_init+0x27/0x50
[<c03228e8>] start_kernel+0xc8/0x1c0
[<c03225a0>] unknown_bootoption+0x0/0x160
preempt count: 1
entry 1: start_kernel+0x39/0x1c0 / (0xc010019f)
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 776156k/786432k available (1685k kernel code, 9820k reserved, 485k data, 344k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 2351.10 BogoMIPS (lpj=1175552)
Security Scaffold v1.0.0 initialized
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: After vendor identify, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After all inits, caps: 0183f9ff c1c7f9ff 00000000 00000020
CPU: AMD Athlon(tm) Processor stepping 02
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
ksoftirqd started up.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfdb01, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
PCI: Probing PCI hardware
PCI: Probing PCI hardware (bus 00)
Uncovering SIS18 that hid as a SIS503 (compatible=1)
Enabling SiS 96x SMBus.
PCI: Using IRQ router SIS [1039/0018] at 0000:00:02.0
PCI: IRQ 0 for device 0000:00:02.1 doesn't match PIRQ mask - try pci=usepirqmask
apm: BIOS version 1.2 Flags 0x03 (Driver version 1.16ac)
Initializing Cryptographic API
Real Time Clock Driver v1.12
Non-volatile memory driver v1.2
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
RAMDISK driver initialized: 16 RAM disks of 4096K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
SIS5513: IDE controller at PCI slot 0000:00:02.5
SIS5513: chipset revision 208
SIS5513: not 100% native mode: will probe irqs later
SIS5513: SiS735 ATA 100 (2nd gen) controller
ide0: BM-DMA at 0xff00-0xff07, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xff08-0xff0f, BIOS settings: hdc:DMA, hdd:DMA
Probing IDE interface ide0...
hda: IC35L060AVER07-0, ATA DISK drive
elevator: using anticipatory as default io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: ST340823A, ATA DISK drive
hdd: TDK CDRW121032, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
hda: 120103200 sectors (61492 MB) w/1916KiB Cache, CHS=65535/16/63, UDMA(100)
hda: cache flushes not supported
hda: hda1 hda2 hda3
hdc: max request size: 128KiB
hdc: Host Protected Area detected.
current capacity is 78165360 sectors (40020 MB)
native capacity is 78165361 sectors (40020 MB)
hdc: Host Protected Area disabled.
hdc: 78165361 sectors (40020 MB) w/1024KiB Cache, CHS=65535/16/63, UDMA(33)
hdc: cache flushes not supported
hdc: hdc1 hdc2
hdd: ATAPI 32X CD-ROM CD-R/RW drive, 2048kB Cache, DMA
Uniform CD-ROM driver Revision: 3.20
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImExPS/2 Logitech Explorer Mouse on isa0060/serio1
input: PC Speaker
NET: Registered protocol family 2
IP: routing cache hash table of 2048 buckets, 64Kbytes
TCP: Hash tables configured (established 65536 bind 37449)
NET: Registered protocol family 1
NET: Registered protocol family 17
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 344k freed
kjournald starting. Commit interval 5 seconds
Adding 289160k swap on /dev/hda3. Priority:-1 extents:1
EXT3-fs warning: maximal mount count reached, running e2fsck is recommended
EXT3 FS on hdc1, internal journal
PCI: Found IRQ 5 for device 0000:00:0f.0
sis900.c: v1.08.07 11/02/2003
PCI: Found IRQ 10 for device 0000:00:03.0
eth0: Realtek RTL8201 PHY transceiver found at address 1.
eth0: Using transceiver found at address 1 as default
eth0: SiS 900 PCI Fast Ethernet at 0xdc00, IRQ 10, 00:d0:09:e9:c1:0f.
CSLIP: code copyright 1989 Regents of the University of California
PPP generic driver version 2.4.2
PPP Deflate Compression module registered
PPP BSD Compression module registered
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda1, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3 FS on hdc2, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
(S40networking/210): new 698390 us maximum-latency critical section.
=> started at: <kernel_fpu_begin+0x21/0x60>
=> ended at: <_mmx_memcpy+0x131/0x180>
[<c012ec41>] sub_preempt_count+0x71/0x90
[<c012e9b0>] check_preempt_timing+0x160/0x200
[<c01e48d1>] _mmx_memcpy+0x131/0x180
[<c010c44e>] kernel_fpu_begin+0xe/0x60
[<c012ec41>] sub_preempt_count+0x71/0x90
[<c01e48d1>] _mmx_memcpy+0x131/0x180
[<c01e48d1>] _mmx_memcpy+0x131/0x180
[<c01edbe5>] vgacon_scroll+0x245/0x260
[<c01fe33a>] scrup+0xda/0xf0
[<c0112240>] mcount+0x14/0x18
[<c01ffe82>] lf+0x72/0x80
[<c0201b20>] do_con_trol+0xa90/0xc30
[<c01fef3b>] hide_softcursor+0xb/0x70
[<c0201f25>] do_con_write+0x265/0x720
[<c0202a0b>] con_write+0x3b/0x50
[<c0202a65>] con_put_char+0x45/0x50
[<c01f4b15>] opost+0xa5/0x1d0
[<c0112240>] mcount+0x14/0x18
[<c01f7083>] write_chan+0x1b3/0x220
[<c0114ff0>] default_wake_function+0x0/0x20
[<c0112240>] mcount+0x14/0x18
[<c0114ff0>] default_wake_function+0x0/0x20
[<c0114f8f>] lock_kernel+0x2f/0x50
[<c01f169f>] tty_write+0x12f/0x1e0
[<c01f6ed0>] write_chan+0x0/0x220
[<c01f1750>] redirected_tty_write+0x0/0xb0
[<c015584a>] vfs_write+0xca/0x140
[<c0112240>] mcount+0x14/0x18
[<c0155990>] sys_write+0x50/0x80
[<c010603b>] syscall_call+0x7/0xb
preempt count: 1
entry 1: kernel_fpu_begin+0x21/0x60 / (_mmx_memcpy+0x36/0x180)
Lee Revell writes:
> On Thu, 2004-10-14 at 22:00, Robert Wisniewski wrote:
> > Theoretically a problem, in practice not, i.e., good enough for soft/normal
> > real-time, not hard real-time; probably wouldn't want my heart monitor on
> > it, but then I wouldn't be using Linux for that either :-)
>
> Also, the issue here is how we do debug logging. You would presumably
> not use this at all in production.
>
> Lee
Yes actually you would. If the tracing subsystem is designed correctly you
leave it in for production systems and enable it when you need to find a
problem. The reason is because many times you can not reproduce a problem
someone in production is seeing in your environment. In addition to the
LTT/Relayfs and K42 tracing work, lots of of tracing work/papers suggest
leaving it in all the time. Most commercial operating systems have made
the investment to correctly design tracing facilities so they are
available. LTT in combination with relayfs could fulfill that role for
Linux.
Robert Wisniewski
The K42 MP OS Project
Advanced Operating Systems
Scalable Parallel Systems
IBM T.J. Watson Research Center
914-945-3181
http://www.research.ibm.com/K42/
[email protected]
Ingo Molnar writes:
>
> * Karim Yaghmour <[email protected]> wrote:
>
> > >i just added something ad-hoc.
> >
> > Yes, I understood as much. I'm suggesting it because a lot of people
> > who need such ad-hoc functionality could easily be using relayfs.
>
> the latency tracer is pretty specialized for a number of reasons, i'm
> not sure there's a good match between the two. If relayfs were in the
> mainline kernel i'd consider reusing parts of it.
:-) it's nice we all have a sense of humor.
>
> > >I wanted it to be accurate across
> > >interrupt entries. I have not looked at the relayfs locking but how does
> > >it solve that?
> >
> > cmpxchg (basically: try reserve; if fail retry; else write), with
> > per-cpu buffers.
>
> this still does not solve all problems related to irq entries: if the
> IRQ interrups the tracing code after a 'successful reserve' but before
> the 'else write' point, and the trace is printed/saved from an
> interrupt, then there will be an incomplete entry in the trace.
That is incorrect. The system behavior needed to generate an incomplete
entry is far more complicated and unlikely than what you describe.
>
> also, there is the problem of timestamp atomicity: if an IRQ interrupts
> the tracing code and the trace timestamp is taken in the 'else' branch
> then a time-reversal situation can occur: the entry will have a
> timestamp _larger_ than the IRQ trace-entries. With cli/sti all tracing
> entries occur atomically: either fully or not at all.
>
> > >Also, cli/sti makes it obviously SMP-safe and is pretty
> > >cheap on all x86 CPUs. (Also, i didnt want to use preempt_disable/enable
> > >because the tracer interacts with that code quite heavily.)
> >
> > No preempt_disable/enable found in the lockless logging in relayfs.
>
> it would have to do that on PREEMPT_REALTIME. The irq flag solves both
> the races, the predictability problem and the preemption problem nicely.
>
> Ingo
If you do not care about performance then you're probably right, this is
fine. If you are concerned about the time it takes to go through the
sequence of code, then probably not.
Robert Wisniewski
The K42 MP OS Project
Advanced Operating Systems
Scalable Parallel Systems
IBM T.J. Watson Research Center
914-945-3181
http://www.research.ibm.com/K42/
[email protected]
* Dominik Karall <[email protected]> wrote:
> On Friday 15 October 2004 12:26, Ingo Molnar wrote:
> > i have released the -U3 PREEMPT_REALTIME patch:
> >
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-
> >U3
>
> hi ingo!
> can you change your version string to lower case letters to avoid "problems"
> with make-kpkg?
> if not, no problem...
sure, no problem. Next one will be -rt-u4.
Ingo
* Robert Wisniewski <[email protected]> wrote:
> > > cmpxchg (basically: try reserve; if fail retry; else write), with
> > > per-cpu buffers.
> >
> > this still does not solve all problems related to irq entries: if the
> > IRQ interrups the tracing code after a 'successful reserve' but before
> > the 'else write' point, and the trace is printed/saved from an
> > interrupt, then there will be an incomplete entry in the trace.
>
> That is incorrect. The system behavior needed to generate an
> incomplete entry is far more complicated and unlikely than what you
> describe.
ah, but i'm talking about actual first-hand experience, not supposition.
It happens quite easily with latency traces (which are saved/printed
from IRQ entries) and it can be very annoying to analyze. My first
tracers tried to do things without the IRQ flag, so i've seen both
methods.
and lets not forget this other issue:
> > also, there is the problem of timestamp atomicity: if an IRQ interrupts
> > the tracing code and the trace timestamp is taken in the 'else' branch
> > then a time-reversal situation can occur: the entry will have a
> > timestamp _larger_ than the IRQ trace-entries. With cli/sti all tracing
> > entries occur atomically: either fully or not at all.
> > > >Also, cli/sti makes it obviously SMP-safe and is pretty
^^^^^^^^^^
> > > >cheap on all x86 CPUs. (Also, i didnt want to use preempt_disable/enable
^^^^^^^^^^^^^^^^^^^^^^
> > > >because the tracer interacts with that code quite heavily.)
> > >
> > > No preempt_disable/enable found in the lockless logging in relayfs.
> >
> > it would have to do that on PREEMPT_REALTIME. The irq flag solves both
> > the races, the predictability problem and the preemption problem nicely.
> >
> > Ingo
>
> If you do not care about performance then you're probably right, this
> is fine. If you are concerned about the time it takes to go through
> the sequence of code, then probably not.
see the portion i highlighted above. CPU makers are busy making cli/sti
as fast as possible. To make sure i tested it on a typical x86 box:
# ./cli-latency
CLI+STI latency: 8 cycles
Since the trace entry can be filled in a constant amount of time there's
no reason not to make use of that extra silicon that makes fast cli/sti
possible! How many trace entries can you generate per second via
relayfs, on a typical PC? Have you ever measured this?
in fact on a modern CPU cli/sti is very likely faster than a cmpxchgl
for the following reason: the cmpxchgl generates a read dependency on
the cacheline which must be fetched in. A single cachemiss can cost
_alot_ in comparison, 200 cycles easily. While in the cli/sti case we
stream out to a new cacheline in a linear fashion which is nicely
optimized by write-allocate cache policies in modern CPUs. No
cachemisses on the trace buffer! Just simple streaming out of data.
i challenge you to change your code to use cli/sti and compare it with
the cmpxchgl variant doing some heavy tracing on a modern PC. Please
just _do_ it and come back with numbers. I have done my own
measurements.
Ingo
[email protected] wrote:
>
> In the systems I have to deal with, I do not have a clear criteria
> to set priorities of interrupts relative to each other. For example, I
> have a real time simulation system using the following devices:
> - occasional disk access to simulate disk I/O
> - real time network traffic
> - real time delivery of interrupts from a PCI timer card and APIC timers
> - real time interrupts from a shared memory interface
> The priorities of real time tasks are basically assigned based on the
> rate of execution. 80 Hz tasks run at a higher priority than 60 Hz, 60 Hz >
> 40 Hz, and so on. A number of tasks can access each device.
>
What if drivers could indicate how much "jitter" (essentially, latency)
its interrupts can tolerate? Higher jitter would SORTOF translate into
lower priority, although the scheduler would make sure the IRQ was
started before its tolerance ran out (ie. the priority approaches
infinity as its tolerance period approaches the end). The jitter
tolerance would be measured in microseconds, I guess.
Ingo Molnar writes:
>
> * Robert Wisniewski <[email protected]> wrote:
>
> > > > cmpxchg (basically: try reserve; if fail retry; else write), with
> > > > per-cpu buffers.
> > >
> > > this still does not solve all problems related to irq entries: if the
> > > IRQ interrups the tracing code after a 'successful reserve' but before
> > > the 'else write' point, and the trace is printed/saved from an
> > > interrupt, then there will be an incomplete entry in the trace.
> >
> > That is incorrect. The system behavior needed to generate an
> > incomplete entry is far more complicated and unlikely than what you
> > describe.
>
> ah, but i'm talking about actual first-hand experience, not supposition.
> It happens quite easily with latency traces (which are saved/printed
> from IRQ entries) and it can be very annoying to analyze. My first
> tracers tried to do things without the IRQ flag, so i've seen both
> methods.
This means that other code you've written has this happen, it doesn't mean
the fundamental model is broken. Also, if what you claim is true and there
really is this contention, then it both means that 1) there are many many
other higher priority tasks in the system than the one you are trying to
trace, and 2) it's questionable whether you want to use locks.
> and lets not forget this other issue:
>
> > > also, there is the problem of timestamp atomicity: if an IRQ interrupts
> > > the tracing code and the trace timestamp is taken in the 'else' branch
> > > then a time-reversal situation can occur: the entry will have a
> > > timestamp _larger_ than the IRQ trace-entries. With cli/sti all tracing
> > > entries occur atomically: either fully or not at all.
>
>
> > > > >Also, cli/sti makes it obviously SMP-safe and is pretty
> ^^^^^^^^^^
> > > > >cheap on all x86 CPUs. (Also, i didnt want to use preempt_disable/enable
> ^^^^^^^^^^^^^^^^^^^^^^
> > > > >because the tracer interacts with that code quite heavily.)
> > > >
> > > > No preempt_disable/enable found in the lockless logging in relayfs.
> > >
> > > it would have to do that on PREEMPT_REALTIME. The irq flag solves both
> > > the races, the predictability problem and the preemption problem nicely.
> > >
> > > Ingo
> >
> > If you do not care about performance then you're probably right, this
> > is fine. If you are concerned about the time it takes to go through
> > the sequence of code, then probably not.
>
> see the portion i highlighted above. CPU makers are busy making cli/sti
> as fast as possible. To make sure i tested it on a typical x86 box:
>
> # ./cli-latency
> CLI+STI latency: 8 cycles
>
> Since the trace entry can be filled in a constant amount of time there's
> no reason not to make use of that extra silicon that makes fast cli/sti
> possible! How many trace entries can you generate per second via
> relayfs, on a typical PC? Have you ever measured this?
>
> in fact on a modern CPU cli/sti is very likely faster than a cmpxchgl
> for the following reason: the cmpxchgl generates a read dependency on
> the cacheline which must be fetched in. A single cachemiss can cost
> _alot_ in comparison, 200 cycles easily. While in the cli/sti case we
> stream out to a new cacheline in a linear fashion which is nicely
> optimized by write-allocate cache policies in modern CPUs. No
> cachemisses on the trace buffer! Just simple streaming out of data.
>
> i challenge you to change your code to use cli/sti and compare it with
> the cmpxchgl variant doing some heavy tracing on a modern PC. Please
> just _do_ it and come back with numbers. I have done my own
> measurements.
>
> Ingo
This is an interesting analysis, do you have a paper you can point to, or
can you post the numbers from, and what the setup of the experiment was,
that you ran. Sounds interesting.
Robert Wisniewski
The K42 MP OS Project
Advanced Operating Systems
Scalable Parallel Systems
IBM T.J. Watson Research Center
914-945-3181
http://www.research.ibm.com/K42/
[email protected]
* Robert Wisniewski <[email protected]> wrote:
> Ingo Molnar writes:
> >
> > * Robert Wisniewski <[email protected]> wrote:
> >
> > > > > cmpxchg (basically: try reserve; if fail retry; else write), with
> > > > > per-cpu buffers.
> > > >
> > > > this still does not solve all problems related to irq entries: if the
> > > > IRQ interrups the tracing code after a 'successful reserve' but before
> > > > the 'else write' point, and the trace is printed/saved from an
> > > > interrupt, then there will be an incomplete entry in the trace.
> > >
> > > That is incorrect. The system behavior needed to generate an
> > > incomplete entry is far more complicated and unlikely than what you
> > > describe.
> >
> > ah, but i'm talking about actual first-hand experience, not supposition.
> > It happens quite easily with latency traces (which are saved/printed
> > from IRQ entries) and it can be very annoying to analyze. My first
> > tracers tried to do things without the IRQ flag, so i've seen both
> > methods.
>
> This means that other code you've written has this happen, it doesn't mean
> the fundamental model is broken. Also, if what you claim is true and there
> really is this contention, then it both means that 1) there are many many
> other higher priority tasks in the system than the one you are trying to
> trace, and 2) it's questionable whether you want to use locks.
_interrupts_. The latency tracer does traces like:
00000002 0.022ms (+0.000ms): mark_page_accessed (zap_pte_range)
00000002 0.022ms (+0.000ms): page_remove_rmap (zap_pte_range)
00000002 0.022ms (+0.000ms): free_page_and_swap_cache (zap_pte_range)
00000002 0.022ms (+0.001ms): put_page (zap_pte_range)
00010002 0.023ms (+0.000ms): do_IRQ (zap_pte_range)
00010002 0.023ms (+0.000ms): do_IRQ (<00000000>)
00010003 0.024ms (+0.004ms): mask_and_ack_8259A (do_IRQ)
00010003 0.029ms (+0.000ms): redirect_hardirq (do_IRQ)
00010000 0.029ms (+0.000ms): handle_IRQ_event (do_IRQ)
and i just pointed out why i didnt use relayfs.
Ingo
On Fri, 2004-10-15 at 07:59, Dominik Karall wrote:
> On Friday 15 October 2004 12:26, Ingo Molnar wrote:
> > i have released the -U3 PREEMPT_REALTIME patch:
> >
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-
> >U3
>
> hi ingo!
> can you change your version string to lower case letters to avoid "problems"
> with make-kpkg?
> if not, no problem...
Please file a Debian bug report. make-kpkg should be able to handle
this.
Lee
>i have released the -U3 PREEMPT_REALTIME patch:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
Built without any problems, booted as single user and had no errors until I
started running init scripts.
First attempt - When I got to S10network start, had a couple
messages displayed about sleeping in invalid contexts (scrolled by too
fast).
It ended with a kernel bug, looked similar to what I reported before. The
new data displayed included:
preempt count: 4
entry 1: schedule+0x46/0x810 / (ksoftirqd+0xd4/0xf0)
entry 2: _spin_lock_irqsave+0x1f/0x80 / (schedule+0x108/0x810)
entry 3: _spin_lock+0x67/0x70 / (load_balance_newidle+0x66/0xc0)
entry 4: _spin_lock+0x67/0x70 / (die_nmi+0x1b/0xa0)
System locked up, hardware reset to reboot. After reboot, noticed
I did capture some data, will send that separately.
Second attempt - basically the same symptom. The kernel BUG output was
incomplete with a "preempt count: 4" but only one entry displayed
(same as entry 1 above).
System locked up, hardware reset to reboot. Will boot with max_cpus=1
to see if I get farther.
Third attempt - had two outputs of tracing data while starting the
network interface. Tried a couple ping commands with numeric destinations
and the machine had no response (displayed) on the first one, Ctrl-C to
abort. On the second try, the system responded the same until I hit Ctrl-C,
no response. Alt-SysRq-T did display task data, ping was in D mode, and
call trace was pretty deep. Thought it was somewhat odd that it (and
several
tasks above it) had a preempt count:1 with
entry 1: _spin_lock+0x1f/0x70 / (__handle_sysrq+0x1f/0xf0)
not sure if that is the expected behavior or not.
Alt-SysRq-S had the same traceback as I reported before. Rebooted with
Alt-SysRq-B.
As mentioned before, will send the system log data in a separate
message.
--Mark H Johnson
<mailto:[email protected]>
On Fri, 15 Oct 2004, Lee Revell wrote:
> On Fri, 2004-10-15 at 07:59, Dominik Karall wrote:
> > On Friday 15 October 2004 12:26, Ingo Molnar wrote:
> > > i have released the -U3 PREEMPT_REALTIME patch:
> > >
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-
> > >U3
> >
> > hi ingo!
> > can you change your version string to lower case letters to avoid "problems"
> > with make-kpkg?
> > if not, no problem...
>
> Please file a Debian bug report. make-kpkg should be able to handle
> this.
Yes and no. Package names in debian must be lowercase. Of course, make-kpkg
could always do a lowercase on the string. shrug.
ps: Speaking with my dpkg hat on.
Ingo Molnar wrote:
> i have released the -U3 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
Overall, for me, this just seems to be getting better and better with
each iteration. I have posted a few traces that I have captured during
some of my testing on my SMP system at home. Only the last four are from
U3 (9-12). The others are from previous versions and in some cases
probably aren't relevant any more. No. 9 which is the worst I've seen
with U3 very well may have happened before the system was up completely.
Traces:
http://www.cybsft.com/testresults/2.6.9-rc4-mm1-VP/
kr
Ingo Molnar wrote:
> i have released the -U3 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
I have gotten a couple of interesting traces on my dual 2.6G Xeon
workstation here at the office. These were both generated running tests
on (oddly enough) my own trace buffer that I am working on for a client
here. The test basically consists of 100 threads putting data into the
trace buffer concurrently and then one reader thread draining it and
populating a multi-dimensional array to make sure all of the data is
accounted for and not corrupted. All threads are running at a normal
priority since the test is for correctness not performance. The traces
are here:
http://www.cybsft.com/testresults/26workstation/2.6.9-rc4-mm1-VP/
kr
Ingo Molnar wrote:
>i have released the -U3 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
>this is a buildfixes-only release, and it is still experimental code.
>
>Changes since -U2:
>
> - build fix: fixes the latency.c compilation error reported by Adam
> Heath.
>
> - build fix: fixes !HIGHMEM compilation, patch from Andrew Rodland
>
>to create a -U3 tree from scratch the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
> + http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
Hey,
I get lockups a few second after i issue the dhcpcd command for my
wireless pcmcia network card (cisco).
These lockups go away when i disable PREEMPT_REALTIME. Are there any
logs or information you want?
On Fri, Oct 15, 2004 at 12:26:33PM +0200, Ingo Molnar wrote:
>
> i have released the -U3 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
Atomic violations:
bill
On Fri, Oct 15, 2004 at 04:16:09PM -0700, Bill Huey wrote:
> Atomic violations:
More atomic violations:
bill
On Fri, 2004-10-15 at 06:26, Ingo Molnar wrote:
> i have released the -U3 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
Does not compile. .config attached:
HOSTCC scripts/bin2c
CC arch/i386/kernel/asm-offsets.s
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:104: error: conflicting types for `spinlock_t'
include/asm/mutex.h:92: error: previous declaration of `spinlock_t'
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:105:1: warning: "SPIN_LOCK_UNLOCKED" redefined
In file included from include/linux/spinlock.h:16,
from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/asm/mutex.h:86:1: warning: this is the location of the previous definition
In file included from include/linux/capability.h:45,
from include/linux/sched.h:7,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/spinlock.h:538:1: warning: "spin_lock_init" redefined
include/linux/spinlock.h:107:1: warning: this is the location of the previous definition
include/linux/spinlock.h:548:1: warning: "spin_is_locked" redefined
include/linux/spinlock.h:143:1: warning: this is the location of the previous definition
include/linux/spinlock.h:558:1: warning: "spin_unlock_wait" redefined
include/linux/spinlock.h:173:1: warning: this is the location of the previous definition
In file included from include/linux/time.h:7,
from include/linux/timex.h:58,
from include/linux/sched.h:11,
from arch/i386/kernel/asm-offsets.c:7:
include/linux/seqlock.h: In function `__write_seqlock':
include/linux/seqlock.h:74: error: structure has no member named `magic'
include/linux/seqlock.h:74: error: structure has no member named `lock'
include/linux/seqlock.h:74: error: structure has no member named `babble'
include/linux/seqlock.h:74: error: structure has no member named `babble'
include/linux/seqlock.h:74: error: structure has no member named `module'
include/linux/seqlock.h:74: error: structure has no member named `owner'
include/linux/seqlock.h:74: error: structure has no member named `oline'
include/linux/seqlock.h:74: error: structure has no member named `lock'
include/linux/seqlock.h:74: error: structure has no member named `owner'
include/linux/seqlock.h:74: error: structure has no member named `oline'
include/linux/seqlock.h: In function `__write_sequnlock':
include/linux/seqlock.h:83: error: structure has no member named `magic'
include/linux/seqlock.h:83: error: structure has no member named `lock'
include/linux/seqlock.h:83: error: structure has no member named `babble'
include/linux/seqlock.h:83: error: structure has no member named `babble'
include/linux/seqlock.h:83: error: structure has no member named `module'
include/linux/seqlock.h:83: error: structure has no member named `lock'
include/linux/seqlock.h: In function `__write_tryseqlock':
include/linux/seqlock.h:88: error: structure has no member named `magic'
include/linux/seqlock.h:88: error: structure has no member named `lock'
include/linux/seqlock.h:88: error: structure has no member named `babble'
include/linux/seqlock.h:88: error: structure has no member named `babble'
include/linux/seqlock.h:88: error: structure has no member named `module'
include/linux/seqlock.h:88: error: structure has no member named `owner'
include/linux/seqlock.h:88: error: structure has no member named `oline'
include/linux/seqlock.h:88: error: structure has no member named `lock'
include/linux/seqlock.h:88: error: structure has no member named `owner'
include/linux/seqlock.h:88: error: structure has no member named `oline'
include/linux/seqlock.h: In function `__write_seqlock_raw':
etc
Lee
On Fri, 2004-10-15 at 21:00, Lee Revell wrote:
> On Fri, 2004-10-15 at 06:26, Ingo Molnar wrote:
> > i have released the -U3 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
> Does not compile. .config attached:
It builds fine if CONFIG_SMP is set. Am I really the only person
running this on UP?
Lee
On Fri, 15 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U3 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
scheduling while atomic: postmaster/0x04000002/3175
caller is cond_resched+0x53/0x70
[<c01069f7>] dump_stack+0x17/0x20
[<c027b457>] schedule+0x517/0x550
[<c027b9c3>] cond_resched+0x53/0x70
[<c012cdc7>] _mutex_lock+0x17/0x40
[<c012ce18>] _mutex_lock_irqsave+0x8/0x10
[<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
[<c01b2335>] avc_has_perm+0x35/0x68
[<c01b79ca>] ipc_has_perm+0x6a/0x80
[<c01ab716>] semctl_main+0xa6/0x410
[<c01abcad>] sys_semctl+0xad/0xb0
[<c010bafd>] sys_ipc+0xad/0x250
[<c0105bff>] syscall_call+0x7/0xb
scheduling while atomic: postmaster/0x04000002/5260
caller is cond_resched+0x53/0x70
[<c01069f7>] dump_stack+0x17/0x20
[<c027b457>] schedule+0x517/0x550
[<c027b9c3>] cond_resched+0x53/0x70
[<c012cdc7>] _mutex_lock+0x17/0x40
[<c012ce18>] _mutex_lock_irqsave+0x8/0x10
[<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
[<c01b2335>] avc_has_perm+0x35/0x68
[<c01b79ca>] ipc_has_perm+0x6a/0x80
[<c01a9832>] ipcperms+0x82/0xb0
[<c01ab6fe>] semctl_main+0x8e/0x410
[<c01abcad>] sys_semctl+0xad/0xb0
[<c010bafd>] sys_ipc+0xad/0x250
[<c0105bff>] syscall_call+0x7/0xb
scheduling while atomic: liquidwar/0x04000002/5505
caller is cond_resched+0x53/0x70
[<c01069f7>] dump_stack+0x17/0x20
[<c027b457>] schedule+0x517/0x550
[<c027b9c3>] cond_resched+0x53/0x70
[<c012cdc7>] _mutex_lock+0x17/0x40
[<c012ce18>] _mutex_lock_irqsave+0x8/0x10
[<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
[<c01b2335>] avc_has_perm+0x35/0x68
[<c01b79ca>] ipc_has_perm+0x6a/0x80
[<c01acfe6>] sys_shmctl+0x196/0x690
[<c010bc9a>] sys_ipc+0x24a/0x250
[<c0105bff>] syscall_call+0x7/0xb
scheduling while atomic: XFree86/0x04000002/1127
caller is cond_resched+0x53/0x70
[<c01069f7>] dump_stack+0x17/0x20
[<c027b457>] schedule+0x517/0x550
[<c027b9c3>] cond_resched+0x53/0x70
[<c012cdd3>] _mutex_lock+0x23/0x40
[<c012ce18>] _mutex_lock_irqsave+0x8/0x10
[<c01b228e>] avc_has_perm_noaudit+0x10e/0x180
[<c01b2335>] avc_has_perm+0x35/0x68
[<c01b79ca>] ipc_has_perm+0x6a/0x80
[<c01ad30d>] sys_shmctl+0x4bd/0x690
[<c010bc9a>] sys_ipc+0x24a/0x250
[<c0105bff>] syscall_call+0x7/0xb
scheduling while atomic: XFree86/0x04000002/1127
caller is cond_resched+0x53/0x70
[<c01069f7>] dump_stack+0x17/0x20
[<c027b457>] schedule+0x517/0x550
[<c027b9c3>] cond_resched+0x53/0x70
[<c012cdc7>] _mutex_lock+0x17/0x40
[<c012ce18>] _mutex_lock_irqsave+0x8/0x10
[<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
[<c01b2335>] avc_has_perm+0x35/0x68
[<c01b79ca>] ipc_has_perm+0x6a/0x80
[<c01a9832>] ipcperms+0x82/0xb0
[<c01ad59f>] do_shmat+0xbf/0x2e0
[<c010bbef>] sys_ipc+0x19f/0x250
[<c0105bff>] syscall_call+0x7/0xb
scheduling while atomic: liquidwar/0x04000002/5505
caller is cond_resched+0x53/0x70
[<c01069f7>] dump_stack+0x17/0x20
[<c027b457>] schedule+0x517/0x550
[<c027b9c3>] cond_resched+0x53/0x70
[<c012cdc7>] _mutex_lock+0x17/0x40
[<c012ce18>] _mutex_lock_irqsave+0x8/0x10
[<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
[<c01b2335>] avc_has_perm+0x35/0x68
[<c01b79ca>] ipc_has_perm+0x6a/0x80
[<c01ad5ba>] do_shmat+0xda/0x2e0
[<c010bbef>] sys_ipc+0x19f/0x250
[<c0105bff>] syscall_call+0x7/0xb
scheduling while atomic: liquidwar/0x04000002/5505
caller is cond_resched+0x53/0x70
[<c01069f7>] dump_stack+0x17/0x20
[<c027b457>] schedule+0x517/0x550
[<c027b9c3>] cond_resched+0x53/0x70
[<c012cdc7>] _mutex_lock+0x17/0x40
[<c012ce18>] _mutex_lock_irqsave+0x8/0x10
[<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
[<c01b2335>] avc_has_perm+0x35/0x68
[<c01b79ca>] ipc_has_perm+0x6a/0x80
[<c01a9832>] ipcperms+0x82/0xb0
[<c01ad59f>] do_shmat+0xbf/0x2e0
[<c010bbef>] sys_ipc+0x19f/0x250
[<c0105bff>] syscall_call+0x7/0xb
On Fri, 2004-10-15 at 06:26, Ingo Molnar wrote:
> i have released the -U3 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
OK, I got this to build by enabling CONFIG_SMP, but it died during the
boot process. It may have been network related, as it hung on when
running ntpdate. I hit ctrl-C and the boot process continued, but as
soon as gdm started I got a blank screen, I could not get to X or the
console.
Oct 16 02:01:22 krustophenia kernel: eth0: link up, 100Mbps, full-duplex, lpa 0x45E1
Oct 16 02:01:22 krustophenia kernel: NET: Registered protocol family 17
This was the last thing in dmesg. I did not see any errors at all
during the boot process.
I suspect the network driver, via-rhine.
Lee
* Lee Revell <[email protected]> wrote:
> > > i have released the -U3 PREEMPT_REALTIME patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
> >
> > Does not compile. .config attached:
>
> It builds fine if CONFIG_SMP is set. Am I really the only person
> running this on UP?
i regularly test it on UP. Do you have SPINLOCK_DEBUG enabled perhaps?
That doesnt work right now. You can enable DEBUG_SPINLOCK_SLEEP and
DEBUG_PREEMPT.
Ingo
* Bill Huey <[email protected]> wrote:
> On Fri, Oct 15, 2004 at 12:26:33PM +0200, Ingo Molnar wrote:
> >
> > i have released the -U3 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
> Atomic violations:
ok, the rtc clock needs to go back to raw for the time being - the patch
below should fix it.
Ingo
--- linux.old/arch/i386/kernel/time.c
+++ linux.new/arch/i386/kernel/time.c
@@ -81,7 +81,7 @@ unsigned long cpu_khz; /* Detected as we
extern unsigned long wall_jiffies;
-DECLARE_SPINLOCK(rtc_lock);
+DECLARE_RAW_SPINLOCK(rtc_lock);
#include <asm/i8253.h>
--- linux.old/include/linux/mc146818rtc.h
+++ linux.new/include/linux/mc146818rtc.h
@@ -16,7 +16,7 @@
#include <linux/spinlock.h> /* spinlock_t */
#include <asm/mc146818rtc.h> /* register access macros */
-extern spinlock_t rtc_lock; /* serialize CMOS RAM access */
+extern raw_spinlock_t rtc_lock; /* serialize CMOS RAM access */
/**********************************************************************
* register summary
* Adam Heath <[email protected]> wrote:
> On Fri, 15 Oct 2004, Ingo Molnar wrote:
>
> >
> > i have released the -U3 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>
> scheduling while atomic: postmaster/0x04000002/3175
> caller is cond_resched+0x53/0x70
> [<c01069f7>] dump_stack+0x17/0x20
> [<c027b457>] schedule+0x517/0x550
> [<c027b9c3>] cond_resched+0x53/0x70
> [<c012cdc7>] _mutex_lock+0x17/0x40
> [<c012ce18>] _mutex_lock_irqsave+0x8/0x10
> [<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
> [<c01b2335>] avc_has_perm+0x35/0x68
> [<c01b79ca>] ipc_has_perm+0x6a/0x80
> [<c01ab716>] semctl_main+0xa6/0x410
> [<c01abcad>] sys_semctl+0xad/0xb0
> [<c010bafd>] sys_ipc+0xad/0x250
> [<c0105bff>] syscall_call+0x7/0xb
thanks - that's the IPC code that is not converted over from RCU yet.
a suggestion for future testing: please enable PREEMPT_TIMING for the
next kernels you build, it will print such entries at the end of
stacktraces:
preempt count: 2
entry 1: cpu_idle+0x38/0x90 / (start_kernel+0x1ac/0x1f0)
entry 2: _spin_lock+0x22/0x80 / (timer_interrupt+0x1b/0x130)
while in this particular IPC case it's immediately visible that it's the
IPC RCU use that is the root of the problem, the preemption trace
printout can be very helpful in other cases to quickly identify where
the preemptible section was started. E.g. the networking code sometimes
has very deep nesting and non-obvious locking. Thanks,
Ingo
* Ingo Molnar <[email protected]> wrote:
> a suggestion for future testing: please enable PREEMPT_TIMING for the
> next kernels you build, it will print such entries at the end of
> stacktraces:
>
> preempt count: 2
> entry 1: cpu_idle+0x38/0x90 / (start_kernel+0x1ac/0x1f0)
> entry 2: _spin_lock+0x22/0x80 / (timer_interrupt+0x1b/0x130)
correction: in -U3 you'll also need to enable LATENCY_TRACE for this to
work. I've fixed this in my tree and from -U4 onwards the preemption
trace will be maintained and printed if DEBUG_PREEMPT is enabled.
Ingo
On Sat, 2004-10-16 at 02:42, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > > > i have released the -U3 PREEMPT_REALTIME patch:
> > > >
> > > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
> > >
> > > Does not compile. .config attached:
> >
> > It builds fine if CONFIG_SMP is set. Am I really the only person
> > running this on UP?
>
> i regularly test it on UP. Do you have SPINLOCK_DEBUG enabled perhaps?
> That doesnt work right now. You can enable DEBUG_SPINLOCK_SLEEP and
> DEBUG_PREEMPT.
Sorry, I did have that enabled. This caused a build failure with a UP
build and a boot failure with CONFIG_SMP.
Lee
Lee Revell writes:
>> On Fri, 2004-10-15 at 06:26, Ingo Molnar wrote:
>> > i have released the -U3 PREEMPT_REALTIME patch:
>> >
>> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>>
>> Does not compile. .config attached:
>
> It builds fine if CONFIG_SMP is set. Am I really the only person
> running this on UP?
>
I run both, on different machines.
I'm actually running 2.6.9-rc4-mm1-U3 at this very moment, on my laptop
(P4 2.53Ghz/UP, Mdk 10.1c) and also on my desktop machine (P4
2.80Ghz/SMP/HT, SuSE 9.1).
However, on the desktop (SMP/HT) I could only made it boot/init
successfully with CONFIG_PREEMPT_REALTIME off. On my laptop (UP) is
running pretty well on full RT.
Both kernel configs are attached.
config-2.6.9-rc4-mm1-U3.0smp.gz: is from my SMP/HT destop (non-RT-preempted)
config-2.6.9-rc4-mm1-RT-U3.0.gz: is from my laptopn (RT-preempted).
These seem stable, even though are taken from the menu-du-jour :)
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
* Lee Revell <[email protected]> wrote:
> > i regularly test it on UP. Do you have SPINLOCK_DEBUG enabled perhaps?
> > That doesnt work right now. You can enable DEBUG_SPINLOCK_SLEEP and
> > DEBUG_PREEMPT.
>
> Sorry, I did have that enabled. This caused a build failure with a UP
> build and a boot failure with CONFIG_SMP.
not your fault at all - i cleaned this up in my tree so that only valid
combinations can be selected, these fixes will show up in -U4.
it seems that SMP + PREEMPT_TIMING is not stable though, somehow the
latency printk's cause a crash sooner or later. I'm still debugging this
problem. Without PREEMPT_TIMING the SMP kernel is stable.
Ingo
Ingo Molnar wrote:
>
> it seems that SMP + PREEMPT_TIMING is not stable though, somehow the
> latency printk's cause a crash sooner or later. I'm still debugging this
> problem. Without PREEMPT_TIMING the SMP kernel is stable.
>
How true!
My first successful SMP/HT PREEMPT_REALTIME has been achieved, by just
turning off PREEMPT_TIMING. So you won't get any latency trace dumps from
here ;)
Actual .config.gz attached.
But now I have my two prime machines on full-RT-throttle, yeepee! :)
Thankful for the hint!
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Ingo Molnar wrote:
> >
> > it seems that SMP + PREEMPT_TIMING is not stable though, somehow the
> > latency printk's cause a crash sooner or later. I'm still debugging this
> > problem. Without PREEMPT_TIMING the SMP kernel is stable.
> >
>
> How true!
>
> My first successful SMP/HT PREEMPT_REALTIME has been achieved, by just
> turning off PREEMPT_TIMING. So you won't get any latency trace dumps
> from here ;)
meanwhile i've mostly debugged the problem: it's an illegal recursion
into the timer code caused by PREEMPT_TIMING printks from within the
timer code calling do_poke_blanked_console() which in turn calls
del_timer() ...
this bug has been in the PREEMPT_TIMING code for almost forever, it's
just that under PREEMPT_REALTIME all the other latency sources are gone,
only the few places that still use raw spinlocks are remaining, amongst
them the timer code ...
> Actual .config.gz attached.
>
> But now I have my two prime machines on full-RT-throttle, yeepee! :)
great! :) I strongly suspect this bug is the one Lee and Mark are seeing
too.
Ingo
Ingo,
>
> Rui Nuno Capela wrote:
>
>> Ingo Molnar wrote:
>> >
>> > it seems that SMP + PREEMPT_TIMING is not stable though, somehow the
>> > latency printk's cause a crash sooner or later. I'm still debugging
>> > this problem. Without PREEMPT_TIMING the SMP kernel is stable.
>> >
>>
>> How true!
>>
>> My first successful SMP/HT PREEMPT_REALTIME has been achieved, by just
>> turning off PREEMPT_TIMING. So you won't get any latency trace dumps
>> from here ;)
OOPS. I think I've made a terrible mistake. It seems that
SMP+PREEMPT_REALTIME is NOT solved after all in my P4/HT box, even with
PREEMPT_TIMING not set.
As one may check from the .config.gz I've sent just about minutes ago,
PREEMPT_REALTIME wasn't being set, and the RT label was bogus.
So sorry to mislead you all. I should have known, it was too good to be
true :(
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> OOPS. I think I've made a terrible mistake. It seems that
> SMP+PREEMPT_REALTIME is NOT solved after all in my P4/HT box, even
> with PREEMPT_TIMING not set.
no problem, there are other types of bugs too reported by others that do
not seem to be related to the PREEMPT_TIMING problem.
Ingo
Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
>
>>>i regularly test it on UP. Do you have SPINLOCK_DEBUG enabled perhaps?
>>>That doesnt work right now. You can enable DEBUG_SPINLOCK_SLEEP and
>>>DEBUG_PREEMPT.
>>
>>Sorry, I did have that enabled. This caused a build failure with a UP
>>build and a boot failure with CONFIG_SMP.
>
>
> not your fault at all - i cleaned this up in my tree so that only valid
> combinations can be selected, these fixes will show up in -U4.
>
> it seems that SMP + PREEMPT_TIMING is not stable though, somehow the
> latency printk's cause a crash sooner or later. I'm still debugging this
> problem. Without PREEMPT_TIMING the SMP kernel is stable.
>
> Ingo
>
On my SMP system here at home I have not seen this instability. It's
been rock solid since yesterday morning and I already posted the worst
latencies that have been generated. My SMP system at work was up and
doing fine until I shut it down when I left last night. And posted the
high latencies from that yesterday as well. All in all it doesn't look
too bad to me.
kr
Home SMP system:
[root@porky latencies]# uptime
07:23:55 up 19:42, 3 users, load average: 67.22, 83.69, 57.23
Current time: Sat Oct 16 06:37:08 CDT 2004
Exiting test run..
Displaying report...
Total test time: 18h46m6s
Tests passed:
TTCP ran 1024 times in 8h32m50s, failed on 0 attempts.
FS ran 64 times in 18h46m3s, failed on 0 attempts.
CRASHME ran 256 times in 2h31m42s, failed on 0 attempts.
FIFOS_MMAP ran 256 times in 11h23m41s, failed on 0 attempts.
P3-FPU ran 256 times in 6h41m44s, failed on 0 attempts.
SAVE-STATE ran 1 times in 1m2s, failed on 0 attempts.
**** Test run completed successfully ****
Rui Nuno Capela wrote:
> Lee Revell writes:
>
>>>On Fri, 2004-10-15 at 06:26, Ingo Molnar wrote:
>>>
>>>>i have released the -U3 PREEMPT_REALTIME patch:
>>>>
>>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
>>>
>>>Does not compile. .config attached:
>>
>>It builds fine if CONFIG_SMP is set. Am I really the only person
>>running this on UP?
>>
>
>
> I run both, on different machines.
>
> I'm actually running 2.6.9-rc4-mm1-U3 at this very moment, on my laptop
> (P4 2.53Ghz/UP, Mdk 10.1c) and also on my desktop machine (P4
> 2.80Ghz/SMP/HT, SuSE 9.1).
>
> However, on the desktop (SMP/HT) I could only made it boot/init
> successfully with CONFIG_PREEMPT_REALTIME off. On my laptop (UP) is
> running pretty well on full RT.
I'm curious what you get when you try to boot the SMP system with
REALTIME on? My SMP/HT system at the office works fine with this.
Although there is one difference that jumps out at me. I have disabled
ACPI. I don't have the config handy so I can't do a complete comparison,
just going from memory.
kr
* K.R. Foley <[email protected]> wrote:
> >>It builds fine if CONFIG_SMP is set. Am I really the only person
> >>running this on UP?
> >
> >I run both, on different machines.
> >
> >I'm actually running 2.6.9-rc4-mm1-U3 at this very moment, on my laptop
> >(P4 2.53Ghz/UP, Mdk 10.1c) and also on my desktop machine (P4
> >2.80Ghz/SMP/HT, SuSE 9.1).
> >
> >However, on the desktop (SMP/HT) I could only made it boot/init
> >successfully with CONFIG_PREEMPT_REALTIME off. On my laptop (UP) is
> >running pretty well on full RT.
>
> I'm curious what you get when you try to boot the SMP system with
> REALTIME on? My SMP/HT system at the office works fine with this.
> Although there is one difference that jumps out at me. I have disabled
> ACPI. I don't have the config handy so I can't do a complete
> comparison, just going from memory.
one group of complaints seems to be related to SELINUX=y: it has hooks
all across the kernel deep within the locking hierarchy - and then
itself it does pretty complex stuff too. IPC is certainly broken due to
this, but some networking problems seem to be related too.
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>>>It builds fine if CONFIG_SMP is set. Am I really the only person
>>>>running this on UP?
>>>
>>>I run both, on different machines.
>>>
>>>I'm actually running 2.6.9-rc4-mm1-U3 at this very moment, on my laptop
>>>(P4 2.53Ghz/UP, Mdk 10.1c) and also on my desktop machine (P4
>>>2.80Ghz/SMP/HT, SuSE 9.1).
>>>
>>>However, on the desktop (SMP/HT) I could only made it boot/init
>>>successfully with CONFIG_PREEMPT_REALTIME off. On my laptop (UP) is
>>>running pretty well on full RT.
>>
>>I'm curious what you get when you try to boot the SMP system with
>>REALTIME on? My SMP/HT system at the office works fine with this.
>>Although there is one difference that jumps out at me. I have disabled
>>ACPI. I don't have the config handy so I can't do a complete
>>comparison, just going from memory.
>
>
> one group of complaints seems to be related to SELINUX=y: it has hooks
> all across the kernel deep within the locking hierarchy - and then
> itself it does pretty complex stuff too. IPC is certainly broken due to
> this, but some networking problems seem to be related too.
>
> Ingo
>
Well therein lies a big difference. I have disabled this on all the
systems that I am testing on.
kr
K.R. Foley wrote:
>Rui Nuno Capela:
>>
>> I run both, on different machines.
>>
>> I'm actually running 2.6.9-rc4-mm1-U3 at this very moment, on my laptop
>> (P4 2.53Ghz/UP, Mdk 10.1c) and also on my desktop machine (P4
>> 2.80Ghz/SMP/HT, SuSE 9.1).
>>
>> However, on the desktop (SMP/HT) I could only made it boot/init
>> successfully with CONFIG_PREEMPT_REALTIME off. On my laptop (UP) is
>> running pretty well on full RT.
>
> I'm curious what you get when you try to boot the SMP system with
> REALTIME on? My SMP/HT system at the office works fine with this.
> Although there is one difference that jumps out at me. I have disabled
> ACPI. I don't have the config handy so I can't do a complete comparison,
> just going from memory.
>
Hmm. The way I see it, if I say acpi=off on kernel boot, I loose HT, and
end in a SMP enabled kernel running on only one CPU. To keep ACPI disabled
but rely on it to show up those hyperthreaded virtual cpus on boot, one
should say acpi=ht, I guess.
Is that what you're asking?
--
rncbc aka Rui Nuno Capela
[email protected]
Rui Nuno Capela wrote:
> K.R. Foley wrote:
>
>>Rui Nuno Capela:
>>
>>>I run both, on different machines.
>>>
>>>I'm actually running 2.6.9-rc4-mm1-U3 at this very moment, on my laptop
>>>(P4 2.53Ghz/UP, Mdk 10.1c) and also on my desktop machine (P4
>>>2.80Ghz/SMP/HT, SuSE 9.1).
>>>
>>>However, on the desktop (SMP/HT) I could only made it boot/init
>>>successfully with CONFIG_PREEMPT_REALTIME off. On my laptop (UP) is
>>>running pretty well on full RT.
>>
>>I'm curious what you get when you try to boot the SMP system with
>>REALTIME on? My SMP/HT system at the office works fine with this.
>>Although there is one difference that jumps out at me. I have disabled
>>ACPI. I don't have the config handy so I can't do a complete comparison,
>>just going from memory.
>>
>
>
> Hmm. The way I see it, if I say acpi=off on kernel boot, I loose HT, and
> end in a SMP enabled kernel running on only one CPU. To keep ACPI disabled
> but rely on it to show up those hyperthreaded virtual cpus on boot, one
> should say acpi=ht, I guess.
>
> Is that what you're asking?
Actually what I was asking was what messages, etc. you get before the
system fails to boot. I think Ingo already pointed out why several
people, maybe yourself included, are having problems with this patch.
As for acpi, I just disabled the power management stuff prior to
building the kernel. Of course my ht still works.
kr
On Friday 15 October 2004 12:26, Ingo Molnar wrote:
> i have released the -U3 PREEMPT_REALTIME patch:
>
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-
>U3
i'm not sure if this bug depends on preemption patch, or if it is a general
one in -mm1 tree.
kernel BUG at fs/fat/cache.c:150!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: ipx p8022 psnap llc nvidia snd_mixer_oss sd_mod tda9887
tuner saa7134 video_buf v4l2_common v4l1_compat ir_common videodev 8139too
sis900 crc32 ehci_hcd usb_storage scsi_mod ohci_hcd usbcore snd_intel8x0
snd_ac97_codec snd_pcm snd_timer snd soundcore snd_page_alloc ohci1394
ieee1394 i2c_sis96x i2c_core
CPU: 0
EIP: 0060:[<c01a0b53>] Tainted: P VLI
EFLAGS: 00210202 (2.6.9-rc4-mm1-vp-u3)
EIP is at fat_cache_add+0x135/0x151
eax: 00000001 ebx: cf5712b8 ecx: c656f780 edx: cf571201
esi: ce414c50 edi: c656f768 ebp: c656f7b8 esp: ce414c1c
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process mplayer (pid: 9843, threadinfo=ce414000 task=c8128a20)
Stack: c656f7b8 ce414c88 cf785800 00000001 c656f768 c01a11bb ce414c88 ce414c8c
c6d06300 00000000 c8128a20 0003ffff c656f7b8 00000000 00000000 00000001
000db50e c656f7b8 c656f768 cf785800 cee1d800 c01a12c1 ce414c8c 00200246
Call Trace:
[<c01a11bb>] fat_get_cluster+0x11f/0x1de
[<c01a12c1>] fat_bmap_cluster+0x47/0x98
[<c01a13f1>] fat_bmap+0xdf/0x101
[<c01a36e8>] fat_get_block+0x30/0x198
[<c0152866>] alloc_buffer_head+0x33/0x52
[<c01501ea>] create_buffers+0x51/0x86
[<c015141c>] block_read_full_page+0x1bd/0x2cc
[<c01a36b8>] fat_get_block+0x0/0x198
[<c0132f0c>] add_to_page_cache+0x58/0x6e
[<c013a0e4>] read_pages+0xbe/0x15e
[<c013a49c>] do_page_cache_readahead+0x120/0x19b
[<c013a606>] page_cache_readahead+0xef/0x1f8
[<c01336ab>] do_generic_mapping_read+0x133/0x54b
[<c013c911>] lru_cache_add+0xd/0x4f
[<c0133d53>] __generic_file_aio_read+0x1a3/0x218
[<c0133ac3>] file_read_actor+0x0/0xed
[<c0133ec5>] generic_file_read+0x9c/0xba
[<c012b84c>] _mutex_lock+0x20/0x36
[<c01244c8>] do_sigaction+0x1dc/0x1f7
[<c012b468>] autoremove_wake_function+0x0/0x43
[<c012b84c>] _mutex_lock+0x20/0x36
[<c0167974>] dnotify_parent+0x35/0x95
[<c014e0a9>] vfs_read+0xcd/0x126
[<c014e36f>] sys_read+0x41/0x6a
[<c0103f7b>] syscall_call+0x7/0xb
Code: f2 89 e8 e8 9f fe ff ff 85 c0 89 c3 74 2a 83 6f 20 01 8b 04 24 39 00 75
24 8b 14 24 a1 74 0b 3c c0 e8 9c b1 f9 ff e9 40 ff ff ff <0f> 0b 96 00 e5 b7
2d c0 e9 2f ff ff ff 8b 1c 24 eb 88 0f 0b 48
<5>Attached scsi generic sg0 at scsi1, channel 0, id 0, lun 0, type 0
best regards,
dominik
* Dominik Karall <[email protected]> wrote:
> i'm not sure if this bug depends on preemption patch, or if it is a
> general one in -mm1 tree.
>
> kernel BUG at fs/fat/cache.c:150!
> EIP is at fat_cache_add+0x135/0x151
> Process mplayer (pid: 9843, threadinfo=ce414000 task=c8128a20)
> Call Trace:
> [<c01a11bb>] fat_get_cluster+0x11f/0x1de
> [<c01a12c1>] fat_bmap_cluster+0x47/0x98
> [<c01a13f1>] fat_bmap+0xdf/0x101
> [<c01a36e8>] fat_get_block+0x30/0x198
> [<c0152866>] alloc_buffer_head+0x33/0x52
> [<c01501ea>] create_buffers+0x51/0x86
> [<c015141c>] block_read_full_page+0x1bd/0x2cc
> [<c01a36b8>] fat_get_block+0x0/0x198
> [<c0132f0c>] add_to_page_cache+0x58/0x6e
> [<c013a0e4>] read_pages+0xbe/0x15e
> [<c013a49c>] do_page_cache_readahead+0x120/0x19b
> [<c013a606>] page_cache_readahead+0xef/0x1f8
> [<c01336ab>] do_generic_mapping_read+0x133/0x54b
> [<c013c911>] lru_cache_add+0xd/0x4f
> [<c0133d53>] __generic_file_aio_read+0x1a3/0x218
> [<c0133ac3>] file_read_actor+0x0/0xed
> [<c0133ec5>] generic_file_read+0x9c/0xba
> [<c012b84c>] _mutex_lock+0x20/0x36
> [<c01244c8>] do_sigaction+0x1dc/0x1f7
> [<c012b468>] autoremove_wake_function+0x0/0x43
> [<c012b84c>] _mutex_lock+0x20/0x36
> [<c0167974>] dnotify_parent+0x35/0x95
> [<c014e0a9>] vfs_read+0xcd/0x126
> [<c014e36f>] sys_read+0x41/0x6a
> [<c0103f7b>] syscall_call+0x7/0xb
indeed this does not seem to be related to the preemption patch. How
hard is it to reproduce this problem? If it's easy then please try with
vanilla 2.6.9-rc4-mm1 (and if it breaks too, with 2.6.9-rc4 as well), to
narrow down where the breakage got introduced.
Ingo
i have released the -U4 PREEMPT_REALTIME patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
this is a fixes-only release, and it is still experimental code.
Changes since -U3:
- crash fix: fix the rtc_lock related crash reported by Bill Huey, the
rtc_lock is now a raw spinlock again.
- crash fix: avoid recursion into timer code when PREEMPT_TIMING is
enabled.
- crash/printout fix: revert some of selinux's locks back to raw
spinlocks. This could fix the problems reported by Mark H. Johnson,
Adam Heath.
- build fix: fix the compilation problems reported by Lee Revell
- debug feature: implemented 'print backtrace on all CPUs' on SMP
systems, SysRq+L will trigger it.
- build cleanup: restructure the debug config options. This should
resolve the build problems and incompatible debug-options
problems reported.
- cleanup: move definitions around, turn on generic rwlocks instead of
the x86-specific version.
i think all bugs that were reported with logs are fixed in -U4. Please
re-report any issue that might remain.
to create a -U4 tree from scratch the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
Ingo
Ingo Molnar wrote:
> i have released the -U4 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
>
> this is a fixes-only release, and it is still experimental code.
>
This fixes a compile problem on UP with CONFIG_LATENCY_TRACE enabled.
--- linux-2.6.9-rc4-mm1/kernel/latency.c.orig 2004-10-16
12:02:39.539487008 -0500
+++ linux-2.6.9-rc4-mm1/kernel/latency.c 2004-10-16
12:03:44.536344303 -0500
@@ -602,7 +602,8 @@
/*
* On UP, NMI tracing is quite simple:
*/
-void notrace nmi_trace(unsigned long eip, unsigned long parent_eip)
+void notrace nmi_trace(unsigned long eip, unsigned long parent_eip,
+ unsigned long flags)
{
__trace(eip, parent_eip);
}
On Sat, 16 Oct 2004, Ingo Molnar wrote:
>
> * Adam Heath <[email protected]> wrote:
>
> > On Fri, 15 Oct 2004, Ingo Molnar wrote:
> >
> > >
> > > i have released the -U3 PREEMPT_REALTIME patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U3
> >
> > scheduling while atomic: postmaster/0x04000002/3175
> > caller is cond_resched+0x53/0x70
> > [<c01069f7>] dump_stack+0x17/0x20
> > [<c027b457>] schedule+0x517/0x550
> > [<c027b9c3>] cond_resched+0x53/0x70
> > [<c012cdc7>] _mutex_lock+0x17/0x40
> > [<c012ce18>] _mutex_lock_irqsave+0x8/0x10
> > [<c01b21ae>] avc_has_perm_noaudit+0x2e/0x180
> > [<c01b2335>] avc_has_perm+0x35/0x68
> > [<c01b79ca>] ipc_has_perm+0x6a/0x80
> > [<c01ab716>] semctl_main+0xa6/0x410
> > [<c01abcad>] sys_semctl+0xad/0xb0
> > [<c010bafd>] sys_ipc+0xad/0x250
> > [<c0105bff>] syscall_call+0x7/0xb
>
> thanks - that's the IPC code that is not converted over from RCU yet.
>
> a suggestion for future testing: please enable PREEMPT_TIMING for the
> next kernels you build, it will print such entries at the end of
> stacktraces:
adam@gradall:~/kernel/gradall/linux-2.6.9-rc4-mm1-U3$ grep PREEMPT /boot/config-2.6.9-rc4-mm1-vp-u3
CONFIG_PREEMPT_TIMING=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_REALTIME=y
CONFIG_DEBUG_PREEMPT=y
adam@gradall:~/kernel/gradall/linux-2.6.9-rc4-mm1-U3$ grep LATENCY /boot/config-2.6.9-rc4-mm1-vp-u3
# CONFIG_LATENCY_TRACE is not set
So, it must not be working.
I'm recompiling now to enable LATENCY_TRACE, however.
> preempt count: 2
> entry 1: cpu_idle+0x38/0x90 / (start_kernel+0x1ac/0x1f0)
> entry 2: _spin_lock+0x22/0x80 / (timer_interrupt+0x1b/0x130)
There were no preempt count lines anywhere.
On Sat, 16 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U4 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
>
> this is a fixes-only release, and it is still experimental code.
You forgot to lowercase RT and U in the EXTRAVERSION.
* Adam Heath <[email protected]> wrote:
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
> >
> > this is a fixes-only release, and it is still experimental code.
>
> You forgot to lowercase RT and U in the EXTRAVERSION.
i changed my mind because lowercase it looks pretty ugly in uname,
appended to the already lowercase -mm string. Why does Debian need to
have it in lowercase anyway? It doesnt seem to make much sense.
Ingo
* K.R. Foley <[email protected]> wrote:
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
> >
> >this is a fixes-only release, and it is still experimental code.
> >
>
> This fixes a compile problem on UP with CONFIG_LATENCY_TRACE enabled.
thanks - i've fixed the U4 patch too so new downloads will have the fix.
Ingo
On Sat, 2004-10-16 at 21:24 +0200, Ingo Molnar wrote:
> i changed my mind because lowercase it looks pretty ugly in uname,
> appended to the already lowercase -mm string. Why does Debian need to
> have it in lowercase anyway? It doesnt seem to make much sense.
It becomes part of the package version, and the package versions are
lowercase. But I agree, it doesn't make much sense.
-RT does look better.
Robert Love
* Adam Heath <[email protected]> wrote:
> > i have released the -U4 PREEMPT_REALTIME patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
> >
> > this is a fixes-only release, and it is still experimental code.
>
> Few stack dumps now.
these are normal, they are the PREEMPT_TIMING traces which get printed
every time the kernel measures a new latency maximum. The stack dumps
are done to make it easier to identify which place too that long of a
delay and why. (if LATENCY_TRACE is enabled too then the last latency
and its trace can also be found in /proc/latency_trace.)
after bootup it makes sense to reset the maximum:
echo 10 > /proc/sys/kernel/preempt_max_latency
because during bootup there are a number of latencies that are one-time
only.
Ingo
On Sat, 16 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U4 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
>
> this is a fixes-only release, and it is still experimental code.
Few stack dumps now.
[cut early messages]
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 364k freed
(swapper/1/CPU#0): new 1 us maximum-latency critical section.
=> started at timestamp 27541945: <call_console_drivers+0x76/0x140>
=> ended at timestamp 27541946: <__find_get_block+0x56/0xf0>
[<c0132650>] sub_preempt_count+0x60/0x90
[<c01323c7>] check_preempt_timing+0x1b7/0x260
[<c015e7a6>] __find_get_block+0x56/0xf0
[<c0132650>] sub_preempt_count+0x60/0x90
[<c015e7a6>] __find_get_block+0x56/0xf0
[<c015e7a6>] __find_get_block+0x56/0xf0
[<c0113304>] mcount+0x14/0x18
[<c015e86d>] __getblk+0x2d/0x70
[<c01992a2>] ext3_getblk+0x92/0x280
[<c019d5ee>] ext3_find_entry+0xe/0x3d0
[<c019dbff>] ext3_lookup+0x3f/0xc0
[<c0113304>] mcount+0x14/0x18
[<c019d6f6>] ext3_find_entry+0x116/0x3d0
[<c0131b7d>] __mcount+0x1d/0x20
[<c013131e>] _mutex_unlock+0xe/0x60
[<c0131b7d>] __mcount+0x1d/0x20
[<c019dbd4>] ext3_lookup+0x14/0xc0
[<c0169ec0>] real_lookup+0xf0/0x120
[<c0113304>] mcount+0x14/0x18
[<c019dbff>] ext3_lookup+0x3f/0xc0
[<c0169ec0>] real_lookup+0xf0/0x120
[<c016a159>] do_lookup+0x89/0xa0
[<c016a27d>] link_path_walk+0x10d/0xde0
[<c016b238>] path_lookup+0xa8/0x1c0
[<c015ab61>] filp_open+0x41/0x70
[<c016bae9>] open_namei+0x89/0x700
[<c0131b7d>] __mcount+0x1d/0x20
[<c015afdb>] sys_open+0x4b/0xb0
[<c0113304>] mcount+0x14/0x18
[<c015ab61>] filp_open+0x41/0x70
[<c013131e>] _mutex_unlock+0xe/0x60
[<c01dccb4>] find_next_zero_bit+0x14/0xa8
[<c015ae70>] get_unused_fd+0x90/0xf0
[<c015afdb>] sys_open+0x4b/0xb0
[<c01002e0>] init+0x0/0x120
[<c010037c>] init+0x9c/0x120
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __find_get_block+0x32/0xf0 / (__getblk+0x2d/0x70)
.. entry 2: print_traces+0x17/0x43 / (dump_stack+0x23/0x30)
=> dump-end timestamp 27543467
(kjournald/19/CPU#0): new 2 us maximum-latency critical section.
=> started at timestamp 27543602: <call_console_drivers+0x76/0x140>
=> ended at timestamp 27543604: <preempt_schedule+0x61/0x80>
[<c0132650>] sub_preempt_count+0x60/0x90
[<c01323c7>] check_preempt_timing+0x1b7/0x260
[<c02a3bf1>] preempt_schedule+0x61/0x80
[<c0132650>] sub_preempt_count+0x60/0x90
[<c02a3bf1>] preempt_schedule+0x61/0x80
[<c02a3bf1>] preempt_schedule+0x61/0x80
[<c01aefa3>] kjournald+0x73/0x240
[<c0117f31>] finish_task_switch+0x51/0xa0
[<c0117f31>] finish_task_switch+0x51/0xa0
[<c0117faf>] schedule_tail+0x2f/0x80
[<c0105faa>] ret_from_fork+0x6/0x14
[<c01aef30>] kjournald+0x0/0x240
[<c01aef00>] commit_timeout+0x0/0x20
[<c01aef30>] kjournald+0x0/0x240
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 04000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x43 / (dump_stack+0x23/0x30)
=> dump-end timestamp 27544227
kjournald starting. Commit interval 5 seconds
(kjournald/19/CPU#0): new 4 us maximum-latency critical section.
=> started at timestamp 27544266: <vprintk+0x2b/0x190>
=> ended at timestamp 27544270: <vprintk+0x102/0x190>
[<c0132650>] sub_preempt_count+0x60/0x90
[<c01323c7>] check_preempt_timing+0x1b7/0x260
[<c011c672>] vprintk+0x102/0x190
[<c0132650>] sub_preempt_count+0x60/0x90
[<c011c672>] vprintk+0x102/0x190
[<c011c672>] vprintk+0x102/0x190
[<c011c56d>] printk+0x1d/0x20
[<c01aefc1>] kjournald+0x91/0x240
[<c0117f31>] finish_task_switch+0x51/0xa0
[<c0117f31>] finish_task_switch+0x51/0xa0
[<c0117faf>] schedule_tail+0x2f/0x80
[<c0105faa>] ret_from_fork+0x6/0x14
[<c01aef30>] kjournald+0x0/0x240
[<c01aef00>] commit_timeout+0x0/0x20
[<c01aef30>] kjournald+0x0/0x240
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: vprintk+0x2b/0x190 / (printk+0x1d/0x20)
.. entry 2: print_traces+0x17/0x43 / (dump_stack+0x23/0x30)
=> dump-end timestamp 27544470
(kjournald/19/CPU#0): new 493 us maximum-latency critical section.
=> started at timestamp 27544482: <kernel_fpu_begin+0x21/0x60>
=> ended at timestamp 27544975: <_mmx_memcpy+0x131/0x180>
[<c0132650>] sub_preempt_count+0x60/0x90
[<c01323c7>] check_preempt_timing+0x1b7/0x260
[<c01dd311>] _mmx_memcpy+0x131/0x180
[<c0132650>] sub_preempt_count+0x60/0x90
[<c01dd311>] _mmx_memcpy+0x131/0x180
[<c01dd311>] _mmx_memcpy+0x131/0x180
[<c01e3a25>] vgacon_scroll+0x245/0x260
[<c01f43ba>] scrup+0xda/0xf0
[<c0113304>] mcount+0x14/0x18
[<c01f5f02>] lf+0x72/0x80
[<c01f86cf>] vt_console_print+0x13f/0x320
[<c011c2b1>] __call_console_drivers+0x61/0x70
[<c011c3da>] call_console_drivers+0x9a/0x140
[<c011c7f1>] release_console_sem+0x71/0x110
[<c011c68f>] vprintk+0x11f/0x190
[<c011c56d>] printk+0x1d/0x20
[<c01aefc1>] kjournald+0x91/0x240
[<c0117f31>] finish_task_switch+0x51/0xa0
[<c0117f31>] finish_task_switch+0x51/0xa0
[<c0117faf>] schedule_tail+0x2f/0x80
[<c0105faa>] ret_from_fork+0x6/0x14
[<c01aef30>] kjournald+0x0/0x240
[<c01aef00>] commit_timeout+0x0/0x20
[<c01aef30>] kjournald+0x0/0x240
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: kernel_fpu_begin+0x21/0x60 / (_mmx_memcpy+0x36/0x180)
.. entry 2: print_traces+0x17/0x43 / (dump_stack+0x23/0x30)
=> dump-end timestamp 27545263
Adding 516056k swap on /dev/hda1. Priority:-1 extents:1
EXT3 FS on hda5, internal journal
SiI680: IDE controller at PCI slot 0000:00:0d.0
SiI680: chipset revision 2
SiI680: BASE CLOCK == 133
SiI680: 100% native mode on irq 10
ide2: MMIO-DMA , BIOS settings: hde:pio, hdf:pio
ide3: MMIO-DMA , BIOS settings: hdg:pio, hdh:pio
Probing IDE interface ide2...
hde: Maxtor 6Y080P0, ATA DISK drive
ide2 at 0xf8840f80-0xf8840f87,0xf8840f8a on irq 10
hde: max request size: 64KiB
hde: 160086528 sectors (81964 MB) w/7936KiB Cache, CHS=65535/16/63, UDMA(133)
hde: cache flushes supported
hde: hde1
Probing IDE interface ide3...
sis900.c: v1.08.07 11/02/2003
eth0: Realtek RTL8201 PHY transceiver found at address 1.
eth0: Using transceiver found at address 1 as default
eth0: SiS 900 PCI Fast Ethernet at 0xc000, IRQ 5, 00:0a:e6:ab:93:c9.
3c59x: Donald Becker and others. http://www.scyld.com/network/vortex.html
0000:00:0f.0: 3Com PCI 3c905B Cyclone 100baseTx at 0xbc00. Vers LK1.1.19
intel8x0_measure_ac97_clock: measured 49102 usecs
intel8x0: clocking to 48000
SCSI subsystem initialized
device-mapper: 4.1.0-ioctl (2003-12-10) initialised: [email protected]
md: md driver 0.90.0 MAX_MD_DEVS=256, MD_SB_DISKS=27
md: raidstart(pid 154) used deprecated START_ARRAY ioctl. This will not be supported beyond 2.6
md: autorun ...
md: considering hdb1 ...
md: adding hdb1 ...
md: adding hdc1 ...
md: adding hdd1 ...
md: created md0
md: bind<hdd1>
md: bind<hdc1>
md: bind<hdb1>
md: running: <hdb1><hdc1><hdd1>
md: kicking non-fresh hdd1 from array!
md: unbind<hdd1>
md: export_rdev(hdd1)
raid5: automatically using best checksumming function: pIII_sse
pIII_sse : 1224.000 MB/sec
raid5: using function: pIII_sse (1224.000 MB/sec)
md: raid5 personality registered as nr 4
raid5: device hdb1 operational as raid disk 2
raid5: device hdc1 operational as raid disk 0
raid5: allocated 3160kB for md0
raid5: raid level 5 set md0 active with 2 out of 3 devices, algorithm 2
RAID5 conf printout:
--- rd:3 wd:2 fd:1
disk 0, o:1, dev:hdc1
disk 2, o:1, dev:hdb1
md: ... autorun DONE.
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
device-mapper: error adding target to table
device-mapper: : dm-linear: Device lookup failed
(evms_activate/312/CPU#0): new 526 us maximum-latency critical section.
=> started at timestamp 38955962: <kernel_fpu_begin+0x21/0x60>
=> ended at timestamp 38956488: <_mmx_memcpy+0x131/0x180>
[<c0132650>] sub_preempt_count+0x60/0x90
[<c01323c7>] check_preempt_timing+0x1b7/0x260
[<c01dd311>] _mmx_memcpy+0x131/0x180
[<c0132650>] sub_preempt_count+0x60/0x90
[<c01dd311>] _mmx_memcpy+0x131/0x180
[<c01dd311>] _mmx_memcpy+0x131/0x180
[<c01e3a25>] vgacon_scroll+0x245/0x260
[<c01f43ba>] scrup+0xda/0xf0
[<c0113304>] mcount+0x14/0x18
[<c01f5f02>] lf+0x72/0x80
[<c01f86cf>] vt_console_print+0x13f/0x320
[<c011c2b1>] __call_console_drivers+0x61/0x70
[<c011c3da>] call_console_drivers+0x9a/0x140
[<c011c7f1>] release_console_sem+0x71/0x110
[<c011c68f>] vprintk+0x11f/0x190
[<c011c56d>] printk+0x1d/0x20
[<f8909683>] dm_table_add_target+0xb3/0x1b0 [dm_mod]
[<f890c022>] populate_table+0x82/0xe0 [dm_mod]
[<f890c0dd>] table_load+0x5d/0x120 [dm_mod]
[<f890c8dc>] ctl_ioctl+0xdc/0x140 [dm_mod]
[<c0118541>] lock_kernel+0x41/0x60
[<f890c080>] table_load+0x0/0x120 [dm_mod]
[<c016f6d5>] sys_ioctl+0xd5/0x240
[<c01060d3>] syscall_call+0x7/0xb
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: kernel_fpu_begin+0x21/0x60 / (_mmx_memcpy+0x36/0x180)
.. entry 2: print_traces+0x17/0x43 / (dump_stack+0x23/0x30)
=> dump-end timestamp 38957038
[cut trailing messages]
On Sat, 16 Oct 2004, Ingo Molnar wrote:
>
> * Adam Heath <[email protected]> wrote:
>
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
> > >
> > > this is a fixes-only release, and it is still experimental code.
> >
> > You forgot to lowercase RT and U in the EXTRAVERSION.
>
> i changed my mind because lowercase it looks pretty ugly in uname,
> appended to the already lowercase -mm string. Why does Debian need to
> have it in lowercase anyway? It doesnt seem to make much sense.
It's only a minor annoyance right now. I have to edit Makefile before
starting make-kpkg, and before backing out the previous patch. I'll file a
bug on kernel-package at some point, but it probably won't be fixed anytime
soon(debian is preparing to release soon(HAHA!)).
On Sat, 16 Oct 2004, Ingo Molnar wrote:
>
> * Adam Heath <[email protected]> wrote:
>
> > > i have released the -U4 PREEMPT_REALTIME patch:
> > >
> > > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
> > >
> > > this is a fixes-only release, and it is still experimental code.
> >
> > Few stack dumps now.
>
> these are normal, they are the PREEMPT_TIMING traces which get printed
> every time the kernel measures a new latency maximum. The stack dumps
> are done to make it easier to identify which place too that long of a
> delay and why. (if LATENCY_TRACE is enabled too then the last latency
> and its trace can also be found in /proc/latency_trace.)
>
> after bootup it makes sense to reset the maximum:
>
> echo 10 > /proc/sys/kernel/preempt_max_latency
>
> because during bootup there are a number of latencies that are one-time
> only.
So, I did that, and immediately started getting more stack dumps. Are these
things that are interesting, or only informational?
* Adam Heath <[email protected]> wrote:
> > after bootup it makes sense to reset the maximum:
> >
> > echo 10 > /proc/sys/kernel/preempt_max_latency
> >
> > because during bootup there are a number of latencies that are one-time
> > only.
>
> So, I did that, and immediately started getting more stack dumps. Are
> these things that are interesting, or only informational?
they are informational, if you are evaluating latencies. Feel free to
post larger latencies. Right now the threshold for "large" depends - on
a fast box i'd say latencies above 200 usecs are worth reporting - but
any trace can be interesting. Latencies above 1000 usecs are definitely
worth seeing.
stability has the highest priority at the moment, and the other type of
non-latency stackdumps (scheduling while atomic, smp_processor_id()
warnings or outright kernel oopses) should always be reported if
possible.
Ingo
On Saturday 16 October 2004 17:24, Ingo Molnar wrote:
> * Dominik Karall <[email protected]> wrote:
> > i'm not sure if this bug depends on preemption patch, or if it is a
> > general one in -mm1 tree.
> >
> > kernel BUG at fs/fat/cache.c:150!
> >
> > EIP is at fat_cache_add+0x135/0x151
> >
> > Process mplayer (pid: 9843, threadinfo=ce414000 task=c8128a20)
> >
> > Call Trace:
> > [<c01a11bb>] fat_get_cluster+0x11f/0x1de
> > [<c01a12c1>] fat_bmap_cluster+0x47/0x98
> > [<c01a13f1>] fat_bmap+0xdf/0x101
> > [<c01a36e8>] fat_get_block+0x30/0x198
> > [<c0152866>] alloc_buffer_head+0x33/0x52
> > [<c01501ea>] create_buffers+0x51/0x86
> > [<c015141c>] block_read_full_page+0x1bd/0x2cc
> > [<c01a36b8>] fat_get_block+0x0/0x198
> > [<c0132f0c>] add_to_page_cache+0x58/0x6e
> > [<c013a0e4>] read_pages+0xbe/0x15e
> > [<c013a49c>] do_page_cache_readahead+0x120/0x19b
> > [<c013a606>] page_cache_readahead+0xef/0x1f8
> > [<c01336ab>] do_generic_mapping_read+0x133/0x54b
> > [<c013c911>] lru_cache_add+0xd/0x4f
> > [<c0133d53>] __generic_file_aio_read+0x1a3/0x218
> > [<c0133ac3>] file_read_actor+0x0/0xed
> > [<c0133ec5>] generic_file_read+0x9c/0xba
> > [<c012b84c>] _mutex_lock+0x20/0x36
> > [<c01244c8>] do_sigaction+0x1dc/0x1f7
> > [<c012b468>] autoremove_wake_function+0x0/0x43
> > [<c012b84c>] _mutex_lock+0x20/0x36
> > [<c0167974>] dnotify_parent+0x35/0x95
> > [<c014e0a9>] vfs_read+0xcd/0x126
> > [<c014e36f>] sys_read+0x41/0x6a
> > [<c0103f7b>] syscall_call+0x7/0xb
>
> indeed this does not seem to be related to the preemption patch. How
> hard is it to reproduce this problem? If it's easy then please try with
> vanilla 2.6.9-rc4-mm1 (and if it breaks too, with 2.6.9-rc4 as well), to
> narrow down where the breakage got introduced.
>
> Ingo
sorry, i tried to reproduce this bug, but can't. i even don't know _when_ this
bug occurred, as i just wanted to take a look in the dmesg output after
loading sg module. but it does not depend on sg, as i unloaded it and tried
again to load.
best regards,
dominik
On Sat, 2004-10-16 at 16:30, Dominik Karall wrote:
> sorry, i tried to reproduce this bug, but can't. i even don't know _when_ this
> bug occurred, as i just wanted to take a look in the dmesg output after
> loading sg module. but it does not depend on sg, as i unloaded it and tried
> again to load.
>
The trace looks like mplayer reading from a FAT filesystem. Can you
reproduce the problem if you do whatever you were doing with mplayer
again?
Lee
Ingo,
In reading your -U3 patch the test below (#156)
wasn't clear to me. It would seem in the case of
softirq_preemption, __do_softirq() should be called
to kick ksoftirqd, otherwise ___do_softirq() would
be called to exec softirqs in the immediate context.
kernel/softirq.c:
153 asmlinkage void _do_softirq(void)
154 {
155 local_irq_disable();
156 if (!softirq_preemption)
157 __do_softirq();
158 else
159 ___do_softirq();
160 local_irq_enable();
161 }
-john
--
[email protected]
* john cooper <[email protected]> wrote:
> Ingo,
> In reading your -U3 patch the test below (#156)
> wasn't clear to me. It would seem in the case of
> softirq_preemption, __do_softirq() should be called
> to kick ksoftirqd, otherwise ___do_softirq() would
> be called to exec softirqs in the immediate context.
the dependencies here are a bit complex due to the
various compile-time and runtime flags, and various
architecture call-ins to softirq.c.
> kernel/softirq.c:
>
> 153 asmlinkage void _do_softirq(void)
> 154 {
> 155 local_irq_disable();
> 156 if (!softirq_preemption)
> 157 __do_softirq();
> 158 else
> 159 ___do_softirq();
> 160 local_irq_enable();
> 161 }
___do_softirq() is the 'lowest level' softirq function, it
directly executes the handlers.
__do_softirq() disables bhs and calls ___do_softirq() - this
is the 'direct' softirq execution model, this function is
called by hardirq contexts and by softirqd. [btw., irqd calls
this function too which is a bit pointless.] In the indirect
execution model (SOFTIRQ_PREEMPT) this function does no softirq
execution, it only wakes up softirqd.
_do_softirq() is what is called by softirqd - dependent on the
execution model this function will either execute ___do_softirq()
[no additional locking or bh disabling] in the threaded case,
while in the direct case it will execute __do_softirq().
so the logic seems to be correct to me. (except for the minor
detail of irqd calling __do_softirq() which doesnt make much
sense but which is harmless otherwise.)
with DEBUG_PREEMPT it is relatively safe to call ___do_softirq()
from softirqd (without doing the extra bh disabling), because
the two main rules of softirqs are still preserved:
1) softirq execution doesnt reenter itself
2) per-CPU assumptions safely detected
Ingo
Hi,
I have compile error when I use the make O= option: usr/initramfs_list
doesn't exist. This doesn't occur in pure 2.6.8.1 or 2.6.9-rc4 but does
occur in 2.6.9-rc4-mm1.
Esben
Here is a fix (the build seems not to be broken with or without O=)
--- linux-2.6.9-rc4-mm1-RT-U4/usr/Makefile.orig 2004-10-16
19:39:46.000000000 +0200
+++ linux-2.6.9-rc4-mm1-RT-U4/usr/Makefile 2004-10-16
23:04:13.661382082 +0200
@@ -35,7 +35,10 @@
echo 'scripts/gen_initramfs_list.sh $(CONFIG_INITRAMFS_SOURCE) > $@'; \
else \
echo 'echo Using shipped $@'; \
- fi)
+ if [ $(KBUILD_SRC)!="" ]; then \
+ cp -f $(KBUILD_SRC)/usr/initramfs_list ./usr/initramfs_list; \
+ fi; \
+ fi)
$(INITRAMFS_LIST): FORCE
On Sat, 16 Oct 2004, Ingo Molnar wrote:
>
> * john cooper <[email protected]> wrote:
>
> > Ingo,
> > In reading your -U3 patch the test below (#156)
> > wasn't clear to me. It would seem in the case of
> > softirq_preemption, __do_softirq() should be called
> > to kick ksoftirqd, otherwise ___do_softirq() would
> > be called to exec softirqs in the immediate context.
>
> the dependencies here are a bit complex due to the
> various compile-time and runtime flags, and various
> architecture call-ins to softirq.c.
>
> > kernel/softirq.c:
> >
> > 153 asmlinkage void _do_softirq(void)
> > 154 {
> > 155 local_irq_disable();
> > 156 if (!softirq_preemption)
> > 157 __do_softirq();
> > 158 else
> > 159 ___do_softirq();
> > 160 local_irq_enable();
> > 161 }
>
> ___do_softirq() is the 'lowest level' softirq function, it
> directly executes the handlers.
>
> __do_softirq() disables bhs and calls ___do_softirq() - this
> is the 'direct' softirq execution model, this function is
> called by hardirq contexts and by softirqd. [btw., irqd calls
> this function too which is a bit pointless.] In the indirect
> execution model (SOFTIRQ_PREEMPT) this function does no softirq
> execution, it only wakes up softirqd.
>
> _do_softirq() is what is called by softirqd - dependent on the
> execution model this function will either execute ___do_softirq()
> [no additional locking or bh disabling] in the threaded case,
> while in the direct case it will execute __do_softirq().
>
> so the logic seems to be correct to me. (except for the minor
> detail of irqd calling __do_softirq() which doesnt make much
> sense but which is harmless otherwise.)
>
> with DEBUG_PREEMPT it is relatively safe to call ___do_softirq()
> from softirqd (without doing the extra bh disabling), because
> the two main rules of softirqs are still preserved:
>
> 1) softirq execution doesnt reenter itself
>
> 2) per-CPU assumptions safely detected
>
> Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Saturday 16 October 2004 22:31, Lee Revell wrote:
> On Sat, 2004-10-16 at 16:30, Dominik Karall wrote:
> > sorry, i tried to reproduce this bug, but can't. i even don't know _when_
> > this bug occurred, as i just wanted to take a look in the dmesg output
> > after loading sg module. but it does not depend on sg, as i unloaded it
> > and tried again to load.
>
> The trace looks like mplayer reading from a FAT filesystem. Can you
> reproduce the problem if you do whatever you were doing with mplayer
> again?
>
> Lee
i could reproduce it now, but only once. it appeared when i started an avi
movie from my fat32 partition. mplayer stopped at buffering 2% and does not
play the movie. i tried to start mplayer again and reproduce it, but the bug
does not appear again. mplayer only stopped at 2% buffering and does nothing
more. it seems like the file couldn't be read clearly now from the fat32
partition, as it does not work with xine and others too.
here is the bug i get now:
------------[ cut here ]------------
kernel BUG at fs/fat/cache.c:150!
invalid operand: 0000 [#2]
PREEMPT
Modules linked in: sg sr_mod ipx p8022 psnap llc nvidia snd_mixer_oss sd_mod
tda9887 tuner saa7134 video_buf v4l2_common v4l1_compat ir_common videodev
8139too sis900 crc32 ehci_hcd usb_storage scsi_mod ohci_hcd usbcore
snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd soundcore snd_page_alloc
ohci1394 ieee1394 i2c_sis96x i2c_core
CPU: 0
EIP: 0060:[<c01a0b53>] Tainted: P VLI
EFLAGS: 00210202 (2.6.9-rc4-mm1-vp-u3)
EIP is at fat_cache_add+0x135/0x151
eax: 00000001 ebx: cf57136c ecx: cf57127c edx: cf571301
esi: c1e53c50 edi: cd0d63c8 ebp: cd0d6418 esp: c1e53c1c
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process mplayer (pid: 29320, threadinfo=c1e53000 task=c6abca20)
Stack: cd0d6418 c1e53c88 cf785800 00000001 cd0d63c8 c01a11bb c1e53c88 c1e53c8c
00000000 00000000 00000000 0003ffff cd0d6418 00000000 00000000 00000001
00093710 cd0d6418 cd0d63c8 cf785800 cee1d800 c01a12c1 c1e53c8c 00000000
Call Trace:
[<c01a11bb>] fat_get_cluster+0x11f/0x1de
[<c01a12c1>] fat_bmap_cluster+0x47/0x98
[<c01a13f1>] fat_bmap+0xdf/0x101
[<c01a36e8>] fat_get_block+0x30/0x198
[<c0152866>] alloc_buffer_head+0x33/0x52
[<c01501ea>] create_buffers+0x51/0x86
[<c015141c>] block_read_full_page+0x1bd/0x2cc
[<c01a36b8>] fat_get_block+0x0/0x198
[<c0132f0c>] add_to_page_cache+0x58/0x6e
[<c013a0e4>] read_pages+0xbe/0x15e
[<c013a49c>] do_page_cache_readahead+0x120/0x19b
[<c013a606>] page_cache_readahead+0xef/0x1f8
[<c01336ab>] do_generic_mapping_read+0x133/0x54b
[<c013c911>] lru_cache_add+0xd/0x4f
[<c0133d53>] __generic_file_aio_read+0x1a3/0x218
[<c0133ac3>] file_read_actor+0x0/0xed
[<c0133ec5>] generic_file_read+0x9c/0xba
[<c012b84c>] _mutex_lock+0x20/0x36
[<c01244c8>] do_sigaction+0x1dc/0x1f7
[<c012b468>] autoremove_wake_function+0x0/0x43
[<c012b84c>] _mutex_lock+0x20/0x36
[<c0167974>] dnotify_parent+0x35/0x95
[<c014e0a9>] vfs_read+0xcd/0x126
[<c014e36f>] sys_read+0x41/0x6a
[<c0103f7b>] syscall_call+0x7/0xb
Code: f2 89 e8 e8 9f fe ff ff 85 c0 89 c3 74 2a 83 6f 20 01 8b 04 24 39 00 75
24 8b 14 24 a1 74 0b 3c c0 e8 9c b1 f9 ff e9 40 ff ff ff <0f> 0b 96 00 e5 b7
2d c0 e9 2f ff ff ff 8b 1c 24 eb 88 0f 0b 48
best regards,
dominik
On Sat, Oct 16, 2004 at 11:15:33PM +0200, Esben Nielsen wrote:
> Hi,
> I have compile error when I use the make O= option: usr/initramfs_list
> doesn't exist. This doesn't occur in pure 2.6.8.1 or 2.6.9-rc4 but does
> occur in 2.6.9-rc4-mm1.
>
> Esben
>
> Here is a fix (the build seems not to be broken with or without O=)
>
> --- linux-2.6.9-rc4-mm1-RT-U4/usr/Makefile.orig 2004-10-16
> 19:39:46.000000000 +0200
> +++ linux-2.6.9-rc4-mm1-RT-U4/usr/Makefile 2004-10-16
> 23:04:13.661382082 +0200
> @@ -35,7 +35,10 @@
> echo 'scripts/gen_initramfs_list.sh $(CONFIG_INITRAMFS_SOURCE) > $@'; \
> else \
> echo 'echo Using shipped $@'; \
> - fi)
> + if [ $(KBUILD_SRC)!="" ]; then \
> + cp -f $(KBUILD_SRC)/usr/initramfs_list ./usr/initramfs_list; \
> + fi; \
> + fi)
The above error is from -mm and not part of Ingo's patch.
The better fix is to prefix $(CONFIG_INITRAMFS_SOURCE) with $(srctree)/
Sam
* Dominik Karall <[email protected]> wrote:
> > The trace looks like mplayer reading from a FAT filesystem. Can you
> > reproduce the problem if you do whatever you were doing with mplayer
> > again?
>
> i could reproduce it now, but only once. it appeared when i started an
> avi movie from my fat32 partition. mplayer stopped at buffering 2% and
> does not play the movie. i tried to start mplayer again and reproduce
> it, but the bug does not appear again. mplayer only stopped at 2%
> buffering and does nothing more. it seems like the file couldn't be
> read clearly now from the fat32 partition, as it does not work with
> xine and others too. here is the bug i get now:
did you retry after rebooting the box? Such bugs can easily depend on IO
patterns (and code sequences) that only happen the first time the file
is accessed in such way (inode init, delays due to IO, etc.). So what
would be nice to try (if you havent tried it already) is to:
- reboot the box into this same kernel and retry - do you get the oops?
- if you can reproduce the oops this way in a more or less reliable way
then please try the same with 2.6.9-rc4-mm1 too.
it _looks_ like a bug not related to the RT patch but a connection
cannot be ruled out: the mutex based kernel changes locking for fatfs
too, and could trigger hidden or hard-to-trigger bugs in an easier way.
In the locking sense the RT kernel can be considered equivalent to an
SMP box with an infinite number of CPUs, even on a uniprocessor. It
tests SMP locking in way nothing else does.
Ingo
Hi,
Ingo Molnar wrote:
>
> it seems that SMP + PREEMPT_TIMING is not stable though, somehow the
> latency printk's cause a crash sooner or later. I'm still debugging this
> problem. Without PREEMPT_TIMING the SMP kernel is stable.
>
Here I go again, while my SMP/HT box still dying on the beach of U4 :)
Finally this time, I've got the serial console logging setup up and
running, and now I'm ready to capture and dump all that I can, regarding
this SMP+PREEMPT_REALTIME issue I'm experiencing.
The main symptom, if you can remember, is that the boot/init sequence
stalls soon or later, in non deterministic point. And its an actual issue
of PREEMPT_REALTIME being set. Without that single config option set, the
kernel boots and runs just fine (which is actually the one running, while
I'm writing this very post).
Attached you may find some of those boot/init sessions, which are just
being taken from capturing the serial console output (ttyS0,115200) via
minicom in the other end of a null modem cable.
The point where the boot/init sequence stalls is marked with a
"<<<---STALL--->>>" line mark. Then it follows the output of SysRq-S
(Sync), SysRq-T (Trace), another SysRq-S (Sync), and finally a SysRq-B
(reBoot).
To me, it looks there's plenty dirt to dig :) Hope it helps.
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> The point where the boot/init sequence stalls is marked with a
> "<<<---STALL--->>>" line mark. Then it follows the output of SysRq-S
> (Sync), SysRq-T (Trace), another SysRq-S (Sync), and finally a SysRq-B
> (reBoot).
thanks. Could you send me the full successful bootlog of -U4 with
PREEMPT_REALTIME disabled (but otherwise the same .config)?
this looks suspicious:
eth0: -- ERROR --
Class: internal Software error
Nr: 0x1ae
Msg: General: Driver release date not initialized
eth0: -- ERROR --
Class: internal Software error
Nr: 0x1ae
Msg: General: Driver release date not initialized
eth0: 3Com Gigabit LOM (3C940)
eth0: network connection down
PrefPort:A RlmtMode:Check Link State
is this normal? Could the stall simply be a bootup stall due to no
network available?
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> > eth0: 3Com Gigabit LOM (3C940)
> > eth0: network connection down
> > PrefPort:A RlmtMode:Check Link State
> >
> > is this normal? Could the stall simply be a bootup stall due to no
> > network available?
> >
>
> Yes, I think it's normal. The fact is that on the non-RT kernel, the eth0
> device comes up immediately after, as you can see on minicom.cap.{6,7,8}
> capture files.
ok, then please try to do a sysrq-T. The bootup is soft-hung for some
reason, lets see what tasks are around.
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> OK. Here goes some more food-for-thought :)
>
> - minitor.cap-2.6.9-rc4-mm1-RT-U4.1smp.tar.gz has another three
> capture sessions, minicom.cap.{3,4,5}, which stalls on boot/init
> (CONFIG_PREEMPT_REALTIME=y). Take a special look on minicom.cap.5,
> where the session has been force-truncated, due to an never ending
> trace dump, and where no SysRq could do the rescue. This is one of
> another symptoms, but with less occurrences I've noticed.
i think you are getting stack overflows. Could you disable
CONFIG_4KSTACKS and see whether that helps?
Ingo
On Sat, 16 Oct 2004 17:33:44 +0200
Ingo Molnar <[email protected]> wrote:
>
> i have released the -U4 PREEMPT_REALTIME patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
>
> this is a fixes-only release, and it is still experimental code.
Hi,
the cpu mhz issue seems to be fixed:
~$ uname -a
Linux mango.fruits.de 2.6.9-rc4-mm1-RT-U4-RT #1 Sun Oct 17 17:48:48 CEST 2004 i686 GNU/Linux
~$ cat /proc/cpuinfo |grep MHz
cpu MHz : 1195.145
mango:/usr/src/linux-2.6.9-rc4-mm1-VP-U4# grep REALTIME .config
CONFIG_PREEMPT_REALTIME=y
Might of course be coincidence. Will report as soon as i see the 0.001 Mhz
pop up again.
I saw one of these in /var/log/syslog:
Oct 17 18:53:52 mango kernel: PCI: Found IRQ 3 for device 0000:00:0f.0
Oct 17 18:53:52 mango kernel: Debug: sleeping function called from invalid context modprobe(109) at kernel/mutex.c:25
Oct 17 18:53:52 mango kernel: in_atomic():0 [00000000], irqs_disabled():1
Oct 17 18:53:52 mango kernel: [print_context_stack+78/112] dump_stack+0x1e/0x20
Oct 17 18:53:52 mango kernel: [sinbin_release_fn+536/864] __might_sleep+0xb8/0xd0
Oct 17 18:53:52 mango kernel: [__create_workqueue+35/320] _mutex_lock+0x23/0x60
Oct 17 18:53:52 mango kernel: [__create_workqueue+145/320] _mutex_lock_irqsave+0x11/0x20
Oct 17 18:53:52 mango kernel: [pg0+809943061/1069765632] get_time_pit+0x15/0x50 [gameport]
Oct 17 18:53:52 mango kernel: [pg0+809943199/1069765632] gameport_measure_speed+0x4f/0x100 [gameport]
Oct 17 18:53:52 mango kernel: [pg0+809943550/1069765632] gameport_register_port+0x2e/0x40 [gameport]
Oct 17 18:53:52 mango kernel: [pg0+809931156/1069765632] snd_card_cs46xx_probe+0x194/0x250 [snd_cs46xx]
Oct 17 18:53:52 mango kernel: [crypto_unregister_alg+43/192] pci_device_probe_static+0x4b/0x60
Oct 17 18:53:52 mango kernel: [crypto_unregister_alg+104/192] __pci_device_probe+0x28/0x30
Oct 17 18:53:52 mango kernel: [crypto_unregister_alg+156/192] pci_device_probe+0x2c/0x50
Oct 17 18:53:52 mango kernel: [take_over_console+605/1248] bus_match+0x3d/0x70
Oct 17 18:53:52 mango kernel: [take_over_console+906/1248] driver_attach+0x5a/0x90
Oct 17 18:53:52 mango kernel: [do_blank_screen+608/688] bus_add_driver+0xa0/0xd0
Oct 17 18:53:52 mango kernel: [con_set_cmap+1/64] driver_register+0x31/0x40
Oct 17 18:53:52 mango kernel: [scatterwalk_pagedone+18/96] pci_register_driver+0x42/0x50
Oct 17 18:53:52 mango kernel: [pg0+809931362/1069765632] alsa_card_cs46xx_init+0x12/0x30 [snd_cs46xx]
Oct 17 18:53:52 mango kernel: [__kfifo_put+65/208] sys_init_module+0x1f1/0x2d0
Oct 17 18:53:52 mango kernel: [need_resched+36/37] syscall_call+0x7/0xb
Oct 17 18:53:52 mango kernel: preempt count: 00000001
Oct 17 18:53:52 mango kernel: . 1-level deep critical section nesting:
Oct 17 18:53:52 mango kernel: .. entry 1: print_traces+0x12/0x40 / (dump_stack+0x1e/0x20)
Oct 17 18:53:52 mango kernel:
This seems to be related to modprobe'ing snd-cs46xx
flo
p.s.: attached is the .config of this kernel.
* Florian Schmidt <[email protected]> wrote:
> the cpu mhz issue seems to be fixed:
>
> ~$ uname -a
> Linux mango.fruits.de 2.6.9-rc4-mm1-RT-U4-RT #1 Sun Oct 17 17:48:48 CEST 2004 i686 GNU/Linux
> ~$ cat /proc/cpuinfo |grep MHz
> cpu MHz : 1195.145
> mango:/usr/src/linux-2.6.9-rc4-mm1-VP-U4# grep REALTIME .config
> CONFIG_PREEMPT_REALTIME=y
>
> Might of course be coincidence. Will report as soon as i see the 0.001 Mhz
> pop up again.
ok.
> I saw one of these in /var/log/syslog:
>
> Oct 17 18:53:52 mango kernel: PCI: Found IRQ 3 for device 0000:00:0f.0
> Oct 17 18:53:52 mango kernel: Debug: sleeping function called from invalid context modprobe(109) at kernel/mutex.c:25
> Oct 17 18:53:52 mango kernel: in_atomic():0 [00000000], irqs_disabled():1
> Oct 17 18:53:52 mango kernel: [print_context_stack+78/112] dump_stack+0x1e/0x20
> Oct 17 18:53:52 mango kernel: [sinbin_release_fn+536/864] __might_sleep+0xb8/0xd0
> Oct 17 18:53:52 mango kernel: [__create_workqueue+35/320] _mutex_lock+0x23/0x60
> Oct 17 18:53:52 mango kernel: [__create_workqueue+145/320] _mutex_lock_irqsave+0x11/0x20
> Oct 17 18:53:52 mango kernel: [pg0+809943061/1069765632] get_time_pit+0x15/0x50 [gameport]
ok, does the patch below fix those messages? (gameport.c used its own,
private, incompatible prototype for i8253_lock which breaks raw spinlock
handling.)
Ingo
--- linux/drivers/input/gameport/gameport.c.orig
+++ linux/drivers/input/gameport/gameport.c
@@ -37,12 +37,13 @@ static LIST_HEAD(gameport_dev_list);
#ifdef __i386__
+#include <asm/i8253.h>
+
#define DELTA(x,y) ((y)-(x)+((y)<(x)?1193182/HZ:0))
#define GET_TIME(x) do { x = get_time_pit(); } while (0)
static unsigned int get_time_pit(void)
{
- extern spinlock_t i8253_lock;
unsigned long flags;
unsigned int count;
Ingo Molnar wrote:
>
> Rui Nuno Capela wrote:
>
>> > eth0: 3Com Gigabit LOM (3C940)
>> > eth0: network connection down
>> > PrefPort:A RlmtMode:Check Link State
>> >
>> > is this normal? Could the stall simply be a bootup stall due to no
>> > network available?
>> >
>>
>> Yes, I think it's normal. The fact is that on the non-RT kernel, the
>> eth0 device comes up immediately after, as you can see on
>> minicom.cap.{6,7,8} capture files.
>
> ok, then please try to do a sysrq-T. The bootup is soft-hung for some
> reason, lets see what tasks are around.
>
Hey, all the captured files I've sent, minicom.cap{0,1,2,3,4,5}, includes
the SysRq-T output, taken right after the hang. Am I missing something?
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> > ok, then please try to do a sysrq-T. The bootup is soft-hung for some
> > reason, lets see what tasks are around.
> >
>
> Hey, all the captured files I've sent, minicom.cap{0,1,2,3,4,5}, includes
> the SysRq-T output, taken right after the hang. Am I missing something?
ah, ok, my fault. I started with minicom.cap.0 and stopped here:
SysRq : Emergency Sync
because this itself caused some regression too. It seems init is blocked
on the dcache_lock mutex, but lets first see whether 8K stacks fix your
box - stack overflows can cause nasty, mostly random bugs that look like
real bugs.
Ingo
On Sun, 17 Oct 2004 18:55:09 +0200
Ingo Molnar <[email protected]> wrote:
> ok, does the patch below fix those messages? (gameport.c used its own,
> private, incompatible prototype for i8253_lock which breaks raw spinlock
> handling.)
>
it seems to fix it. i don't see any more messages like the reported anymore.
snd-cs46xx might have some other issues though: Upon rmmod snd-cs46xx i see:
Oct 17 19:43:04 mango kernel: Sound Fusion CS46xx 0000:00:0f.0: Device was removed without properly calling pci_disable_device(). This may need fixing.
but i should probably report that to alsa-devel instead right?
flo
* Florian Schmidt <[email protected]> wrote:
> > ok, does the patch below fix those messages? (gameport.c used its own,
> > private, incompatible prototype for i8253_lock which breaks raw spinlock
> > handling.)
> >
>
> it seems to fix it. i don't see any more messages like the reported
> anymore.
good!
> snd-cs46xx might have some other issues though: Upon rmmod snd-cs46xx
> i see:
>
> Oct 17 19:43:04 mango kernel: Sound Fusion CS46xx 0000:00:0f.0: Device
> was removed without properly calling pci_disable_device(). This may
> need fixing.
>
> but i should probably report that to alsa-devel instead right?
yeah, i think so.
Ingo
On Sunday 17 October 2004 17:32, OGAWA Hirofumi wrote:
> Dominik Karall <[email protected]> writes:
> > i could reproduce it now, but only once. it appeared when i started an
> > avi movie from my fat32 partition. mplayer stopped at buffering 2% and
> > does not play the movie. i tried to start mplayer again and reproduce it,
> > but the bug does not appear again. mplayer only stopped at 2% buffering
> > and does nothing more. it seems like the file couldn't be read clearly
> > now from the fat32 partition, as it does not work with xine and others
> > too.
> > here is the bug i get now:
> >
> > ------------[ cut here ]------------
> > kernel BUG at fs/fat/cache.c:150!
>
> Probably this BUG_ON() was wrong. Does this bug occur only by the
> specific file?
>
> If so, please do "filefrag -v filename" against that file.
>
> Then, can you try the attached patch? This patch removes the BUG_ON(),
> and instead adds printk() for debugging. When the bug occured, it prints
> the current cache.
>
> Thanks.
yes, the bug only occurs on a specific file.
as the bug is present in -mm1 (without vp) too, i applied your patch to that
one. here is the output:
fat_cache_check: id 0, contig 6415, fclus 38231, dclus 1010103
contig 6416, fclus 38231, dclus 1010103
contig 0, fclus 32, dclus 603964
contig 1, fclus 30, dclus 603960
contig 7, fclus 22, dclus 603950
contig 4, fclus 17, dclus 603943
contig 1, fclus 15, dclus 603940
contig 6, fclus 8, dclus 603931
contig 0, fclus 7, dclus 603929
and the movie starts to play in mplayer without problems. tell me if you need
more debugging!
best regards,
dominik
Ingo Molnar wrote:
>
> Rui Nuno Capela wrote:
>
>> OK. Here goes some more food-for-thought :)
>> - minicom.cap-2.6.9-rc4-mm1-RT-U4.1smp.tar.gz has another three capture
>> sessions, minicom.cap.{3,4,5}, which stalls on boot/init
>> (CONFIG_PREEMPT_REALTIME=y). Take a special look on minicom.cap.5,
>> where the session has been force-truncated, due to an never ending
>> trace dump, and where no SysRq could do the rescue. This is one of
>> another symptoms, but with less occurrences I've noticed.
>
> i think you are getting stack overflows. Could you disable
> CONFIG_4KSTACKS and see whether that helps?
>
OK. There is goes another bunch, now with CONFIG_4KSTACKS not set. It
doesn't seem to solve anything, unless perhaps more entropy.
BTW, stack overflows wasn't supposed to be pin-pointed when one has
CONFIG_DEBUG_STACKOVERFLOW=y ???
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> BTW, stack overflows wasn't supposed to be pin-pointed when one has
> CONFIG_DEBUG_STACKOVERFLOW=y ???
there were some signs of it:
minicom.cap.5:do_IRQ: stack overflow: 504
Ingo
Ingo Molnar wrote:
>
> * Rui Nuno Capela <[email protected]> wrote:
>
>> BTW, stack overflows wasn't supposed to be pin-pointed when one has
>> CONFIG_DEBUG_STACKOVERFLOW=y ???
>
> there were some signs of it:
>
> minicom.cap.5:do_IRQ: stack overflow: 504
>
Yeah. And that was the one who went away in a never ending trace...
--
rncbc aka Rui Nuno Capela
[email protected]
i have released the -U5 Real-Time Preemption patch:
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
this is a release intended to increase stability, but since it also
includes new debug features and related cleanups it might introduce new
regressions. Be careful.
there are two big changes:
- debug feature: automatic semaphore/rwsem deadlock detection, based on
the code from Igor Manyilov and Bill Huey.
this is a very nice feature that should help in debugging the remaining
deadlocks. The deadlock detection feature has already helped me fix a
bug that was causing hangs in the VFS, so it's really useful.
- generic semaphore implementation
the generic semaphore implementation (which uses rwsems) makes it
possible to use the deadlock detection mechanism for all the mutex types
we currently have: semaphores, rw-semaphores, spinlock-mutexes and
rwlock-mutexes. Another benefit is that PREEMPT_REALTIME becomes much
more portable this way. (although it's still x86-only at the moment.)
other changes since -U4:
- crash fix: fixed a possible "unbound recursion upon IRQ entry" bug.
introduced preempt_schedule_irq() which now schedules without
enabling interrupts again, preventing new IRQs from hitting this
task again and triggering preemption. This might fix the
'infinite stackdumps' problem Rui Nuno Capela was seeing.
- deadlock fix: is_subdir()'s PREEMPT_REALTIME locking was buggy. This
could perhaps fix the other problem reported by Rui Nuno Capela.
- i8253_lock fixes: apm, hd.c, gameport.c and analog.c were all
improperly importing the variable while overriding the prototype.
This fixes the bug reported by Florian Schmidt.
- possible crash fix: one particular lock in selinux has to be
mutex-based, because while held it calls other mutex-using code.
- two more selinux locks converted to raw spinlocks, because they were
called from within raw-critical sections.
- debug feature: enforce interrupts-enabled upon schedule().
(Note that this does not break sleep_on() because sleep_on() does not
disable interrupts in the PREEMPT_REALTIME mode. It might break with
!PREEMPT_REALTIME though.)
- locking cleanup: converted the IPC code from raw spinlocks & RCU to
spinlock-mutexes.
- code cleanup: cleaned up the generic rwsem code.
- debug feature: implemented /proc/sys/kernel/trace_verbose runtime
flag (default:0), which enables a much more verbose printout in
/proc/latency_trace. This trace format can be useful in e.g.
debugging timestamp weirdnesses.
- irqs-off fix: there was one codepath where irqd would call schedule()
with interrupts disabled.
- debug feature: the NMI entries in the latency trace now also include
the last-observed-EFLAGS value. Can be useful in figuring out what a
certain CPU is doing and why.
- cleanup: fixed preemption-off ordering: often the spinlock (and
scheduler) code would re-enable preemption and interrupts in the
wrong order, opening up a small window for an interrupt handler to
fit in and increase the latency of that almost-finished critical
section.
- cleanup: consolidated various bug-printouts. It should now be easy to
find whether anything bad happens even amongst lots of preempt-timing
printouts: 'dmesg | grep BUG'.
to create a -U5 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
Ingo
>i have released the -U4 PREEMPT_REALTIME patch:
>
>http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U4
I still cannot get the network up and running with this. The BUG messages
are gone but the system stops processing init scripts somewhere in the
network related scripts. One time, it stopped in portmapper, the next it
got past that and stopped in the imapd script.
What I see on the screen is the normal boot, a few latency trace outputs
indicating increasing latency (up to about 2.3 msec) and then I make it to
single user mode. You can see those traces in /var/log/messages; I will
send that separately.
By doing the scripts in
/etc/rc3.d/
by hand or by using
telinit 3
I get normal displays until the system stops running the scripts. It just
stops at that point. No response to Ctrl-C (if done by hand). Alt-SysRq
keyboard commands do work (displayed on the screen) but the output does
not make it to /var/log/messages. The output from the second run is
particularly disappointing, it appears to be truncated.
I will rebuild with -U5 since I noticed it is available, but if you
have some suggestions on a way to capture more helpful data, I would
be glad to do it.
--Mark
* [email protected] <[email protected]> wrote:
> I will rebuild with -U5 since I noticed it is available, but if you
> have some suggestions on a way to capture more helpful data, I would
> be glad to do it.
-U5 has CONFIG_RWSEM_DEADLOCK_DETECT, which could help with your network
hangs.
Ingo
On 9:50:08 am 10/18/04 Ingo Molnar <[email protected]> wrote:
>
> i have released the -U5 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-r
> c4-mm1-U5
Im getting this on a make bzImage:
CC fs/xfs/xfs_inode_item.o
fs/xfs/xfs_inode_item.c: In function `xfs_inode_item_pushbuf':
fs/xfs/xfs_inode_item.c:803: error: structure has no member named `count'
fs/xfs/xfs_inode_item.c:825: error: structure has no member named `count'
make[2]: *** [fs/xfs/xfs_inode_item.o] Error 1
I'm currently running 2.6.9-rc4-mm1-U1 without error. I backed out the U1
patch, applied the new one, did a make oldconfig, make clean, make bzImage,
make modules. My config is attached.
\__ Jason Munro
\__ [email protected]
\__ http://hastymail.sourceforge.net/
>i have released the -U5 Real-Time Preemption patch:
>
>
http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
I am getting build problems - specifically with:
CC [M] drivers/char/ipmi/ipmi_watchdog.o
CC [M] fs/jfs/jfs_dmap.o
drivers/char/ipmi/ipmi_watchdog.c:389: warning: type defaults to `int' in
declaration of `DECLARE_MUTEX_LOCKED'
drivers/char/ipmi/ipmi_watchdog.c:389: warning: parameter names (without
types) in function declaration
drivers/char/ipmi/ipmi_watchdog.c: In function `heartbeat_free_smi':
drivers/char/ipmi/ipmi_watchdog.c:393: error: `heartbeat_wait_lock'
undeclared (first use in this function)
drivers/char/ipmi/ipmi_watchdog.c:393: error: (Each undeclared identifier
is reported only once
drivers/char/ipmi/ipmi_watchdog.c:393: error: for each function it appears
in.)
drivers/char/ipmi/ipmi_watchdog.c: In function `heartbeat_free_recv':
drivers/char/ipmi/ipmi_watchdog.c:398: error: `heartbeat_wait_lock'
undeclared (first use in this function)
drivers/char/ipmi/ipmi_watchdog.c: In function `ipmi_heartbeat':
drivers/char/ipmi/ipmi_watchdog.c:476: error: `heartbeat_wait_lock'
undeclared (first use in this function)
drivers/char/ipmi/ipmi_watchdog.c: At top level:
drivers/char/ipmi/ipmi_watchdog.c:389: warning: `DECLARE_MUTEX_LOCKED'
declared `static' but never defined
If I read the patch correctly, this should be recoded as
DECLARE_MUTEX
instead, but a quick grep of the source code indicates we have about
20 more places where DECLARE_MUTEX_LOCKED is still used. Should I do
a global replace on that or is something else needed?
I also had a compile failure in XFS. The messages are:
CC [M] fs/xfs/quota/xfs_dquot_item.o
CC [M] fs/xfs/quota/xfs_trans_dquot.o
fs/xfs/quota/xfs_dquot_item.c: In function `xfs_qm_dquot_logitem_pushbuf':
fs/xfs/quota/xfs_dquot_item.c:266: error: structure has no member named
`count'
fs/xfs/quota/xfs_dquot_item.c:279: error: structure has no member named
`count'
This refers to a macro defined at
fs/xfs/linux-2.6/sema.h:51:#define valusema(sp)
(atomic_read(&(sp)->count))
Not quite sure if this is an error due to type changes or yet another
name collision.
Please advise how to proceed.
--Mark
* [email protected] <[email protected]> wrote:
> >i have released the -U5 Real-Time Preemption patch:
> >
> >
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>
> I am getting build problems - specifically with:
> CC [M] drivers/char/ipmi/ipmi_watchdog.o
> If I read the patch correctly, this should be recoded as
> DECLARE_MUTEX
> instead, but a quick grep of the source code indicates we have about
> 20 more places where DECLARE_MUTEX_LOCKED is still used. Should I do
> a global replace on that or is something else needed?
it's not normally used, and it's much simpler to rewrite those places
than to implement initialization. (which would be quite hairy)
> I also had a compile failure in XFS. The messages are:
> CC [M] fs/xfs/quota/xfs_dquot_item.o
> CC [M] fs/xfs/quota/xfs_trans_dquot.o
ok, i've re-uploaded a new version of -U5 that has this and the
ipmi_watchdog compilation problems fixed.
please check whether it works, XFS does not seem to make use of count>1
semaphores but one never knows ...
Ingo
On Mon, 18 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U5 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>
> this is a release intended to increase stability, but since it also
> includes new debug features and related cleanups it might introduce new
> regressions. Be careful.
>
> [snip]
>
> - debug feature: implemented /proc/sys/kernel/trace_verbose runtime
> flag (default:0), which enables a much more verbose printout in
> /proc/latency_trace. This trace format can be useful in e.g.
> debugging timestamp weirdnesses.
>
With all these proc values, what do you recommend they should be set to?
Ingo Molnar wrote:
> * [email protected] <[email protected]> wrote:
>
>
>>>i have released the -U5 Real-Time Preemption patch:
>>>
>>>
>>
>>http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>>
>>I am getting build problems - specifically with:
>
>
>> CC [M] drivers/char/ipmi/ipmi_watchdog.o
>
>
>>If I read the patch correctly, this should be recoded as
>> DECLARE_MUTEX
>>instead, but a quick grep of the source code indicates we have about
>>20 more places where DECLARE_MUTEX_LOCKED is still used. Should I do
>>a global replace on that or is something else needed?
>
>
> it's not normally used, and it's much simpler to rewrite those places
> than to implement initialization. (which would be quite hairy)
>
>
>>I also had a compile failure in XFS. The messages are:
>> CC [M] fs/xfs/quota/xfs_dquot_item.o
>> CC [M] fs/xfs/quota/xfs_trans_dquot.o
>
>
> ok, i've re-uploaded a new version of -U5 that has this and the
> ipmi_watchdog compilation problems fixed.
>
> please check whether it works, XFS does not seem to make use of count>1
> semaphores but one never knows ...
>
> Ingo
>
>
Well you just beat me with that one. :) And here is another for aha152x.
--- linux-2.6.9-rc4-mm1/drivers/scsi/aha152x.c.orig 2004-10-18
12:05:02.891049751 -0500
+++ linux-2.6.9-rc4-mm1/drivers/scsi/aha152x.c 2004-10-18
12:05:24.360353020 -0500
@@ -1160,7 +1160,7 @@
static int aha152x_device_reset(Scsi_Cmnd * SCpnt)
{
struct Scsi_Host *shpnt = SCpnt->device->host;
- DECLARE_MUTEX_LOCKED(sem);
+ DECLARE_MUTEX(sem);
struct timer_list timer;
int ret, issued, disconnected;
unsigned long flags;
* Adam Heath <[email protected]> wrote:
> > - debug feature: implemented /proc/sys/kernel/trace_verbose runtime
> > flag (default:0), which enables a much more verbose printout in
> > /proc/latency_trace. This trace format can be useful in e.g.
> > debugging timestamp weirdnesses.
>
> With all these proc values, what do you recommend they should be set
> to?
just the default values - same for the .config options. Once the feature
gets more stable the latency measurements can begin again - for them the
/proc values are important.
Ingo
On Mon, 18 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U5 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>
EXT3 FS on hda5, internal journal
(mount/71/CPU#0): new 493 us maximum-latency critical section.
=> started at timestamp 29143938: <kernel_fpu_begin+0x21/0x60>
=> ended at timestamp 29144431: <_mmx_memcpy+0x131/0x180>
[<c0132480>] sub_preempt_count+0x60/0x90
[<c01321ff>] check_preempt_timing+0x1af/0x250
[<c01dcf51>] _mmx_memcpy+0x131/0x180
[<c0132480>] sub_preempt_count+0x60/0x90
[<c01dcf51>] _mmx_memcpy+0x131/0x180
[<c01dcf51>] _mmx_memcpy+0x131/0x180
[<c01e3625>] vgacon_scroll+0x245/0x260
[<c01f3bba>] scrup+0xda/0xf0
[<c0113104>] mcount+0x14/0x18
[<c01f5702>] lf+0x72/0x80
[<c01f7e9f>] vt_console_print+0x13f/0x320
[<c011c231>] __call_console_drivers+0x61/0x70
[<c011c35a>] call_console_drivers+0x9a/0x140
[<c011c721>] release_console_sem+0x71/0x100
[<c011c5f6>] vprintk+0x116/0x180
[<c011c4dd>] printk+0x1d/0x20
[<c01a0957>] ext3_setup_super+0x127/0x1b0
[<c01db07f>] up_write+0x4f/0x80
[<c01a2552>] ext3_remount+0x132/0x190
[<c01dae41>] down_write+0x71/0xa0
[<c0162903>] do_remount_sb+0xb3/0xe0
[<c0179712>] do_remount+0x72/0xc0
[<c017a073>] do_mount+0x1a3/0x1b0
[<c01dae41>] down_write+0x71/0xa0
[<c017a3ec>] sys_mount+0x9c/0x100
[<c0106013>] syscall_call+0x7/0xb
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: kernel_fpu_begin+0x21/0x60 / (_mmx_memcpy+0x36/0x180)
.. entry 2: print_traces+0x1d/0x70 / (dump_stack+0x23/0x30)
=> dump-end timestamp 29144924
The kernel is jsut getting ready to start init at this point(mounting root),
so I don't know if you are really interested in this high latency trace, but
I'm sending anyways.
However, after I reset the threshold to 50(and got a few small traces), I got
this whopper.
(XFree86/1129/CPU#0): new 4692 us maximum-latency critical section.
=> started at timestamp 358506933: <call_console_drivers+0x76/0x140>
=> ended at timestamp 358511625: <finish_task_switch+0x43/0xa0>
[<c0132480>] sub_preempt_count+0x60/0x90
[<c01321ff>] check_preempt_timing+0x1af/0x250
[<c0117ca3>] finish_task_switch+0x43/0xa0
[<c0132480>] sub_preempt_count+0x60/0x90
[<c0117ca3>] finish_task_switch+0x43/0xa0
[<c0117ca3>] finish_task_switch+0x43/0xa0
[<c02a2859>] __sched_text_start+0x2d9/0x570
[<c02a3323>] schedule_timeout+0x63/0xc0
[<c02a316e>] cond_resched+0xe/0x90
[<c01253a0>] process_timeout+0x0/0x20
[<c02a3185>] cond_resched+0x25/0x90
[<c016f592>] do_select+0x172/0x280
[<c016f280>] __pollwait+0x0/0xb0
[<c016f984>] sys_select+0x294/0x570
[<c0106013>] syscall_call+0x7/0xb
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x3b/0x570 / (schedule_timeout+0x63/0xc0)
.. entry 2: print_traces+0x1d/0x70 / (dump_stack+0x23/0x30)
=> dump-end timestamp 358512077
ps: I've never mentioned the hardware I am running. Athlon XP 2000, 1G ram,
460G(usable) software raid5(3*250g ide)(plus boot 120G), LVM, extra
SiliconImage UDMA133 controller(mobo can only do 100).
I'm not certain what kind of latencies to expect with this setup. I'm
tending to ignore <100us, at least for now.
* Adam Heath <[email protected]> wrote:
> => dump-end timestamp 29144924
>
> The kernel is jsut getting ready to start init at this point(mounting
> root), so I don't know if you are really interested in this high
> latency trace, but I'm sending anyways.
lets skip these for the time being, large runtime ones are the first
ones to be squashed.
> However, after I reset the threshold to 50(and got a few small traces), I got
> this whopper.
>
> (XFree86/1129/CPU#0): new 4692 us maximum-latency critical section.
> => started at timestamp 358506933: <call_console_drivers+0x76/0x140>
> => ended at timestamp 358511625: <finish_task_switch+0x43/0xa0>
> [<c0132480>] sub_preempt_count+0x60/0x90
interesting - this could be a printk (trace) done in a critical section
though. What does /proc/latency_trace tell, is it full of console code
functions?
one of the best ways to avoid the console-printk-ing overhead is to do a
'dmesg -n 1' and reset the maximum back to 50. (i prefer to use the
preempt_max_latency option not the preempt_thresh option.)
> ps: I've never mentioned the hardware I am running. Athlon XP 2000, 1G ram,
> 460G(usable) software raid5(3*250g ide)(plus boot 120G), LVM, extra
> SiliconImage UDMA133 controller(mobo can only do 100).
>
> I'm not certain what kind of latencies to expect with this setup. I'm
> tending to ignore <100us, at least for now.
this setup shouldnt produce above-100 usec latencies with -U5 and
PREEMPT_REALTIME.
Ingo
Ingo Molnar wrote:
> i have released the -U5 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>
Ingo,
*** Warning: "__you_cannot_kmalloc_that_much"
[drivers/scsi/aacraid/aacraid.ko] undefined!
This just appeared in U5. I was trying to track this one down just
because I saw it, even though I don't need aacraid. I am having a hell
of a time tracking down what changed that would cause this, but I figure
you will know exactly what changed that would cause it. :)
kr
* K.R. Foley <[email protected]> wrote:
> Ingo Molnar wrote:
> >i have released the -U5 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
> >
>
> Ingo,
>
> *** Warning: "__you_cannot_kmalloc_that_much"
> [drivers/scsi/aacraid/aacraid.ko] undefined!
>
> This just appeared in U5. I was trying to track this one down just
> because I saw it, even though I don't need aacraid. I am having a hell
> of a time tracking down what changed that would cause this, but I
> figure you will know exactly what changed that would cause it. :)
i suspect this is due to the size increase of semaphores if
CONFIG_RWSEM_DEADLOCK_DETECT is enabled. Try lowering
CONFIG_RWSEM_MAX_OWNERS from the default 64 to 32, does that help?
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Ingo Molnar wrote:
>>
>>>i have released the -U5 Real-Time Preemption patch:
>>>
>>> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>>>
>>
>>Ingo,
>>
>>*** Warning: "__you_cannot_kmalloc_that_much"
>>[drivers/scsi/aacraid/aacraid.ko] undefined!
>>
>>This just appeared in U5. I was trying to track this one down just
>>because I saw it, even though I don't need aacraid. I am having a hell
>>of a time tracking down what changed that would cause this, but I
>>figure you will know exactly what changed that would cause it. :)
>
>
> i suspect this is due to the size increase of semaphores if
> CONFIG_RWSEM_DEADLOCK_DETECT is enabled. Try lowering
> CONFIG_RWSEM_MAX_OWNERS from the default 64 to 32, does that help?
>
> Ingo
>
Yes. That does help.
kr
* K.R. Foley <[email protected]> wrote:
> Well you just beat me with that one. :) And here is another for
> aha152x.
> - DECLARE_MUTEX_LOCKED(sem);
> + DECLARE_MUTEX(sem);
almost - the full patch is the one below. (DECLARE_MUTEX() initializes
the mutex as unlocked, so there's a difference.)
Ingo
--- linux/drivers/scsi/aha152x.c.orig
+++ linux/drivers/scsi/aha152x.c
@@ -1160,11 +1160,12 @@ static void timer_expired(unsigned long
static int aha152x_device_reset(Scsi_Cmnd * SCpnt)
{
struct Scsi_Host *shpnt = SCpnt->device->host;
- DECLARE_MUTEX_LOCKED(sem);
+ DECLARE_MUTEX(sem);
struct timer_list timer;
int ret, issued, disconnected;
unsigned long flags;
+ init_MUTEX_LOCKED(&sem);
#if defined(AHA152X_DEBUG)
if(HOSTDATA(shpnt)->debug & debug_eh) {
printk(INFO_LEAD "aha152x_device_reset(%p)", CMDINFO(SCpnt), SCpnt);
* Bill Huey <[email protected]> wrote:
> On Mon, Oct 18, 2004 at 04:50:08PM +0200, Ingo Molnar wrote:
> > i have released the -U5 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>
> CC arch/i386/kernel/traps.o
> arch/i386/kernel/traps.c: In function `do_debug':
> arch/i386/kernel/traps.c:786: error: `sysenter_past_esp' undeclared (first use in this function)
> arch/i386/kernel/traps.c:786: error: (Each undeclared identifier is reported only once
> arch/i386/kernel/traps.c:786: error: for each function it appears in.)
> make[1]: *** [arch/i386/kernel/traps.o] Error 1
> make: *** [arch/i386/kernel] Error 2
i guess this might be an -mm1 breakage if CONFIG_KGDB enabled - does it
happen with vanilla -mm1 too?
Ingo
On Mon, Oct 18, 2004 at 09:36:03PM +0200, Ingo Molnar wrote:
> * Bill Huey <[email protected]> wrote:
> >
> > CC arch/i386/kernel/traps.o
> > arch/i386/kernel/traps.c: In function `do_debug':
> > arch/i386/kernel/traps.c:786: error: `sysenter_past_esp' undeclared (first use in this function)
> > arch/i386/kernel/traps.c:786: error: (Each undeclared identifier is reported only once
> > arch/i386/kernel/traps.c:786: error: for each function it appears in.)
> > make[1]: *** [arch/i386/kernel/traps.o] Error 1
> > make: *** [arch/i386/kernel] Error 2
>
> i guess this might be an -mm1 breakage if CONFIG_KGDB enabled - does it
> happen with vanilla -mm1 too?
yep, should I wait for -mm2 ?
bill
On Mon, Oct 18, 2004 at 04:50:08PM +0200, Ingo Molnar wrote:
> i have released the -U5 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
CC arch/i386/kernel/traps.o
arch/i386/kernel/traps.c: In function `do_debug':
arch/i386/kernel/traps.c:786: error: `sysenter_past_esp' undeclared (first use in this function)
arch/i386/kernel/traps.c:786: error: (Each undeclared identifier is reported only once
arch/i386/kernel/traps.c:786: error: for each function it appears in.)
make[1]: *** [arch/i386/kernel/traps.o] Error 1
make: *** [arch/i386/kernel] Error 2
bill
On Mon, Oct 18, 2004 at 09:46:32PM +0200, Ingo Molnar wrote:
> since 2.6.9's release is imminent there will unlikely be an -rc4-mm2,
> 2.6.9-mm1 should be the next one.
The new kgdb logic is a bit foreign to me. I'll have to look at it a
bit more, but this kgdb build problem is criticial for a certain
part of the kernel community that needs it. I've commented out that
section and rebuilding it now.
To get kgdb work (referencing my tree), I've just demoted all of the
spinlocks in arch/i386/{kernel,lib}/kgdb*.c files. It's pretty straight
forward, nothing tricky at all.
bill
* Bill Huey <[email protected]> wrote:
> On Mon, Oct 18, 2004 at 09:36:03PM +0200, Ingo Molnar wrote:
> > * Bill Huey <[email protected]> wrote:
> > >
> > > CC arch/i386/kernel/traps.o
> > > arch/i386/kernel/traps.c: In function `do_debug':
> > > arch/i386/kernel/traps.c:786: error: `sysenter_past_esp' undeclared (first use in this function)
> > > arch/i386/kernel/traps.c:786: error: (Each undeclared identifier is reported only once
> > > arch/i386/kernel/traps.c:786: error: for each function it appears in.)
> > > make[1]: *** [arch/i386/kernel/traps.o] Error 1
> > > make: *** [arch/i386/kernel] Error 2
> >
> > i guess this might be an -mm1 breakage if CONFIG_KGDB enabled - does it
> > happen with vanilla -mm1 too?
>
> yep, should I wait for -mm2 ?
since 2.6.9's release is imminent there will unlikely be an -rc4-mm2,
2.6.9-mm1 should be the next one.
Ingo
On Mon, Oct 18, 2004 at 12:32:51PM -0700, Bill Huey wrote:
> On Mon, Oct 18, 2004 at 04:50:08PM +0200, Ingo Molnar wrote:
> > i have released the -U5 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/voluntary-preempt/voluntary-preempt-2.6.9-rc4-mm1-U5
>
> CC arch/i386/kernel/traps.o
> arch/i386/kernel/traps.c: In function `do_debug':
> arch/i386/kernel/traps.c:786: error: `sysenter_past_esp' undeclared (first use in this function)
> arch/i386/kernel/traps.c:786: error: (Each undeclared identifier is reported only once
> arch/i386/kernel/traps.c:786: error: for each function it appears in.)
> make[1]: *** [arch/i386/kernel/traps.o] Error 1
> make: *** [arch/i386/kernel] Error 2
bah, let me handle this, CONFIG_KGDB was turned on and I'm workin on
this now...
bill
On Mon, 18 Oct 2004, Ingo Molnar wrote:
> > However, after I reset the threshold to 50(and got a few small traces), I got
> > this whopper.
> >
> > (XFree86/1129/CPU#0): new 4692 us maximum-latency critical section.
> > => started at timestamp 358506933: <call_console_drivers+0x76/0x140>
> > => ended at timestamp 358511625: <finish_task_switch+0x43/0xa0>
> > [<c0132480>] sub_preempt_count+0x60/0x90
>
> interesting - this could be a printk (trace) done in a critical section
> though. What does /proc/latency_trace tell, is it full of console code
> functions?
Too late, it's gone. It'd be nice if there was some way to have history on
that file.
* Adam Heath <[email protected]> wrote:
> On Mon, 18 Oct 2004, Ingo Molnar wrote:
>
> > > However, after I reset the threshold to 50(and got a few small traces), I got
> > > this whopper.
> > >
> > > (XFree86/1129/CPU#0): new 4692 us maximum-latency critical section.
> > > => started at timestamp 358506933: <call_console_drivers+0x76/0x140>
> > > => ended at timestamp 358511625: <finish_task_switch+0x43/0xa0>
> > > [<c0132480>] sub_preempt_count+0x60/0x90
> >
> > interesting - this could be a printk (trace) done in a critical section
> > though. What does /proc/latency_trace tell, is it full of console code
> > functions?
>
> Too late, it's gone. It'd be nice if there was some way to have
> history on that file.
well - if it's gone it's always replaced by a larger latency (if you use
the preempt_max_latency method), which in most cases is more interesting
than the one you wanted to save.
Ingo
yOn Mon, 18 Oct 2004, Ingo Molnar wrote:
> > Too late, it's gone. It'd be nice if there was some way to have
> > history on that file.
>
> well - if it's gone it's always replaced by a larger latency (if you use
> the preempt_max_latency method), which in most cases is more interesting
> than the one you wanted to save.
I reset the minimum to 50, and it's only gotten up to 83.
On Mon, 18 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U5 Real-Time Preemption patch:
>From the first line of the attached oops file:
kernel BUG at lib/rwsem-generic.c:130!
Then, a BUG, sleeping while atomic.
On Mon, 2004-10-18 at 16:50, Ingo Molnar wrote:
> i have released the -U5 Real-Time Preemption patch:
>
starting nfsd triggers the deadlock detection, but it's not really a
deadlock.
The first nfsd thread acquires nlmsvc_sema. The other 7 nfsd threads
wait on it. The first thread creates lockd and tries to get lockd_start,
which is initialized locked and will be released when the lockd thread
starts. The deadlock complains about a deadlock on nlmsvc_sema inside of
down(&lockd_start). ?
Converting lockd_start to a waitqueue solves the problem and is more
sane, than the locked mutex.
tglx
--- 2.6.9-rc4-mm1-RT-U5/fs/lockd/svc.c.orig 2004-10-19
10:02:17.000000000 +0200
+++ 2.6.9-rc4-mm1-RT-U5/fs/lockd/svc.c 2004-10-19 09:54:22.000000000
+0200
@@ -46,7 +46,7 @@
int nlmsvc_grace_period;
unsigned long nlmsvc_timeout;
-static DECLARE_MUTEX(lockd_start);
+static DECLARE_WAIT_QUEUE_HEAD(lockd_start);
static DECLARE_WAIT_QUEUE_HEAD(lockd_exit);
/*
@@ -109,7 +109,7 @@
* Let our maker know we're running.
*/
nlmsvc_pid = current->pid;
- up(&lockd_start);
+ wake_up(&lockd_start);
daemonize("lockd");
@@ -230,6 +230,7 @@
printk(KERN_WARNING
"lockd_up: no pid, %d users??\n", nlmsvc_users);
+
error = -ENOMEM;
serv = svc_create(&nlmsvc_program, LOCKD_BUFSIZE);
if (!serv) {
@@ -258,7 +259,19 @@
"lockd_up: create thread failed, error=%d\n",
error);
goto destroy_and_out;
}
- down(&lockd_start);
+ /*
+ * Wait for the lockd process to start, but since we're holding
+ * the lockd semaphore, we can't wait around forever ...
+ */
+ clear_thread_flag(TIF_SIGPENDING);
+ interruptible_sleep_on_timeout(&lockd_start, HZ);
+ if (!nlmsvc_pid) {
+ printk(KERN_WARNING
+ "lockd_down: lockd failed to start\n");
+ }
+ spin_lock_irq(¤t->sighand->siglock);
+ recalc_sigpending();
+ spin_unlock_irq(¤t->sighand->siglock);
/*
* Note: svc_serv structures have an initial use count of 1,
@@ -423,7 +436,6 @@
static int __init init_nlm(void)
{
- init_MUTEX_LOCKED(&lockd_start);
nlm_sysctl_table = register_sysctl_table(nlm_sysctl_root, 0);
return nlm_sysctl_table ? 0 : -ENOMEM;
}
Installing knfsd (copyright (C) 1996 [email protected]).
mountdBUG: semaphore deadlock detected!
.. task nfsd/1246 is holding c89f4560.
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
Call Trace:
[<c0117fc7>] copy_process+0x5f7/0xc30
[<c01c4e62>] __delay+0x12/0x20
[<c01fe460>] serial8250_console_write+0x170/0x280
[<c011930e>] __call_console_drivers+0x5e/0x60
[<c011982a>] release_console_sem+0xda/0xe0
[<c01196a7>] vprintk+0x107/0x160
[<c01196a7>] vprintk+0x107/0x160
[<c012a3ae>] __kernel_text_address+0x2e/0x40
[<c0106bf5>] show_trace+0x35/0x90
[<c0106bf5>] show_trace+0x35/0x90
[<c0106cd0>] show_stack+0x80/0xa0
[<c01c2976>] __rwsem_deadlock+0x136/0x180
[<c01c28aa>] __rwsem_deadlock+0x6a/0x180
[<c02c3c33>] __down_write+0x93/0x170
[<c01c3033>] down_write+0x43/0x80
[<c89e661d>] lockd_up+0x9d/0x120 [lockd]
[<c89e62e0>] lockd+0x0/0x2a0 [lockd]
[<c883d2e1>] nfsd+0x91/0x380 [nfsd]
[<c0105e12>] ret_from_fork+0x6/0x14
[<c883d250>] nfsd+0x0/0x380 [nfsd]
[<c0103ffd>] kernel_thread_helper+0x5/0x18
...#0 task nfsd/1246 is holding c89f4560.
BUG: semaphore deadlock: nfsd/1252 is blocked on c89f4560, deadlocking
nfsd/1246
c733df60 00000046 c7adad10 c03d6060 00000000 00000000 00000000 00000000
00000000 00000000 c7a306f0 00005b6e 39133ac7 0000001a c7adae6c
c7adad10
c7adad10 c7584400 00000000 c02c3cad c89f4560 c6853f6c c6819f6c
c7adad10
Call Trace:
[<c02c3cad>] __down_write+0x10d/0x170
[<c01c3033>] down_write+0x43/0x80
[<c89e6591>] lockd_up+0x11/0x120 [lockd]
[<c883d250>] nfsd+0x0/0x380 [nfsd]
[<c883d2e1>] nfsd+0x91/0x380 [nfsd]
[<c0105e12>] ret_from_fork+0x6/0x14
[<c883d250>] nfsd+0x0/0x380 [nfsd]
[<c0103ffd>] kernel_thread_helper+0x5/0x18
------------[ cut here ]------------
On Mon, 2004-10-18 at 16:50, Ingo Molnar wrote:
> i have released the -U5 Real-Time Preemption patch:
All sleep_on variants trigger the irqs_disabled() check in schedule().
tglx
* Thomas Gleixner <[email protected]> wrote:
> On Mon, 2004-10-18 at 16:50, Ingo Molnar wrote:
> > i have released the -U5 Real-Time Preemption patch:
>
> All sleep_on variants trigger the irqs_disabled() check in schedule().
> tglx
ah, forgot that the waitqueue lock is a raw lock. Is there _any_
scenario where sleep_on() is actually correct kernel code?
Ingo
On Tue, 2004-10-19 at 11:04, Ingo Molnar wrote:
> > All sleep_on variants trigger the irqs_disabled() check in schedule().
> > tglx
>
> ah, forgot that the waitqueue lock is a raw lock. Is there _any_
> scenario where sleep_on() is actually correct kernel code?
Hmm, the sleep_on() variants are used quite a lot over the kernel. Whats
wrong with them and to what should they be converted ?
tglx
* Thomas Gleixner <[email protected]> wrote:
> On Tue, 2004-10-19 at 11:04, Ingo Molnar wrote:
> > > All sleep_on variants trigger the irqs_disabled() check in schedule().
> > > tglx
> >
> > ah, forgot that the waitqueue lock is a raw lock. Is there _any_
> > scenario where sleep_on() is actually correct kernel code?
>
> Hmm, the sleep_on() variants are used quite a lot over the kernel.
> Whats wrong with them and to what should they be converted ?
they are racy on SMP. It does:
current->state = TASK_INTERRUPTIBLE;
schedule();
which is almost always a bug to go to sleep via sleep_on() _after_
checking for the condition, because the following could happen:
CPU1 CPU2
if (condition)
goto done;
wake_up(&waitqueue);
current->state = TASK_INTERRUPTIBLE;
schedule();
The proper interface is wait_event() (and variants).
your patch probably only works due to timing - the wakeup always happens
after sleep_on() has been called.
this particular NFS case is probably only correct due to userspace
behavior. The code is apparently relying on the wake_up() never
happening _before_ we do the sleep_on().
so, could you try the init_MUTEX_LOCKED() fix plus the patch below -
does that turn off the deadlock assert? (Plus also uncomment the
RWSEM_BUG() around line 130.)
Ingo
--- linux/lib/rwsem-generic.c.orig
+++ linux/lib/rwsem-generic.c
@@ -750,6 +750,15 @@ void fastcall sema_init(struct semaphore
case 0:
init_rwsem(&sem->lock);
down(sem);
+#ifdef CONFIG_RWSEM_DEADLOCK_DETECT
+ {
+ unsigned long flags;
+
+ rwsem_lock_irqsave(&rwsem_lock, flags);
+ rwsem_owner_del(&sem->lock);
+ rwsem_unlock_irqrestore(&rwsem_lock, flags);
+ }
+#endif
break;
default:
RWSEM_BUG();
* Ingo Molnar <[email protected]> wrote:
> so, could you try the init_MUTEX_LOCKED() fix plus the patch below -
> does that turn off the deadlock assert? (Plus also uncomment the
> RWSEM_BUG() around line 130.)
but i agree with you that this semaphore (ab-)use in the NFS code should
be fixed. Best would be to use a completion, but unfortunately there's
no wait_for_completion_timeout() API right now.
Ingo
On Tue, 2004-10-19 at 11:34, Ingo Molnar wrote:
> > Hmm, the sleep_on() variants are used quite a lot over the kernel.
> > Whats wrong with them and to what should they be converted ?
>
> they are racy on SMP. It does:
> The proper interface is wait_event() (and variants).
Sorry for beeing stupid. I remebered the wait_event stuff immidiately
after hitting send (:
> your patch probably only works due to timing - the wakeup always happens
> after sleep_on() has been called.
>
> this particular NFS case is probably only correct due to userspace
> behavior. The code is apparently relying on the wake_up() never
> happening _before_ we do the sleep_on().
Correct fix appended. I think it's more sane than the locked mutex, as
we actually come back if lockd is not started for any reason.
> so, could you try the init_MUTEX_LOCKED() fix plus the patch below -
> does that turn off the deadlock assert? (Plus also uncomment the
> RWSEM_BUG() around line 130.)
Yep, that fixes the problem
tglx
--- 2.6.9-rc4-mm1-RT-U5/fs/lockd/svc.c.orig 2004-10-19
10:02:17.000000000 +0200
+++ 2.6.9-rc4-mm1-RT-U5/fs/lockd/svc.c 2004-10-19 11:34:12.000000000
+0200
@@ -46,7 +46,7 @@
int nlmsvc_grace_period;
unsigned long nlmsvc_timeout;
-static DECLARE_MUTEX(lockd_start);
+static DECLARE_WAIT_QUEUE_HEAD(lockd_start);
static DECLARE_WAIT_QUEUE_HEAD(lockd_exit);
/*
@@ -109,7 +109,7 @@
* Let our maker know we're running.
*/
nlmsvc_pid = current->pid;
- up(&lockd_start);
+ wake_up(&lockd_start);
daemonize("lockd");
@@ -230,6 +230,7 @@
printk(KERN_WARNING
"lockd_up: no pid, %d users??\n", nlmsvc_users);
+
error = -ENOMEM;
serv = svc_create(&nlmsvc_program, LOCKD_BUFSIZE);
if (!serv) {
@@ -258,8 +259,15 @@
"lockd_up: create thread failed, error=%d\n",
error);
goto destroy_and_out;
}
- down(&lockd_start);
-
+ /*
+ * Wait for the lockd process to start, but since we're holding
+ * the lockd semaphore, we can't wait around forever ...
+ */
+ if (wait_event_interruptible_timeout(lockd_start,
+ nlmsvc_pid != 0, HZ)) {
+ printk(KERN_WARNING
+ "lockd_down: lockd failed to start\n");
+ }
/*
* Note: svc_serv structures have an initial use count of 1,
* so we exit through here on both success and failure.
@@ -298,16 +306,12 @@
* Wait for the lockd process to exit, but since we're holding
* the lockd semaphore, we can't wait around forever ...
*/
- clear_thread_flag(TIF_SIGPENDING);
- interruptible_sleep_on_timeout(&lockd_exit, HZ);
- if (nlmsvc_pid) {
+ if (wait_event_interruptible_timeout(lockd_exit,
+ nlmsvc_pid == 0, HZ)) {
printk(KERN_WARNING
"lockd_down: lockd failed to exit, clearing
pid\n");
nlmsvc_pid = 0;
}
- spin_lock_irq(¤t->sighand->siglock);
- recalc_sigpending();
- spin_unlock_irq(¤t->sighand->siglock);
out:
up(&nlmsvc_sema);
}
@@ -423,7 +427,6 @@
static int __init init_nlm(void)
{
- init_MUTEX_LOCKED(&lockd_start);
nlm_sysctl_table = register_sysctl_table(nlm_sysctl_root, 0);
return nlm_sysctl_table ? 0 : -ENOMEM;
}
BUG: semaphore deadlock detected!
.. task ksoftirqd/0/2 is holding c04019c4.
0000012c 00000001 c14f4000 00000000 c14f5f9c c011e713 c03a36b8 c011e7d9
c011eb18 0000000a c03a36b8 c14f4000 c14f4000 00000000 c14f5fa4 c011e7f1
c14f5fbc c011eb18 00000001 fffffff6 c14f4000 c14e3f70 c14f5fec c012d333
Call Trace:
[<c011e713>] ___do_softirq+0x83/0xc8
[<c011e7d9>] _do_softirq+0x8/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c011e7f1>] _do_softirq+0x20/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c012d333>] kthread+0xaa/0xaf
[<c011ea85>] ksoftirqd+0x0/0xd1
[<c012d289>] kthread+0x0/0xaf
[<c01032f1>] kernel_thread_helper+0x5/0xb
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: down_write+0x95/0x97 / (dev_queue_xmit_nit+0x41/0x127)
.. entry 2: __down_write+0x43/0x190 / (down_write+0x52/0x97)
.. entry 3: print_traces+0x1d/0x56 / (show_stack+0x85/0x9b)
...#0 task ksoftirqd/0/2 is holding c04019c4.
------------[ cut here ]------------
kernel BUG at lib/rwsem-generic.c:416!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: ipt_REJECT ipt_pkttype ipt_LOG ipt_state ipt_multiport ipt_conntrack iptable_mangle ip_nat_irc ip_nat_tftp ip_nat_ftp iptable_nat ip_conntrack_irc ip_conntrack_tftp ip_conntrack_ftp ip_conntrack iptable_filter ip_tables af_packet 8139too 8139cp mii crc32 e1000 snd_intel8x0 snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd soundcore snd_page_alloc ehci_hcd usbhid uhci_hcd usbcore intel_mch_agp intel_agp evdev nls_iso8859_1 nls_cp437 vfat fat dm_mod floppy ide_cd cdrom psmouse unix
CPU: 0
EIP: 0060:[<c0289aca>] Not tainted VLI
EFLAGS: 00010002 (2.6.9-rc4-mm1-RT-U5)
EIP is at __down_write+0xad/0x190
eax: 00000001 ebx: c14e0d20 ecx: 00000000 edx: c14f4000
esi: c04019c4 edi: dff1b080 ebp: c14f5d64 esp: c14f5d48
ds: 007b es: 007b ss: 0068 preempt: 00000003
Process ksoftirqd/0 (pid: 2, threadinfo=c14f4000 task=c14e0d20)
Stack: c04019c4 c04019c8 c04019c8 c14e0d20 00000002 c14f4000 c04019c4 c14f5d7c
c01b33c8 c04019c4 c14e0d20 dec0f000 df59a000 c14f5da0 c02364c5 c04019c0
00000202 c14f5da0 df59a000 dec0f000 df59a000 df59a358 c14f5dc4 c02420cd
Call Trace:
[<c01b33c8>] down_write+0x52/0x97
[<c02364c5>] dev_queue_xmit_nit+0x41/0x127
[<c02420cd>] qdisc_restart+0x182/0x1ac
[<c0236964>] dev_queue_xmit+0x194/0x230
[<c023c3e3>] neigh_resolve_output+0xf1/0x1e5
[<c023bf8d>] neigh_update+0x2c3/0x368
[<c0272642>] arp_process+0x1f8/0x5ac
[<c01132d0>] mcount+0x14/0x18
[<c0272a0a>] arp_rcv+0x14/0x155
[<c0236e14>] netif_receive_skb+0x10f/0x1ab
[<c0241c7e>] eth_type_trans+0xe/0xc8
[<e097582f>] rtl8139_rx+0x185/0x326 [8139too]
[<e0975bbb>] rtl8139_poll+0x5d/0xfe [8139too]
[<c023707b>] net_rx_action+0x7c/0x143
[<c011e713>] ___do_softirq+0x83/0xc8
[<c011e7d9>] _do_softirq+0x8/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c011e7f1>] _do_softirq+0x20/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c012d333>] kthread+0xaa/0xaf
[<c011ea85>] ksoftirqd+0x0/0xd1
[<c012d289>] kthread+0x0/0xaf
[<c01032f1>] kernel_thread_helper+0x5/0xb
preempt count: 00000004
. 4-level deep critical section nesting:
.. entry 1: down_write+0x95/0x97 / (dev_queue_xmit_nit+0x41/0x127)
.. entry 2: __down_write+0x43/0x190 / (down_write+0x52/0x97)
.. entry 3: die+0x3f/0x196 / (do_invalid_op+0x106/0x108)
.. entry 4: print_traces+0x1d/0x56 / (show_stack+0x85/0x9b)
Code: 10 89 45 ec 89 34 24 e8 47 90 f2 ff 89 34 24 e8 04 91 f2 ff 85 c0 74 1b a1 58 36 2d c0 85 c0 74 12 c7 05 58 36 2d c0 00 00 00 00 <0f> 0b a0 01 2f 32 2a c0 89 b3 70 06 00 00 9c 58 c1 e8 09 83 f0
<3>BUG: sleeping function called from invalid context ksoftirqd/0(2) at lib/rwsem-generic.c:593
in_atomic():1 [00000002], irqs_disabled():0
[<c0116e6c>] __might_sleep+0xc4/0xd6
[<c01b3275>] down_read+0x27/0x96
[<c011a353>] profile_task_exit+0x1b/0x43
[<c01132d0>] mcount+0x14/0x18
[<c011bfbf>] do_exit+0x1f/0x4a3
[<c0105e9a>] do_invalid_op+0x0/0x108
[<c0105af8>] do_divide_error+0x0/0x132
[<c0114a1c>] fixup_exception+0x1c/0x38
[<c0105fa0>] do_invalid_op+0x106/0x108
[<c01132d0>] mcount+0x14/0x18
[<c0289aca>] __down_write+0xad/0x190
[<c0119e3b>] release_console_sem+0xa7/0xfe
[<c0119ce2>] vprintk+0x116/0x179
[<c0105309>] error_code+0x2d/0x38
[<c0289aca>] __down_write+0xad/0x190
[<c01b33c8>] down_write+0x52/0x97
[<c02364c5>] dev_queue_xmit_nit+0x41/0x127
[<c02420cd>] qdisc_restart+0x182/0x1ac
[<c0236964>] dev_queue_xmit+0x194/0x230
[<c023c3e3>] neigh_resolve_output+0xf1/0x1e5
[<c023bf8d>] neigh_update+0x2c3/0x368
[<c0272642>] arp_process+0x1f8/0x5ac
[<c01132d0>] mcount+0x14/0x18
[<c0272a0a>] arp_rcv+0x14/0x155
[<c0236e14>] netif_receive_skb+0x10f/0x1ab
[<c0241c7e>] eth_type_trans+0xe/0xc8
[<e097582f>] rtl8139_rx+0x185/0x326 [8139too]
[<e0975bbb>] rtl8139_poll+0x5d/0xfe [8139too]
[<c023707b>] net_rx_action+0x7c/0x143
[<c011e713>] ___do_softirq+0x83/0xc8
[<c011e7d9>] _do_softirq+0x8/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c011e7f1>] _do_softirq+0x20/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c012d333>] kthread+0xaa/0xaf
[<c011ea85>] ksoftirqd+0x0/0xd1
[<c012d289>] kthread+0x0/0xaf
[<c01032f1>] kernel_thread_helper+0x5/0xb
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: down_write+0x95/0x97 / (dev_queue_xmit_nit+0x41/0x127)
.. entry 2: __down_write+0x43/0x190 / (down_write+0x52/0x97)
.. entry 3: print_traces+0x1d/0x56 / (dump_stack+0x23/0x27)
BUG: scheduling while atomic: ksoftirqd/0/0x04000002/2
caller is cond_resched+0x62/0x82
[<c0288df2>] __sched_text_start+0x53e/0x571
[<c02894c4>] cond_resched+0x62/0x82
[<c01132d0>] mcount+0x14/0x18
[<c02894c4>] cond_resched+0x62/0x82
[<c01b327a>] down_read+0x2c/0x96
[<c011a353>] profile_task_exit+0x1b/0x43
[<c01132d0>] mcount+0x14/0x18
[<c011bfbf>] do_exit+0x1f/0x4a3
[<c0105e9a>] do_invalid_op+0x0/0x108
[<c0105af8>] do_divide_error+0x0/0x132
[<c0114a1c>] fixup_exception+0x1c/0x38
[<c0105fa0>] do_invalid_op+0x106/0x108
[<c01132d0>] mcount+0x14/0x18
[<c0289aca>] __down_write+0xad/0x190
(ksoftirqd/0/2/CPU#0): new 856 us maximum-latency critical section.
=> started at timestamp 144481672: <__call_console_drivers+0x19/0x65>
=> ended at timestamp 144482528: <call_console_drivers+0x9e/0x13f>
[<c012ec10>] touch_preempt_timing+0x3d/0x41
[<c012eb37>] check_preempt_timing+0x1b2/0x24e
[<c0119a54>] call_console_drivers+0x9e/0x13f
[<c012ec10>] touch_preempt_timing+0x3d/0x41
[<c0119a54>] call_console_drivers+0x9e/0x13f
[<c0119a54>] call_console_drivers+0x9e/0x13f
[<c0288edb>] preempt_schedule+0x11/0x7a
[<c0119e04>] release_console_sem+0x70/0xfe
[<c0119ce2>] vprintk+0x116/0x179
[<c0289aca>] __down_write+0xad/0x190
[<c0119bc8>] printk+0x1d/0x21
[<c01055f8>] show_trace+0x78/0xb2
[<c0289aca>] __down_write+0xad/0x190
[<c01056f0>] dump_stack+0x23/0x27
[<c0288df2>] __sched_text_start+0x53e/0x571
[<c02894c4>] cond_resched+0x62/0x82
[<c01132d0>] mcount+0x14/0x18
[<c02894c4>] cond_resched+0x62/0x82
[<c01b327a>] down_read+0x2c/0x96
[<c011a353>] profile_task_exit+0x1b/0x43
[<c01132d0>] mcount+0x14/0x18
[<c011bfbf>] do_exit+0x1f/0x4a3
[<c0105e9a>] do_invalid_op+0x0/0x108
[<c0105af8>] do_divide_error+0x0/0x132
[<c0114a1c>] fixup_exception+0x1c/0x38
[<c0105fa0>] do_invalid_op+0x106/0x108
[<c01132d0>] mcount+0x14/0x18
[<c0289aca>] __down_write+0xad/0x190
[<c0119e3b>] release_console_sem+0xa7/0xfe
[<c0119ce2>] vprintk+0x116/0x179
[<c0105309>] error_code+0x2d/0x38
[<c0289aca>] __down_write+0xad/0x190
[<c01b33c8>] down_write+0x52/0x97
[<c02364c5>] dev_queue_xmit_nit+0x41/0x127
[<c02420cd>] qdisc_restart+0x182/0x1ac
[<c0236964>] dev_queue_xmit+0x194/0x230
[<c023c3e3>] neigh_resolve_output+0xf1/0x1e5
[<c023bf8d>] neigh_update+0x2c3/0x368
[<c0272642>] arp_process+0x1f8/0x5ac
[<c01132d0>] mcount+0x14/0x18
[<c0272a0a>] arp_rcv+0x14/0x155
[<c0236e14>] netif_receive_skb+0x10f/0x1ab
[<c0241c7e>] eth_type_trans+0xe/0xc8
[<e097582f>] rtl8139_rx+0x185/0x326 [8139too]
[<e0975bbb>] rtl8139_poll+0x5d/0xfe [8139too]
[<c023707b>] net_rx_action+0x7c/0x143
[<c011e713>] ___do_softirq+0x83/0xc8
[<c011e7d9>] _do_softirq+0x8/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c011e7f1>] _do_softirq+0x20/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c012d333>] kthread+0xaa/0xaf
[<c011ea85>] ksoftirqd+0x0/0xd1
[<c012d289>] kthread+0x0/0xaf
[<c01032f1>] kernel_thread_helper+0x5/0xb
preempt count: 04000003
. 3-level deep critical section nesting:
.. entry 1: down_write+0x95/0x97 / (dev_queue_xmit_nit+0x41/0x127)
.. entry 2: __down_write+0x43/0x190 / (down_write+0x52/0x97)
.. entry 3: print_traces+0x1d/0x56 / (dump_stack+0x23/0x27)
=> dump-end timestamp 144483248
[<c0119e3b>] release_console_sem+0xa7/0xfe
[<c0119ce2>] vprintk+0x116/0x179
[<c0105309>] error_code+0x2d/0x38
[<c0289aca>] __down_write+0xad/0x190
[<c01b33c8>] down_write+0x52/0x97
[<c02364c5>] dev_queue_xmit_nit+0x41/0x127
[<c02420cd>] qdisc_restart+0x182/0x1ac
[<c0236964>] dev_queue_xmit+0x194/0x230
[<c023c3e3>] neigh_resolve_output+0xf1/0x1e5
[<c023bf8d>] neigh_update+0x2c3/0x368
[<c0272642>] arp_process+0x1f8/0x5ac
[<c01132d0>] mcount+0x14/0x18
[<c0272a0a>] arp_rcv+0x14/0x155
[<c0236e14>] netif_receive_skb+0x10f/0x1ab
[<c0241c7e>] eth_type_trans+0xe/0xc8
[<e097582f>] rtl8139_rx+0x185/0x326 [8139too]
[<e0975bbb>] rtl8139_poll+0x5d/0xfe [8139too]
[<c023707b>] net_rx_action+0x7c/0x143
[<c011e713>] ___do_softirq+0x83/0xc8
[<c011e7d9>] _do_softirq+0x8/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c011e7f1>] _do_softirq+0x20/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c012d333>] kthread+0xaa/0xaf
[<c011ea85>] ksoftirqd+0x0/0xd1
[<c012d289>] kthread+0x0/0xaf
[<c01032f1>] kernel_thread_helper+0x5/0xb
preempt count: 04000003
. 3-level deep critical section nesting:
.. entry 1: down_write+0x95/0x97 / (dev_queue_xmit_nit+0x41/0x127)
.. entry 2: __down_write+0x43/0x190 / (down_write+0x52/0x97)
.. entry 3: print_traces+0x1d/0x56 / (dump_stack+0x23/0x27)
note: ksoftirqd/0[2] exited with preempt_count 2
BUG: scheduling while atomic: ksoftirqd/0/0x00000002/2
caller is do_exit+0x2a7/0x4a3
[<c0288df2>] __sched_text_start+0x53e/0x571
[<c011c247>] do_exit+0x2a7/0x4a3
[<c01132d0>] mcount+0x14/0x18
[<c011c247>] do_exit+0x2a7/0x4a3
[<c0105e9a>] do_invalid_op+0x0/0x108
[<c0105af8>] do_divide_error+0x0/0x132
[<c0114a1c>] fixup_exception+0x1c/0x38
[<c0105fa0>] do_invalid_op+0x106/0x108
[<c01132d0>] mcount+0x14/0x18
[<c0289aca>] __down_write+0xad/0x190
[<c0119e3b>] release_console_sem+0xa7/0xfe
[<c0119ce2>] vprintk+0x116/0x179
[<c0105309>] error_code+0x2d/0x38
[<c0289aca>] __down_write+0xad/0x190
[<c01b33c8>] down_write+0x52/0x97
[<c02364c5>] dev_queue_xmit_nit+0x41/0x127
[<c02420cd>] qdisc_restart+0x182/0x1ac
[<c0236964>] dev_queue_xmit+0x194/0x230
[<c023c3e3>] neigh_resolve_output+0xf1/0x1e5
[<c023bf8d>] neigh_update+0x2c3/0x368
[<c0272642>] arp_process+0x1f8/0x5ac
[<c01132d0>] mcount+0x14/0x18
[<c0272a0a>] arp_rcv+0x14/0x155
[<c0236e14>] netif_receive_skb+0x10f/0x1ab
[<c0241c7e>] eth_type_trans+0xe/0xc8
[<e097582f>] rtl8139_rx+0x185/0x326 [8139too]
[<e0975bbb>] rtl8139_poll+0x5d/0xfe [8139too]
[<c023707b>] net_rx_action+0x7c/0x143
[<c011e713>] ___do_softirq+0x83/0xc8
[<c011e7d9>] _do_softirq+0x8/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c011e7f1>] _do_softirq+0x20/0x22
[<c011eb18>] ksoftirqd+0x93/0xd1
[<c012d333>] kthread+0xaa/0xaf
[<c011ea85>] ksoftirqd+0x0/0xd1
[<c012d289>] kthread+0x0/0xaf
[<c01032f1>] kernel_thread_helper+0x5/0xb
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: down_write+0x95/0x97 / (dev_queue_xmit_nit+0x41/0x127)
.. entry 2: __down_write+0x43/0x190 / (down_write+0x52/0x97)
.. entry 3: print_traces+0x1d/0x56 / (dump_stack+0x23/0x27)
* Thomas Gleixner <[email protected]> wrote:
> + * Wait for the lockd process to start, but since we're holding
> + * the lockd semaphore, we can't wait around forever ...
> + */
> + if (wait_event_interruptible_timeout(lockd_start,
> + nlmsvc_pid != 0, HZ)) {
> + printk(KERN_WARNING
> + "lockd_down: lockd failed to start\n");
yeah, this is much cleaner. I'd suggest to remove the init_sem() hack
from lib/rwsem-generic.c, it seems it is a nice facility to find
semaphore abuses.
Ingo
On Tue, 2004-10-19 at 13:07, Ingo Molnar wrote:
> * Thomas Gleixner <[email protected]> wrote:
>
> > + * Wait for the lockd process to start, but since we're holding
> > + * the lockd semaphore, we can't wait around forever ...
> > + */
> > + if (wait_event_interruptible_timeout(lockd_start,
> > + nlmsvc_pid != 0, HZ)) {
> > + printk(KERN_WARNING
> > + "lockd_down: lockd failed to start\n");
>
> yeah, this is much cleaner.
Cleaner, but not perfect. The return value is > 0, if the timeout is not
reached. Grmbl.
> I'd suggest to remove the init_sem() hack
> from lib/rwsem-generic.c, it seems it is a nice facility to find
> semaphore abuses.
True. Will do so.
tglx
i have released the -U6 Real-Time Preemption patch:
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
this is a fixes-only release.
found and fixed the 'big bug' that was probably the one causing
stability problems for a number of people. There was a small window for
a task double-free race to occur, causing all sorts of crashes later on.
This bug could trigger on UP and SMP systems alike, on SMP being a bit
more frequent.
Also, a common networking deadlock was found and fixed as well, using
the deadlock detector.
Changes since -U5:
- crash bug: fix task double free caused by irq-preemption of
do_exit(). This got introduced in -U5 as part of a simplification of
the zombie-reaping rewrite that the -U series did. That rewrite had an
unrobustness which got triggered by -U5 in a subtle way, opening up a
small window at the end of do_exit() for an interrupt-triggered
preemption to cause a double-free. This could fix some of the crashes
reported by Rui Nuno Capela, Mark H Johnson.
- deadlock bug: fix networking deadlock reported by Matthew L Foster.
Restructured the way the RT-RCU locking of ptype_lock is done - it's
cleaner and more obvious now (besides being correct). This could also
fix the deadlock reported by Michal Schmidt.
- deadlock bug: fix NFS startup breakage related to semaphore abuse,
patch from Thomas Gleixner.
- build bug: fix aha152x.c, based on patch from K.R. Foley.
- build bug: fix compilation error in qla2xxx. (reported by Fernando
Pablo Lopez-Lezcano and Mark H Johnson)
- build bug: fix !PREEMPT_REALTIME compilation error. (reported by
Matthew L Foster)
- build bug: fix ipmi-watchdog compilation error. (reported by Mark H
Johnson)
- tracer fix: if an assert happens within the tracer then we'd get into
infinite recursion. The fix was to correctly nest tracing on/off
points.
- debug enhancement: added a few more asserts to catch underflowing
atomic counters. (this made the task double-free trigger earlier.)
- debug enhancement: extended CONFIG_DEBUG_STACKOVERFLOW to be
mcount()-driven as well. This helps in catching stack overflows much
more reliably than the do_IRQ() based method.
to create a -U6 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
Ingo
Ingo Molnar <[email protected]> writes:
> i have released the -U5 Real-Time Preemption patch:
[...]
> the generic semaphore implementation (which uses rwsems) makes it
> possible to use the deadlock detection mechanism for all the mutex types
> we currently have: semaphores, rw-semaphores, spinlock-mutexes and
> rwlock-mutexes. Another benefit is that PREEMPT_REALTIME becomes much
> more portable this way. (although it's still x86-only at the moment.)
Speaking of portability, is anyone yet working on ports to other
platforms? I'm particularily interested in ARM.
If anyone has started on ARM, I'd be glad to help out.
Kevin
http://hilman.org/kevin/
* Kevin Hilman <[email protected]> wrote:
> > the generic semaphore implementation (which uses rwsems) makes it
> > possible to use the deadlock detection mechanism for all the mutex types
> > we currently have: semaphores, rw-semaphores, spinlock-mutexes and
> > rwlock-mutexes. Another benefit is that PREEMPT_REALTIME becomes much
> > more portable this way. (although it's still x86-only at the moment.)
>
> Speaking of portability, is anyone yet working on ports to other
> platforms? I'm particularily interested in ARM.
i'll do x64 a couple of days after stability has been reached. I dont
know of any ARM efforts though.
a good starting point would be to enhance the generic-hardirqs framework
for ARM. Generic-hardirqs is a portion of the PREEMPT_REALTIME patch
that i've split out and submitted upstream, and which is expected to be
merged into 2.6.10. The generic irq subsystem makes the irq-threading
feature really simple and maintainable. So for PREEMPT_REALTIME to work
on ARM the first step is to enable generic-hardirqs on ARM. (which is
far from simple though.)
Ingo
* Ingo Molnar <[email protected]> wrote:
> i have released the -U6 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
i've re-released the patch because shortly after releasing it i found a
false-positive in the deadlock-detector that was triggering in oowriter.
The latest patch is thus:
$ md5sum realtime-preempt-2.6.9-rc4-mm1-U6
9fd546bdd2d45ff1a8d5a88160135170 realtime-preempt-2.6.9-rc4-mm1-U6
if you've got the earlier one and have CONFIG_RWSEM_DEADLOCK_DETECT
enabled then please download the new patch.
Ingo
Ingo Molnar wrote:
>
>> i have released the -U6 Real-Time Preemption patch:
>>
>> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
>
I'm experiencing terrible kernel panics at a very early bootstrap stage
while testing the U5 and U6 latest patch(es) on my laptop (P4/UP) --
(Ingo: this is about the very same trouble I've reported while pre-testing
U6).
Sorry that I can't show any trace dumps; only a hard-screenshot (with
digital camera?) would be possible but rather incomplete. The serial
console hack is not an option--these "modern" laptops doesn't come with
serial ports anymore, and netconsole is a no-op at a so early point of the
boot process. Or so I believe.
OK. After some incremental configurations, I've isolated that those
oops(es) only occurs if PREEMPT_TIMING and/or LATENCY_TRACE areset (Y). My
first suspect was that newest RWSEM_DEADLOCK_DETECT, but it wasn't the
case.
So something has broken on that non-preemptible critical section timing
stuff since U4.
Hasn't anybody else stumbled on this?
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> OK. After some incremental configurations, I've isolated that those
> oops(es) only occurs if PREEMPT_TIMING and/or LATENCY_TRACE areset
> (Y). My first suspect was that newest RWSEM_DEADLOCK_DETECT, but it
> wasn't the case.
>
> So something has broken on that non-preemptible critical section
> timing stuff since U4.
>
> Hasn't anybody else stumbled on this?
i'm using it myself and havent seen the problem yet. Could you send me
the latest .configs, the working and the broken one too? I'll try to
reprodue it (or maybe someone else with a serial console sees it too).
Ingo
On Tue, 2004-10-19 at 17:23, Rui Nuno Capela wrote:
> Ingo Molnar wrote:
> >
> >> i have released the -U6 Real-Time Preemption patch:
> >>
> >> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
> >
>
> I'm experiencing terrible kernel panics at a very early bootstrap stage
> while testing the U5 and U6 latest patch(es) on my laptop (P4/UP) --
> (Ingo: this is about the very same trouble I've reported while pre-testing
> U6).
>
> Sorry that I can't show any trace dumps; only a hard-screenshot (with
> digital camera?) would be possible but rather incomplete. The serial
> console hack is not an option--these "modern" laptops doesn't come with
> serial ports anymore, and netconsole is a no-op at a so early point of the
> boot process. Or so I believe.
>
> OK. After some incremental configurations, I've isolated that those
> oops(es) only occurs if PREEMPT_TIMING and/or LATENCY_TRACE areset (Y). My
> first suspect was that newest RWSEM_DEADLOCK_DETECT, but it wasn't the
> case.
Same here on al DELL P4/UP.
tglx
On Tue, 2004-10-19 at 16:46, Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
> i've re-released the patch because shortly after releasing it i found a
> false-positive in the deadlock-detector that was triggering in oowriter.
Hit and converted another one. There are more, but they need more
modifications as they don't have a condition to wait for and therefor
must be converted to use the completion API, which must be extended to
provide completion_timemout() first.
tglx
* Thomas Gleixner <[email protected]> wrote:
> > Sorry that I can't show any trace dumps; only a hard-screenshot (with
> > digital camera?) would be possible but rather incomplete. The serial
> > console hack is not an option--these "modern" laptops doesn't come with
> > serial ports anymore, and netconsole is a no-op at a so early point of the
> > boot process. Or so I believe.
> >
> > OK. After some incremental configurations, I've isolated that those
> > oops(es) only occurs if PREEMPT_TIMING and/or LATENCY_TRACE areset (Y). My
> > first suspect was that newest RWSEM_DEADLOCK_DETECT, but it wasn't the
> > case.
>
> Same here on al DELL P4/UP.
any chance for serial logging on that box?
and does this bootup crash go away if you unset PREEMPT_TIMING or
LATENCY_TRACE, as suggested by Rui?
Ingo
Ingo Molnar wrote:
> * Rui Nuno Capela <[email protected]> wrote:
>
>
>>OK. After some incremental configurations, I've isolated that those
>>oops(es) only occurs if PREEMPT_TIMING and/or LATENCY_TRACE areset
>>(Y). My first suspect was that newest RWSEM_DEADLOCK_DETECT, but it
>>wasn't the case.
>>
>>So something has broken on that non-preemptible critical section
>>timing stuff since U4.
>>
>>Hasn't anybody else stumbled on this?
>
>
> i'm using it myself and havent seen the problem yet. Could you send me
> the latest .configs, the working and the broken one too? I'll try to
> reprodue it (or maybe someone else with a serial console sees it too).
>
> Ingo
>
I am seeing something similar here with both U5 and U6 on both of my SMP
systems (actually I haven't gotten to try U6 on my SMP system at home
yet.) On my SMP system here (dual Xeon) I get a handful of traces during
the boot and then the last thing I see is a trace that has something to
do with parport, but the key MIGHT be that it always seems to crap out
when I get traces for 3-level deep critical section nesting. I will try
to capture the trace the old-fashioned way when I get a chance shortly.
I do have U5 booted on my UP system at home though.
kr
* Ingo Molnar <[email protected]> wrote:
> > Hasn't anybody else stumbled on this?
>
> i'm using it myself and havent seen the problem yet. Could you send me
> the latest .configs, the working and the broken one too? I'll try to
> reprodue it (or maybe someone else with a serial console sees it too).
i found older .config's from you and i tried your desktop one and it
didnt crash. But when i tried your laptop's U3 .config then i got the
bootup crash immediately. Debugging it ...
Ingo
* Ingo Molnar <[email protected]> wrote:
> > i found older .config's from you and i tried your desktop one and it
> > didnt crash. But when i tried your laptop's U3 .config then i got the
> > bootup crash immediately. Debugging it ...
>
> one difference in your config is that you have 4K stacks enabled.
> Could you disable them? Especially with rwsem-detection and tracing
> enabled the stack footprint can get pretty large ...
indeed, this is what triggers with your .config:
testing NMI watchdog ... OK.
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=2 pin2=-1
mcount: stack overflow: 1008
[<c012cd9f>] ___trace+0x105/0x117
[<c012cdd7>] __mcount+0x1d/0x1f
[<c013e015>] cache_grow+0xe/0x1ab
[<c013e3cd>] cache_alloc_refill+0x21b/0x253
enabling 8K stacks ought to help this one. I've made the limit a bit too
conservative - there's still 1000 bytes left and the assert hits. Here's
the full trace, the large footprint seems to be in zlib initialization:
mcount: stack overflow: 1008
[<c012cd9f>] ___trace+0x105/0x117
[<c012cdd7>] __mcount+0x1d/0x1f
[<c013e015>] cache_grow+0xe/0x1ab
[<c013e3cd>] cache_alloc_refill+0x21b/0x253
[<c010efec>] mcount+0x14/0x18
[<c013e015>] cache_grow+0xe/0x1ab
[<c013e3cd>] cache_alloc_refill+0x21b/0x253
[<c013e730>] __kmalloc+0x82/0x9f
[<c03607c8>] malloc+0x1e/0x20
[<c01008c9>] huft_build+0x309/0x5e8
[<c0101bec>] inflate+0x4c/0xb0
[<c010efec>] mcount+0x14/0x18
[<c0101279>] inflate_fixed+0xcb/0x1a4
[<c0101bec>] inflate+0x4c/0xb0
[<c010efec>] mcount+0x14/0x18
[<c0101eae>] gunzip+0x1d4/0x396
[<c036130e>] unpack_to_rootfs+0x162/0x225
[<c010efec>] mcount+0x14/0x18
[<c0100434>] init+0x0/0x124
[<c03613fe>] populate_rootfs+0x2d/0x3f
[<c010046b>] init+0x37/0x124
[<c0102365>] kernel_thread_helper+0x5/0xb
Ingo
* Ingo Molnar <[email protected]> wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > > Hasn't anybody else stumbled on this?
> >
> > i'm using it myself and havent seen the problem yet. Could you send me
> > the latest .configs, the working and the broken one too? I'll try to
> > reprodue it (or maybe someone else with a serial console sees it too).
>
> i found older .config's from you and i tried your desktop one and it
> didnt crash. But when i tried your laptop's U3 .config then i got the
> bootup crash immediately. Debugging it ...
one difference in your config is that you have 4K stacks enabled. Could
you disable them? Especially with rwsem-detection and tracing enabled
the stack footprint can get pretty large ...
Ingo
On Tue, 2004-10-19 at 17:57, Ingo Molnar wrote:
> any chance for serial logging on that box?
No
> and does this bootup crash go away if you unset PREEMPT_TIMING or
> LATENCY_TRACE, as suggested by Rui?
It comes into init now, but dies when loading the AGP driver. Have to
look into this.
tglx
On Tue, 19 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U6 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
>
> this is a fixes-only release.
>
> found and fixed the 'big bug' that was probably the one causing
> stability problems for a number of people. There was a small window for
> a task double-free race to occur, causing all sorts of crashes later on.
> This bug could trigger on UP and SMP systems alike, on SMP being a bit
> more frequent.
I am still having the same bug(repeatable by running liquidwar) as I reported
with -U5(see my earlier email).
* Ingo Molnar <[email protected]> wrote:
> enabling 8K stacks ought to help this one. I've made the limit a bit
> too conservative - there's still 1000 bytes left and the assert hits.
> Here's the full trace, the large footprint seems to be in zlib
> initialization:
>
> mcount: stack overflow: 1008
i've added a stackframe-size field to the end of every stack trace
entry:
mcount: stack overflow: 1008
[<c012cdaf>] ___trace+0x105/0x117 (12)
[<c012cde7>] __mcount+0x1d/0x1f (32)
[<c013e025>] cache_grow+0xe/0x1ab (4)
[<c013e3dd>] cache_alloc_refill+0x21b/0x253 (4)
[<c010effc>] mcount+0x14/0x18 (8)
[<c013e025>] cache_grow+0xe/0x1ab (20)
[<c013e3dd>] cache_alloc_refill+0x21b/0x253 (52)
[<c013e740>] __kmalloc+0x82/0x9f (48)
[<c03607c8>] malloc+0x1e/0x20 (28)
[<c01008c9>] huft_build+0x309/0x5e8 (16)
[<c0101bec>] inflate+0x4c/0xb0 (1444)
[<c010effc>] mcount+0x14/0x18 (8)
[<c0101279>] inflate_fixed+0xcb/0x1a4 (20)
[<c0101bec>] inflate+0x4c/0xb0 (1212)
[<c010effc>] mcount+0x14/0x18 (12)
[<c0101eae>] gunzip+0x1d4/0x396 (20)
[<c036130e>] unpack_to_rootfs+0x162/0x225 (28)
[<c010effc>] mcount+0x14/0x18 (8)
[<c0100434>] init+0x0/0x124 (4)
[<c03613fe>] populate_rootfs+0x2d/0x3f (16)
[<c010046b>] init+0x37/0x124 (20)
[<c0102365>] kernel_thread_helper+0x5/0xb (20)
as suspected, zlib's huft_build() is fun:
lib/inflate.c:
#define N_MAX 288 /* maximum number of codes in any set */
STATIC int huft_build(
...
{
unsigned v[N_MAX]; /* values in order of bit length */
a whopping 1152 bytes for this local variable alone! The patch below
fixes this, but there are other overflows as well, later in the bootup.
Ingo
--- linux/lib/inflate.c.orig
+++ linux/lib/inflate.c
@@ -300,7 +300,7 @@ STATIC int huft_build(
register struct huft *q; /* points to current table */
struct huft r; /* table entry for structure assignment */
struct huft *u[BMAX]; /* table stack */
- unsigned v[N_MAX]; /* values in order of bit length */
+ unsigned *v; /* values in order of bit length */
register int w; /* bits before this table == (l * h) */
unsigned x[BMAX+1]; /* bit offsets, then code stack */
unsigned *xp; /* pointer into x */
@@ -309,6 +309,10 @@ STATIC int huft_build(
DEBG("huft1 ");
+ /* allocate new table */
+ v = (unsigned *)malloc(sizeof(unsigned)*N_MAX);
+ if (!v)
+ return 3; /* not enough memory */
/* Generate counts for each bit length */
memzero(c, sizeof(c));
p = b; i = n;
@@ -322,6 +326,7 @@ DEBG("huft1 ");
{
*t = (struct huft *)NULL;
*m = 0;
+ free(v);
return 0;
}
@@ -347,10 +352,14 @@ DEBG("huft3 ");
/* Adjust last length count to fill out codes, if needed */
for (y = 1 << j; j < i; j++, y <<= 1)
- if ((y -= c[j]) < 0)
+ if ((y -= c[j]) < 0) {
+ free(v);
return 2; /* bad input: more codes than bits */
- if ((y -= c[i]) < 0)
+ }
+ if ((y -= c[i]) < 0) {
+ free(v);
return 2;
+ }
c[i] += y;
DEBG("huft4 ");
@@ -422,6 +431,7 @@ DEBG1("3 ");
{
if (h)
huft_free(u[0]);
+ free(v);
return 3; /* not enough memory */
}
DEBG1("4 ");
@@ -485,6 +495,7 @@ DEBG("h6f ");
DEBG("huft7 ");
+ free(v);
/* Return true (1) if we were given an incomplete table */
return y != 0 && g != 1;
}
* Adam Heath <[email protected]> wrote:
> I am still having the same bug(repeatable by running liquidwar) as I
> reported with -U5(see my earlier email).
ok, this seems to be some questionable code in OSS. It really has no
business up()-ing the inode semaphore - nobody down()-ed it before! This
could be either a bad workaround for a bug/hang someone saw, or an old
VFS assumption that doesnt hold anymore. In any case, could you try the
patch below, does it fix liquidwar?
Ingo
--- linux/sound/core/oss/pcm_oss.c.orig
+++ linux/sound/core/oss/pcm_oss.c
@@ -2120,9 +2120,7 @@ static ssize_t snd_pcm_oss_write(struct
substream = pcm_oss_file->streams[SNDRV_PCM_STREAM_PLAYBACK];
if (substream == NULL)
return -ENXIO;
- up(&file->f_dentry->d_inode->i_sem);
result = snd_pcm_oss_write1(substream, buf, count);
- down(&file->f_dentry->d_inode->i_sem);
#ifdef OSS_DEBUG
printk("pcm_oss: write %li bytes (wrote %li bytes)\n", (long)count, (long)result);
#endif
* [email protected] <[email protected]> wrote:
> Booted to single user and was able to get some network operations
> going with this version (w/ previously mentioned update). However, at
> the step where I start CUPS, I got a number of traces on the display
> referring to parport_pc related function calls [but I don't use a
> parallel printer...]. It ended with:
thanks for the logs - there are some semaphore assumptions in
ieee1284.c, it should use completions & wait_for_completion_timeout()
too. The workaround is to disable CONFIG_PARPORT_1284. (or
CONFIG_PARPORT altogether.)
Ingo
On Tue, 19 Oct 2004 18:56:46 +0200
Ingo Molnar <[email protected]> wrote:
> do you get the same pauses if you do 'dmesg -n 1'? Also, are you using
> preempt_thresh or the maximum-searching variant? preempt_thresh can
> generate _tons_ of messages with a low threshold, freezing the system in
> essence for long periods of time.
afaik i use the maximum search:
mango:~# cat /proc/sys/kernel/preempt_thresh
0
mango:~# cat /proc/sys/kernel/preempt_max_latency
1841
I'll compile a kernel w/o preempt-timing and tracing. But i will get to it
tomorrow the earliest.
> but this trace is weird:
[snip]
> this doesnt seem like normal behavior. It seems two tasks are
> ping-pong-ing a semaphore but are unable to make any progress. The whole
> thing is non-preemptible because this semaphore was taken while in a
> PREEMPT_ACTIVE section.
>
> (i'd say this is the BKL semaphore - it is quite special in that
> regard.)
btw: afaics these long critical sections reports do not correlate to the
pauses X makes (not sure though). I have anther long one. This is from
syslog as i haven't had the tracing enabled at that time:
(IRQ 3/117/CPU#0): new 1385 us maximum-latency critical section.
=> started at timestamp 2293619762: <call_console_drivers+0x76/0x140>
=> ended at timestamp 2293621147: <finish_task_switch+0x43/0xb0>
[<c012f630>] sub_preempt_count+0x60/0x90
[<c012f31e>] check_preempt_timing+0x15e/0x270
[<c0114ae3>] finish_task_switch+0x43/0xb0
[<c012f630>] sub_preempt_count+0x60/0x90
[<c0114ae3>] finish_task_switch+0x43/0xb0
[<c0114ae3>] finish_task_switch+0x43/0xb0
[<c02a7917>] __schedule+0x2d7/0x5d0
[<c0136faa>] do_irqd+0x5a/0x80
[<c0112440>] mcount+0x14/0x18
[<c0136faa>] do_irqd+0x5a/0x80
[<c012d82a>] kthread+0xaa/0xb0
[<c0136f50>] do_irqd+0x0/0x80
[<c012d780>] kthread+0x0/0xb0
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __schedule+0x3b/0x5d0 / (do_irqd+0x5a/0x80)
.. entry 2: print_traces+0x1d/0x80 / (dump_stack+0x23/0x30)
=> dump-end timestamp 2293621634
Here's a similar (to the one from my last mail) shorter one:
preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U6
-------------------------------------------------------
latency: 471 us, entries: 2562 (2562) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: xterm/3155, uid:1000 nice:0 policy:0 rt_prio:0
-----------------
=> started at: cond_resched+0x23/0x80 <c02a82b3>
=> ended at: finish_task_switch+0x43/0xb0 <c0114ae3>
=======>
04000001 0.000ms (+0.000ms): __rwsem_deadlock (down_write)
04000000 0.000ms (+0.000ms): schedule (down_write)
04000000 0.000ms (+0.000ms): __schedule (down_write)
04000001 0.000ms (+0.000ms): sched_clock (__schedule)
04000000 0.000ms (+0.000ms): schedule (down_write)
04000000 0.000ms (+0.000ms): __schedule (down_write)
04000001 0.000ms (+0.000ms): sched_clock (__schedule)
04000000 0.001ms (+0.000ms): schedule (down_write)
04000000 0.001ms (+0.000ms): __schedule (down_write)
04000001 0.001ms (+0.000ms): sched_clock (__schedule)
04000000 0.001ms (+0.000ms): schedule (down_write)
04000000 0.001ms (+0.000ms): __schedule (down_write)
04000001 0.001ms (+0.000ms): sched_clock (__schedule)
04000000 0.002ms (+0.000ms): schedule (down_write)
04000000 0.002ms (+0.000ms): __schedule (down_write)
04000001 0.002ms (+0.000ms): sched_clock (__schedule)
04000000 0.002ms (+0.000ms): schedule (down_write)
04000000 0.002ms (+0.000ms): __schedule (down_write)
04000001 0.002ms (+0.000ms): sched_clock (__schedule)
04000000 0.002ms (+0.000ms): schedule (down_write)
04000000 0.002ms (+0.000ms): __schedule (down_write)
04000001 0.003ms (+0.000ms): sched_clock (__schedule)
04000000 0.003ms (+0.000ms): schedule (down_write)
04000000 0.003ms (+0.000ms): __schedule (down_write)
04000001 0.003ms (+0.000ms): sched_clock (__schedule)
04000000 0.003ms (+0.000ms): schedule (down_write)
[many many more]
04000000 0.445ms (+0.000ms): __schedule (down_write)
04000001 0.445ms (+0.000ms): sched_clock (__schedule)
04000000 0.446ms (+0.000ms): schedule (down_write)
04000000 0.446ms (+0.000ms): __schedule (down_write)
04000001 0.446ms (+0.000ms): sched_clock (__schedule)
04000000 0.446ms (+0.000ms): schedule (down_write)
04000000 0.446ms (+0.000ms): __schedule (down_write)
04000001 0.447ms (+0.000ms): sched_clock (__schedule)
04000000 0.447ms (+0.000ms): schedule (down_write)
04000000 0.447ms (+0.000ms): __schedule (down_write)
04000001 0.447ms (+0.000ms): sched_clock (__schedule)
04000000 0.447ms (+0.000ms): schedule (down_write)
04000000 0.447ms (+0.000ms): __schedule (down_write)
04000001 0.448ms (+0.000ms): sched_clock (__schedule)
04000000 0.448ms (+0.000ms): schedule (down_write)
04000000 0.448ms (+0.000ms): __schedule (down_write)
04000001 0.448ms (+0.000ms): sched_clock (__schedule)
04000000 0.448ms (+0.000ms): schedule (down_write)
04000000 0.449ms (+0.000ms): __schedule (down_write)
04000001 0.449ms (+0.000ms): sched_clock (__schedule)
04000000 0.449ms (+0.000ms): schedule (down_write)
04000000 0.449ms (+0.000ms): __schedule (down_write)
04000001 0.449ms (+0.000ms): sched_clock (__schedule)
04000000 0.449ms (+0.000ms): schedule (down_write)
04000000 0.450ms (+0.000ms): __schedule (down_write)
04000001 0.450ms (+0.000ms): sched_clock (__schedule)
04000000 0.450ms (+0.000ms): schedule (down_write)
04000000 0.450ms (+0.000ms): __schedule (down_write)
04000001 0.450ms (+0.000ms): sched_clock (__schedule)
04000000 0.450ms (+0.000ms): schedule (down_write)
04000000 0.451ms (+0.000ms): __schedule (down_write)
04000001 0.451ms (+0.000ms): sched_clock (__schedule)
04000000 0.451ms (+0.000ms): schedule (down_write)
04000000 0.451ms (+0.000ms): __schedule (down_write)
04000001 0.451ms (+0.000ms): sched_clock (__schedule)
04000000 0.452ms (+0.000ms): schedule (down_write)
04000000 0.452ms (+0.000ms): __schedule (down_write)
04000001 0.452ms (+0.000ms): sched_clock (__schedule)
04000000 0.452ms (+0.000ms): schedule (down_write)
04000000 0.452ms (+0.000ms): __schedule (down_write)
04000001 0.452ms (+0.000ms): sched_clock (__schedule)
04000000 0.453ms (+0.000ms): schedule (down_write)
04000000 0.453ms (+0.000ms): __schedule (down_write)
04000001 0.453ms (+0.000ms): sched_clock (__schedule)
04000000 0.453ms (+0.000ms): schedule (down_write)
04000000 0.453ms (+0.000ms): __schedule (down_write)
04000001 0.453ms (+0.000ms): sched_clock (__schedule)
04000000 0.454ms (+0.000ms): schedule (down_write)
04000000 0.454ms (+0.000ms): __schedule (down_write)
04000001 0.454ms (+0.000ms): sched_clock (__schedule)
04000000 0.454ms (+0.000ms): schedule (down_write)
04000000 0.454ms (+0.001ms): __schedule (down_write)
04010000 0.455ms (+0.000ms): do_IRQ (add_preempt_count)
04010000 0.455ms (+0.000ms): do_IRQ (<00000000>)
00010001 0.456ms (+0.001ms): mask_and_ack_8259A (__do_IRQ)
00010001 0.458ms (+0.000ms): redirect_hardirq (__do_IRQ)
00010000 0.458ms (+0.000ms): handle_IRQ_event (__do_IRQ)
00010000 0.458ms (+0.000ms): timer_interrupt (handle_IRQ_event)
00010001 0.458ms (+0.006ms): mark_offset_tsc (timer_interrupt)
00010001 0.464ms (+0.000ms): do_timer (timer_interrupt)
00010001 0.465ms (+0.000ms): update_wall_time (do_timer)
00010001 0.465ms (+0.000ms): update_wall_time_one_tick (update_wall_time)
00010001 0.465ms (+0.000ms): update_process_times (timer_interrupt)
00010001 0.465ms (+0.000ms): update_one_process (update_process_times)
00010001 0.465ms (+0.000ms): run_local_timers (update_process_times)
00010001 0.466ms (+0.000ms): raise_softirq (update_process_times)
00010001 0.466ms (+0.000ms): scheduler_tick (update_process_times)
00010001 0.466ms (+0.000ms): sched_clock (scheduler_tick)
00010002 0.466ms (+0.000ms): task_timeslice (scheduler_tick)
00010002 0.467ms (+0.000ms): dequeue_task (scheduler_tick)
00010002 0.467ms (+0.000ms): effective_prio (scheduler_tick)
00010002 0.467ms (+0.000ms): enqueue_task (scheduler_tick)
00010001 0.468ms (+0.000ms): note_interrupt (__do_IRQ)
00010001 0.468ms (+0.000ms): end_8259A_irq (__do_IRQ)
00010001 0.468ms (+0.000ms): enable_8259A_irq (__do_IRQ)
04010000 0.469ms (+0.000ms): irq_exit (do_IRQ)
04000001 0.469ms (+0.000ms): sched_clock (__schedule)
00000002 0.470ms (+0.000ms): __switch_to (__schedule)
00000002 0.470ms (+0.000ms): finish_task_switch (__schedule)
one more:
preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U6
-------------------------------------------------------
latency: 2363 us, entries: 4000 (15429) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: xterm/3155, uid:1000 nice:0 policy:0 rt_prio:0
-----------------
=> started at: cond_resched+0x23/0x80 <c02a82b3>
=> ended at: finish_task_switch+0x43/0xb0 <c0114ae3>
=======>
04000001 0.000ms (+0.000ms): __rwsem_deadlock (down_write)
04000000 0.000ms (+0.000ms): schedule (down_write)
04000000 0.000ms (+0.000ms): __schedule (down_write)
04000001 0.000ms (+0.000ms): sched_clock (__schedule)
04000000 0.000ms (+0.000ms): schedule (down_write)
04000000 0.000ms (+0.000ms): __schedule (down_write)
[more]
04000000 0.347ms (+0.000ms): schedule (down_write)
04000000 0.348ms (+0.000ms): __schedule (down_write)
04000001 0.348ms (+0.000ms): sched_clock (__schedule)
04000000 0.348ms (+0.000ms): schedule (down_write)
04000000 0.348ms (+0.000ms): __schedule (down_write)
04000001 0.348ms (+0.000ms): sched_clock (__schedule)
04000000 0.349ms (+0.000ms): schedule (down_write)
04000000 0.349ms (+0.000ms): __schedule (down_write)
04000001 0.349ms (+0.000ms): sched_clock (__schedule)
04000000 0.349ms (+0.000ms): schedule (down_write)
04000000 0.349ms (+0.000ms): __schedule (down_write)
04000001 0.349ms (+0.000ms): sched_clock (__schedule)
04000000 0.350ms (+0.000ms): schedule (down_write)
04000000 0.350ms (+0.000ms): __schedule (down_write)
04000001 0.350ms (+0.000ms): sched_clock (__schedule)
04000000 0.350ms (+0.000ms): schedule (down_write)
04000000 0.350ms (+0.000ms): __schedule (down_write)
04000001 0.351ms (+0.000ms): sched_clock (__schedule)
04000000 0.351ms (+0.000ms): schedule (down_write)
04000000 0.351ms (+0.000ms): __schedule (down_write)
04000001 0.351ms (+0.000ms): sched_clock (__schedule)
04000000 0.352ms (+0.000ms): schedule (down_write)
04000000 0.352ms (+0.000ms): __schedule (down_write)
04010001 0.352ms (+0.000ms): do_IRQ (add_preempt_count)
04010001 0.353ms (+0.000ms): do_IRQ (<00000000>)
00010001 0.353ms (+0.001ms): mask_and_ack_8259A (__do_IRQ)
00010001 0.355ms (+0.000ms): redirect_hardirq (__do_IRQ)
00010000 0.355ms (+0.000ms): handle_IRQ_event (__do_IRQ)
00010000 0.355ms (+0.000ms): timer_interrupt (handle_IRQ_event)
00010001 0.356ms (+0.006ms): mark_offset_tsc (timer_interrupt)
00010001 0.362ms (+0.000ms): do_timer (timer_interrupt)
00010001 0.363ms (+0.000ms): update_wall_time (do_timer)
00010001 0.363ms (+0.000ms): update_wall_time_one_tick (update_wall_time)
00010001 0.363ms (+0.000ms): update_process_times (timer_interrupt)
00010001 0.363ms (+0.000ms): update_one_process (update_process_times)
00010001 0.363ms (+0.000ms): run_local_timers (update_process_times)
00010001 0.363ms (+0.000ms): raise_softirq (update_process_times)
00010001 0.364ms (+0.000ms): scheduler_tick (update_process_times)
00010001 0.364ms (+0.000ms): sched_clock (scheduler_tick)
00010002 0.364ms (+0.000ms): task_timeslice (scheduler_tick)
00010001 0.365ms (+0.000ms): note_interrupt (__do_IRQ)
00010001 0.365ms (+0.000ms): end_8259A_irq (__do_IRQ)
00010001 0.365ms (+0.000ms): enable_8259A_irq (__do_IRQ)
04010001 0.366ms (+0.000ms): irq_exit (do_IRQ)
04000001 0.367ms (+0.000ms): sched_clock (__schedule)
04000000 0.367ms (+0.000ms): schedule (down_write)
04000000 0.367ms (+0.000ms): __schedule (down_write)
04000001 0.367ms (+0.000ms): sched_clock (__schedule)
04000000 0.368ms (+0.000ms): schedule (down_write)
04000000 0.368ms (+0.000ms): __schedule (down_write)
04000001 0.368ms (+0.000ms): sched_clock (__schedule)
04000000 0.368ms (+0.000ms): schedule (down_write)
04000000 0.368ms (+0.000ms): __schedule (down_write)
04000001 0.369ms (+0.000ms): sched_clock (__schedule)
04000000 0.369ms (+0.000ms): schedule (down_write)
04000000 0.369ms (+0.000ms): __schedule (down_write)
04000001 0.369ms (+0.000ms): sched_clock (__schedule)
04000000 0.370ms (+0.000ms): schedule (down_write)
04000000 0.370ms (+0.000ms): __schedule (down_write)
04000001 0.370ms (+0.000ms): sched_clock (__schedule)
04000000 0.370ms (+0.000ms): schedule (down_write)
[more]
04000000 0.797ms (+0.000ms): schedule (down_write)
04000000 0.797ms (+0.000ms): __schedule (down_write)
04000001 0.797ms (+0.000ms): sched_clock (__schedule)
04000000 0.797ms (+0.000ms): schedule (down_write)
04000000 0.798ms (+0.000ms): __schedule (down_write)
04000001 0.798ms (+0.000ms): sched_clock (__schedule)
04000000 0.798ms (+0.000ms): schedule (down_write)
04000000 0.798ms (+0.000ms): __schedule (down_write)
04000001 0.798ms (+0.000ms): sched_clock (__schedule)
04000000 0.799ms (+0.000ms): schedule (down_write)
04000000 0.799ms (+0.000ms): __schedule (down_write)
04000001 0.799ms (+0.000ms): sched_clock (__schedule)
04000000 0.799ms (+0.000ms): schedule (down_write)
04000000 0.799ms (+0.000ms): __schedule (down_write)
04000001 0.800ms (+0.000ms): sched_clock (__schedule)
04000000 0.800ms (+0.000ms): schedule (down_write)
04000000 0.800ms (+0.000ms): __schedule (down_write)
04000001 0.800ms (+877840.005ms): sched_clock (__schedule)
fio
>i have released the -U6 Real-Time Preemption patch:
>
>
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
Booted to single user and was able to get some network operations going
with
this version (w/ previously mentioned update). However, at the step where
I start CUPS, I got a number of traces on the display referring to
parport_pc
related function calls [but I don't use a parallel printer...]. It ended
with:
NMI Watchdog detected LOCKUP on CPU1, eip c0139b22, registers:
[not sure you want the all the details, I'll put a few key items in
and can try to reproduce on request]
Modules linked in: parport_pc lp parport autofs4 sunrpc 8139too mii dm_mod
uhci_hcd ext3 jbd
CPU: 1
... EIP is at sub_preempt_count+0x82/0xa0
... Process ksoftirqd/1 (pid:5 ...)
Pid: 1825, comm: modprobe
... CPU: 0
... EIP is at flush_tlb_others+0x90/0xf0
(stack trace shows sys_init_module, parport_pc_init, ...
parport_announce_port,
mcount, parport_pc_probe_port, parport_announce_port, __mcount,
parport_daisy_init, ...
parport_wait_event, down_write_interruptible, error_code, show_stack,
etc.)
preempt count: 00010006
...
console shuts up...
Alt-Sysrq keys are recognized (displays command) but don't display any
data.
Will send whatever made it into to the system log shortly.
--Mark H Johnson
<mailto:[email protected]>
* Thomas Gleixner <[email protected]> wrote:
> The IPV6 code triggers the irqs_disabled() check in schedule. dmesg
> output attached.
ok, does the patch below, ontop of -U7, fix them?
Ingo
--- linux.old/include/net/protocol.h
+++ linux.new/include/net/protocol.h
@@ -83,6 +83,7 @@ extern spinlock_t inet_proto_lock;
#if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE)
extern struct inet6_protocol *inet6_protos[MAX_INET_PROTOS];
+extern rwlock_t inet6_proto_lock;
#endif
extern int inet_add_protocol(struct net_protocol *prot, unsigned char num);
--- linux.old/net/ipv6/af_inet6.c
+++ linux.new/net/ipv6/af_inet6.c
@@ -94,7 +94,7 @@ atomic_t inet6_sock_nr;
* build a new socket.
*/
static struct list_head inetsw6[SOCK_MAX];
-static spinlock_t inetsw6_lock = SPIN_LOCK_UNLOCKED;
+static rwlock_t inetsw6_lock = RW_LOCK_UNLOCKED;
static void inet6_sock_destruct(struct sock *sk)
{
@@ -127,7 +127,7 @@ static int inet6_create(struct socket *s
/* Look for the requested type/protocol pair. */
answer = NULL;
- rcu_read_lock();
+ rcu_read_lock_read(&inetsw6_lock);
list_for_each_rcu(p, &inetsw6[sock->type]) {
answer = list_entry(p, struct inet_protosw, list);
@@ -162,7 +162,7 @@ static int inet6_create(struct socket *s
answer_prot = answer->prot;
answer_no_check = answer->no_check;
answer_flags = answer->flags;
- rcu_read_unlock();
+ rcu_read_unlock_read(&inetsw6_lock);
BUG_TRAP(answer_prot->slab != NULL);
@@ -242,7 +242,7 @@ static int inet6_create(struct socket *s
out:
return rc;
out_rcu_unlock:
- rcu_read_unlock();
+ rcu_read_unlock_read(&inetsw6_lock);
goto out;
}
@@ -542,7 +542,7 @@ inet6_register_protosw(struct inet_proto
int protocol = p->protocol;
struct list_head *last_perm;
- spin_lock_bh(&inetsw6_lock);
+ write_lock_bh(&inetsw6_lock);
if (p->type >= SOCK_MAX)
goto out_illegal;
@@ -573,7 +573,7 @@ inet6_register_protosw(struct inet_proto
*/
list_add_rcu(&p->list, last_perm);
out:
- spin_unlock_bh(&inetsw6_lock);
+ write_unlock_bh(&inetsw6_lock);
return;
out_permanent:
@@ -596,9 +596,9 @@ inet6_unregister_protosw(struct inet_pro
"Attempt to unregister permanent protocol %d.\n",
p->protocol);
} else {
- spin_lock_bh(&inetsw6_lock);
+ write_lock_bh(&inetsw6_lock);
list_del_rcu(&p->list);
- spin_unlock_bh(&inetsw6_lock);
+ write_unlock_bh(&inetsw6_lock);
synchronize_net();
}
--- linux.old/net/ipv6/icmp.c
+++ linux.new/net/ipv6/icmp.c
@@ -537,11 +537,11 @@ static void icmpv6_notify(struct sk_buff
hash = nexthdr & (MAX_INET_PROTOS - 1);
- rcu_read_lock();
+ rcu_read_lock_read(&inet6_proto_lock);
ipprot = rcu_dereference(inet6_protos[hash]);
if (ipprot && ipprot->err_handler)
ipprot->err_handler(skb, NULL, type, code, inner_offset, info);
- rcu_read_unlock();
+ rcu_read_unlock_read(&inet6_proto_lock);
read_lock(&raw_v6_lock);
if ((sk = sk_head(&raw_v6_htable[hash])) != NULL) {
--- linux.old/net/ipv6/ip6_input.c
+++ linux.new/net/ipv6/ip6_input.c
@@ -156,7 +156,7 @@ static inline int ip6_input_finish(struc
skb->h.raw += (skb->h.raw[1]+1)<<3;
}
- rcu_read_lock();
+ rcu_read_lock_read(&raw_v6_lock);
resubmit:
if (!pskb_pull(skb, skb->h.raw - skb->data))
goto discard;
@@ -205,12 +205,12 @@ resubmit:
kfree_skb(skb);
}
}
- rcu_read_unlock();
+ rcu_read_unlock_read(&raw_v6_lock);
return 0;
discard:
IP6_INC_STATS_BH(IPSTATS_MIB_INDISCARDS);
- rcu_read_unlock();
+ rcu_read_unlock_read(&raw_v6_lock);
kfree_skb(skb);
return 0;
}
--- linux.old/net/ipv6/ndisc.c
+++ linux.new/net/ipv6/ndisc.c
@@ -289,17 +289,17 @@ static int ndisc_constructor(struct neig
struct neigh_parms *parms;
int is_multicast = ipv6_addr_is_multicast(addr);
- rcu_read_lock();
+ rcu_read_lock_read(&addrconf_lock);
in6_dev = in6_dev_get(dev);
if (in6_dev == NULL) {
- rcu_read_unlock();
+ rcu_read_unlock_read(&addrconf_lock);
return -EINVAL;
}
parms = in6_dev->nd_parms;
__neigh_parms_put(neigh->parms);
neigh->parms = neigh_parms_clone(parms);
- rcu_read_unlock();
+ rcu_read_unlock_read(&addrconf_lock);
neigh->type = is_multicast ? RTN_MULTICAST : RTN_UNICAST;
if (dev->hard_header == NULL) {
--- linux.old/net/ipv6/protocol.c
+++ linux.new/net/ipv6/protocol.c
@@ -40,14 +40,14 @@
#include <net/protocol.h>
struct inet6_protocol *inet6_protos[MAX_INET_PROTOS];
-static spinlock_t inet6_proto_lock = SPIN_LOCK_UNLOCKED;
+rwlock_t inet6_proto_lock = RW_LOCK_UNLOCKED;
int inet6_add_protocol(struct inet6_protocol *prot, unsigned char protocol)
{
int ret, hash = protocol & (MAX_INET_PROTOS - 1);
- spin_lock_bh(&inet6_proto_lock);
+ write_lock_bh(&inet6_proto_lock);
if (inet6_protos[hash]) {
ret = -1;
@@ -56,7 +56,7 @@ int inet6_add_protocol(struct inet6_prot
ret = 0;
}
- spin_unlock_bh(&inet6_proto_lock);
+ write_unlock_bh(&inet6_proto_lock);
return ret;
}
@@ -69,7 +69,7 @@ int inet6_del_protocol(struct inet6_prot
{
int ret, hash = protocol & (MAX_INET_PROTOS - 1);
- spin_lock_bh(&inet6_proto_lock);
+ write_lock_bh(&inet6_proto_lock);
if (inet6_protos[hash] != prot) {
ret = -1;
@@ -78,7 +78,7 @@ int inet6_del_protocol(struct inet6_prot
ret = 0;
}
- spin_unlock_bh(&inet6_proto_lock);
+ write_unlock_bh(&inet6_proto_lock);
synchronize_net();
On Tue, 19 Oct 2004, Ingo Molnar wrote:
>
> * Adam Heath <[email protected]> wrote:
>
> > I am still having the same bug(repeatable by running liquidwar) as I
> > reported with -U5(see my earlier email).
>
> ok, this seems to be some questionable code in OSS. It really has no
> business up()-ing the inode semaphore - nobody down()-ed it before! This
> could be either a bad workaround for a bug/hang someone saw, or an old
> VFS assumption that doesnt hold anymore. In any case, could you try the
> patch below, does it fix liquidwar?
Yup, the below fixes it. However, this problem *only* started occuring in
-U5. I've been running liquidwar on all versions(it's my current
game-to-play-when-I-feel-stupid program).
>
> Ingo
>
> --- linux/sound/core/oss/pcm_oss.c.orig
> +++ linux/sound/core/oss/pcm_oss.c
> @@ -2120,9 +2120,7 @@ static ssize_t snd_pcm_oss_write(struct
> substream = pcm_oss_file->streams[SNDRV_PCM_STREAM_PLAYBACK];
> if (substream == NULL)
> return -ENXIO;
> - up(&file->f_dentry->d_inode->i_sem);
> result = snd_pcm_oss_write1(substream, buf, count);
> - down(&file->f_dentry->d_inode->i_sem);
> #ifdef OSS_DEBUG
> printk("pcm_oss: write %li bytes (wrote %li bytes)\n", (long)count, (long)result);
> #endif
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
On Tue, 19 Oct 2004, Ingo Molnar wrote:
> i have released the -U6 Real-Time Preemption patch:
(xterm/1219/CPU#0): new 39188 us maximum-latency critical section.
=> started at timestamp 71898423: <call_console_drivers+0x76/0x140>
=> ended at timestamp 71937611: <finish_task_switch+0x43/0xb0>
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c01324de>] check_preempt_timing+0x15e/0x270
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c02a5717>] __sched_text_start+0x2d7/0x5d0
[<c02a684f>] down_write+0x12f/0x1e0
[<c0113104>] mcount+0x14/0x18
[<c02a684f>] down_write+0x12f/0x1e0
[<c014dd45>] remove_vm_struct+0x45/0xb0
[<c014fde2>] exit_mmap+0xf2/0x120
[<c0119b96>] mmput+0x46/0xf0
[<c0166d8f>] exec_mmap+0xaf/0x140
[<c0166ffd>] flush_old_exec+0xfd/0x7f0
[<c015b917>] vfs_read+0xe7/0x140
[<c0113104>] mcount+0x14/0x18
[<c0185d7f>] load_elf_binary+0x30f/0xbd0
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c01db739>] up_read+0xf9/0x140
[<c01323d8>] check_preempt_timing+0x58/0x270
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c0131c6d>] __mcount+0x1d/0x20
[<c0185a70>] load_elf_binary+0x0/0xbd0
[<c0167aba>] search_binary_handler+0x19a/0x2e0
[<c0167dac>] do_execve+0x1ac/0x260
[<c0104b07>] sys_execve+0x47/0xd0
[<c0106013>] syscall_call+0x7/0xb
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x3b/0x5d0 / (down_write+0x12f/0x1e0)
.. entry 2: print_traces+0x1d/0x80 / (dump_stack+0x23/0x30)
=> dump-end timestamp 71938197
On Tue, 2004-10-19 at 20:00, Ingo Molnar wrote:
> i have released the -U7 Real-Time Preemption patch:
Another simple fix.
tglx
On Tue, Oct 19, 2004 at 08:00:59PM +0200, Ingo Molnar wrote:
>
> i have released the -U7 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
You should seriously think about using a kind of bitkeeper or CVS stle
system so that multipule folks can dump stuff into it rapidly. This
project is large enough that it needs some kind of facility like that.
bill
On Tue, 2004-10-19 at 20:26, Ingo Molnar wrote:
> * Thomas Gleixner <[email protected]> wrote:
>
> > The IPV6 code triggers the irqs_disabled() check in schedule. dmesg
> > output attached.
>
> ok, does the patch below, ontop of -U7, fix them?
Yes, it does. Thanks.
tglx
> --- linux.old/include/net/protocol.h
> +++ linux.new/include/net/protocol.h
> @@ -83,6 +83,7 @@ extern spinlock_t inet_proto_lock;
>
> #if defined(CONFIG_IPV6) || defined (CONFIG_IPV6_MODULE)
> extern struct inet6_protocol *inet6_protos[MAX_INET_PROTOS];
> +extern rwlock_t inet6_proto_lock;
> #endif
>
> extern int inet_add_protocol(struct net_protocol *prot, unsigned char num);
> --- linux.old/net/ipv6/af_inet6.c
> +++ linux.new/net/ipv6/af_inet6.c
> @@ -94,7 +94,7 @@ atomic_t inet6_sock_nr;
> * build a new socket.
> */
> static struct list_head inetsw6[SOCK_MAX];
> -static spinlock_t inetsw6_lock = SPIN_LOCK_UNLOCKED;
> +static rwlock_t inetsw6_lock = RW_LOCK_UNLOCKED;
>
> static void inet6_sock_destruct(struct sock *sk)
> {
> @@ -127,7 +127,7 @@ static int inet6_create(struct socket *s
>
> /* Look for the requested type/protocol pair. */
> answer = NULL;
> - rcu_read_lock();
> + rcu_read_lock_read(&inetsw6_lock);
> list_for_each_rcu(p, &inetsw6[sock->type]) {
> answer = list_entry(p, struct inet_protosw, list);
>
> @@ -162,7 +162,7 @@ static int inet6_create(struct socket *s
> answer_prot = answer->prot;
> answer_no_check = answer->no_check;
> answer_flags = answer->flags;
> - rcu_read_unlock();
> + rcu_read_unlock_read(&inetsw6_lock);
>
> BUG_TRAP(answer_prot->slab != NULL);
>
> @@ -242,7 +242,7 @@ static int inet6_create(struct socket *s
> out:
> return rc;
> out_rcu_unlock:
> - rcu_read_unlock();
> + rcu_read_unlock_read(&inetsw6_lock);
> goto out;
> }
>
> @@ -542,7 +542,7 @@ inet6_register_protosw(struct inet_proto
> int protocol = p->protocol;
> struct list_head *last_perm;
>
> - spin_lock_bh(&inetsw6_lock);
> + write_lock_bh(&inetsw6_lock);
>
> if (p->type >= SOCK_MAX)
> goto out_illegal;
> @@ -573,7 +573,7 @@ inet6_register_protosw(struct inet_proto
> */
> list_add_rcu(&p->list, last_perm);
> out:
> - spin_unlock_bh(&inetsw6_lock);
> + write_unlock_bh(&inetsw6_lock);
> return;
>
> out_permanent:
> @@ -596,9 +596,9 @@ inet6_unregister_protosw(struct inet_pro
> "Attempt to unregister permanent protocol %d.\n",
> p->protocol);
> } else {
> - spin_lock_bh(&inetsw6_lock);
> + write_lock_bh(&inetsw6_lock);
> list_del_rcu(&p->list);
> - spin_unlock_bh(&inetsw6_lock);
> + write_unlock_bh(&inetsw6_lock);
>
> synchronize_net();
> }
> --- linux.old/net/ipv6/icmp.c
> +++ linux.new/net/ipv6/icmp.c
> @@ -537,11 +537,11 @@ static void icmpv6_notify(struct sk_buff
>
> hash = nexthdr & (MAX_INET_PROTOS - 1);
>
> - rcu_read_lock();
> + rcu_read_lock_read(&inet6_proto_lock);
> ipprot = rcu_dereference(inet6_protos[hash]);
> if (ipprot && ipprot->err_handler)
> ipprot->err_handler(skb, NULL, type, code, inner_offset, info);
> - rcu_read_unlock();
> + rcu_read_unlock_read(&inet6_proto_lock);
>
> read_lock(&raw_v6_lock);
> if ((sk = sk_head(&raw_v6_htable[hash])) != NULL) {
> --- linux.old/net/ipv6/ip6_input.c
> +++ linux.new/net/ipv6/ip6_input.c
> @@ -156,7 +156,7 @@ static inline int ip6_input_finish(struc
> skb->h.raw += (skb->h.raw[1]+1)<<3;
> }
>
> - rcu_read_lock();
> + rcu_read_lock_read(&raw_v6_lock);
> resubmit:
> if (!pskb_pull(skb, skb->h.raw - skb->data))
> goto discard;
> @@ -205,12 +205,12 @@ resubmit:
> kfree_skb(skb);
> }
> }
> - rcu_read_unlock();
> + rcu_read_unlock_read(&raw_v6_lock);
> return 0;
>
> discard:
> IP6_INC_STATS_BH(IPSTATS_MIB_INDISCARDS);
> - rcu_read_unlock();
> + rcu_read_unlock_read(&raw_v6_lock);
> kfree_skb(skb);
> return 0;
> }
> --- linux.old/net/ipv6/ndisc.c
> +++ linux.new/net/ipv6/ndisc.c
> @@ -289,17 +289,17 @@ static int ndisc_constructor(struct neig
> struct neigh_parms *parms;
> int is_multicast = ipv6_addr_is_multicast(addr);
>
> - rcu_read_lock();
> + rcu_read_lock_read(&addrconf_lock);
> in6_dev = in6_dev_get(dev);
> if (in6_dev == NULL) {
> - rcu_read_unlock();
> + rcu_read_unlock_read(&addrconf_lock);
> return -EINVAL;
> }
>
> parms = in6_dev->nd_parms;
> __neigh_parms_put(neigh->parms);
> neigh->parms = neigh_parms_clone(parms);
> - rcu_read_unlock();
> + rcu_read_unlock_read(&addrconf_lock);
>
> neigh->type = is_multicast ? RTN_MULTICAST : RTN_UNICAST;
> if (dev->hard_header == NULL) {
> --- linux.old/net/ipv6/protocol.c
> +++ linux.new/net/ipv6/protocol.c
> @@ -40,14 +40,14 @@
> #include <net/protocol.h>
>
> struct inet6_protocol *inet6_protos[MAX_INET_PROTOS];
> -static spinlock_t inet6_proto_lock = SPIN_LOCK_UNLOCKED;
> +rwlock_t inet6_proto_lock = RW_LOCK_UNLOCKED;
>
>
> int inet6_add_protocol(struct inet6_protocol *prot, unsigned char protocol)
> {
> int ret, hash = protocol & (MAX_INET_PROTOS - 1);
>
> - spin_lock_bh(&inet6_proto_lock);
> + write_lock_bh(&inet6_proto_lock);
>
> if (inet6_protos[hash]) {
> ret = -1;
> @@ -56,7 +56,7 @@ int inet6_add_protocol(struct inet6_prot
> ret = 0;
> }
>
> - spin_unlock_bh(&inet6_proto_lock);
> + write_unlock_bh(&inet6_proto_lock);
>
> return ret;
> }
> @@ -69,7 +69,7 @@ int inet6_del_protocol(struct inet6_prot
> {
> int ret, hash = protocol & (MAX_INET_PROTOS - 1);
>
> - spin_lock_bh(&inet6_proto_lock);
> + write_lock_bh(&inet6_proto_lock);
>
> if (inet6_protos[hash] != prot) {
> ret = -1;
> @@ -78,7 +78,7 @@ int inet6_del_protocol(struct inet6_prot
> ret = 0;
> }
>
> - spin_unlock_bh(&inet6_proto_lock);
> + write_unlock_bh(&inet6_proto_lock);
>
> synchronize_net();
>
Ingo Molnar wrote:
>
> i have released the -U7 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
>
With U7 I'm now able to boot and go all the way into X/KDE with a
SMP+PREEMPT_REALTIME configured kernel (config.gz attached) and reboot
into it, again and again. Things are getting better, really, but ...
Overall behavior is still unreliable. Many things stop functioning once
and for good (until reboot). The shutdown sequence is also prone to busy
hanging.
As an aside, my greatest complaint is that jackd -R doesn't work at all:
JACK: unable to mlock() port buffers: Cannot allocate memory
jack_create_thread: error -1 switching current thread to rt for
inheritance: Unknown error 4294967295
cannot start watchdog thread
cannot load driver module alsa
Another example, and one of the very few that were verbose on dmesg, while
accessing the floppy disk (subfs mounted), I got:
kernel BUG at fs/buffer.c:2702!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in: nls_iso8859_15 nls_cp860 vfat fat nls_base snd_seq
usbhid ehc
i_hcd intel_mch_agp agpgart uhci_hcd mga snd_usb_usx2y snd_usb_lib
snd_rawmidi s
nd_seq_device snd_hwdep snd_intel8x0 snd_ac97_codec snd_pcm snd_timer snd
soundc
ore snd_page_alloc evdev sk98lin w83781d i2c_sensor i2c_isa i2c_i801
i2c_core wa
com usbcore subfs dm_mod
CPU: 0
EIP: 0060:[<c015a9c2>] Not tainted VLI
EFLAGS: 00010246 (2.6.9-rc4-mm1-RT-U7.0smp)
EIP is at submit_bh+0x119/0x126
eax: 00000004 ebx: f16d683c ecx: 00000000 edx: f12e2000
esi: f16d683c edi: 00000000 ebp: f12e3d90 esp: f12e3d78
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process submountd (pid: 11393, threadinfo=f12e2000 task=f6116950)
Stack: 00000000 00000200 00000000 f16d683c fffffffb f64b0800 f12e3da4
c01587e3
00000000 f16d683c f4675000 f12e3db8 c0158ab1 f16d683c 00000000
00000200
f12e3e34 f8c7e85b f673d500 00000000 00000200 f4675148 f12e3de8
c02ca60c
Call Trace:
[<c0105235>] show_stack+0xaf/0xb7 (28)
[<c01053c5>] show_registers+0x168/0x1dd (56)
[<c01055ce>] die+0x107/0x18f (64)
[<c0105aee>] do_invalid_op+0x109/0x10b (188)
[<c0104e7d>] error_code+0x2d/0x38 (92)
[<c01587e3>] __bread_slow+0x60/0xb2 (20)
[<c0158ab1>] __bread+0x33/0x39 (20)
[<f8c7e85b>] fat_fill_super+0xe6/0x76c [fat] (124)
[<f8c43d70>] vfat_fill_super+0x30/0x67 [vfat] (32)
[<c015d220>] get_sb_bdev+0xf1/0x147 (72)
[<f8c43dd5>] vfat_get_sb+0x2e/0x31 [vfat] (28)
[<c015d492>] do_kern_mount+0xa5/0x15a (44)
[<c0172f0d>] do_new_mount+0x71/0xab (52)
[<c01735e3>] do_mount+0x184/0x1ae (116)
[<c0173a21>] sys_mount+0x97/0xd6 (48)
[<c01043b9>] sysenter_past_esp+0x52/0x71 (-8124)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x17/0x70 / (die+0x3f/0x18f)
.. entry 2: print_traces+0x16/0x4c / (show_stack+0xaf/0xb7)
Code: 24 e8 28 08 00 00 8b 45 f0 83 c4 0c 5b 5e 5f 5d c3 0f 0b 8d 0a 86 ea
2d c0 e9 14 ff ff ff 0f 0b 8f 0a 86 ea 2d c0 e9 1c ff ff ff <0f> 0b 8e 0a
86 ea 2d c0 e9 04 ff ff ff 55 89 e5 57 56 31 f6 53
<3>subfs: submountd execution failure. Error 11
Think it's too early to party, eh? :)
--
rncbc aka Rui Nuno Capela
[email protected]
> Ingo Molnar wrote:
>>
>> i have released the -U7 Real-Time Preemption patch:
>>
>> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
>>
>
>
> As an aside, my greatest complaint is that jackd -R doesn't work at all:
>
> JACK: unable to mlock() port buffers: Cannot allocate memory
> jack_create_thread: error -1 switching current thread to rt for
> inheritance: Unknown error 4294967295
> cannot start watchdog thread
> cannot load driver module alsa
>
Forget this. The reason is that realtime-lsm module wasn't being loaded.
Sorry.
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar wrote:
>
> i have released the -U7 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
>
Not critical, but I'm "consistently" getting stuck while running mkinitrd,
both on SMP (SuSE9.1, reiserfs) and UP (Mdk10.1c, ext3). This was
happening with U6 too.
Anyone care to confirm?
Some more incidents on 2.6.9-rc5-mm1-RT-U7.0smp:
kernel BUG at lib/rwsem-generic.c:130!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in: loop nls_iso8859_15 nls_cp860 vfat fat nls_base snd_seq
usbhi
d ehci_hcd uhci_hcd intel_mch_agp agpgart mga snd_usb_usx2y snd_usb_lib
snd_rawm
idi snd_seq_device snd_hwdep snd_intel8x0 snd_ac97_codec snd_pcm snd_timer
snd s
oundcore snd_page_alloc evdev sk98lin w83781d i2c_sensor i2c_isa i2c_i801
i2c_co
re wacom usbcore subfs dm_mod
CPU: 0
EIP: 0060:[<c01e3e5d>] Not tainted VLI
EFLAGS: 00010002 (2.6.9-rc4-mm1-RT-U7.0smp)
EIP is at rwsem_owner_del+0x68/0x8c
eax: f41c0a50 ebx: cb736000 ecx: 00000001 edx: 00000001
esi: 00000001 edi: cdf30218 ebp: cb737fa0 esp: cb737f94
ds: 007b es: 007b ss: 0068 preempt: 00000003
Process loop0 (pid: 23481, threadinfo=cb736000 task=f71ba7d0)
Stack: cdf3021c cdf30000 cdf30218 cb737fc8 c01e42d0 cdf30218 f71ba7d0
c1806cd4
c1806820 00000282 f8c839aa cdf30000 cdf30448 cb737fec f8c83a0f
f71ba7d0
ffffffec 00000000 cdf30218 f8c839aa 00000000 00000000 00000000
c01025c9
Call Trace:
[<c0105235>] show_stack+0xaf/0xb7 (28)
[<c01053c5>] show_registers+0x168/0x1dd (56)
[<c01055ce>] die+0x107/0x18f (64)
[<c0105aee>] do_invalid_op+0x109/0x10b (188)
[<c0104e7d>] error_code+0x2d/0x38 (80)
[<c01e42d0>] up_write+0x57/0x1a1 (40)
[<f8c83a0f>] loop_thread+0x65/0x11d [loop] (36)
[<c01025c9>] kernel_thread_helper+0x5/0xb (881623060)
preempt count: 00000004
. 4-level deep critical section nesting:
.. entry 1: _spin_lock+0x17/0x6d / (up_write+0x110/0x1a1)
.. entry 2: _spin_lock+0x17/0x6d / (up_write+0x4f/0x1a1)
.. entry 3: _spin_lock_irqsave+0x17/0x70 / (die+0x3f/0x18f)
.. entry 4: print_traces+0x16/0x4c / (show_stack+0xaf/0xb7)
Code: 00 e0 ff ff 21 e3 8b 44 8f 18 3b 03 74 2a 83 c1 01 89 d6 39 d1 7c ef
8b 15
60 fa 31 c0 85 d2 74 12 c7 05 60 fa 31 c0 00 00 00 00 <0f> 0b 82 00 0b 96
2e c0
5b 5e 5f 5d c3 8d 56 ff 39 d1 74 08 8b
<6>note: loop0[23481] exited with preempt_count 2
BUG: sleeping function called from invalid context loop0(23481) at
lib/rwsem-gen
eric.c:398
in_atomic():1 [00000002], irqs_disabled():0
[<c010525b>] dump_stack+0x1e/0x20 (20)
[<c0118408>] __might_sleep+0xb7/0xca (36)
[<c02ca11c>] down_write+0x1f/0x184 (44)
[<c011cfbc>] exit_notify+0x26/0x97f (60)
[<c011db93>] do_exit+0x27e/0x492 (40)
[<c0105656>] do_divide_error+0x0/0x12c (64)
[<c0105aee>] do_invalid_op+0x109/0x10b (188)
[<c0104e7d>] error_code+0x2d/0x38 (80)
[<c01e42d0>] up_write+0x57/0x1a1 (40)
[<f8c83a0f>] loop_thread+0x65/0x11d [loop] (36)
[<c01025c9>] kernel_thread_helper+0x5/0xb (881623060)
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: _spin_lock+0x17/0x6d / (up_write+0x110/0x1a1)
.. entry 2: _spin_lock+0x17/0x6d / (up_write+0x4f/0x1a1)
.. entry 3: print_traces+0x16/0x4c / (dump_stack+0x1e/0x20)
BUG: scheduling while atomic: loop0/0x00000002/23481
caller is do_exit+0x287/0x492
[<c010525b>] dump_stack+0x1e/0x20 (20)
[<c02c90d6>] __schedule+0x836/0xc81 (116)
[<c011db9c>] do_exit+0x287/0x492 (40)
[<c0105656>] do_divide_error+0x0/0x12c (64)
[<c0105aee>] do_invalid_op+0x109/0x10b (188)
[<c0104e7d>] error_code+0x2d/0x38 (80)
[<c01e42d0>] up_write+0x57/0x1a1 (40)
[<f8c83a0f>] loop_thread+0x65/0x11d [loop] (36)
[<c01025c9>] kernel_thread_helper+0x5/0xb (881623060)
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: _spin_lock+0x17/0x6d / (up_write+0x110/0x1a1)
.. entry 2: _spin_lock+0x17/0x6d / (up_write+0x4f/0x1a1)
.. entry 3: print_traces+0x16/0x4c / (dump_stack+0x1e/0x20)
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
On Tue, 2004-10-19 at 11:00, Ingo Molnar wrote:
> i have released the -U7 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
>
> this too is a fixes-only release.
First Ux I try to boot, on install I get this:
WARNING:
/lib/modules/2.6.8-1.520.1rU7.ll.rhfc2.ccrma/kernel/drivers/scsi/aacraid/aacraid.ko needs unknown symbol __you_cannot_kmalloc_that_much
WARNING:
/lib/modules/2.6.8-1.520.1rU7.ll.rhfc2.ccrma/kernel/drivers/message/i2o/i2o_block.ko needs unknown symbol i2o_msg_in_to_virt
WARNING:
/lib/modules/2.6.8-1.520.1rU7.ll.rhfc2.ccrma/kernel/drivers/message/i2o/i2o_core.ko needs unknown symbol i2o_msg_in_to_virt
WARNING:
/lib/modules/2.6.8-1.520.1rU7.ll.rhfc2.ccrma/kernel/drivers/message/i2o/i2o_core.ko needs unknown symbol i2o_msg_out_to_virt
WARNING:
/lib/modules/2.6.8-1.520.1rU7.ll.rhfc2.ccrma/kernel/drivers/message/i2o/i2o_scsi.ko needs unknown symbol i2o_msg_in_to_virt
No luck booting, I get a kernel panic, the last lines printed to the
screen are like this (transcribed by hand, there may be errors):
printk+0x17/0x20 (20)
print_preempt_trace+0x51/0xa0 (12)
print_traces+0x1e/0x40 (24)
show_stack+0x70/0x90 (8)
error_code+0x2d/0x38 (20)
down_write_interruptible+0xba/0x278 (52)
scsi_error_handler+0x98/0x1a0 [scsi_mod] (44)
scsi_error_handler+0x0/0x1a0 [scsi_mod] (284)
kernel_thread_helper+0x5/0x18 (8)
preempt count: 04000008
. 8 level deep critical section nesting:
.. entry 1: down_write_interruptible+0x273/0x278 / (0x0)
.. entry 2: down_write_interruptible+0x5a/0x278 / (0x0)
.. entry 3: __schedule+0x34/0x650 / (0x0)
.. entry 4: __schedule+0xcb/0x650 / (0x0)
.. entry 5: __schedule+0x34/0x650 / (0x0)
.. entry 6: __schedule+0xcb/0x650 / (0x0)
.. entry 7: die+0x36/0x180 / (0x0)
.. entry 8: print_traces+0xd/0x40 / (0x0)
<0> Kernel panic - not syncing: Fatal exception in interrupt
I hope this can be useful...
-- Fernando
On Wed, 2004-10-20 at 01:38, Fernando Pablo Lopez-Lezcano wrote:
> On Tue, 2004-10-19 at 11:00, Ingo Molnar wrote:
> > i have released the -U7 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
> >
> > this too is a fixes-only release.
>
That's in scsi_error_handler() where a mutex is initialized locked and
then acquired again. This triggers the deadlock/correctness check.
tglx
On Tue, 2004-10-19 at 21:04, Thomas Gleixner wrote:
> On Tue, 2004-10-19 at 20:00, Ingo Molnar wrote:
> > i have released the -U7 Real-Time Preemption patch:
>
Nobrainer typo removal. I'm feeling stupid.
tglx
On Tue, 2004-10-19 at 21:04, Thomas Gleixner wrote:
> On Tue, 2004-10-19 at 20:00, Ingo Molnar wrote:
> > i have released the -U7 Real-Time Preemption patch:
>
> Another simple fix.
Another one using wait_for_completion_timeout(). No problems so far.
tglx
On Wed, 2004-10-20 at 01:39, Thomas Gleixner wrote:
> That's in scsi_error_handler() where a mutex is initialized locked and
> then acquired again. This triggers the deadlock/correctness check.
Some more fixes
- scsi_error_handler() fix
- device subsystem device_remove locking fix
tglx
On Tue, 2004-10-19 at 14:00, Ingo Molnar wrote:
> i have released the -U7 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
OK, this one boots but Gnome does not start. It hangs at "Session
Manager". The system does not hang, but I never get to my desktop.
Nothing useful in the logs.
While this was going on I switched to a text console and noticed that if
I enabled/ Caps Lock at just the right moment then _all_ text output
(LOGIN, PASSWORD, etc) would be in caps. Toggling it a few times seemed
to get rid of the problem.
Any particular debug options I should try?
Lee
* Thomas Gleixner <[email protected]> wrote:
> On Wed, 2004-10-20 at 01:39, Thomas Gleixner wrote:
> > That's in scsi_error_handler() where a mutex is initialized locked and
> > then acquired again. This triggers the deadlock/correctness check.
>
> Some more fixes
>
> - scsi_error_handler() fix
>
> - device subsystem device_remove locking fix
thanks, i've applied these. The block/loop.c assert reported by Rui
seems to be a similar problem too.
Ingo
i have released the -U8 Real-Time Preemption patch:
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
this too is a fixes-only release. It includes the many semaphore-abuse
and sleep_on() fixes/improvements from Thomas Gleixner, and it also
includes a couple of semaphore related fixes.
I believe the semaphore fixes should resolve a number of the deadlocks
reported for -U7.
In particular it seems the only sane and reliable way to convert RCU
locking was to allow the following semantics for rwsems: allow reads to
nest, and allow self-read-recursion of a self-write-held semaphore. My
current implementation for this allows semaphore unfairness, but that
can be fixed later on. Most importantly, the RCU to RT-locking
conversions are much more automatic now and map nicely to what the code
is doing upstream. Most of the time they involve a conversion of a
spinlock or semaphore into a rwlock or rwsem. The old code maps to new
code almost automatically, the only manual work needed was to associate
the rcu_read_lock() with the writers-lock that it excludes against,
which is a pretty clear (but not automatic, and hence not automatable)
decision. This way i could convert some more networking code, and
simplify the older changes and hopefully get rid of some deadlocks. The
locking API is still not in its final form, but it's getting closer.
Changes since -U7:
- deadlock fix: sysfs/driver-base semaphore fixes from Thomas Gleixner
- deadlock fix: scsi semaphore fixes from Thomas Gleixner
- NFS sleep_on() fixes from Thomas Gleixner
- rawmidid.c sleep_on() fix from Thomas Gleixner
- [ i've added more wait_for_completion_*() primitives, to ease
conversion of other semaphore-(ab-)using code. ]
- make rwsems self-recursive
- RCU lock conversion: convert rtnl_sem RCU use.
- netfilter deadlock fix - clean up RCU locking.
to create a -U8 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
Ingo
On Wed, 2004-10-20 at 09:40, Ingo Molnar wrote:
> > - scsi_error_handler() fix
> > - device subsystem device_remove locking fix
>
> thanks, i've applied these. The block/loop.c assert reported by Rui
> seems to be a similar problem too.
Yep, it's all the same scheme. Most of the offending code uses
MUTEX_LOCKED in an init function and plays the down, and up from a
different context game, which triggers the deadlock/owner verify. Not
hard to fix, but at some places it takes a bit, until you see the
intention of the driver hacker.
The most surprising one was in driver/base. I did not expect that new
2.5/6 code uses those tricks too.
Fixes for aic7xxx and sym53c8xx_2 attached.
tglx
> Changes since -U7:
>
- fix block-loopback assert reported by Mark H Johnson, Matthew L
Foster and Rui Nuno Capela. (usually triggers during 'make install'
of a kernel compile.)
Ingo
Ingo Molnar wrote:
>
>> Changes since -U7:
>>
>
> - fix block-loopback assert reported by Mark H Johnson, Matthew L
> Foster and Rui Nuno Capela. (usually triggers during 'make install'
> of a kernel compile.)
>
Is this fix already on U8 ? I don't seem to get out of mkinitrd (which is
triggered by kernel make install).
OTOH, still on my laptop (P4/UP) I'm getting this very often:
RTNL: assertion failed at net/ipv4/devinet.c (1049)
[<c0104ee4>] dump_stack+0x1e/0x20 (20)
[<c02afa2b>] inet_dump_ifaddr+0x135/0x13a (52)
[<c027a533>] rtnetlink_dump_all+0x92/0xaa (40)
[<c028117f>] netlink_dump+0x6c/0x211 (56)
[<c0280f97>] netlink_recvmsg+0x209/0x21b (92)
[<c0268a40>] sock_recvmsg+0xcc/0xf0 (248)
[<c026a4cc>] sys_recvmsg+0x110/0x1fb (284)
[<c026a628>] sys_socketcall+0x71/0x234 (68)
[<c01040a9>] sysenter_past_esp+0x52/0x71 (-8124)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x16/0x4a / (dump_stack+0x1e/0x20)
And I also found this once:
------------[ cut here ]------------
kernel BUG at lib/rwsem-generic.c:598!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: realtime commoncap snd_seq_oss snd_seq_midi_event
snd_seq snd_pcm_oss snd_mixer_oss snd_usb_usx2y snd_usb_lib snd_rawmidi
snd_seq_device snd_hwdep snd_ali5451 snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd soundcore prism2_cs p80211 ds yenta_socket pcmcia_core
natsemi crc32 loop subfs evdev ohci_hcd usbcore thermal processor fan
button battery ac
CPU: 0
EIP: 0060:[<c01b7e30>] Not tainted VLI
EFLAGS: 00010202 (2.6.9-rc4-mm1-RT-U8.0)
EIP is at up_write+0x1d4/0x202
eax: d4edc000 ebx: e003f967 ecx: d4eb8d40 edx: dee04020
esi: de9b4214 edi: de9b443c ebp: d4eddfcc esp: d4eddfac
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process loop0 (pid: 6672, threadinfo=d4edc000 task=d4eb8d40)
Stack: c0113b01 00000001 c0384d90 c0384d60 00000282 e003f967 de9b4000
de9b443c
d4eddfec e003f9c8 d4eb8d40 ffffffec 00000000 e003f967 00000000
00000000
00000000 c0102305 de9b4000 00000000 00000000
Call Trace:
[<c0104eb0>] show_stack+0x80/0x96 (28)
[<c010504b>] show_registers+0x165/0x1de (56)
[<c010525d>] die+0xf6/0x191 (64)
[<c0105797>] do_invalid_op+0x10b/0x10d (188)
[<c0104b0d>] error_code+0x2d/0x38 (100)
[<e003f9c8>] loop_thread+0x61/0x11b [loop] (32)
[<c0102305>] kernel_thread_helper+0x5/0xb (722608148)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: die+0x3a/0x191 / (do_invalid_op+0x10b/0x10d)
.. entry 2: print_traces+0x16/0x4a / (show_stack+0x80/0x96)
Code: e8 af f9 ff ff 89 f8 e8 f1 af f5 ff e9 35 ff ff ff 0f 0b a5 00 43 e4
2c c0 e9 da fe ff ff 0f 0b a4 00 43 e4 2c c0 e9 c4 fe ff ff <0f> 0b 56 02
cf 70 2d c0 e9 3c fe ff ff e8 d7 56 10 00 e9 22 ff
(config.gz is attached)
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar wrote:
> i have released the -U8 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>
I'm getting these BUGs when I use netconsole with Real-Time Preemption
(but netconsole works):
kjournald starting. Commit interval 5 seconds
BUG: sleeping function called from invalid context kjournald(775) at
kernel/mutex.c:25
in_atomic():0 [00000000], irqs_disabled():1
[<c0105fbe>] dump_stack+0x1e/0x20 (20)
[<c01194d8>] __might_sleep+0xb8/0xd0 (36)
[<c0130bc0>] _mutex_lock+0x20/0x40 (20)
[<c02675f7>] netpoll_send_skb+0x37/0xc0 (28)
[<c0231081>] write_msg+0x41/0x60 (36)
[<c011c208>] __call_console_drivers+0x58/0x60 (32)
[<c011c326>] call_console_drivers+0x96/0x140 (40)
[<c011c6e1>] release_console_sem+0x71/0x100 (36)
[<c011c5b6>] vprintk+0x116/0x180 (36)
[<c011c498>] printk+0x18/0x20 (16)
[<c01a31af>] kjournald+0x8f/0x250 (140)
[<c01032d1>] kernel_thread_helper+0x5/0x14 (141484052)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x18/0x50 / (dump_stack+0x1e/0x20)
EXT3 FS on hda8, <3>BUG: sleeping function called from invalid context
mount(786) at kernel/mutex.c:25
in_atomic():0 [00000000], irqs_disabled():1
[<c0105fbe>] dump_stack+0x1e/0x20 (20)
[<c01194d8>] __might_sleep+0xb8/0xd0 (36)
[<c0130bc0>] _mutex_lock+0x20/0x40 (20)
[<c02675f7>] netpoll_send_skb+0x37/0xc0 (28)
[<c0231081>] write_msg+0x41/0x60 (36)
[<c011c208>] __call_console_drivers+0x58/0x60 (32)
[<c011c302>] call_console_drivers+0x72/0x140 (40)
[<c011c6e1>] release_console_sem+0x71/0x100 (36)
[<c011c5b6>] vprintk+0x116/0x180 (36)
[<c011c498>] printk+0x18/0x20 (16)
[<c01989f2>] ext3_setup_super+0xd2/0x1c0 (80)
[<c019a5ed>] ext3_remount+0x12d/0x190 (48)
[<c015d9b0>] do_remount_sb+0xa0/0xf0 (32)
[<c0173c6d>] do_remount+0x6d/0xc0 (36)
[<c01745fb>] do_mount+0x19b/0x1b0 (116)
[<c01749e7>] sys_mount+0x97/0xe0 (48)
[<c010518f>] syscall_call+0x7/0xb (-8124)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x18/0x50 / (dump_stack+0x1e/0x20)
internal journal
I have hacked the sk98lin driver to support netpoll (I sent the patch to
netdev), so maybe I did something wrong and these BUGs are my own fault.
Does anybody else use netconsole with Real-Time Preemption?
Michal
diff -Nurp linux-2.6.9/drivers/net/sk98lin/skge.c linux-2.6.9-mich/drivers/net/sk98lin/skge.c
--- linux-2.6.9/drivers/net/sk98lin/skge.c 2004-10-18 23:53:22.000000000 +0200
+++ linux-2.6.9-mich/drivers/net/sk98lin/skge.c 2004-10-20 01:09:07.566181320 +0200
@@ -1126,6 +1126,21 @@ SK_U32 IntSrc; /* interrupts source re
return SkIsrRetHandled;
} /* SkGeIsrOnePort */
+#ifdef CONFIG_NET_POLL_CONTROLLER
+/**
+ * SkGePollController - polling receive, for netconsole
+ * @dev: network device
+ *
+ * Polling receive - used by netconsole and other diagnostic tools
+ * to allow network i/o with interrupts disabled.
+ */
+static void SkGePollController(struct net_device *dev)
+{
+ disable_irq(dev->irq);
+ SkGeIsr(dev->irq, dev, NULL);
+ enable_irq(dev->irq);
+}
+#endif
/****************************************************************************
*
@@ -4960,6 +4975,9 @@ static int __devinit skge_probe_one(stru
dev->set_mac_address = &SkGeSetMacAddr;
dev->do_ioctl = &SkGeIoctl;
dev->change_mtu = &SkGeChangeMtu;
+#ifdef CONFIG_NET_POLL_CONTROLLER
+ dev->poll_controller = &SkGePollController;
+#endif
dev->flags &= ~IFF_RUNNING;
SET_NETDEV_DEV(dev, &pdev->dev);
Ingo Molnar wrote:
>
>Rui Nuno Capela wrote:
>> >
>> > - fix block-loopback assert reported by Mark H Johnson, Matthew L
>> > Foster and Rui Nuno Capela. (usually triggers during 'make install'
>> > of a kernel compile.)
>> >
>>
>> Is this fix already on U8 ? I don't seem to get out of mkinitrd (which
>> is triggered by kernel make install).
>
> please re-download -U8, i've updated it a couple of minutes after
> uploading it, but apparently not fast enough :-| Sorry!
>
OK. No problem.... and yes, mkinitrd (make install) works again.
>> OTOH, still on my laptop (P4/UP) I'm getting this very often:
>>
>> RTNL: assertion failed at net/ipv4/devinet.c (1049)
>
> yeah - this too was an oversight i fixed in the latest upload.
I don't think so. I still see plenty of those here.
Is there an even more recent U8? I think you should consider add some dot
numbering to each of the uploads... ;)
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Ingo Molnar wrote:
> >
> >> Changes since -U7:
> >>
> >
> > - fix block-loopback assert reported by Mark H Johnson, Matthew L
> > Foster and Rui Nuno Capela. (usually triggers during 'make install'
> > of a kernel compile.)
> >
>
> Is this fix already on U8 ? I don't seem to get out of mkinitrd (which
> is triggered by kernel make install).
please re-download -U8, i've updated it a couple of minutes after
uploading it, but apparently not fast enough :-| Sorry!
> OTOH, still on my laptop (P4/UP) I'm getting this very often:
>
> RTNL: assertion failed at net/ipv4/devinet.c (1049)
yeah - this too was an oversight i fixed in the latest upload.
> ------------[ cut here ]------------
> kernel BUG at lib/rwsem-generic.c:598!
> [<c0104b0d>] error_code+0x2d/0x38 (100)
> [<e003f9c8>] loop_thread+0x61/0x11b [loop] (32)
> [<c0102305>] kernel_thread_helper+0x5/0xb (722608148)
yes, this is the loopback fix. Please-retry with the latest patch.
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> > please re-download -U8, i've updated it a couple of minutes after
> > uploading it, but apparently not fast enough :-| Sorry!
> >
>
> OK. No problem.... and yes, mkinitrd (make install) works again.
good.
> >> RTNL: assertion failed at net/ipv4/devinet.c (1049)
> >
> > yeah - this too was an oversight i fixed in the latest upload.
>
> I don't think so. I still see plenty of those here.
>
> Is there an even more recent U8? I think you should consider add some
> dot numbering to each of the uploads... ;)
indeed this most likely means there's a newer update :-| Please
double-check that the one you have is:
$ md5sum realtime-preempt-2.6.9-rc4-mm1-U8
b59ae00ca0f45f545519348113af5c4f realtime-preempt-2.6.9-rc4-mm1-U8
Ingo
* Michal Schmidt <[email protected]> wrote:
> Ingo Molnar wrote:
> >* Michal Schmidt <[email protected]> wrote:
> >>I'm getting these BUGs when I use netconsole with Real-Time Preemption
> >>(but netconsole works):
> >
> >
> >you are getting them because interrupts get disabled somewhere in the
> >path. Do your changes perhaps introduce a local_irq_save() or
> >local_irq_disable()?
> >
>
> I'm attaching my sk98lin patch. It uses disable_irq(). It's inspired
> by 8139too.
disable_irq() should work fine though. (it doesnt disable local
interrupts, it only disables that particular irq line.) So something
else disabled interrupts - ah, netconsole.c itself. Does the patch below
fix things up for you?
Ingo
--- linux/drivers/net/netconsole.c.orig
+++ linux/drivers/net/netconsole.c
@@ -73,7 +73,9 @@ static void write_msg(struct console *co
if (!np.dev)
return;
+#ifndef CONFIG_PREEMPT_REALTIME
local_irq_save(flags);
+#endif
for(left = len; left; ) {
frag = min(left, MAX_PRINT_CHUNK);
@@ -82,7 +84,9 @@ static void write_msg(struct console *co
left -= frag;
}
+#ifndef CONFIG_PREEMPT_REALTIME
local_irq_restore(flags);
+#endif
}
static struct console netconsole = {
* Michal Schmidt <[email protected]> wrote:
> I'm getting these BUGs when I use netconsole with Real-Time Preemption
> (but netconsole works):
you are getting them because interrupts get disabled somewhere in the
path. Do your changes perhaps introduce a local_irq_save() or
local_irq_disable()?
(in PREEMPT_REALTIME spin_lock_irq*() does not disable interrupts for
mutex-based spinlocks, so the only way to get irqs disabled is
explicitly.)
Ingo
On Wed, 20 Oct 2004 11:45:08 +0200
Ingo Molnar <[email protected]> wrote:
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
Hi,
i just wanted to let you know that with U8 i still experience the "pauses" i
reported on U6, too. I would guess that it's some scheduler thing as jackd
running SCHED_FIFO and all its clients (at least the audio threads running
SCHED_FIFO) are not affected by the pauses (i don't see any xruns from jackd
and audio processing happily goes along without audible dropouts).
Also it seems that /proc/sys/kernel/trace_enabled == 1 is not the only thing
being able to trigger the pauses. With U6 i also experienced them with
trace_enabled == 0. I have to add though that it took quite a while for them
to kick in (hours) after setting trace_enabled to 0. So my conclusion is
that trace_enabled == 1 just increases the probability of such pauses by
several magnitudes (with 1 i get about one of these pauses per 2-10 minutes,
with 0 it took several hours for the first pause to occur and then they
stayed less frequent than with 1).
Ah and i forgot: dmesg -n 1 does not help..
flo
Ingo Molnar wrote:
>
>> >> RTNL: assertion failed at net/ipv4/devinet.c (1049)
>> >
>> > yeah - this too was an oversight i fixed in the latest upload.
>>
>> I don't think so. I still see plenty of those here.
>>
>> Is there an even more recent U8? I think you should consider add some
>> dot numbering to each of the uploads... ;)
>
> indeed this most likely means there's a newer update :-| Please
> double-check that the one you have is:
>
> $ md5sum realtime-preempt-2.6.9-rc4-mm1-U8
> b59ae00ca0f45f545519348113af5c4f realtime-preempt-2.6.9-rc4-mm1-U8
>
That was it. Thanks.
Now's some bad news:
I getting the dump below, this time while plugging a flash memory stick,
but right after that the system starts to behave preety bad and
increasingly unresponsive. An hard-boot is almost the end of the (short)
story :(
(e.g. running jackd also hoses the complete system in no reproducible
amount of time--sometimes short, other times long, like a random
time-bomb).
ohci_hcd 0000:00:0f.0: wakeup
usb 2-1: new full speed USB device using address 2
Initializing USB Mass Storage driver...
scsi0 : SCSI emulation for USB Mass Storage devices
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
------------[ cut here ]------------
kernel BUG at lib/rwsem-generic.c:598!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: usb_storage vfat fat udf isofs nls_base realtime
commoncap snd_seq_oss snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss
snd_usb_usx2y snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd_ali5451
snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore prism2_cs
p80211 ds yenta_socket pcmcia_core natsemi crc32 loop subfs evdev ohci_hcd
usbcore thermal processor fan button battery ac
CPU: 0
EIP: 0060:[<c01b7e30>] Not tainted VLI
EFLAGS: 00010206 (2.6.9-rc4-mm1-RT-U8.3)
EIP is at up_write+0x1d4/0x202
eax: d2b2a000 ebx: 00000292 ecx: d2afe980 edx: d2ad4f40
esi: d7b83b24 edi: dcb21000 ebp: d2b2bd6c esp: d2b2bd4c
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process usb-stor (pid: 6699, threadinfo=d2b2a000 task=d2ad48b0)
Stack: d2ad48b0 d2b2bd78 c02bea7f 00000001 d2ad48b0 00000292 d2afe980
dcb21000
d2b2bd84 e01ca139 d2afe980 d2b2bd84 00000292 dcb21138 d2b2bdac
c022ed18
d2afe980 c022ef1c c0231679 00000000 d2afe9d4 d2afe980 d2aa3800
dcb21000
Call Trace:
[<c0104eb0>] show_stack+0x80/0x96 (28)
[<c010504b>] show_registers+0x165/0x1de (56)
[<c010525d>] die+0xf6/0x191 (64)
[<c0105797>] do_invalid_op+0x10b/0x10d (188)
[<c0104b0d>] error_code+0x2d/0x38 (100)
[<e01ca139>] queuecommand+0x70/0x7c [usb_storage] (24)
[<c022ed18>] scsi_dispatch_cmd+0x168/0x218 (40)
[<c02342ed>] scsi_request_fn+0x1ee/0x42b (52)
[<c0205612>] blk_insert_request+0xcd/0xfb (44)
[<c0232f4f>] scsi_insert_special_req+0x3b/0x3f (28)
[<c0233181>] scsi_wait_req+0x61/0x94 (60)
[<c023529c>] scsi_probe_lun+0x8e/0x240 (68)
[<c023588f>] scsi_probe_and_add_lun+0xb0/0x1be (48)
[<c0236015>] scsi_scan_target+0xa4/0x123 (60)
[<c0236121>] scsi_scan_channel+0x8d/0xa4 (48)
[<c02361b1>] scsi_scan_host_selected+0x79/0xd4 (44)
[<c023623d>] scsi_scan_host+0x31/0x33 (28)
[<e01cccbd>] usb_stor_scan_thread+0x144/0x155 [usb_storage] (96)
[<c0102305>] kernel_thread_helper+0x5/0xb (760037396)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: die+0x3a/0x191 / (do_invalid_op+0x10b/0x10d)
.. entry 2: print_traces+0x16/0x4a / (show_stack+0x80/0x96)
Code: e8 af f9 ff ff 89 f8 e8 f1 af f5 ff e9 35 ff ff ff 0f 0b a5 00 e3 e8
2c c0 e9 da fe ff ff 0f 0b a4 00 e3 e8 2c c0 e9 c4 fe ff ff <0f> 0b 56 02
6f 75 2d c0 e9 3c fe ff ff e8 7f 5b 10 00 e9 22 ff
Bye.
--
rncbc aka Rui Nuno Capela
[email protected]
* Thomas Gleixner <[email protected]> wrote:
> Yep, it's all the same scheme. Most of the offending code uses
> MUTEX_LOCKED in an init function and plays the down, and up from a
> different context game, which triggers the deadlock/owner verify. Not
> hard to fix, but at some places it takes a bit, until you see the
> intention of the driver hacker.
the NFS ones seemed to be the least clear ones. I'm glad you converted
those already :-)
> The most surprising one was in driver/base. I did not expect that new
> 2.5/6 code uses those tricks too.
it is not strictly a bug, but that technique was discouraged for years -
completions are cleaner and faster for that purpose anyway. (they were
designed for what in the semaphore case is the slowpath.)
> Fixes for aic7xxx and sym53c8xx_2 attached.
Applied. The sym53c8xx_2 looks good. aic7xxx is good too except for a
minor cleanup issue: i've changed all _sem symbols to be _done symbols.
It's not a semaphore anymore, lets avoid the namespace-rotting effect.
I've put these into -U8 so anyone hitting aic7xxx or sym53c8xx_2 should
re-download the -U8 patch. (others who have already downloaded it should
not bother.)
Ingo
On Wednesday 20 October 2004 11:45, Ingo Molnar wrote:
>
> i have released the -U8 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
------------[ cut here ]------------
kernel BUG at lib/rwsem-generic.c:598!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: irtty_sir sir_dev irda crc_ccitt usbcore lp ipv6 dm_mod it87 i2c_isa i2c
CPU: 0
EIP: 0060:[<c01dfa5b>] Not tainted VLI
EFLAGS: 00010287 (2.6.9-rc4-mm1-RT-U8)
EIP is at up_write+0x1eb/0x200
eax: de848000 ebx: df3a255c ecx: 0000001f edx: c14e3260
esi: df3a24c8 edi: de848000 ebp: de849f7c esp: de849f5c
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process kIrDAd (pid: 1295, threadinfo=de848000 task=df3d6110)
Stack: e09119bb de849f74 c0111590 df3a2400 0000001f df3a255c e0915000 de848000
de849f94 e09119bb df3a2400 00000282 de849fc4 00000000 de849fec e0911ae2
e09129c2 de849fb8 00000000 df3d6110 c01148d0 00000000 00000000 de84ff08
Call Trace:
[<e09119bb>] run_irda_queue+0x5b/0xd0 [sir_dev] (4)
[<c0111590>] mcount+0x14/0x18 (8)
[<e09119bb>] run_irda_queue+0x5b/0xd0 [sir_dev] (28)
[<e0911ae2>] irda_thread+0xb2/0xf0 [sir_dev] (24)
[<c01148d0>] default_wake_function+0x0/0x20 (20)
[<c010612a>] ret_from_fork+0x6/0x14 (16)
[<c01148d0>] default_wake_function+0x0/0x20 (16)
[<e0911a30>] irda_thread+0x0/0xf0 [sir_dev] (24)
[<c0104319>] kernel_thread_helper+0x5/0xc (12)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: die+0x3f/0x1a0 / (do_invalid_op+0x106/0x110)
.. entry 2: print_traces+0x1d/0x80 / (show_stack+0x83/0xa0)
Code: ff e9 44 ff ff ff 0f 0b a5 00 13 ce 2e c0 eb b6 0f 0b a4 00 13 ce 2e c0 eb a5 c7 04 2
(events/0/3/CPU#0): new 785 us maximum-latency critical section.
=> started at timestamp 193385865: <kernel_fpu_begin+0x21/0x60>
=> ended at timestamp 193386650: <_mmx_memcpy+0x131/0x180>
[<c012f070>] sub_preempt_count+0x60/0x90 (4)
[<c012ed5e>] check_preempt_timing+0x15e/0x270 (8)
[<c01e1a11>] _mmx_memcpy+0x131/0x180 (8)
[<c012f070>] sub_preempt_count+0x60/0x90 (64)
[<c01e1a11>] _mmx_memcpy+0x131/0x180 (8)
[<c01e1a11>] _mmx_memcpy+0x131/0x180 (16)
[<c01ee1d0>] vgacon_save_screen+0x80/0x90 (28)
[<c021ba89>] redraw_screen+0x199/0x270 (28)
[<c012e4ed>] __mcount+0x1d/0x20 (12)
[<c0216d61>] complete_change_console+0x11/0x100 (4)
[<c021ec86>] console_callback+0xe6/0xf0 (4)
[<c0216d89>] complete_change_console+0x39/0x100 (28)
[<c021ec86>] console_callback+0xe6/0xf0 (28)
[<c0128e63>] worker_thread+0x1d3/0x2a0 (20)
[<c021eba0>] console_callback+0x0/0xf0 (28)
[<c01148d0>] default_wake_function+0x0/0x20 (28)
[<c01148d0>] default_wake_function+0x0/0x20 (32)
[<c0111590>] mcount+0x14/0x18 (12)
[<c012d26a>] kthread+0xaa/0xb0 (28)
[<c0128c90>] worker_thread+0x0/0x2a0 (20)
[<c012d1c0>] kthread+0x0/0xb0 (12)
[<c0104319>] kernel_thread_helper+0x5/0xc (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: kernel_fpu_begin+0x21/0x60 / (_mmx_memcpy+0x36/0x180)
.. entry 2: print_traces+0x1d/0x80 / (dump_stack+0x23/0x30)
=> dump-end timestamp 193387119
* Lorenzo Allegrucci <[email protected]> wrote:
> Process kIrDAd (pid: 1295, threadinfo=de848000 task=df3d6110)
> Call Trace:
> [<e09119bb>] run_irda_queue+0x5b/0xd0 [sir_dev] (4)
> [<c0111590>] mcount+0x14/0x18 (8)
> [<e09119bb>] run_irda_queue+0x5b/0xd0 [sir_dev] (28)
> [<e0911ae2>] irda_thread+0xb2/0xf0 [sir_dev] (24)
ok - IRDA too needs fixing. Disable CONFIG_IRDA for the time being.
Ingo
On Wed, 20 Oct 2004 14:55:00 +0200
Ingo Molnar <[email protected]> wrote:
> i dont think it's caused by trace_enabled - the trace you sent last time
> clearly showed erratic behavior. There's one piece of code i suspect in
> particular - could you try the patch below ontop of -U8? (i have
> compile- and boot- tested it)
mango:/usr/src/linux-2.6.9-rc4-mm1-U8# patch -p1 </home/tapas/foo.patch
patching file kernel/sched.c
Hunk #5 succeeded at 3843 with fuzz 1.
building anyways, reporting later..
flo
* Florian Schmidt <[email protected]> wrote:
> > i dont think it's caused by trace_enabled - the trace you sent last time
> > clearly showed erratic behavior. There's one piece of code i suspect in
> > particular - could you try the patch below ontop of -U8? (i have
> > compile- and boot- tested it)
>
> mango:/usr/src/linux-2.6.9-rc4-mm1-U8# patch -p1 </home/tapas/foo.patch
> patching file kernel/sched.c
> Hunk #5 succeeded at 3843 with fuzz 1.
>
> building anyways, reporting later..
never worry about a fuzz when applying patches - as long as you dont get
a reject it should be ok.
Ingo
* Florian Schmidt <[email protected]> wrote:
> i just wanted to let you know that with U8 i still experience the
> "pauses" i reported on U6, too. I would guess that it's some scheduler
> thing as jackd running SCHED_FIFO and all its clients (at least the
> audio threads running SCHED_FIFO) are not affected by the pauses (i
> don't see any xruns from jackd and audio processing happily goes along
> without audible dropouts).
ok.
> Also it seems that /proc/sys/kernel/trace_enabled == 1 is not the only
> thing being able to trigger the pauses. With U6 i also experienced
> them with trace_enabled == 0. I have to add though that it took quite
> a while for them to kick in (hours) after setting trace_enabled to 0.
> So my conclusion is that trace_enabled == 1 just increases the
> probability of such pauses by several magnitudes (with 1 i get about
> one of these pauses per 2-10 minutes, with 0 it took several hours for
> the first pause to occur and then they stayed less frequent than with
> 1).
i dont think it's caused by trace_enabled - the trace you sent last time
clearly showed erratic behavior. There's one piece of code i suspect in
particular - could you try the patch below ontop of -U8? (i have
compile- and boot- tested it)
Ingo
--- linux/kernel/sched.c.orig
+++ linux/kernel/sched.c
@@ -2764,6 +2764,8 @@ need_resched:
else
deactivate_task(prev, rq);
}
+ if (preempt_count() & PREEMPT_ACTIVE)
+ sub_preempt_count(PREEMPT_ACTIVE);
if (unlikely(prev->flags & PF_DEAD)) {
BUG_ON(prev->state != TASK_RUNNING);
prev->state = __TASK_DEAD;
@@ -2940,6 +2942,7 @@ asmlinkage void __sched preempt_schedule
return;
need_resched:
+ local_irq_disable();
add_preempt_count(PREEMPT_ACTIVE);
/*
* We keep the big kernel semaphore locked, but we
@@ -2950,11 +2953,10 @@ need_resched:
saved_lock_depth = task->lock_depth;
task->lock_depth = -1;
#endif
- schedule();
+ __schedule();
#ifdef CONFIG_PREEMPT_BKL
task->lock_depth = saved_lock_depth;
#endif
- sub_preempt_count(PREEMPT_ACTIVE);
/* we could miss a preemption opportunity between schedule and now */
barrier();
@@ -3002,7 +3004,6 @@ need_resched:
#ifdef CONFIG_PREEMPT_BKL
task->lock_depth = saved_lock_depth;
#endif
- sub_preempt_count(PREEMPT_ACTIVE);
/* we could miss a preemption opportunity between schedule and now */
barrier();
@@ -3842,9 +3843,9 @@ static inline void __cond_resched(void)
if (preempt_count() & PREEMPT_ACTIVE)
return;
do {
+ local_irq_disable();
add_preempt_count(PREEMPT_ACTIVE);
- schedule();
- sub_preempt_count(PREEMPT_ACTIVE);
+ __schedule();
} while (need_resched());
}
On Wed, 20 Oct 2004 15:25:07 +0200
Florian Schmidt <[email protected]> wrote:
> On Wed, 20 Oct 2004 14:55:00 +0200
> Ingo Molnar <[email protected]> wrote:
>
> > i dont think it's caused by trace_enabled - the trace you sent last time
> > clearly showed erratic behavior. There's one piece of code i suspect in
> > particular - could you try the patch below ontop of -U8? (i have
> > compile- and boot- tested it)
>
> mango:/usr/src/linux-2.6.9-rc4-mm1-U8# patch -p1 </home/tapas/foo.patch
> patching file kernel/sched.c
> Hunk #5 succeeded at 3843 with fuzz 1.
>
> building anyways, reporting later..
Hi,
it seems that the pauses went away with that patch. The system is showing a
different weird behaviour now. On last bootup the machine slowly died away
(first my email program froze upon checking for mail, then starting top
would just hang the respective xterm. ps still ran and procuced output [i
didn't capture it though, doh], other stuff would hang, too. upon
ctrl-alt-bkspc to kill the x server, it all locked up.. i have no serial
console or other machine to test if it was still up in any way.
And on this bootup the pauses are still gone, but as soon as i echo'ed 1
into trace_enabled the mouse started to become very skippy (update freq at
about 3hz). Keyboard is fine though.. putting trace_enabled back to 0
doesn't fix it. I suppose it's just a matter of time until the next lockup.
We'll see though..
Syslog only sees critical section timing reports, no BUG's afaics.
flo
* Florian Schmidt <[email protected]> wrote:
> Hi,
>
> it seems that the pauses went away with that patch. [...]
great! I've uploaded -U8.1 with this fix included:
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8.1
> And on this bootup the pauses are still gone, but as soon as i echo'ed
> 1 into trace_enabled the mouse started to become very skippy (update
> freq at about 3hz). Keyboard is fine though.. putting trace_enabled
> back to 0 doesn't fix it. I suppose it's just a matter of time until
> the next lockup. We'll see though..
>
> Syslog only sees critical section timing reports, no BUG's afaics.
note that the keyboard and USB interrupts are SCHED_OTHER by default, so
they could be delayed quite long depending on the workload. To avoid
that i'd suggest to:
chrt --fifo --pid 30 `pidof 'IRQ 1'`
chrt --fifo --pid 30 `pidof 'IRQ 12'`
(do this for every IRQ you have for input devices.) This puts them below
jackd's priority (which is FIFO 50 iirc) but above all SCHED_OTHER
tasks. The soundcard IRQ i guess you have chrt-ed already?
or did you have them on SCHED_FIFO already?
Ingo
On Wed, 20 Oct 2004 16:18:22 +0200
Ingo Molnar <[email protected]> wrote:
> note that the keyboard and USB interrupts are SCHED_OTHER by default, so
> they could be delayed quite long depending on the workload. To avoid
> that i'd suggest to:
>
> chrt --fifo --pid 30 `pidof 'IRQ 1'`
> chrt --fifo --pid 30 `pidof 'IRQ 12'`
>
> (do this for every IRQ you have for input devices.) This puts them below
> jackd's priority (which is FIFO 50 iirc) but above all SCHED_OTHER
> tasks. The soundcard IRQ i guess you have chrt-ed already?
>
> or did you have them on SCHED_FIFO already?
setting them to SCHED_FIFO even with a prio of 99 won't help. will try
rebooting to see if it's reproducable
flo
On Wed, 2004-10-20 at 10:18, Ingo Molnar wrote:
> note that the keyboard and USB interrupts are SCHED_OTHER by default, so
> they could be delayed quite long depending on the workload.
Why is this the default behavior? It seems like you would want all IRQ
threads to be SCHED_FIFO by default. Otherwise it seems like the
scheduler could decide to run a normal userspace process (like, say, X)
while an IRQ thread is runnable.
Is it really a good idea for IRQ threads to be subject to the whims of
the scheduler?
Also, on modern machines, this would effectively make all IRQ threads
SCHED_OTHER because the USB port shares an interrupt with everything:
0: 36676353 XT-PIC timer 0/76353
1: 8759 XT-PIC i8042 5/8759
2: 0 XT-PIC cascade 0/0
8: 4 XT-PIC rtc 0/4
10: 0 XT-PIC uhci_hcd 0/0
11: 210713 XT-PIC uhci_hcd, eth0 0/10713
12: 0 XT-PIC uhci_hcd 0/0
15: 79277 XT-PIC ide1 0/79276
NMI: 0
ERR: 0
Lee
On Tue, 19 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U6 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
Got these high-latency values during the night on U6(haven't booted U8 yet).
IRQ 5/431/CPU#0): 612 us critical section violates 100 us threshold.
=> started at timestamp 4167601478: <call_console_drivers+0x76/0x140>
=> ended at timestamp 4167602090: <finish_task_switch+0x43/0xb0>
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c01324de>] check_preempt_timing+0x15e/0x270
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c02a5717>] __sched_text_start+0x2d7/0x5d0
[<c0113104>] mcount+0x14/0x18
[<c013bdfa>] do_irqd+0x5a/0x80
[<c01309ea>] kthread+0xaa/0xb0
[<c013bda0>] do_irqd+0x0/0x80
[<c0130940>] kthread+0x0/0xb0
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x3b/0x5d0 / (do_irqd+0x5a/0x80)
.. entry 2: print_traces+0x1d/0x80 / (dump_stack+0x23/0x30)
=> dump-end timestamp 4167602447
(IRQ 5/431/CPU#0): 34875 us critical section violates 100 us threshold.
=> started at timestamp 4167608224: <call_console_drivers+0x76/0x140>
=> ended at timestamp 4167643099: <finish_task_switch+0x43/0xb0>
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c01324de>] check_preempt_timing+0x15e/0x270
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c02a5717>] __sched_text_start+0x2d7/0x5d0
[<c0113104>] mcount+0x14/0x18
[<c013bdfa>] do_irqd+0x5a/0x80
[<c01309ea>] kthread+0xaa/0xb0
[<c013bda0>] do_irqd+0x0/0x80
[<c0130940>] kthread+0x0/0xb0
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x3b/0x5d0 / (do_irqd+0x5a/0x80)
.. entry 2: print_traces+0x1d/0x80 / (dump_stack+0x23/0x30)
=> dump-end timestamp 4167643459
(IRQ 1/18/CPU#0): 30560 us critical section violates 100 us threshold.
=> started at timestamp 4167647182: <call_console_drivers+0x76/0x140>
=> ended at timestamp 4167677742: <finish_task_switch+0x43/0xb0>
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c01324de>] check_preempt_timing+0x15e/0x270
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c02a5717>] __sched_text_start+0x2d7/0x5d0
[<c0113104>] mcount+0x14/0x18
[<c013bdfa>] do_irqd+0x5a/0x80
[<c01309ea>] kthread+0xaa/0xb0
[<c013bda0>] do_irqd+0x0/0x80
[<c0130940>] kthread+0x0/0xb0
[<c0104099>] kernel_thread_helper+0x5/0xc
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x3b/0x5d0 / (do_irqd+0x5a/0x80)
.. entry 2: print_traces+0x1d/0x80 / (dump_stack+0x23/0x30)
=> dump-end timestamp 4167678099
(bash/10595/CPU#0): 33546 us critical section violates 100 us threshold.
=> started at timestamp 4167681248: <call_console_drivers+0x76/0x140>
=> ended at timestamp 4167714794: <finish_task_switch+0x43/0xb0>
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c01324de>] check_preempt_timing+0x15e/0x270
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c01327f0>] sub_preempt_count+0x60/0x90
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c0117ca3>] finish_task_switch+0x43/0xb0
[<c02a5717>] __sched_text_start+0x2d7/0x5d0
[<c02a684f>] down_write+0x12f/0x1e0
[<c0113104>] mcount+0x14/0x18
[<c02a684f>] down_write+0x12f/0x1e0
[<c01182bb>] lock_kernel+0x2b/0x40
[<c016f122>] sys_ioctl+0x52/0x230
[<c0106013>] syscall_call+0x7/0xb
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x3b/0x5d0 / (down_write+0x12f/0x1e0)
.. entry 2: print_traces+0x1d/0x80 / (dump_stack+0x23/0x30)
=> dump-end timestamp 4167715168
Linux version 2.6.9-rc4-mm1-RT-U8 ([email protected]) (gcc version 3.2.2 20030222 (Red Hat Linux 3.2.2-5)) #11 SMP Wed Oct 20 18:49:11 MSD 2004
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e8000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001ff30000 (usable)
BIOS-e820: 000000001ff30000 - 000000001ff40000 (ACPI data)
BIOS-e820: 000000001ff40000 - 000000001fff0000 (ACPI NVS)
BIOS-e820: 000000001fff0000 - 0000000020000000 (reserved)
BIOS-e820: 00000000ffb80000 - 0000000100000000 (reserved)
511MB LOWMEM available.
found SMP MP-table at 000ff780
DMI 2.3 present.
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:3 APIC version 20
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 15:3 APIC version 20
ACPI: IOAPIC (id[0x02] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 2, version 32, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: INT_SRC_OVR (bus 0 bus_irq 9 global_irq 9 high level)
Enabling APIC mode: Flat. Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Built 1 zonelists
Initializing CPU#0
Kernel command line: root=/dev/hda2 console=ttyS0,57600
PID hash table entries: 2048 (order: 11, 32768 bytes)
Detected 2798.919 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 512132k/523456k available (2689k kernel code, 10764k reserved, 1111k data, 192k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
monitor/mwait feature present.
using mwait in idle threads.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU0: Intel P4/Xeon Extended MCE MSRs (12) available
CPU0: Thermal monitoring enabled
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
CPU0: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 03
per-CPU timeslice cutoff: 2926.33 usecs.
task migration cache decay timeout: 3 msecs.
Booting processor 1/1 eip 3000
Initializing CPU#1
monitor/mwait feature present.
CPU: Trace cache: 12K uops, L1 D cache: 16K
CPU: L2 cache: 1024K
CPU: Physical Processor ID: 0
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel P4/Xeon Extended MCE MSRs (12) available
CPU1: Thermal monitoring enabled
CPU1: Intel(R) Pentium(R) 4 CPU 2.80GHz stepping 03
Total of 2 processors activated (11108.35 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=2 pin2=-1
checking TSC synchronization across 2 CPUs: passed.
ksoftirqd started up.
Brought up 2 CPUs
ksoftirqd started up.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=2
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040816
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
PCI: Ignoring BAR0-3 of IDE controller 0000:00:1f.1
PCI: Transparent bridge - 0000:00:1e.0
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 6 7 *10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 6 7 10 11 12 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 6 7 10 11 12 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 6 7 10 *11 12 14 15)
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 6 7 10 *11 12 14 15)
Linux Plug and Play Support v0.97 (c) Adam Belay
SCSI subsystem initialized
usbcore: registered new driver usbfs
usbcore: registered new driver hub
PCI: Using ACPI for IRQ routing
** PCI interrupts are no longer routed automatically. If this
** causes a device to stop working, it is probably because the
** driver failed to call pci_enable_device(). As a temporary
** workaround, the "pci=routeirq" argument restores the old
** behavior. If this argument makes the device work again,
** please email the output of "lspci" to [email protected]
** so I can fix the driver.
Machine check exception polling timer started.
audit: initializing netlink socket (disabled)
audit(1098284016.319:0): initialized
Installing knfsd (copyright (C) 1996 [email protected]).
ACPI: Power Button (FF) [PWRF]
ACPI: Processor [CPU1] (supports C1)
ACPI: Processor [CPU2] (supports C1)
lp: driver loaded but no devices found
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected an Intel 865 Chipset.
agpgart: Maximum main memory to use for agp memory: 439M
agpgart: AGP aperture is 64M @ 0xf8000000
ACPI: PS/2 Keyboard Controller [PS2K] at I/O 0x60, 0x64, irq 1
ACPI: PS/2 Mouse Controller [PS2M] at irq 12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
parport0: PC-style at 0x378 [PCSPP(,...)]
lp0: using parport0 (polling).
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
ACPI: Floppy Controller [FDC] at I/O 0x3f0-0x3f5, 0x3f7 irq 6 dma channel 2
elevator: using anticipatory as default io scheduler
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ICH5: IDE controller at PCI slot 0000:00:1f.1
PCI: Enabling device 0000:00:1f.1 (0005 -> 0007)
ACPI: PCI interrupt 0000:00:1f.1[A] -> GSI 18 (level, low) -> IRQ 18
ICH5: chipset revision 2
ICH5: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, hdd:pio
hda: SAMSUNG SP0411N, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
hda: max request size: 1024KiB
hda: 78242976 sectors (40060 MB) w/2048KiB Cache, CHS=16383/255/63, UDMA(100)
hda: hda1 hda2 hda3 hda4 < hda5 >
ACPI: PCI interrupt 0000:00:1d.7[D] -> GSI 23 (level, low) -> IRQ 23
ehci_hcd 0000:00:1d.7: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB2 EHCI Controller
ehci_hcd 0000:00:1d.7: irq 23, pci mem 0xfebffc00
ehci_hcd 0000:00:1d.7: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:1d.7: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 8 ports detected
USB Universal Host Controller Interface driver v2.2
ACPI: PCI interrupt 0000:00:1d.0[A] -> GSI 16 (level, low) -> IRQ 16
uhci_hcd 0000:00:1d.0: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #1
uhci_hcd 0000:00:1d.0: irq 16, io base 0xef00
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:1d.1[B] -> GSI 19 (level, low) -> IRQ 19
uhci_hcd 0000:00:1d.1: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #2
uhci_hcd 0000:00:1d.1: irq 19, io base 0xef20
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:1d.2[C] -> GSI 18 (level, low) -> IRQ 18
uhci_hcd 0000:00:1d.2: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #3
uhci_hcd 0000:00:1d.2: irq 18, io base 0xef40
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 4
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:1d.3[A] -> GSI 16 (level, low) -> IRQ 16
uhci_hcd 0000:00:1d.3: Intel Corp. 82801EB/ER (ICH5/ICH5R) USB UHCI #4
uhci_hcd 0000:00:1d.3: irq 16, io base 0xef80
uhci_hcd 0000:00:1d.3: new USB bus registered, assigned bus number 5
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 2 ports detected
usbcore: registered new driver usblp
drivers/usb/class/usblp.c: v0.13: USB Printer Device Class driver
Initializing USB Mass Storage driver...
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
usbcore: registered new driver usbhid
drivers/usb/input/hid-core.c: v2.0:USB HID core driver
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImPS/2 Generic Wheel Mouse on isa0060/serio1
Advanced Linux Sound Architecture Driver Version 1.0.6 (Sun Aug 15 07:17:53 2004 UTC).
ACPI: PCI interrupt 0000:00:1f.5[B] -> GSI 17 (level, low) -> IRQ 17
AC'97 0 analog subsections not ready
intel8x0_measure_ac97_clock: measured 49987 usecs
intel8x0: clocking to 48000
ALSA device list:
#0: Intel ICH5 with AD1985 at 0xfebff800, irq 17
oprofile: using NMI interrupt.
NET: Registered protocol family 2
IP: routing cache hash table of 128 buckets, 21Kbytes
TCP: Hash tables configured (established 1024 bind 1560)
ip_conntrack version 2.1 (4089 buckets, 32712 max) - 308 bytes per conntrack
ip_tables: (C) 2000-2002 Netfilter core team
ipt_recent v0.3.1: Stephen Frost <[email protected]>. http://snowman.net/projects/ipt_recent/
arp_tables: (C) 2002 David S. Miller
NET: Registered protocol family 1
NET: Registered protocol family 17
Starting balanced_irq
ACPI: (supports S0 S1 S3 S4 S5)
ACPI wakeup devices:
P0P4 MC97 USB1 USB2 USB3 USB4 EUSB PS2K PS2M ILAN
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 192k freed
[...skip...]
Warning: dev (pts0) tty->count(16) != #fd's(8) in tty_open
Warning: dev (pts0) tty->count(16) != #fd's(11) in tty_open
Warning: dev (pts0) tty->count(17) != #fd's(13) in tty_open
Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
Warning: dev (pts0) tty->count(19) != #fd's(16) in tty_open
Warning: dev (pts0) tty->count(19) != #fd's(16) in release_dev
Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
Warning: dev (pts0) tty->count(19) != #fd's(17) in tty_open
Warning: dev (pts0) tty->count(19) != #fd's(17) in release_dev
Warning: dev (pts0) tty->count(18) != #fd's(17) in tty_open
Warning: dev (pts0) tty->count(18) != #fd's(17) in tty_open
Warning: dev (pts0) tty->count(19) != #fd's(17) in release_dev
Warning: dev (pts0) tty->count(17) != #fd's(18) in release_dev
Warning: dev (pts0) tty->count(16) != #fd's(19) in release_dev
Warning: dev (pts0) tty->count(15) != #fd's(18) in release_dev
Warning: dev (pts0) tty->count(14) != #fd's(18) in release_dev
Warning: dev (pts0) tty->count(14) != #fd's(18) in release_dev
Warning: dev (pts0) tty->count(13) != #fd's(18) in tty_open
Warning: dev (pts0) tty->count(14) != #fd's(19) in release_dev
Warning: dev (pts0) tty->count(13) != #fd's(18) in release_dev
Warning: dev (pts0) tty->count(13) != #fd's(17) in release_dev
Warning: dev (pts0) tty->count(12) != #fd's(18) in tty_open
Warning: dev (pts0) tty->count(12) != #fd's(18) in release_dev
Warning: dev (pts0) tty->count(12) != #fd's(18) in release_dev
Warning: dev (pts0) tty->count(28) != #fd's(16) in tty_open
Warning: dev (pts0) tty->count(528) != #fd's(13) in tty_open
Warning: dev (pts0) tty->count(528) != #fd's(13) in release_dev
Warning: dev (pts0) tty->count(527) != #fd's(12) in tty_open
Warning: dev (pts0) tty->count(528) != #fd's(12) in release_dev
Warning: dev (pts0) tty->count(538) != #fd's(28) in release_dev
Warning: dev (pts0) tty->count(537) != #fd's(528) in tty_open
Warning: dev (pts0) tty->count(537) != #fd's(527) in release_dev
Warning: dev (pts0) tty->count(536) != #fd's(527) in tty_open
Warning: dev (pts0) tty->count(536) != #fd's(527) in tty_open
Warning: dev (pts0) tty->count(536) != #fd's(527) in release_dev
Warning: dev (pts0) tty->count(535) != #fd's(527) in tty_open
Warning: dev (pts0) tty->count(535) != #fd's(530) in tty_open
[...skip...]
Warning: dev (pts0) tty->count(11) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(10) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(9) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(8) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(7) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(6) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(5) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(4) != #fd's(71) in release_dev
Warning: dev (pts0) tty->count(3) != #fd's(71) in release_dev
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0143645>] buffered_rmqueue+0x89/0x1f4 (28)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (48)
[<c039e226>] cond_resched+0x20/0x7f (32)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f019>] _spin_unlock+0x12/0x2d (16)
[<c039f019>] _spin_unlock+0x12/0x2d (16)
[<c0147f5e>] cache_grow+0x147/0x16f (8)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (28)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
eggcups: page allocation failure. order:1, mode:0x20
[<c01439d8>] __alloc_pages+0x228/0x3ff (8)
[<c0143bce>] __get_free_pages+0x1f/0x3b (68)
[<c014719c>] kmem_getpages+0x25/0xe0 (8)
[<c0147ee1>] cache_grow+0xca/0x16f (24)
[<c0148067>] cache_alloc_refill+0xe1/0x230 (52)
[<c01484a4>] __kmalloc+0x76/0x98 (48)
[<c0325521>] pskb_expand_head+0x51/0x123 (24)
[<c032a8ae>] skb_checksum_help+0x103/0x114 (40)
[<c037a8d7>] ip_nat_fn+0x203/0x215 (36)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c03352bb>] nf_iterate+0x71/0xa5 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (20)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (16)
[<c03355b9>] nf_hook_slow+0x77/0x125 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (28)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c034131a>] ip_finish_output+0x24c/0x251 (4)
[<c0343c16>] ip_finish_output2+0x0/0x1fa (24)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0343c01>] dst_output+0x14/0x29 (4)
[<c033561e>] nf_hook_slow+0xdc/0x125 (8)
[<c0343bed>] dst_output+0x0/0x29 (28)
[<c0341a9e>] ip_queue_xmit+0x51d/0x61c (32)
[<c0343bed>] dst_output+0x0/0x29 (24)
[<c0324ee0>] kfree_skbmem+0x24/0x2c (36)
[<c0324f81>] __kfree_skb+0x99/0xf9 (16)
[<c034d71c>] tcp_ack_saw_tstamp+0x22/0x57 (8)
[<c034df0b>] tcp_clean_rtx_queue+0x40b/0x433 (16)
[<c0324610>] sk_reset_timer+0x1f/0x2f (16)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (52)
[<c035306c>] tcp_transmit_skb+0x521/0x8da (52)
[<c0353f06>] tcp_write_xmit+0x16a/0x2c7 (72)
[<c039e226>] cond_resched+0x20/0x7f (16)
[<c03473fa>] tcp_sendmsg+0x541/0x1260 (40)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (20)
[<c039f0e1>] _spin_unlock_irq+0x12/0x2e (48)
[<c036a404>] inet_sendmsg+0x4d/0x59 (28)
[<c0320fce>] sock_sendmsg+0xe5/0x100 (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (72)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (28)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c0133550>] autoremove_wake_function+0x0/0x57 (52)
[<c0320d33>] sockfd_lookup+0x1c/0x77 (32)
[<c0322424>] sys_sendto+0xe3/0xfe (20)
[<c03221c3>] sys_connect+0x91/0xb1 (68)
[<c0324ae2>] sock_common_setsockopt+0x36/0x3a (96)
[<c0322476>] sys_send+0x37/0x3b (40)
[<c0322ce9>] sys_socketcall+0x139/0x256 (28)
[<c0105259>] sysenter_past_esp+0x52/0x71 (64)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x50 / (0x0)
Warning: trace overflow for c0414d40 (32), increase RWSEM_MAX_OWNERS
... disabling semaphore tracing and deadlock detection.
BUG: sleeping function called from invalid context bash(24154) at drivers/block/ll_rw_blk.c:1283
in_atomic():1 [00000001], irqs_disabled():1
[<c011c4c7>] __might_sleep+0xbb/0xc9 (8)
[<c026d24f>] generic_unplug_device+0x1f/0x50 (36)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling with irqs disabled: bash/0x04000001/24154
caller is cond_resched+0x5e/0x7f
[<c039d99f>] schedule+0x6c/0xb9 (8)
[<c039e264>] cond_resched+0x5e/0x7f (8)
[<c039e264>] cond_resched+0x5e/0x7f (24)
[<c026d254>] generic_unplug_device+0x24/0x50 (16)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 04000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x04000001/24154
caller is cond_resched+0x5e/0x7f
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e264>] cond_resched+0x5e/0x7f (8)
[<c011f16e>] vprintk+0x11e/0x15b (12)
[<c0106116>] dump_stack+0x1c/0x20 (56)
[<c039d99f>] schedule+0x6c/0xb9 (16)
[<c039e264>] cond_resched+0x5e/0x7f (8)
[<c039e264>] cond_resched+0x5e/0x7f (24)
[<c026d254>] generic_unplug_device+0x24/0x50 (16)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 04000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039e2c8>] io_schedule+0x26/0x30 (116)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: sleeping function called from invalid context bash(24154) at lib/rwsem-generic.c:319
in_atomic():1 [00000001], irqs_disabled():0
[<c011c4c7>] __might_sleep+0xbb/0xc9 (8)
[<c039e69d>] down_read+0x2e/0x18a (36)
[<c015587d>] swap_unplug_io_fn+0x19/0xb5 (40)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x04000001/24154
caller is cond_resched+0x5e/0x7f
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e264>] cond_resched+0x5e/0x7f (8)
[<c011f16e>] vprintk+0x11e/0x15b (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c0106116>] dump_stack+0x1c/0x20 (40)
[<c039e264>] cond_resched+0x5e/0x7f (36)
[<c039e6a2>] down_read+0x33/0x18a (16)
[<c015587d>] swap_unplug_io_fn+0x19/0xb5 (40)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 04000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039e2c8>] io_schedule+0x26/0x30 (116)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039e2c8>] io_schedule+0x26/0x30 (116)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: sleeping function called from invalid context bash(24154) at lib/rwsem-generic.c:319
in_atomic():1 [00000001], irqs_disabled():0
[<c011c4c7>] __might_sleep+0xbb/0xc9 (8)
[<c039e69d>] down_read+0x2e/0x18a (36)
[<c015587d>] swap_unplug_io_fn+0x19/0xb5 (40)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x04000001/24154
caller is cond_resched+0x5e/0x7f
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e264>] cond_resched+0x5e/0x7f (8)
[<c011f16e>] vprintk+0x11e/0x15b (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c0106116>] dump_stack+0x1c/0x20 (40)
[<c039e264>] cond_resched+0x5e/0x7f (36)
[<c039e6a2>] down_read+0x33/0x18a (16)
[<c015587d>] swap_unplug_io_fn+0x19/0xb5 (40)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 04000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039e2c8>] io_schedule+0x26/0x30 (116)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039f019>] _spin_unlock+0x12/0x2d (28)
[<c039f019>] _spin_unlock+0x12/0x2d (52)
[<c039e2c8>] io_schedule+0x26/0x30 (36)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: sleeping function called from invalid context bash(24154) at lib/rwsem-generic.c:319
in_atomic():1 [00000001], irqs_disabled():0
[<c011c4c7>] __might_sleep+0xbb/0xc9 (8)
[<c039f019>] _spin_unlock+0x12/0x2d (24)
[<c039e69d>] down_read+0x2e/0x18a (12)
[<c015587d>] swap_unplug_io_fn+0x19/0xb5 (40)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039f019>] _spin_unlock+0x12/0x2d (28)
[<c039f019>] _spin_unlock+0x12/0x2d (52)
[<c039e2c8>] io_schedule+0x26/0x30 (36)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039e2c8>] io_schedule+0x26/0x30 (116)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is down_write+0x14d/0x17c
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e946>] down_write+0x14d/0x17c (8)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c039e946>] down_write+0x14d/0x17c (52)
[<c026d262>] generic_unplug_device+0x32/0x50 (40)
[<c026d29b>] blk_backing_dev_unplug+0x1b/0x1d (16)
[<c01558bc>] swap_unplug_io_fn+0x58/0xb5 (8)
[<c0162583>] block_sync_page+0x4f/0x58 (36)
[<c013e5c4>] sync_page+0x50/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is io_schedule+0x26/0x30
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e2c8>] io_schedule+0x26/0x30 (8)
[<c039f019>] _spin_unlock+0x12/0x2d (28)
[<c039f019>] _spin_unlock+0x12/0x2d (52)
[<c039e2c8>] io_schedule+0x26/0x30 (36)
[<c013e5b8>] sync_page+0x44/0x59 (12)
[<c039e5c0>] __wait_on_bit_lock+0x53/0x61 (8)
[<c013e574>] sync_page+0x0/0x59 (8)
[<c013ed4b>] __lock_page+0xa3/0xab (20)
[<c01335a7>] wake_bit_function+0x0/0x55 (24)
[<c01335a7>] wake_bit_function+0x0/0x55 (32)
[<c014e9ce>] do_swap_page+0x219/0x2da (20)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: sleeping function called from invalid context bash(24154) at kernel/mutex.c:25
in_atomic():1 [00000001], irqs_disabled():0
[<c011c4c7>] __might_sleep+0xbb/0xc9 (8)
[<c0133970>] _mutex_lock+0x1f/0x3b (36)
[<c014e847>] do_swap_page+0x92/0x2da (16)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x04000001/24154
caller is cond_resched+0x5e/0x7f
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c039e264>] cond_resched+0x5e/0x7f (8)
[<c011f16e>] vprintk+0x11e/0x15b (24)
[<c0133f74>] check_preempt_timing+0x70/0x1aa (16)
[<c0106116>] dump_stack+0x1c/0x20 (40)
[<c039e264>] cond_resched+0x5e/0x7f (36)
[<c0133975>] _mutex_lock+0x24/0x3b (16)
[<c014e847>] do_swap_page+0x92/0x2da (16)
[<c014f175>] handle_mm_fault+0xdf/0x18e (52)
[<c0117355>] do_page_fault+0x1be/0x595 (48)
[<c039f019>] _spin_unlock+0x12/0x2d (60)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c0117197>] do_page_fault+0x0/0x595 (12)
[<c0105d3d>] error_code+0x2d/0x38 (8)
[<c010521f>] sysenter_past_esp+0x18/0x71 (52)
preempt count: 04000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
BUG: scheduling while atomic: bash/0x00000001/24154
caller is do_exit+0x2ee/0x3b1
[<c039d8e3>] __sched_text_start+0xbef/0xc3f (8)
[<c01219a6>] do_exit+0x2ee/0x3b1 (8)
[<c01210a2>] exit_notify+0x280/0x896 (48)
[<c01219a6>] do_exit+0x2ee/0x3b1 (68)
[<c0121b82>] do_group_exit+0x91/0xb5 (36)
[<c012acfd>] get_signal_to_deliver+0x1bf/0x396 (24)
[<c01050ea>] do_signal+0xe3/0x11b (48)
[<c039f019>] _spin_unlock+0x12/0x2d (40)
[<c015d40e>] vfs_read+0xbf/0x12a (88)
[<c015d6df>] sys_read+0x51/0x80 (44)
[<c010517a>] do_notify_resume+0x58/0x5a (28)
[<c01052f7>] work_notifysig+0x13/0x18 (12)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
INIT: Switching to runlevel: 6
INIT: Sending processes the TERM signal
Stopping atd: [ OK ]
Stopping keytable: [ OK ]
Stopping cups: [ OK ]
Shutting down xfs: [ OK ]
Shutting down console mouse services: [ OK ]
Stopping sshd:[ OK ]
Shutting down sendmail: [ OK ]
Shutting down sm-client: [ OK ]
Stopping xinetd: [ OK ]
Stopping crond: [ OK ]
Saving random seed: [ OK ]
Stopping NFS statd:
On Wed, 20 Oct 2004, Alexander Batyrshin wrote:
>
>
> Ingo Molnar wrote:
> > i have released the -U8 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
> >
>
> 2.
> if execute
> ``for i in `seq 1 9999`; do nohup bash >/dev/null 2>&1 & done'',
> then you'll get something like:
> [...skip...]
> Warning: dev (pts0) tty->count(16) != #fd's(8) in tty_open
> Warning: dev (pts0) tty->count(16) != #fd's(11) in tty_open
> Warning: dev (pts0) tty->count(17) != #fd's(13) in tty_open
> Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
> Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
> Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
> Warning: dev (pts0) tty->count(19) != #fd's(16) in tty_open
> Warning: dev (pts0) tty->count(19) != #fd's(16) in release_dev
> Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
> Warning: dev (pts0) tty->count(18) != #fd's(16) in tty_open
> Warning: dev (pts0) tty->count(19) != #fd's(17) in tty_open
> Warning: dev (pts0) tty->count(19) != #fd's(17) in release_dev
> Warning: dev (pts0) tty->count(18) != #fd's(17) in tty_open
> Warning: dev (pts0) tty->count(18) != #fd's(17) in tty_open
> Warning: dev (pts0) tty->count(19) != #fd's(17) in release_dev
> Warning: dev (pts0) tty->count(17) != #fd's(18) in release_dev
> Warning: dev (pts0) tty->count(16) != #fd's(19) in release_dev
> Warning: dev (pts0) tty->count(15) != #fd's(18) in release_dev
> Warning: dev (pts0) tty->count(14) != #fd's(18) in release_dev
> Warning: dev (pts0) tty->count(14) != #fd's(18) in release_dev
> Warning: dev (pts0) tty->count(13) != #fd's(18) in tty_open
> Warning: dev (pts0) tty->count(14) != #fd's(19) in release_dev
> Warning: dev (pts0) tty->count(13) != #fd's(18) in release_dev
> Warning: dev (pts0) tty->count(13) != #fd's(17) in release_dev
> Warning: dev (pts0) tty->count(12) != #fd's(18) in tty_open
> Warning: dev (pts0) tty->count(12) != #fd's(18) in release_dev
> Warning: dev (pts0) tty->count(12) != #fd's(18) in release_dev
> Warning: dev (pts0) tty->count(28) != #fd's(16) in tty_open
> Warning: dev (pts0) tty->count(528) != #fd's(13) in tty_open
> Warning: dev (pts0) tty->count(528) != #fd's(13) in release_dev
> Warning: dev (pts0) tty->count(527) != #fd's(12) in tty_open
> Warning: dev (pts0) tty->count(528) != #fd's(12) in release_dev
> Warning: dev (pts0) tty->count(538) != #fd's(28) in release_dev
> Warning: dev (pts0) tty->count(537) != #fd's(528) in tty_open
> Warning: dev (pts0) tty->count(537) != #fd's(527) in release_dev
> Warning: dev (pts0) tty->count(536) != #fd's(527) in tty_open
> Warning: dev (pts0) tty->count(536) != #fd's(527) in tty_open
> Warning: dev (pts0) tty->count(536) != #fd's(527) in release_dev
> Warning: dev (pts0) tty->count(535) != #fd's(527) in tty_open
> Warning: dev (pts0) tty->count(535) != #fd's(530) in tty_open
> [...skip...]
> Warning: dev (pts0) tty->count(11) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(10) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(9) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(8) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(7) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(6) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(5) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(4) != #fd's(71) in release_dev
> Warning: dev (pts0) tty->count(3) != #fd's(71) in release_dev
> eggcups: page allocation failure. order:1, mode:0x20
> [<c01439d8>] __alloc_pages+0x228/0x3ff (8)
> [<c0143bce>] __get_free_pages+0x1f/0x3b (68)
> [<c014719c>] kmem_getpages+0x25/0xe0 (8)
> [...skip...]
I got something like this too, just now. Not doing anything special(tty7
holds xdm).
Warning: dev (pts7) tty->count(4) != #fd's(3) in tty_open
Warning: dev (pts7) tty->count(4) != #fd's(3) in release_dev
Hi,
I finally got around to get my labtop up running with this. It works -
nearly that is - X even started up!
When I start the network I get the trace below. And my network runs
awfully slowly. Could seem like the interrupt doesn't work correctly....
Regards,
Esben
Oct 20 21:45:53 localhost kernel: e100: Intel(R) PRO/100 Network Driver,
3.1.4-k
2-NAPI
Oct 20 21:45:53 localhost kernel: e100: Copyright(c) 1999-2004 Intel
Corporation
Oct 20 21:45:53 localhost kernel: ACPI: PCI interrupt 0000:00:09.0[A] ->
GSI 11
(level, low) -> IRQ 11
Oct 20 21:45:53 localhost kernel: e100: eth0: e100_probe: addr 0x41280000,
irq 1
1, MAC addr 00:D0:59:2D:91:84
Oct 20 21:45:53 localhost kernel: ip/1988: BUG in enable_irq at
/misc/frodo_opt3
/simlo/Linux-RT/linux-2.6.9-rc4-mm1-rt-u8.1/kernel/irq/manage.c:111
Oct 20 21:45:53 localhost kernel: [<c013614e>] enable_irq+0xee/0x100 (12)
Oct 20 21:45:53 localhost kernel: [<d089741e>] e100_up+0x10e/0x200 [e100]
(48)
Oct 20 21:45:54 localhost kernel: [<d08987c0>] e100_open+0x30/0x80 [e100]
(48)
Oct 20 21:45:54 localhost kernel: [<c010fe00>] mcount+0x14/0x18 (12)
Oct 20 21:45:54 localhost kernel: [<c02ea108>] dev_open+0x88/0xa0 (20)
Oct 20 21:45:54 localhost kernel: [<c02eb82d>]
dev_change_flags+0x5d/0x140 (28)
Oct 20 21:45:54 localhost kernel: [<c02e976e>] __dev_get_by_name+0xe/0xd0
(8)
Oct 20 21:45:54 localhost kernel: [<c03293b7>] devinet_ioctl+0x277/0x6e0
(28)
Oct 20 21:45:54 localhost kernel: [<c032b834>] inet_ioctl+0x64/0xb0 (108)
Oct 20 21:45:54 localhost kernel: [<c02e0ae8>] sock_ioctl+0xc8/0x250 (28)
Oct 20 21:45:54 localhost kernel: [<c016a319>] sys_ioctl+0xc9/0x230 (32)
Oct 20 21:45:54 localhost kernel: [<c01046fd>]
sysenter_past_esp+0x52/0x71 (44)
Oct 20 21:45:54 localhost kernel: preempt count: 00000002
Oct 20 21:45:54 localhost kernel: . 2-level deep critical section nesting:
Oct 20 21:45:54 localhost kernel: .. entry 1: enable_irq+0x2e/0x100 /
(e100_up+0
x10e/0x200 [e100])
Oct 20 21:45:54 localhost kernel: .. entry 2: print_traces+0x1d/0x60 /
(dump_sta
ck+0x23/0x30)
Oct 20 21:45:54 localhost kernel:
Oct 20 21:45:54 localhost kernel: e100: eth0: e100_watchdog: link up,
10Mbps, ha
On Wed, 20 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U8 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>
> this too is a fixes-only release. It includes the many semaphore-abuse
> and sleep_on() fixes/improvements from Thomas Gleixner, and it also
> includes a couple of semaphore related fixes.
>
> I believe the semaphore fixes should resolve a number of the deadlocks
> reported for -U7.
>
> In particular it seems the only sane and reliable way to convert RCU
> locking was to allow the following semantics for rwsems: allow reads to
> nest, and allow self-read-recursion of a self-write-held semaphore. My
> current implementation for this allows semaphore unfairness, but that
> can be fixed later on. Most importantly, the RCU to RT-locking
> conversions are much more automatic now and map nicely to what the code
> is doing upstream. Most of the time they involve a conversion of a
> spinlock or semaphore into a rwlock or rwsem. The old code maps to new
> code almost automatically, the only manual work needed was to associate
> the rcu_read_lock() with the writers-lock that it excludes against,
> which is a pretty clear (but not automatic, and hence not automatable)
> decision. This way i could convert some more networking code, and
> simplify the older changes and hopefully get rid of some deadlocks. The
> locking API is still not in its final form, but it's getting closer.
>
> Changes since -U7:
>
> - deadlock fix: sysfs/driver-base semaphore fixes from Thomas Gleixner
>
> - deadlock fix: scsi semaphore fixes from Thomas Gleixner
>
> - NFS sleep_on() fixes from Thomas Gleixner
>
> - rawmidid.c sleep_on() fix from Thomas Gleixner
>
> - [ i've added more wait_for_completion_*() primitives, to ease
> conversion of other semaphore-(ab-)using code. ]
>
> - make rwsems self-recursive
>
> - RCU lock conversion: convert rtnl_sem RCU use.
>
> - netfilter deadlock fix - clean up RCU locking.
>
> to create a -U8 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>
> Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Ingo Molnar wrote:
> disable_irq() should work fine though. (it doesnt disable local
> interrupts, it only disables that particular irq line.) So something
> else disabled interrupts - ah, netconsole.c itself. Does the patch below
> fix things up for you?
>
> Ingo
> [patch snipped]
That patch was not enough. The BUGs were still showing up the same as
before.
I tried to debug it myself. I've found an interesting thing in
kernel/printk.c:release_console_sem(). There is the following sequence:
spin_lock_irqsave(&logbuf_lock, flags);
/* ... some code ... */
spin_unlock(&logbuf_lock);
call_console_drivers(...);
local_irq_restore(flags);
I know very little about locking, but I didn't like this two-phased
unlock. So I replaced it with a single spin_unlock_irqrestore. Patch
attached.
I'm almost certain that there is a reason for the two-phased unlocking
and that this patch will break something, but so far it works for me.
netconsole now works without complaining.
Michal
On Wed, 2004-10-20 at 10:49, Alexander Batyrshin wrote:
> Ingo Molnar wrote:
> > i have released the -U8 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
> >
>
> used i386/defconfig
>
> 1. at boot
> [...skip...]
> hda: 78242976 sectors (40060 MB) w/2048KiB Cache, CHS=16383/255/63,
> UDMA(100)
> hda: hda1 hda2 hda3 hda4 < hda5 >
> BUG: semaphore recursion deadlock detected!
> .. current task khpsbpkt/723 is already holding c04610c0.
> 00001f23 00001f2f c04e8de9 00000086 00000009 00000000 c011b552 00000020
> 00000400 c03b3efa dfdc9f70 dfdc9f50 0000000c dfdc78f0 c011b430
> c03b3efa
> dfdc9f70 c01052a5 c03b3efa c03b3efa 00000000 dfdc78f0 dfdc78f0
> c01fd364
> Call Trace:
> [<c012caa4>] __kernel_text_address+0x2e/0x37 (24)
> [<c01051c9>] show_trace+0x4e/0xcc (12)
> [<c01052c9>] show_stack+0x82/0x97 (36)
> [<c01fd364>] __rwsem_deadlock+0xd9/0x135 (24)
> [<c039e2a0>] down_write_interruptible+0xe6/0x202 (48)
> [<c029dd80>] hpsbpkt_thread+0x2b/0x86 (48)
> [<c029dd55>] hpsbpkt_thread+0x0/0x86 (12)
> [<c01024d1>] kernel_thread_helper+0x5/0xb (4)
> preempt count: 00000003
> . 3-level deep critical section nesting:
> .. entry 1: _spin_lock_irqsave+0x19/0x74 / (0x0)
> .. entry 2: _spin_lock+0x19/0x6d / (0x0)
> .. entry 3: print_traces+0x17/0x50 / (0x0)
> [...skip...]
This looks related to IEEE1394 . It has a deadlock in it. Try turning it
off..
Daniel
On Wed, 2004-10-20 at 02:45, Ingo Molnar wrote:
> i have released the -U8 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>
> this too is a fixes-only release.
A bit late, but a short report. I managed to boot into U8.1 (Athlon64 up
machine). But only if I turn off kudzu (the hardware discovery program
in FC2). If I leave it on, I get tons of kernel error or warning
messages on the console, they scroll too fast for me to read, once that
starts the machine is pretty much stuck - the messages keep scrolling
but that's about it. I have not managed to capture the output to see
what they say...
If I boot without kudzu I can use sound, but I have not done yet any
latency testing.
-- Fernando
* Michal Schmidt <[email protected]> wrote:
> That patch was not enough. The BUGs were still showing up the same as
> before. I tried to debug it myself. I've found an interesting thing in
> kernel/printk.c:release_console_sem(). There is the following
> sequence:
>
> spin_lock_irqsave(&logbuf_lock, flags);
> /* ... some code ... */
> spin_unlock(&logbuf_lock);
> call_console_drivers(...);
> local_irq_restore(flags);
>
> I know very little about locking, but I didn't like this two-phased
> unlock. So I replaced it with a single spin_unlock_irqrestore. Patch
> attached. I'm almost certain that there is a reason for the two-phased
> unlocking and that this patch will break something, but so far it
> works for me. netconsole now works without complaining.
ah, indeed. Note that this is still not enough - please try to add a
local_irq_enable() to netconsole.c's console-write function - does that
fix it equally well for you?
the reason is that if we crash within an irqs-off section then
netconsole will still be called with interrupts disabled and will
trigger the assert.
Ingo
* Ingo Molnar <[email protected]> wrote:
> ah, indeed. Note that this is still not enough - please try to add a
> local_irq_enable() to netconsole.c's console-write function - does
> that fix it equally well for you?
>
> the reason is that if we crash within an irqs-off section then
> netconsole will still be called with interrupts disabled and will
> trigger the assert.
i've added your patch to my tree, plus the extra local_irq_enable(),
this should also fix fbcon - so no changes needed to netconsole.c. All
of these problems will go away if/when the console code goes away from
raw spinlocks.
Ingo
* Ingo Molnar <[email protected]> wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > ah, indeed. Note that this is still not enough - please try to add a
> > local_irq_enable() to netconsole.c's console-write function - does
> > that fix it equally well for you?
> >
> > the reason is that if we crash within an irqs-off section then
> > netconsole will still be called with interrupts disabled and will
> > trigger the assert.
>
> i've added your patch to my tree, plus the extra local_irq_enable(),
> this should also fix fbcon - so no changes needed to netconsole.c. All
> of these problems will go away if/when the console code goes away from
> raw spinlocks.
actually ... i think i'll add the local_irq_enable() to netconsole.c and
fbcon, that way the VGA and serial consoles can still keep interrupts
disabled. That could make the difference between a debuggable and
undebuggable crash ...
Ingo
On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
As I already pointed out, this is a problem due to up(sema) in
queuecommand. That's one of the semaphore abuse points, which needs to
be fixed.
The problem is that semaphores are hold by Process A and released by
Process B, which makes Ingo's checks trigger
tglx
* Rui Nuno Capela <[email protected]> wrote:
> One of the signs that there's real trouble in here can be seen on
> the following complete dmesg output (which was even a miracle to be
> captured at all). This shows the complete bootstrap and init sequences
> and at the end one fatal crash while plugging an USB flash memory
> stick (usb-storage). This has been already reported earlier yesterday,
> but I just want to make it here, as the evidence-at-hand.
>
> After this precise occurence, the system becomes very flaky,
> unreliable and often ends up freezing to death.
for the sake of testing could you disable CONFIG_USB and see whether the
instability is truly directly related to the USB crash, as you suspect?
Such a kernel crash can often destabilize other parts of the kernel.
Ingo
Ingo Molnar wrote:
>
> i have released the -U8 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>
Hi,
I'm posting this time about to report my current status of my two
boxes, regarding the realtime-preempt-2.6.9-rc4-mm1-U8 patch. So it
seems that now the positions have been somewhat reversed. Respective
config.gz are attached.
a) Desktop P4 [email protected] SMP/HT (SuSE 9.1 Pro)
config-2.6.9-rc4-mm1-RT-U8.0smp.gz
This is apparently but surprinsingly OK, as everything seems to work
flawlessly, besides some quirks on the onboard NIC (sk98lin) that only
shows up initially but stabilizes later on. Indeed, U8 is the first
SMP+PREEMPT_REALTIME encarnation that runs at all and is fairly
workable on this machine. This is a relief.
b) Laptop P4 2.533Ghz UP (Mandrake 10.1c)
config-2.6.9-rc4-mm1-RT-U8.1.gz
This box was known to work without major issues until U4. With U8 it's
a real pain. Once trivial operations turns out fatal now. Running jackd
-R, which has been a flagship before, now freezes the whole system in
no time. (I'll take some netconsole capture sessions later)
One of the signs that there's real trouble in here can be seen on the
following complete dmesg output (which was even a miracle to be
captured at all). This shows the complete bootstrap and init sequences
and at the end one fatal crash while plugging an USB flash memory stick
(usb-storage). This has been already reported earlier yesterday, but I
just want to make it here, as the evidence-at-hand.
After this precise occurence, the system becomes very flaky, unreliable
and often ends up freezing to death.
Hope this helps (me)
Bye now.
--
Linux version 2.6.9-rc4-mm1-RT-U8.1 (root@lambda) (gcc version 3.4.1
(Mandrakelinux (Alpha 3.4.1-3mdk)) #1 Thu Oct 21 09:08:18 WEST 2004
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009f800 (usable)
BIOS-e820: 000000000009f800 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000dc000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001f770000 (usable)
BIOS-e820: 000000001f770000 - 000000001f77f000 (ACPI data)
BIOS-e820: 000000001f77f000 - 000000001f780000 (ACPI NVS)
BIOS-e820: 000000001f780000 - 000000001f800000 (reserved)
BIOS-e820: 000000002f780000 - 000000002f800000 (reserved)
BIOS-e820: 00000000fff80000 - 0000000100000000 (reserved)
503MB LOWMEM available.
On node 0 totalpages: 128880
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 124784 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
ACPI: RSDP (v000 PTLTD ) @ 0x000f6c70
ACPI: RSDT (v001 PTLTD RSDT 0x06040000 LTP 0x00000000) @ 0x1f7783fd
ACPI: FADT (v001 ATI Salmon 0x06040000 ATI 0x000f4240) @ 0x1f77ef64
ACPI: BOOT (v001 PTLTD $SBFTBL$ 0x06040000 LTP 0x00000001) @ 0x1f77efd8
ACPI: DSDT (v001 ATI MS2_1535 0x06040000 MSFT 0x0100000e) @ 0x00000000
Built 1 zonelists
No local APIC present or hardware disabled
Initializing CPU#0
Kernel command line: BOOT_IMAGE=linux-RT-U8.1 ro root=305 devfs=nomount
acpi=on
PID hash table entries: 2048 (order: 11, 32768 bytes)
Detected 2525.698 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 65536 (order: 6, 262144 bytes)
Inode-cache hash table entries: 32768 (order: 5, 131072 bytes)
Memory: 507728k/515520k available (1791k kernel code, 7300k reserved, 591k
data, 152k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 4931.58 BogoMIPS (lpj=2465792)
Security Scaffold v1.0.0 initialized
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: bfebf9ff 00000000 00000000 00000000
CPU: After vendor identify, caps: bfebf9ff 00000000 00000000 00000000
CPU: Trace cache: 12K uops, L1 D cache: 8K
CPU: L2 cache: 512K
CPU: After all inits, caps: bfebf9ff 00000000 00000000 00000080
CPU: Intel(R) Pentium(R) 4 CPU 2.53GHz stepping 07
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
ACPI: IRQ9 SCI: Edge set to Level Trigger.
ksoftirqd started up.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xfd88b, last bus=2
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040816
ACPI: Interpreter enabled
ACPI: Using PIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0.AGPB._PRT]
ACPI: PCI Interrupt Link [LNK0] (IRQs 7 10) *0, disabled.
ACPI: PCI Interrupt Link [LNK1] (IRQs 7 *10)
ACPI: PCI Interrupt Link [LNK2] (IRQs 7 10) *0, disabled.
ACPI: PCI Interrupt Link [LNK3] (IRQs 7 10) *0, disabled.
ACPI: PCI Interrupt Link [LNK4] (IRQs 7 *10)
ACPI: PCI Interrupt Link [LNK5] (IRQs 7 *11)
ACPI: PCI Interrupt Link [LNK6] (IRQs 7 10) *0, disabled.
ACPI: PCI Interrupt Link [LNK7] (IRQs *5)
ACPI: PCI Interrupt Link [LNK8] (IRQs 7 *10)
ACPI: Embedded Controller [EC0] (gpe 24)
SCSI subsystem initialized
PCI: Using ACPI for IRQ routing
** PCI interrupts are no longer routed automatically. If this
** causes a device to stop working, it is probably because the
** driver failed to call pci_enable_device(). As a temporary
** workaround, the "pci=routeirq" argument restores the old
** behavior. If this argument makes the device work again,
** please email the output of "lspci" to [email protected]
** so I can fix the driver.
Simple Boot Flag at 0x37 set to 0x1
devfs: 2004-01-31 Richard Gooch ([email protected])
devfs: boot_options: 0x0
Activating ISA DMA hang workarounds.
Real Time Clock Driver v1.12
ACPI: PS/2 Keyboard Controller [KBC0] at I/O 0x60, 0x64, irq 1
ACPI: PS/2 Mouse Controller [MSE0] at irq 12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
PCI: Enabling device 0000:00:08.0 (0000 -> 0003)
ACPI: PCI Interrupt Link [LNK6] enabled at IRQ 10
PCI: setting IRQ 10 as level-triggered
ACPI: PCI interrupt 0000:00:08.0[A] -> GSI 10 (level, low) -> IRQ 10
ttyS0 at I/O 0x1428 (irq = 10) is a 8250
ttyS2 at I/O 0x1440 (irq = 10) is a 8250
ttyS3 at I/O 0x1450 (irq = 10) is a 8250
ttyS4 at I/O 0x1460 (irq = 10) is a 8250
ttyS5 at I/O 0x1470 (irq = 10) is a 8250
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
ACPI: Floppy Controller [FDC] at I/O 0x3f0-0x3f5, 0x3f7 irq 6 dma channel 2
elevator: using anticipatory as default io scheduler
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
ALI15X3: IDE controller at PCI slot 0000:00:10.0
ACPI: PCI interrupt 0000:00:10.0[A]: no GSI
ALI15X3: chipset revision 196
ALI15X3: not 100% native mode: will probe irqs later
ide0: BM-DMA at 0x2000-0x2007, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0x2008-0x200f, BIOS settings: hdc:pio, hdd:pio
Probing IDE interface ide0...
hda: IC25N040ATCS04-0, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdc: HL-DT-STCD-RW/DVD DRIVE GCC-4240N, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 128KiB
hda: 78140160 sectors (40007 MB) w/1768KiB Cache, CHS=65535/16/63, UDMA(100)
hda: cache flushes not supported
/dev/ide/host0/bus0/target0/lun0: p1 p2 < p5 p6 p7 >
hdc: ATAPI 24X DVD-ROM CD-R/RW drive, 2048kB Cache, DMA
Uniform CD-ROM driver Revision: 3.20
mice: PS/2 mouse device common for all mice
input: AT Translated Set 2 keyboard on isa0060/serio0
Synaptics Touchpad, model: 1
Firmware: 5.8
Sensor: 35
new absolute packet format
Touchpad has extended capability bits
-> multifinger detection
-> palm detection
input: SynPS/2 Synaptics TouchPad on isa0060/serio1
NET: Registered protocol family 2
IP: routing cache hash table of 128 buckets, 20Kbytes
TCP: Hash tables configured (established 1024 bind 1638)
NET: Registered protocol family 1
NET: Registered protocol family 17
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 152k freed
kjournald starting. Commit interval 5 seconds
ACPI: AC Adapter [ACAD] (on-line)
ACPI: Battery Slot [BAT1] (battery present)
ACPI: Power Button (FF) [PWRF]
ACPI: Lid Switch [LID]
ACPI: Processor [CPU0] (supports C1 C2)
ACPI: Thermal Zone [THRM] (52 C)
usbcore: registered new driver usbfs
usbcore: registered new driver hub
ohci_hcd: 2004 Feb 02 USB 1.1 'Open' Host Controller (OHCI) Driver (PCI)
ACPI: PCI Interrupt Link [LNK8] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:02.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:02.0: OHCI Host Controller
ohci_hcd 0000:00:02.0: irq 10, pci mem 0xd4000000
ohci_hcd 0000:00:02.0: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 3 ports detected
ACPI: PCI Interrupt Link [LNK4] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:0f.0[A] -> GSI 10 (level, low) -> IRQ 10
ohci_hcd 0000:00:0f.0: OHCI Host Controller
ohci_hcd 0000:00:0f.0: irq 10, pci mem 0xd4009000
ohci_hcd 0000:00:0f.0: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 3 ports detected
usb usb2: string descriptor 0 read error: -113
usb usb2: string descriptor 0 read error: -113
usb usb2: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb2: string descriptor 0 read error: -113
usb usb2: string descriptor 0 read error: -113
usb usb2: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
usb usb1: string descriptor 0 read error: -113
EXT3 FS on hda5, internal journal
Adding 506008k swap on /dev/hda6. Priority:-1 extents:1
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda7, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
subfs 0.9
loop: loaded (max 8 devices)
natsemi dp8381x driver, version 1.07+LK1.0.17, Sep 27, 2002
originally by Donald Becker <[email protected]>
http://www.scyld.com/network/natsemi.html
2.4.x kernel port by Jeff Garzik, Tjeerd Mulder
ACPI: PCI Interrupt Link [LNK1] enabled at IRQ 10
ACPI: PCI interrupt 0000:00:12.0[A] -> GSI 10 (level, low) -> IRQ 10
natsemi eth0: NatSemi DP8381[56] at 0xd400a000 (0000:00:12.0),
00:0b:cd:85:0f:54, IRQ 10, port TP.
Linux Kernel Card Services
options: [pci] [cardbus] [pm]
PCI: Enabling device 0000:00:0a.0 (0005 -> 0007)
ACPI: PCI Interrupt Link [LNK5] enabled at IRQ 11
PCI: setting IRQ 11 as level-triggered
ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 11 (level, low) -> IRQ 11
Yenta: CardBus bridge found at 0000:00:0a.0 [103c:0850]
Yenta: ISA IRQ mask 0x00b8, PCI irq 11
Socket status: 30000007
cs: IO port probe 0x0c00-0x0cff: clean.
cs: IO port probe 0x0100-0x04ff: excluding 0x408-0x40f 0x480-0x48f
0x4d0-0x4d7
cs: IO port probe 0x0a00-0x0aff: clean.
prism2_cs: Ignoring new-style parameters in presence of obsolete ones
prism2cs_init: prism2_cs.o: 0.2.1-pre22 Loaded
prism2cs_init: dev_info is: prism2_cs
PCI: Enabling device 0000:00:06.0 (0005 -> 0007)
ACPI: PCI Interrupt Link [LNK7] enabled at IRQ 5
PCI: setting IRQ 5 as level-triggered
ACPI: PCI interrupt 0000:00:06.0[A] -> GSI 5 (level, low) -> IRQ 5
usbcore: registered new driver snd-usb-usx2y
Realtime LSM initialized (group 81, mlock=1)
mtrr: no more MTRRs available
ohci_hcd 0000:00:0f.0: wakeup
usb 2-1: new full speed USB device using address 2
Initializing USB Mass Storage driver...
scsi0 : SCSI emulation for USB Mass Storage devices
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
------------[ cut here ]------------
kernel BUG at lib/rwsem-generic.c:598!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: usb_storage realtime commoncap snd_seq_oss
snd_seq_midi_event snd_seq snd_pcm_oss snd_mixer_oss snd_usb_usx2y
snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd_ali5451
snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore prism2_cs
p80211 ds yenta_socket pcmcia_core natsemi crc32 loop subfs evdev ohci_hcd
usbcore thermal processor fan button battery ac
CPU: 0
EIP: 0060:[<c01b7e24>] Not tainted VLI
EFLAGS: 00010202 (2.6.9-rc4-mm1-RT-U8.1)
EIP is at up_write+0x1d4/0x202
eax: d4dce000 ebx: 00000292 ecx: d4e18980 edx: d52da030
esi: d4e69e24 edi: d5803400 ebp: d4dcfd6c esp: d4dcfd4c
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process usb-stor (pid: 6630, threadinfo=d4dce000 task=d4dcd8f0)
Stack: d4dcd8f0 d4dcfd78 c02bea7a 00000001 d4dcd8f0 00000292 d4e18980
d5803400
d4dcfd84 e018e139 d4e18980 d4dcfd84 00000292 d58034b8 d4dcfdac
c022ed0c
d4e18980 c022ef10 c023166d 00000000 d4e189d4 d4e18980 dcf21000
d5803400
Call Trace:
[<c0104eb0>] show_stack+0x80/0x96 (28)
[<c010504b>] show_registers+0x165/0x1de (56)
[<c010525d>] die+0xf6/0x191 (64)
[<c0105797>] do_invalid_op+0x10b/0x10d (188)
[<c0104b0d>] error_code+0x2d/0x38 (100)
[<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
[<c022ed0c>] scsi_dispatch_cmd+0x168/0x218 (40)
[<c02342e1>] scsi_request_fn+0x1ee/0x42b (52)
[<c0205606>] blk_insert_request+0xcd/0xfb (44)
[<c0232f43>] scsi_insert_special_req+0x3b/0x3f (28)
[<c0233175>] scsi_wait_req+0x61/0x94 (60)
[<c0235290>] scsi_probe_lun+0x8e/0x240 (68)
[<c0235883>] scsi_probe_and_add_lun+0xb0/0x1be (48)
[<c0236009>] scsi_scan_target+0xa4/0x123 (60)
[<c0236115>] scsi_scan_channel+0x8d/0xa4 (48)
[<c02361a5>] scsi_scan_host_selected+0x79/0xd4 (44)
[<c0236231>] scsi_scan_host+0x31/0x33 (28)
[<e0190cbd>] usb_stor_scan_thread+0x144/0x155 [usb_storage] (96)
[<c0102305>] kernel_thread_helper+0x5/0xb (723714068)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: die+0x3a/0x191 / (do_invalid_op+0x10b/0x10d)
.. entry 2: print_traces+0x16/0x4a / (show_stack+0x80/0x96)
Code: e8 af f9 ff ff 89 f8 e8 fd af f5 ff e9 35 ff ff ff 0f 0b a5 00 e3 e8
2c c0 e9 da fe ff ff 0f 0b a4 00 e3 e8 2c c0 e9 c4 fe ff ff <0f> 0b 56 02
6f 75 2d c0 e9 3c fe ff ff e8 a8 5b 10 00 e9 22 ff
--
rncbc aka Rui Nuno Capela
[email protected]
On Thu, Oct 21, 2004 at 11:16:30AM +0200, Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> > [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
>
> As I already pointed out, this is a problem due to up(sema) in
> queuecommand. That's one of the semaphore abuse points, which needs to
> be fixed.
>
> The problem is that semaphores are hold by Process A and released by
> Process B, which makes Ingo's checks trigger
Which is perfectly valid for a semaphore.
On Thu, Oct 21, 2004 at 11:44:38AM +0200, Ingo Molnar wrote:
>
> * Christoph Hellwig <[email protected]> wrote:
>
> > > The problem is that semaphores are hold by Process A and released by
> > > Process B, which makes Ingo's checks trigger
> >
> > Which is perfectly valid for a semaphore.
>
> yes, it is valid and perfectly fine code, but i'm trying to separate out
> the simple 'mutex' functionality (99% of the semaphore users are just
> that) and implement a 'counted semaphore' separately. This removes a
> number of implementational constraints from mutexes.
So leave the good old struct semaphore alone and introduce a mutex_t..
On Thu, Oct 21 2004, Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> > [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
>
> As I already pointed out, this is a problem due to up(sema) in
> queuecommand. That's one of the semaphore abuse points, which needs to
> be fixed.
>
> The problem is that semaphores are hold by Process A and released by
> Process B, which makes Ingo's checks trigger
That's utter crap, it's perfectly valid use.
--
Jens Axboe
On Thu, 2004-10-21 at 11:35, Christoph Hellwig wrote:
> On Thu, Oct 21, 2004 at 11:16:30AM +0200, Thomas Gleixner wrote:
> > On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> > > [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
> >
> > As I already pointed out, this is a problem due to up(sema) in
> > queuecommand. That's one of the semaphore abuse points, which needs to
> > be fixed.
> >
> > The problem is that semaphores are hold by Process A and released by
> > Process B, which makes Ingo's checks trigger
>
> Which is perfectly valid for a semaphore.
>
In fact this is used where wait_for_completion() is the correct thing to
do. It's not waiting for a resource. It's waiting for completion of a
commoand.
tglx
* Christoph Hellwig <[email protected]> wrote:
> > yes, it is valid and perfectly fine code, but i'm trying to separate out
> > the simple 'mutex' functionality (99% of the semaphore users are just
> > that) and implement a 'counted semaphore' separately. This removes a
> > number of implementational constraints from mutexes.
>
> So leave the good old struct semaphore alone and introduce a mutex_t..
with nearly 1000 'struct semaphore' references in the kernel and 980 of
them being simple mutex use this is rather impractical. So i instead
went for safely detecting the 20 non-mutex uses and converting those
places. (Btw., 90% of those 20 cases can be detected safely at
compile-time (and link-time) by removing DECLARE_MUTEX_LOCKED and making
sema_init() a macro that only allows constant values of 0 and 1 and
produces a link error for other cases.)
this work is still incomplete so i'm not arguing for upstream inclusion.
(But while we did this a couple of places did turn out to use semaphores
for completion which is inefficient - we converted those to completions
and are contributing those changes to mainline. But this issue is
totally orthogonal to the issue of counted semaphores.)
Ingo
On Thu, 2004-10-21 at 11:53, Jens Axboe wrote:
> On Thu, Oct 21 2004, Thomas Gleixner wrote:
> > On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> > > [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
> >
> > As I already pointed out, this is a problem due to up(sema) in
> > queuecommand. That's one of the semaphore abuse points, which needs to
> > be fixed.
> >
> > The problem is that semaphores are hold by Process A and released by
> > Process B, which makes Ingo's checks trigger
>
> That's utter crap, it's perfectly valid use.
It's not!
>From the code:
init_MUTEX_LOCKED(&(us->sema));
This is used to wait for command completion and therefor we have the
completion API. It was used this way because the ancestor of completion
(sleep_on) was racy !
tglx
On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
Can you try that one ?
diff -urN 2.6.9-rc4-mm1-RT-U8/drivers/usb/storage/scsiglue.c
2.6.9-rc4-mm1-RT-U8.1/drivers/usb/storage/scsiglue.c
--- 2.6.9-rc4-mm1-RT-U8/drivers/usb/storage/scsiglue.c 2004-10-12
09:41:44.000000000 +0200
+++ 2.6.9-rc4-mm1-RT-U8.1/drivers/usb/storage/scsiglue.c 2004-10-21
11:45:14.000000000 +0200
@@ -187,7 +187,7 @@
us->srb = srb;
/* wake up the process task */
- up(&(us->sema));
+ complete(&(us->done));
return 0;
}
diff -urN 2.6.9-rc4-mm1-RT-U8/drivers/usb/storage/usb.c
2.6.9-rc4-mm1-RT-U8.1/drivers/usb/storage/usb.c
--- 2.6.9-rc4-mm1-RT-U8/drivers/usb/storage/usb.c 2004-10-12
09:41:44.000000000 +0200
+++ 2.6.9-rc4-mm1-RT-U8.1/drivers/usb/storage/usb.c 2004-10-21
11:45:34.000000000 +0200
@@ -299,7 +299,7 @@
for(;;) {
US_DEBUGP("*** thread sleeping.\n");
- if(down_interruptible(&us->sema))
+ if(wait_for_completion_interruptible(&us->done))
break;
US_DEBUGP("*** thread awakened.\n");
@@ -941,7 +941,7 @@
}
memset(us, 0, sizeof(struct us_data));
init_MUTEX(&(us->dev_semaphore));
- init_MUTEX_LOCKED(&(us->sema));
+ init_completion(&(us->done));
init_completion(&(us->notify));
init_waitqueue_head(&us->dev_reset_wait);
init_waitqueue_head(&us->scsi_scan_wait);
diff -urN 2.6.9-rc4-mm1-RT-U8/drivers/usb/storage/usb.h
2.6.9-rc4-mm1-RT-U8.1/drivers/usb/storage/usb.h
--- 2.6.9-rc4-mm1-RT-U8/drivers/usb/storage/usb.h 2004-10-12
09:41:44.000000000 +0200
+++ 2.6.9-rc4-mm1-RT-U8.1/drivers/usb/storage/usb.h 2004-10-21
11:45:13.000000000 +0200
@@ -159,8 +159,8 @@
dma_addr_t iobuf_dma;
/* mutual exclusion and synchronization structures */
- struct semaphore sema; /* to sleep thread on */
- struct completion notify; /* thread begin/end */
+ struct completion done; /* to sleep thread on */
+ struct completion notify; /* thread begin/end */
wait_queue_head_t dev_reset_wait; /* wait during reset */
wait_queue_head_t scsi_scan_wait; /* wait before scanning */
struct completion scsi_scan_done; /* scan thread end */
On Thu, Oct 21 2004, Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 11:53, Jens Axboe wrote:
> > On Thu, Oct 21 2004, Thomas Gleixner wrote:
> > > On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> > > > [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
> > >
> > > As I already pointed out, this is a problem due to up(sema) in
> > > queuecommand. That's one of the semaphore abuse points, which needs to
> > > be fixed.
> > >
> > > The problem is that semaphores are hold by Process A and released by
> > > Process B, which makes Ingo's checks trigger
> >
> > That's utter crap, it's perfectly valid use.
>
> It's not!
>
> >From the code:
>
> init_MUTEX_LOCKED(&(us->sema));
>
> This is used to wait for command completion and therefor we have the
> completion API. It was used this way because the ancestor of completion
> (sleep_on) was racy !
I didn't look at the USB code, I'm just saying that it's perfectly valid
use of a semaphore the pattern you describe (process A holding it,
process B releasing it).
--
Jens Axboe
On Thu, 2004-10-21 at 12:11, Jens Axboe wrote:
> On Thu, Oct 21 2004, Thomas Gleixner wrote:
> > On Thu, 2004-10-21 at 11:53, Jens Axboe wrote:
> > > On Thu, Oct 21 2004, Thomas Gleixner wrote:
> > > > On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> > > > > [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
> > > >
> > > > As I already pointed out, this is a problem due to up(sema) in
> > > > queuecommand. That's one of the semaphore abuse points, which needs to
> > > > be fixed.
> > > >
> > > > The problem is that semaphores are hold by Process A and released by
> > > > Process B, which makes Ingo's checks trigger
> > >
> > > That's utter crap, it's perfectly valid use.
> >
> > It's not!
> >
> > >From the code:
> >
> > init_MUTEX_LOCKED(&(us->sema));
> >
> > This is used to wait for command completion and therefor we have the
> > completion API. It was used this way because the ancestor of completion
> > (sleep_on) was racy !
>
> I didn't look at the USB code, I'm just saying that it's perfectly valid
> use of a semaphore the pattern you describe (process A holding it,
> process B releasing it).
Yeah, for a semaphore it is, but not for a mutex.
IMHO, this is not clearly seperated and therefor produces a lot of
confusion.
tglx
On Monday 18 October 2004 05:50, OGAWA Hirofumi wrote:
> Dominik Karall <[email protected]> writes:
> > yes, the bug only occurs on a specific file.
> > as the bug is present in -mm1 (without vp) too, i applied your patch to
> > that one. here is the output:
> >
> > fat_cache_check: id 0, contig 6415, fclus 38231, dclus 1010103
> > contig 6416, fclus 38231, dclus 1010103
> > contig 0, fclus 32, dclus 603964
> > contig 1, fclus 30, dclus 603960
> > contig 7, fclus 22, dclus 603950
> > contig 4, fclus 17, dclus 603943
> > contig 1, fclus 15, dclus 603940
> > contig 6, fclus 8, dclus 603931
> > contig 0, fclus 7, dclus 603929
>
> Thanks. Seems good. There is no inconsistency in cache.
>
> > and the movie starts to play in mplayer without problems. tell me if
> > you need more debugging!
>
> Can you please try the patch again? This patch should tell who added
> the cache.
>
> Thanks.
sorry, but i can't reproduce the bug again. even after a reboot the file works
as normal. but i didn't changed anything on the file.
if the bug appears again, i will apply the patch and let you know!
best regards,
dominik
Ingo Molnar wrote:
>
> * Rui Nuno Capela <[email protected]> wrote:
>
>> One of the signs that there's real trouble in here can be seen on
>> the following complete dmesg output (which was even a miracle to be
>> captured at all). This shows the complete bootstrap and init sequences
>> and at the end one fatal crash while plugging an USB flash memory
>> stick (usb-storage). This has been already reported earlier yesterday,
>> but I just want to make it here, as the evidence-at-hand.
>>
>> After this precise occurence, the system becomes very flaky,
>> unreliable and often ends up freezing to death.
>
> for the sake of testing could you disable CONFIG_USB and see whether the
> instability is truly directly related to the USB crash, as you suspect?
> Such a kernel crash can often destabilize other parts of the kernel.
>
Just tested with CONFIG_USB off, and can't test the usb-storage crash, of
course. However, jackd is still freezing to death. No console, nor syslog
output can be found. The system just dies sometime after some jack client
is launched. Will try further.
I'm on the way to test Thomas Gleixner's patch...
BBL
--
rncbc aka Rui Nuno Capela
[email protected]
* Jens Axboe <[email protected]> wrote:
> I didn't look at the USB code, I'm just saying that it's perfectly
> valid use of a semaphore the pattern you describe (process A holding
> it, process B releasing it).
yes, that is perfectly true, and sorry if we gave you the wrong
impression.
the goal of these patches is to do a semaphore->completion conversion in
cases where the semaphore was used for completion purposes. It's a bit
faster and more readable but not a 'bugfix' in any way. (another set of
patches are converting sleep_on() uses to wait_event*() plus waitqueues
- those can in fact be considered bugfixes in some cases.)
typically the cases where semaphores are held by one task and released
by another task happens coincide with this used-for-completion scenario.
[ the different-owner assert that triggers in my PREEMPT_REALTIME tree
is for completely different reasons and has no impact on upstream at
all. (It merely means 'Ingo does some weird stuff again, pester him, not
others'.) ]
Ingo
* Christoph Hellwig <[email protected]> wrote:
> > The problem is that semaphores are hold by Process A and released by
> > Process B, which makes Ingo's checks trigger
>
> Which is perfectly valid for a semaphore.
yes, it is valid and perfectly fine code, but i'm trying to separate out
the simple 'mutex' functionality (99% of the semaphore users are just
that) and implement a 'counted semaphore' separately. This removes a
number of implementational constraints from mutexes.
Ingo
On Thu, Oct 21 2004, Ingo Molnar wrote:
>
> * Jens Axboe <[email protected]> wrote:
>
> > I didn't look at the USB code, I'm just saying that it's perfectly
> > valid use of a semaphore the pattern you describe (process A holding
> > it, process B releasing it).
>
> yes, that is perfectly true, and sorry if we gave you the wrong
> impression.
>
> the goal of these patches is to do a semaphore->completion conversion in
> cases where the semaphore was used for completion purposes. It's a bit
> faster and more readable but not a 'bugfix' in any way. (another set of
> patches are converting sleep_on() uses to wait_event*() plus waitqueues
> - those can in fact be considered bugfixes in some cases.)
>
> typically the cases where semaphores are held by one task and released
> by another task happens coincide with this used-for-completion scenario.
Thanks for the explanation, I can agree with that.
--
Jens Axboe
* Thomas Gleixner <[email protected]> wrote:
> > > This is used to wait for command completion and therefor we have the
> > > completion API. It was used this way because the ancestor of completion
> > > (sleep_on) was racy !
> >
> > I didn't look at the USB code, I'm just saying that it's perfectly valid
> > use of a semaphore the pattern you describe (process A holding it,
> > process B releasing it).
>
> Yeah, for a semaphore it is, but not for a mutex.
but mutexes dont exist in upstream Linux as a separate entity. (they
exist in my tree but that's another ballgame.)
> IMHO, this is not clearly seperated and therefor produces a lot of
> confusion.
if used to complete some work then semaphores are indeed a tad unclean
and slightly slower than completions - but they are fully correct kernel
code. And there are much worse offenders of cleanliness around.
Ingo
>>
>> for the sake of testing could you disable CONFIG_USB and see whether the
>> instability is truly directly related to the USB crash, as you suspect?
>> Such a kernel crash can often destabilize other parts of the kernel.
>>
>
> Just tested with CONFIG_USB off, and can't test the usb-storage crash, of
> course. However, jackd is still freezing to death. No console, nor syslog
> output can be found. The system just dies sometime after some jack client
> is launched. Will try further.
>
> I'm on the way to test Thomas Gleixner's patch...
>
OK. Thomas patch solves the usb-storage crash, but I added an oneliner
change to it, to make it build. Corrected patch is appended.
Thanks.
--
rncbc aka Rui Nuno Capela
[email protected]
On Thu, Oct 21 2004, Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 12:11, Jens Axboe wrote:
> > On Thu, Oct 21 2004, Thomas Gleixner wrote:
> > > On Thu, 2004-10-21 at 11:53, Jens Axboe wrote:
> > > > On Thu, Oct 21 2004, Thomas Gleixner wrote:
> > > > > On Thu, 2004-10-21 at 11:12, Rui Nuno Capela wrote:
> > > > > > [<e018e139>] queuecommand+0x70/0x7c [usb_storage] (24)
> > > > >
> > > > > As I already pointed out, this is a problem due to up(sema) in
> > > > > queuecommand. That's one of the semaphore abuse points, which needs to
> > > > > be fixed.
> > > > >
> > > > > The problem is that semaphores are hold by Process A and released by
> > > > > Process B, which makes Ingo's checks trigger
> > > >
> > > > That's utter crap, it's perfectly valid use.
> > >
> > > It's not!
> > >
> > > >From the code:
> > >
> > > init_MUTEX_LOCKED(&(us->sema));
> > >
> > > This is used to wait for command completion and therefor we have the
> > > completion API. It was used this way because the ancestor of completion
> > > (sleep_on) was racy !
> >
> > I didn't look at the USB code, I'm just saying that it's perfectly valid
> > use of a semaphore the pattern you describe (process A holding it,
> > process B releasing it).
>
> Yeah, for a semaphore it is, but not for a mutex.
>
> IMHO, this is not clearly seperated and therefor produces a lot of
> confusion.
Semaphore and mutex has always been the same thing in Linux. Apparently
this isn't so in Ingos tree, you should make a clear distinction on
which you are discussing.
--
Jens Axboe
On Thu, 2004-10-21 at 13:11, Jens Axboe wrote:
> >
> > IMHO, this is not clearly seperated and therefor produces a lot of
> > confusion.
>
> Semaphore and mutex has always been the same thing in Linux. Apparently
> this isn't so in Ingos tree, you should make a clear distinction on
> which you are discussing.
I agree, but this thread is discussing Ingo's tree :)
I know that semaphores and mutexes are the same, but that's something
which should be seperated IMHO.
Ingo's changes reviel a couple of places where completion or wait_event
is more clean, than using a mutex. That's all I'm talking about. Sorry,
if I did not express it clearly enough or even used a wrong expression.
The points, which we identify are not wrong from the view point when
they were coded. They use a mutex to wait for completion, which is
functional by the mutex implementation and common use in the kernel.
Part of this discussion is the given mixup in the kernel, which is
functionally correct, but if a mutex is changed to a real mutex then it
is wrong in the semantical sense.
tglx
Ingo Molnar wrote:
>* Thomas Gleixner <[email protected]> wrote:
>
>
>>Yeah, for a semaphore it is, but not for a mutex.
>>
>
>but mutexes dont exist in upstream Linux as a separate entity. (they
>exist in my tree but that's another ballgame.)
>
Mutexes layered on existing semaphores seems convenient
at the moment. However a more native mutex mechanism
which tracks ownership would provide a basis for PI as
well as further instrumentation. This may not be an
issue at the present but I don't think it is too far
off.
-john
--
[email protected]
On Wed, 20 Oct 2004, Ingo Molnar wrote:
>
> i have released the -U8 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
Grr. I got this -U8 mail *before* I got the -U7 mail.
All thruout these U# kernels, I get stalls in X(everything pauses, not sure if
remote pings stop). Nothing shows in dmesg when this occurs.
I had been running rc kernels for awhile, without these problems. I had not
run any mm kernels, but I have run all the U kernels(but not U7).
Compiling/rebooting.
Rui Nuno Capela wrote:
> Ingo Molnar wrote:
>>
>> i have released the -U8 Real-Time Preemption patch:
>>
>> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>>
> [...]
>
> b) Laptop P4 2.533Ghz UP (Mandrake 10.1c)
> config-2.6.9-rc4-mm1-RT-U8.1.gz
>
> This box was known to work without major issues until U4. With U8 it's
> a real pain. Once trivial operations turns out fatal now. Running jackd
> -R, which has been a flagship before, now freezes the whole system in
> no time. (I'll take some netconsole capture sessions later)
>
OK.
Now that the usb-storage crash has been ironed out, thanks to Thomas
Gleixner's patch, I proceeded with some local experiments regarding the
jackd -R issue.
The fact is jackd -R (realtime mode; SCHED_FIFO) hosing the system, and
thats exposed as soon as some jack audio client application enters into
the chain.
Running jackd non-realtime (SCHED_OTHER) does not expose this problem, so
I think it's a scheduling related one.
With a default scenario, with all IRQ handlers under SCHED_OTHER
scheduling class and default priority, running jackd -R freezes completely
the system. Only a hard-reset or power-off is the way out.
Then I try tweaking the keyboard (IRQs 1,12), rtc (IRQ 8) and soundcard
(IRQ 5) scheduling policies to SCHED_FIFO and priority to something higher
than jackd's (e.g. chrt --fifo 60).
This way, running jack -R still hoses the system, in a somewhat less
egoistic manner, but still seems that it's the only process running on the
system taking full control of it. The evidence I could find was that
jackd's verbose output keeps pumping, as it would usually, but all the
rest poor things freeze to death. This time however, magic SysRq is of
some use, barely thanks to the i8042 IRQ scheduling promotion.
So it all seems that jackd -R is not crashing, nor anything else, for this
matter.
I really hate to say this, but this should be investigated for the RT
patch sake, obviously because the only purpose I find to it is precisely
running jackd -R, and I can swear it has been near perfection until U4
inclusive (think it was called VP back then :).
I hope I'm not the only one.
(IIRC Florian Schmidt was experiencing something similar, like system
intermitence, pauses, whatever, while also running jackd -R.)
Strange enough, all this is running on another SMP/HT box of mine, without
major issues. Guess that SMP makes the difference here.
But wait, now I remember to notice something there yet: running some jack
clients (e.g. fluidsynth) is much more expensive than usual wrt.CPU usage,
reaching very unusual levels above 40% sustained (this is actual CPU% as
reported by procps tools, not the DSP usage as reported by jack itself).
IMHO the SMP/HT effect just seems to mask the real trouble, as the
increased cost affects only one of the (virtual) CPUs.
I wonder if this does ring some bell out there. :)
Cheers,
--
rncbc aka Rui Nuno Capela
[email protected]
With a workaround patch for the boot time BUG, I was able to get ...
- to single user mode w/o any errors
- a [NOW] non fatal error getting the network up (telinit 3)
- no further errors getting the X server up (telinit 5)
- able to hear sample audio
- system stayed up all night (daemons were stable...)
This is the first time in about two weeks that I had a reasonably stable
system (last known good is -T3).
I was about to run my normal stress tests when the system locked up.
The symptom was the display stopped updating / no mouse motion. Apparently
caused while I was dragging a window with the mouse (USB). We may still
have problems in that area. No apparent response to Alt-Sysrq keys;
hardware reset was sufficient to reboot.
Will check the system logs to see what I can find.
--Mark H Johnson
<mailto:[email protected]>
i have released the -U9 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this too is a fixes-only release. It includes more driver fixes and
improvements from Thomas Gleixner.
Changes since -U8.1:
- USB semaphore->completion conversion from Thomas Gleixner
- netconsole fixes from Michal Schmidt
- fbcon fixes
- added counted semaphores, this is now used by firewire, XFS and ACPI.
This could fix the firewire breakage - but testing would be welcome.
- PREEMPT_ACTIVE irqs-enabled critical path removal.
- fixed irqs-off raw spinlock primitives on UP: they enabled irqs
before enabling preemption, creating a window for an interrupt to
slip in and increase the critical path.
- made the deadlock detector not crash the current process - it will
just hang. This produces far nicer log output while still not
endangering stability. Also, fixed a bug in the detector that happens
if the trace buffer overflows.
- made the atomic-counter-underflow detector non-fatal as well, for the
same reasons.
to create a -U9 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U9
Ingo
On Wed, 20 Oct 2004 16:53:26 +0200
Florian Schmidt <[email protected]> wrote:
> setting them to SCHED_FIFO even with a prio of 99 won't help. will try
> rebooting to see if it's reproducable
>
> flo
ok, it seems it was coincidence that the mouse skipping started at the time
of my echo 1 > trace_enabled.. This time it just started sometime after
boot. The scheduler class of the different irq's seems to have no influence
[i experimented with SCHED_FIFO and SCHED_OTHER for irq 1, 8, 12 and 3 and 10
[the latter two are my soundcards irq's].
~$ cat /proc/interrupts
CPU0
0: 480317 XT-PIC timer 0/80317
1: 1731 XT-PIC i8042 0/1731
2: 0 XT-PIC cascade 0/0
3: 10828 XT-PIC CS46XX 0/10828
5: 390 XT-PIC eth0 0/390
8: 4 XT-PIC rtc 0/4
10: 251556 XT-PIC SiS SI7012 0/51556
12: 16605 XT-PIC i8042 0/16604
14: 1151 XT-PIC ide0 0/1151
15: 26537 XT-PIC ide1 0/26537
NMI: 0
ERR: 0
Btw: i just experienced two pauses again, so the patch didn't really fix it
:( hrmm..
I get the feeling that something indeterministic is going on. Still no BUG's
in either dmesg output nor in the syslog.
flo
* Rui Nuno Capela <[email protected]> wrote:
> The fact is jackd -R (realtime mode; SCHED_FIFO) hosing the system,
> and thats exposed as soon as some jack audio client application enters
> into the chain.
>
> Running jackd non-realtime (SCHED_OTHER) does not expose this problem,
> so I think it's a scheduling related one.
i tried to pole jackd a little bit (just using things like
jack_freewheel and jack_impulse_grabber - i dont even know what they
do), and got jackd into some sort of userspace loop:
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
2558 root 16 0 27900 1852 2152 S 97.8 0.8 2:36.38 jackd
attaching gdb to it shows:
Loaded symbols for /usr/local/lib/jack/jack_oss.so
0xffffe410 in ?? ()
(gdb) bt
#0 0xffffe410 in ?? ()
#1 0xbffff7f8 in ?? ()
#2 0x00000a67 in ?? ()
#3 0x00000000 in ?? ()
#4 0x4db8adf8 in pthread_join () from /lib/tls/libpthread.so.0
#5 0xb77d6e66 in oss_driver_stop (driver=0x8055938) at oss_driver.c:696
#6 0x0804ba03 in jack_engine_delete (engine=0x805c308) at engine.c:2466
#7 0x0804ade7 in main (argc=3, argv=0xbffffb44) at jackd.c:207
(gdb)
it's looping somewhere in the pthread code, and it does no system-calls
at all and thus no scheduling as well.
i dont know much about jackit, and it could easily be that something in
this kernel broke its interaction with pthread, but it seems to me that
this loop is in userspace and is only 'fatal' if the looping thread runs
at SCHED_FIFO priority. Could someone with more jackit experience try to
figure out what's going on here?
Ingo
* Ingo Molnar <[email protected]> wrote:
> i tried to pole jackd a little bit (just using things like
> jack_freewheel and jack_impulse_grabber - i dont even know what they
> do), and got jackd into some sort of userspace loop:
>
> PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
> 2558 root 16 0 27900 1852 2152 S 97.8 0.8 2:36.38 jackd
ah ... i should have guessed that "jack_freewheel y" puts jackd into ...
freewheeling mode. So this is by design.
i still suspect that it's some sort of userspace loop causing the jackd
problems - just that under SCHED_OTHER you dont normally notice it,
while with jack -R it's fatal.
Ingo
>i have released the -U8 Real-Time Preemption patch:
>
>
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
First attempt at booting this version, made it to single user OK but
telinit 3 stopped while starting up the network with the following messages
on console...
INIT: Switching to runlevel: 3
INIT: Sending processes the TERM signal
INIT: Sending processes the KILL signal
Applying Intel IA32 Microcode update: [ OK ]
Checking for new hardware ----------------[ cut here ]---------------
kernel BUG at include/asm/atomic.h:135!
Invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in: 8139too mii floppy sg scsi_mod microcode dm_mod uhci_hcd
ext3 jbd
CPU: 0
EIP: 0060:[<c02b8426>] Not tainted VLI
EFLAGS: 00010246 (2.6.9-rc4-mm1-RT-U8)
EIP is at qdisc_destroy+0x76/0x80
eax: 00000000 ebx: dedf9000 ecx: c037ede0 edx: c037ede0
esi: dedf9000 edi: c03747a8 ebp: df3d9e68 esp: df3d9e64
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process modprobe (pid: 1594, threadinfo=df3d8000 task=db708690)
Stack: dedf9000 df3d9e84 c02b863d c037ede0 df3d9e74 df3d9e74 c03423e4
dedf9000
df3d9ea8 c02a9bdb dedf9000 c01148b0 e0836b10 dedf9420 dedf9000
e083b148
0807a804 df3d9eb8 c024192e dedf9000 dedf9000 df3d9ed8 e0836b3a
dedf9000
Call Trace:
[<c02b863d>] dev_shutdown+0x3d/0xa0 (12)
[<c02a9bdb>] unregister_netdevice+0x13b/0x280 (28)
[<c01148b0>] mcount+0x14/0x18 (8)
[<e0836b10>] rtl8139_remove_one+0x0/0xa0 [8139too] (4)
[<c024192e>] unregister_netdev+0x1e/0x30 (24)
[<e0836b3a>] rtl8139_remove_one+0x2a/0xa0 [8139too] (16)
[<c01148b0>] mcount+0x14/0x18 (12)
[<c01e3a06>] pci_device_remove+0x76/0x80 (20)
[<c02300bb>] device_detach_shutdown+0xb/0x40 (12)
[<c022d207>] device_release_driver+0x67/0x70 (12)
[<c022d23b>] device_detach+0x2b/0x40 (24)
[<c022d6af>] bus_remove_driver+0x3f/0x70 (20)
[<c022dbb9>] driver_unregister+0x19/0x30 (20)
[<c01e3cac>] pci_unregister_driver+0x1c/0x30 (16)
[<e0838ae7>] rtl8139_cleanup_module+0x17/0x1b [8139too] (16)
[<c013c751>] sys_delete_module+0x121/0x150 (12)
[<c01594f4>] sys_munmap+0x54/0x70 (64)
[<c0118560>] do_page_fault+0x0/0x6d0 (16)
[<c0107b49>] sysenter_past_esp+0x52/0x71 (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x1f/0x80 / (die+0x44/0x190)
.. entry 2: print_traces_0x1d/0x60 / (show_stack+0x8f/0xb0)
Code: 00 ba c0 82 2b c0 c7 43 04 00 02 20 00 5b 5d e9 81 bf e7 ff 0f 0b a5
00 e9
4d 32 c0 eb cd 0f 0b a4 00 e9 4d 32 c0 eb b8 5b 5d c3 <0f> 0b 87 00 0d 4e
32 c0
eb 93 55 89 e5 83 ec 10 89 5d f8 89 75
Updating /etc/fstab [ OK ]
Setting network parameters: [ OK ]
Bringing up loopback interface:
(no further console output at this point)
Alt-Sysrq-L showed active tasks were swapper (CPU 1) and IRQ 1 (CPU 0) at
cpu_idle and nmi_show_all_regs respectively. Able to repeat this more than
once with the same results each time. Tasks listed by Alt-Sysrq-T included
IRQ 6, S10network, minlogd, initlog, ifup, ip, and grep (all I could see
on the scrollback).
Synced disks (Alt-Sysrq-S) and rebooted (Alt-Sysrq-B) to try again. This
time
turning on syslog before telinit 3. Same results. Will send whatever showed
up
on disk separately.
--Mark H Johnson
<mailto:[email protected]>
* [email protected] <[email protected]> wrote:
> I was about to run my normal stress tests when the system locked up.
>
> The symptom was the display stopped updating / no mouse motion.
> Apparently caused while I was dragging a window with the mouse (USB).
> We may still have problems in that area. No apparent response to
> Alt-Sysrq keys; hardware reset was sufficient to reboot.
the soundcard IRQ trace you got is interesting:
BUG: sleeping function called from invalid context X(2948) at kernel/mutex.c:25 Oct 21 07:53:02 localhost kernel: in_atomic():1 [00010000], irqs_disabled():0
[<c011f06a>] __might_sleep+0xca/0xe0 (12)
[<c01387e6>] _mutex_lock+0x26/0x50 (36)
[<e0a549b6>] snd_audiopci_interrupt+0x46/0xf0 [snd_ens1371] (20)
[<c01436f6>] handle_IRQ_event+0x46/0x80 (24)
[<c0143837>] __do_IRQ+0x107/0x160 (32)
[<c010a299>] do_IRQ+0x59/0x90 (36)
[<c0108510>] common_interrupt+0x18/0x20 (20)
preempt count: 00010001
do you have PREEMPT_REALTIME enabled? The above trace is a direct
interrupt - only the timer interrupt is allowed to execute directly in
the PREEMPT_REALTIME model - things break badly if it happens for any
other interrupt (such as the soundcard IRQ).
Ingo
On Thu, 21 Oct 2004, john cooper wrote:
> Ingo Molnar wrote:
>
> >* Thomas Gleixner <[email protected]> wrote:
> >
> >
> >>Yeah, for a semaphore it is, but not for a mutex.
> >>
> >
> >but mutexes dont exist in upstream Linux as a separate entity. (they
> >exist in my tree but that's another ballgame.)
> >
> Mutexes layered on existing semaphores seems convenient
> at the moment. However a more native mutex mechanism
> which tracks ownership would provide a basis for PI as
> well as further instrumentation. This may not be an
> issue at the present but I don't think it is too far
> off.
>
> -john
>
Actually you need to have another kind of semaphore based on a new kind of
wait-queue: Priority based. I.e. the task with the highest priority get
woken up first. Then on top of that you build your mutex.
I was planning to start to look at it and try to see if I could get
anything to work, but I must admit I haven't got much further than
just getting Igno's -U8.1 up running.
An idea I had was to make a macro in list.h called
list_insert_sorted(list, element, condition_statement)
and use that in this kind of wait_queue...
To get a mutex with priority inheritance add an element pointing to the
current owner and a field where you store the owners original priority
which it has to be set back to when it relases the mutex (I am not sure
how this will work out if someone holds several mutexes!)
Regards,
Esben
* Ingo Molnar <[email protected]> wrote:
> the soundcard IRQ trace you got is interesting:
>
> BUG: sleeping function called from invalid context X(2948) at kernel/mutex.c:25 Oct 21 07:53:02 localhost kernel: in_atomic():1 [00010000], irqs_disabled():0
> [<c011f06a>] __might_sleep+0xca/0xe0 (12)
> [<c01387e6>] _mutex_lock+0x26/0x50 (36)
> [<e0a549b6>] snd_audiopci_interrupt+0x46/0xf0 [snd_ens1371] (20)
> [<c01436f6>] handle_IRQ_event+0x46/0x80 (24)
> [<c0143837>] __do_IRQ+0x107/0x160 (32)
> [<c010a299>] do_IRQ+0x59/0x90 (36)
> [<c0108510>] common_interrupt+0x18/0x20 (20)
> preempt count: 00010001
>
> do you have PREEMPT_REALTIME enabled? The above trace is a direct
> interrupt - only the timer interrupt is allowed to execute directly in
> the PREEMPT_REALTIME model - things break badly if it happens for any
> other interrupt (such as the soundcard IRQ).
this also seems to be the major cause of the other problems in your log:
kernel preempting off a hardirq context and then confusing other kernel
code. It's surprising that it booted up at all! Now how did the
soundcard IRQ end up being non-threaded?
Ingo
On Thu, 2004-10-21 at 15:27, Ingo Molnar wrote:
> i have released the -U9 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
impi watchdog conversion to completion api.
tglx
diff --exclude='*~' -urN
2.6.9-rc4-mm1-RT-U9/drivers/char/ipmi/ipmi_watchdog.c
2.6.9-rc4-mm1-U9-E0/drivers/char/ipmi/ipmi_watchdog.c
--- 2.6.9-rc4-mm1-RT-U9/drivers/char/ipmi/ipmi_watchdog.c 2004-10-21
15:47:23.000000000 +0200
+++ 2.6.9-rc4-mm1-U9-E0/drivers/char/ipmi/ipmi_watchdog.c 2004-10-21
15:41:53.000000000 +0200
@@ -386,16 +386,16 @@
when both messages are free. */
static atomic_t heartbeat_tofree = ATOMIC_INIT(0);
static DECLARE_MUTEX(heartbeat_lock);
-static DECLARE_MUTEX(heartbeat_wait_lock);
+static DECLARE_COMPLETION(heartbeat_received);
static void heartbeat_free_smi(struct ipmi_smi_msg *msg)
{
if (atomic_dec_and_test(&heartbeat_tofree))
- up(&heartbeat_wait_lock);
+ complete(&heartbeat_received);
}
static void heartbeat_free_recv(struct ipmi_recv_msg *msg)
{
if (atomic_dec_and_test(&heartbeat_tofree))
- up(&heartbeat_wait_lock);
+ complete(&heartbeat_received);
}
static struct ipmi_smi_msg heartbeat_smi_msg =
{
@@ -473,7 +473,7 @@
}
/* Wait for the heartbeat to be sent. */
- down(&heartbeat_wait_lock);
+ wait_for_completion(&heartbeat_received);
if (heartbeat_recv_msg.msg.data[0] != 0) {
/* Got an error in the heartbeat response. It was already
On Thu, 2004-10-21 at 16:22, Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 15:27, Ingo Molnar wrote:
> > i have released the -U9 Real-Time Preemption patch, which can be
> > downloaded from:
> >
> > http://redhat.com/~mingo/realtime-preempt/
> >
>
> impi watchdog conversion to completion api.
Sorry, I copied the wrong file to the correct place.
tglx
diff --exclude='*~' -urN
2.6.9-rc4-mm1-RT-U9/drivers/char/ipmi/ipmi_watchdog.c
2.6.9-rc4-mm1-U9-E0/drivers/char/ipmi/ipmi_watchdog.c
--- 2.6.9-rc4-mm1-RT-U9/drivers/char/ipmi/ipmi_watchdog.c 2004-10-21
15:47:23.000000000 +0200
+++ 2.6.9-rc4-mm1-U9-E0/drivers/char/ipmi/ipmi_watchdog.c 2004-10-21
16:25:14.000000000 +0200
@@ -47,6 +47,7 @@
#include <linux/reboot.h>
#include <linux/wait.h>
#include <linux/poll.h>
+#include <linux/completion.h>
#ifdef CONFIG_X86_LOCAL_APIC
#include <asm/apic.h>
#endif
@@ -386,16 +387,16 @@
when both messages are free. */
static atomic_t heartbeat_tofree = ATOMIC_INIT(0);
static DECLARE_MUTEX(heartbeat_lock);
-static DECLARE_MUTEX(heartbeat_wait_lock);
+static DECLARE_COMPLETION(heartbeat_received);
static void heartbeat_free_smi(struct ipmi_smi_msg *msg)
{
if (atomic_dec_and_test(&heartbeat_tofree))
- up(&heartbeat_wait_lock);
+ complete(&heartbeat_received);
}
static void heartbeat_free_recv(struct ipmi_recv_msg *msg)
{
if (atomic_dec_and_test(&heartbeat_tofree))
- up(&heartbeat_wait_lock);
+ complete(&heartbeat_received);
}
static struct ipmi_smi_msg heartbeat_smi_msg =
{
@@ -473,7 +474,7 @@
}
/* Wait for the heartbeat to be sent. */
- down(&heartbeat_wait_lock);
+ wait_for_completion(&heartbeat_received);
if (heartbeat_recv_msg.msg.data[0] != 0) {
/* Got an error in the heartbeat response. It was already
@@ -944,7 +945,6 @@
{
int rv;
- init_MUTEX_LOCKED(&heartbeat_wait_lock);
printk(KERN_INFO PFX "driver version "
IPMI_WATCHDOG_VERSION "\n");
>do you have PREEMPT_REALTIME enabled? The above trace is a direct
>interrupt - only the timer interrupt is allowed to execute directly in
>the PREEMPT_REALTIME model - things break badly if it happens for any
>other interrupt (such as the soundcard IRQ).
Yes I have PREEMPT_REALTIME enabled.
The thing that comes to mind is I do have a script that does
echo 0 > '/proc/irq/10/Esoniq AudioPCI/threaded
as part of ensuring the all the preemption stuff was set right. I may
have run that script prior to getting those messages. I thought you
said before that the non threaded IRQ stuff was disabled. Perhaps this
interface needs to be disabled as well [unless you really decide to
fix this limitation...].
I was already going into that script to add something like...
for N in 1 3 4 6 8 10 11 12 14 15 ; do
chrt -p -f 99 `pidof "IRQ $N"`
done
to make all the threaded IRQ's max priority RT fifo tasks. I can
certainly comment out the IRQ thread disable code while I'm at it.
--Mark H Johnson
<mailto:[email protected]>
Esben Nielsen wrote:
>On Thu, 21 Oct 2004, john cooper wrote:
>
>>Mutexes layered on existing semaphores seems convenient
>>at the moment. However a more native mutex mechanism
>>which tracks ownership would provide a basis for PI as
>>well as further instrumentation. This may not be an
>>issue at the present but I don't think it is too far
>>off.
>>
>>-john
>>
>>
>
>Actually you need to have another kind of semaphore based on a new kind of
>wait-queue: Priority based. I.e. the task with the highest priority get
>woken up first. Then on top of that you build your mutex.
>
That more/less falls out of the PI mechanism. Though it
appears conserving per-mutex data footprint and O(1)
behavior are going to be at odds.
>I was planning to start to look at it and try to see if I could get
>anything to work, but I must admit I haven't got much further than
>just getting Igno's -U8.1 up running.
>
I myself wonder whether Ingo is 1 or N people.
>To get a mutex with priority inheritance add an element pointing to the
>current owner and a field where you store the owners original priority
>which it has to be set back to when it relases the mutex (I am not sure
>how this will work out if someone holds several mutexes!)
>
A task holding several mutexes shouldn't be an issue.
Though per task an ownership list needs to be maintained
in descending priority order such that the effective PI
can be resolved from all task owned mutexes.
Also a multiple ownership model is needed for the case of
shared-reader locks. This results in a list of owners
where the list element can maintain per-mutex task ownership
as well as per-task mutex ownerships.
It is tempting to re-implement the wheel here but existing
works are well on their way:
http://developer.osdl.org/dev/robustmutexes
-john
--
[email protected]
* [email protected] <[email protected]> wrote:
> >do you have PREEMPT_REALTIME enabled? The above trace is a direct
> >interrupt - only the timer interrupt is allowed to execute directly in
> >the PREEMPT_REALTIME model - things break badly if it happens for any
> >other interrupt (such as the soundcard IRQ).
> Yes I have PREEMPT_REALTIME enabled.
>
> The thing that comes to mind is I do have a script that does
> echo 0 > '/proc/irq/10/Esoniq AudioPCI/threaded
>
> as part of ensuring the all the preemption stuff was set right. I may
> have run that script prior to getting those messages. I thought you
> said before that the non threaded IRQ stuff was disabled. Perhaps this
> interface needs to be disabled as well [unless you really decide to
> fix this limitation...].
argh, there was a typo in that change so the 'threaded' flags werent
really disabled. So i ended up only disabling the global hardirq_preempt
flag but not the per-handler 'threaded' flags. Ouch!
I've uploaded the -U9.1 patch that has the fix, does it work any better
than previous kernels?
Ingo
john cooper wrote:
> Esben Nielsen wrote:
>
>> On Thu, 21 Oct 2004, john cooper wrote:
>>
>>> Mutexes layered on existing semaphores seems convenient
>>> at the moment. However a more native mutex mechanism
>>> which tracks ownership would provide a basis for PI as
>>> well as further instrumentation. This may not be an
>>> issue at the present but I don't think it is too far
>>> off.
>>>
>>> -john
>>>
>>>
>>
>> Actually you need to have another kind of semaphore based on a new
>> kind of
>> wait-queue: Priority based. I.e. the task with the highest priority get
>> woken up first. Then on top of that you build your mutex.
>>
> That more/less falls out of the PI mechanism. Though it
> appears conserving per-mutex data footprint and O(1)
> behavior are going to be at odds.
>
>> I was planning to start to look at it and try to see if I could get
>> anything to work, but I must admit I haven't got much further than
>> just getting Igno's -U8.1 up running.
>>
> I myself wonder whether Ingo is 1 or N people.
>
>> To get a mutex with priority inheritance add an element pointing to the
>> current owner and a field where you store the owners original priority
>> which it has to be set back to when it relases the mutex (I am not sure
>> how this will work out if someone holds several mutexes!)
>>
> A task holding several mutexes shouldn't be an issue.
> Though per task an ownership list needs to be maintained
> in descending priority order such that the effective PI
> can be resolved from all task owned mutexes.
Seems it is too coplex model at least for the first step. The one of
possible trade-offs coming on mind is to trace the number of resources
(mutexes) held by a process and to restore original priority only when
resource count reaches 0. This is one of the sollutions accepted by RTOS
guys.
The other concern with PI is that most likely PI should be prohibited
for utilization with locks which are used in the way similar to "waiting
completition" - i.e. if PI is employed on a mutex it is prohibited to
release it if a process is not the owner of the mutex.
>
> Also a multiple ownership model is needed for the case of
> shared-reader locks. This results in a list of owners
> where the list element can maintain per-mutex task ownership
> as well as per-task mutex ownerships.
> It is tempting to re-implement the wheel here but existing
> works are well on their way:
>
> http://developer.osdl.org/dev/robustmutexes
It is definitly non-trivial work to adapt this approach - there are a
lot of issues.
Eugeny
> -john
>
* Michal Schmidt <[email protected]> wrote:
> Ingo Molnar wrote:
> >
> > - netconsole fixes from Michal Schmidt
> >
>
> The #ifdef is not right. Patch attached.
indeed - i've added this to -U9.2 too.
Ingo
Ingo Molnar wrote:
>i have released the -U9 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this too is a fixes-only release. It includes more driver fixes and
>improvements from Thomas Gleixner.
>
>Changes since -U8.1:
>
> - USB semaphore->completion conversion from Thomas Gleixner
>
> - netconsole fixes from Michal Schmidt
>
> - fbcon fixes
>
> - added counted semaphores, this is now used by firewire, XFS and ACPI.
> This could fix the firewire breakage - but testing would be welcome.
>
> - PREEMPT_ACTIVE irqs-enabled critical path removal.
>
> - fixed irqs-off raw spinlock primitives on UP: they enabled irqs
> before enabling preemption, creating a window for an interrupt to
> slip in and increase the critical path.
>
> - made the deadlock detector not crash the current process - it will
> just hang. This produces far nicer log output while still not
> endangering stability. Also, fixed a bug in the detector that happens
> if the trace buffer overflows.
>
> - made the atomic-counter-underflow detector non-fatal as well, for the
> same reasons.
>
>to create a -U9 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
> + http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U9
>
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
Hey,
The kernel booted now with my firewire card plugged in. However when i
try to mount my reiser4 partition i get following error
BUG: semaphore recursion deadlock detected!
.. current task mount/10514 is already holding ccb5bb4c.
c022d6cb cf25d8f0 ccb5bae8 00002912 ccb5bb4c 00000000 ccb5bb48 ccb5a000
ccb5bb4c cf25d8f0 00000000 ccb5bb48 c0344760 ccb5bb4c 0000019d
ccb5bb50
ccb5bb50 cf25d8f0 00000002 ccabdc00 ccb5bb4c d0b26a08 ccabdd4c
c0120005
Call Trace:
[<c022d6cb>] __rwsem_deadlock+0xd9/0x12d (4)
[<c0344760>] down_write+0x103/0x1a6 (48)
[<d0b26a08>] kcond_wait+0xaa/0xac [reiser4] (36)
[<c0120005>] do_fork+0x133/0x18a (8)
[<c03443a8>] out_of_line_wait_on_bit+0x91/0x99 (32)
[<c0134273>] wake_bit_function+0x0/0x55 (28)
[<c0134c36>] check_preempt_timing+0x6e/0x1a4 (16)
[<c034471a>] down_write+0xbd/0x1a6 (56)
[<c034471a>] down_write+0xbd/0x1a6 (12)
[<d0b280b0>] start_ktxnmgrd+0x98/0x9a [reiser4] (36)
[<d0b33716>] reiser4_fill_super+0x3b/0x71 [reiser4] (28)
[<c0160854>] sb_set_blocksize+0x2e/0x5d (588)
[<c01603fd>] get_sb_bdev+0xf9/0x132 (24)
[<d0b2d569>] reiser4_get_sb+0x2f/0x33 [reiser4] (68)
[<d0b336db>] reiser4_fill_super+0x0/0x71 [reiser4] (20)
[<c016061a>] do_kern_mount+0x4f/0xc0 (4)
[<c0175945>] do_new_mount+0x9c/0xe1 (36)
[<c0175feb>] do_mount+0x145/0x194 (44)
[<c034471a>] down_write+0xbd/0x1a6 (48)
[<c0175e4d>] copy_mount_options+0x63/0xbc (32)
[<c0176459>] sys_mount+0x9f/0xe0 (32)
[<c01060b1>] sysenter_past_esp+0x52/0x71 (44)
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: down_write+0x1a1/0x1a6 / (0x0)
.. entry 2: down_write+0x6a/0x1a6 / (0x0)
.. entry 3: print_traces+0x17/0x50 / (0x0)
------------[ cut here ]------------
kernel BUG at lib/rwsem-generic.c:601!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: reiser4 airo_cs airo ohci_hcd ehci_hcd 8139cp mii
ohci1394 ieee1394 snd_intel8x0 snd_ac97_codec usbhid uhci_hcd intel_agp
agpgart snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd usbcore
vfat fat
CPU: 0
EIP: 0060:[<c022dc15>] Not tainted VLI
EFLAGS: 00010202 (2.6.9-rc4-mm1-RT-U9.1)
EIP is at up_write+0x1dc/0x1e9
eax: cca0a000 ebx: ccb5bb48 ecx: 0000002f edx: cf25d8f0
esi: ccb5bb4c edi: ccabdc00 ebp: cf1ad970 esp: cca0bf84
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process ktxnmgrd (pid: 10544, threadinfo=cca0a000 task=cf1ad970)
Stack: 00000001 cf1ad970 ccabdc00 cf1ad970 ccb5bb48 ccabdc00 ccabdc00
cf1ad970
d0b26b9f ccabdc00 cdf87000 ccabdd4c d0b27de5 ccabdc00 c0105fda
cf205260
d0b27d4c 00000000 cc9d5400 00000000 00000000 00000000 cca0a000
d0b27d4c
Call Trace:
[<d0b26b9f>] kcond_broadcast+0x23/0x46 [reiser4] (36)
[<d0b27de5>] ktxnmgrd+0x99/0x22c [reiser4] (16)
[<c0105fda>] ret_from_fork+0x6/0x14 (8)
[<d0b27d4c>] ktxnmgrd+0x0/0x22c [reiser4] (8)
[<d0b27d4c>] ktxnmgrd+0x0/0x22c [reiser4] (28)
[<c01042a9>] kernel_thread_helper+0x5/0xb (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: die+0x39/0x198 / (0x0)
.. entry 2: print_traces+0x17/0x50 / (0x0)
* Gunther Persoons <[email protected]> wrote:
> The kernel booted now with my firewire card plugged in. However when i
> try to mount my reiser4 partition i get following error
>
> BUG: semaphore recursion deadlock detected!
> .. current task mount/10514 is already holding ccb5bb4c.
> [<c0344760>] down_write+0x103/0x1a6 (48)
> [<d0b26a08>] kcond_wait+0xaa/0xac [reiser4] (36)
> [<d0b280b0>] start_ktxnmgrd+0x98/0x9a [reiser4] (36)
> [<d0b33716>] reiser4_fill_super+0x3b/0x71 [reiser4] (28)
> [<d0b2d569>] reiser4_get_sb+0x2f/0x33 [reiser4] (68)
> [<c016061a>] do_kern_mount+0x4f/0xc0 (4)
> [<c0175945>] do_new_mount+0x9c/0xe1 (36)
> [<c0175feb>] do_mount+0x145/0x194 (44)
> [<c0176459>] sys_mount+0x9f/0xe0 (32)
> [<c01060b1>] sysenter_past_esp+0x52/0x71 (44)
reiser4 has some pretty ugly locking abstraction called kcond, i took a
look but it doesnt seem simple to convert it. Reiserfs should really use
a normal Linux waitqueue and nothing more...
Ingo
* Thomas Gleixner <[email protected]> wrote:
> + spin_lock_irq(&x->wait.lock);
> if (!timeout)
> goto out;
> - spin_lock_irq(&x->wait.lock);
> - schedule();
> - spin_lock_irq(&x->wait.lock);
> + timeout = schedule_timeout(timeout);
> + spin_lock_irq(&x->wait.lock);
> + if (!timeout)
> + goto out;
yeah. I've added these fixes and uploaded -U9.2.
Ingo
Ingo Molnar wrote:
>
> - netconsole fixes from Michal Schmidt
>
The #ifdef is not right. Patch attached.
Michal
On Wed, 20 Oct 2004, Adam Heath wrote:
> On Wed, 20 Oct 2004, Ingo Molnar wrote:
>
> >
> > i have released the -U8 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>
> Grr. I got this -U8 mail *before* I got the -U7 mail.
>
> All thruout these U# kernels, I get stalls in X(everything pauses, not sure if
> remote pings stop). Nothing shows in dmesg when this occurs.
Got some input on this.
Heavy disk and/or network i/o seems to cause the pauses. Doing a copy over
nfs(writing to disk), or using scp gives me 1-2 MB/s. If I boot plain rc4, I
get full network speed(10-11 MB/s with nfs, 5-8 MB/s with scp).
Eugeny S. Mints wrote:
> john cooper wrote:
>
>> A task holding several mutexes shouldn't be an issue.
>> Though per task an ownership list needs to be maintained
>> in descending priority order such that the effective PI
>> can be resolved from all task owned mutexes.
>
>
> Seems it is too coplex model at least for the first step. The one of
> possible trade-offs coming on mind is to trace the number of resources
> (mutexes) held by a process and to restore original priority only when
> resource count reaches 0. This is one of the sollutions accepted by
> RTOS guys.
It would seem a mutex ownership list still needs to be maintained.
Doing so in unordered priority will give a small fixed insertion
time, but will require an exhaustive search in order to calculate
maximum priority. Doing so in priority order will require an
average of #mutex_owned / 2 for the insertion, and gives a fixed
time for maximum priority calculation. The latter appears to offer
a performance benefit to the degree the incoming priorities are
random.
> The other concern with PI is that most likely PI should be prohibited
> for utilization with locks which are used in the way similar to
> "waiting completition" - i.e. if PI is employed on a mutex it is
> prohibited to release it if a process is not the owner of the mutex.
Yes, that type of usage breaks the notion of ownership.
It would be a error for a task to attempt relinquishing
a mutex which it had not acquired and was holding at the
time of unlock.
>> http://developer.osdl.org/dev/robustmutexes
>
>
> It is definitly non-trivial work to adapt this approach - there are a
> lot of issues.
I should have qualified that reference. Those folks are
addressing more than PI mutexes. Indeed their goal is
support of fast user mutexes which offer detection of mutex
owners gone astray (exited, killed, etc..). It is the kernel
component of the work to which I was referring.
-john
--
[email protected]
On Thu, Oct 21, 2004 at 12:49:30PM -0400, john cooper wrote:
> It would seem a mutex ownership list still needs to be maintained.
> Doing so in unordered priority will give a small fixed insertion
> time, but will require an exhaustive search in order to calculate
> maximum priority. Doing so in priority order will require an
> average of #mutex_owned / 2 for the insertion, and gives a fixed
> time for maximum priority calculation. The latter appears to offer
> a performance benefit to the degree the incoming priorities are
> random.
If you keep it in priority order, then you're paying the O(n) cost
every time you acquire a lock. If you keep it unordered and only
search it when you need to recalculate a task's priority after a lock
has been released (or priorities have been changed), you pay the cost
much less often. Plus, the number of locks held by any given thread
should generally be very small.
-Scott
On Thu, 2004-10-21 at 15:27, Ingo Molnar wrote:
> i have released the -U9 Real-Time Preemption patch, which can be
> downloaded from:
tglx
diff --exclude='*~' -urN 2.6.9-rc4-mm1-RT-U9/kernel/sched.c
2.6.9-rc4-mm1-U9-E0/kernel/sched.c
--- 2.6.9-rc4-mm1-RT-U9/kernel/sched.c 2004-10-21 15:47:21.000000000
+0200
+++ 2.6.9-rc4-mm1-U9-E0/kernel/sched.c 2004-10-21 17:17:44.000000000
+0200
@@ -3185,9 +3185,9 @@
__set_current_state(TASK_UNINTERRUPTIBLE);
spin_unlock_irq(&x->wait.lock);
timeout = schedule_timeout(timeout);
+ spin_lock_irq(&x->wait.lock);
if (!timeout)
goto out;
- spin_lock_irq(&x->wait.lock);
} while (!x->done);
__remove_wait_queue(&x->wait, &wait);
}
@@ -3250,8 +3250,10 @@
}
__set_current_state(TASK_INTERRUPTIBLE);
spin_unlock_irq(&x->wait.lock);
- schedule();
- spin_lock_irq(&x->wait.lock);
+ timeout = schedule_timeout(timeout);
+ spin_lock_irq(&x->wait.lock);
+ if (!timeout)
+ goto out;
} while (!x->done);
__remove_wait_queue(&x->wait, &wait);
}
john cooper wrote:
> Eugeny S. Mints wrote:
>
>> john cooper wrote:
>>
>>> A task holding several mutexes shouldn't be an issue.
>>> Though per task an ownership list needs to be maintained
>>> in descending priority order such that the effective PI
>>> can be resolved from all task owned mutexes.
>>
>>
>>
>> Seems it is too coplex model at least for the first step. The one of
>> possible trade-offs coming on mind is to trace the number of resources
>> (mutexes) held by a process and to restore original priority only when
>> resource count reaches 0. This is one of the sollutions accepted by
>> RTOS guys.
>
>
> It would seem a mutex ownership list still needs to be maintained.
> Doing so in unordered priority will give a small fixed insertion
> time, but will require an exhaustive search in order to calculate
> maximum priority. Doing so in priority order will require an
> average of #mutex_owned / 2 for the insertion, and gives a fixed
> time for maximum priority calculation. The latter appears to offer
> a performance benefit to the degree the incoming priorities are
> random.
Yes, definilty, I 100% agree with you - I have _not_ disputed the
priority list approach at all.
>> The other concern with PI is that most likely PI should be prohibited
>> for utilization with locks which are used in the way similar to
>> "waiting completition" - i.e. if PI is employed on a mutex it is
>> prohibited to release it if a process is not the owner of the mutex.
>
>
> Yes, that type of usage breaks the notion of ownership.
> It would be a error for a task to attempt relinquishing
> a mutex which it had not acquired and was holding at the
> time of unlock.
exactly
>>> http://developer.osdl.org/dev/robustmutexes
>>
>>
>>
>> It is definitly non-trivial work to adapt this approach - there are a
>> lot of issues.
>
>
> I should have qualified that reference. Those folks are
> addressing more than PI mutexes. Indeed their goal is
> support of fast user mutexes which offer detection of mutex
> owners gone astray (exited, killed, etc..). It is the kernel
> component of the work to which I was referring.
Ok, I've also talked about kernel component not user space one. User
space approach with robustness, fast uncontented obtaining, etc looks
very attractive but both kernel and user space parts have their issues -
though I took a glance to the project about 2 months ago.
Eugeny
Ingo Molnar writes:
>
> * Gunther Persoons <[email protected]> wrote:
>
> > The kernel booted now with my firewire card plugged in. However when i
> > try to mount my reiser4 partition i get following error
> >
> > BUG: semaphore recursion deadlock detected!
> > .. current task mount/10514 is already holding ccb5bb4c.
>
> > [<c0344760>] down_write+0x103/0x1a6 (48)
> > [<d0b26a08>] kcond_wait+0xaa/0xac [reiser4] (36)
> > [<d0b280b0>] start_ktxnmgrd+0x98/0x9a [reiser4] (36)
> > [<d0b33716>] reiser4_fill_super+0x3b/0x71 [reiser4] (28)
> > [<d0b2d569>] reiser4_get_sb+0x2f/0x33 [reiser4] (68)
> > [<c016061a>] do_kern_mount+0x4f/0xc0 (4)
> > [<c0175945>] do_new_mount+0x9c/0xe1 (36)
> > [<c0175feb>] do_mount+0x145/0x194 (44)
> > [<c0176459>] sys_mount+0x9f/0xe0 (32)
> > [<c01060b1>] sysenter_past_esp+0x52/0x71 (44)
>
> reiser4 has some pretty ugly locking abstraction called kcond, i took a
It's fairly standard condition variable.
> look but it doesnt seem simple to convert it. Reiserfs should really use
> a normal Linux waitqueue and nothing more...
Why? Condition variable is very well known and widely used concept. In
the area of their applicability (where predicate whose change is waited
upon is protected by a single lock) they provide clean and easily
recognizable synchronization device.
Real problem in this case is failure of "semaphore deadlock detection"
to cope with perfectly legal semaphore usage (down() by thread T1, up()
by thread T2). As one possible solution kcond can be re-written on top
of beloved "normal Linux waitqueue".
Nikita.
>
> Ingo
Oct 21 12:33:22 porky kernel: BUG: atomic counter underflow at:
Oct 21 12:33:22 porky kernel: [<c0254dd8>] qdisc_destroy+0x98/0xa0 (12)
Oct 21 12:33:22 porky kernel: [<c0254fed>] dev_shutdown+0x3d/0xa0 (16)
Oct 21 12:33:22 porky kernel: [<c024773b>] unregister_netdevice+0x13b/0x280 (28)
Oct 21 12:33:22 porky netfs: Mounting other filesystems: succeeded
Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (12)
Oct 21 12:33:22 porky kernel: [<e09a6160>] tulip_remove_one+0x0/0xa0 [tulip] (4)
Oct 21 12:33:22 porky kernel: [<c02058de>] unregister_netdev+0x1e/0x30 (24)
Oct 21 12:33:22 porky kernel: [<e09a618f>] tulip_remove_one+0x2f/0xa0 [tulip] (16)
Oct 21 12:33:22 porky kernel: [<c01f2907>] device_release_driver+0x67/0x70 (8)
Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (8)
Oct 21 12:33:22 porky kernel: [<c01c4e26>] pci_device_remove+0x76/0x80 (20)
Oct 21 12:33:22 porky kernel: [<c01f573b>] device_detach_shutdown+0xb/0x40 (12)
Oct 21 12:33:22 porky kernel: [<c01f2907>] device_release_driver+0x67/0x70 (12)
Oct 21 12:33:22 porky kernel: [<c01f293b>] driver_detach+0x2b/0x40 (24)
Oct 21 12:33:22 porky kernel: [<c01f2daf>] bus_remove_driver+0x3f/0x70 (20)
Oct 21 12:33:22 porky kernel: [<c01f32b9>] driver_unregister+0x19/0x30 (20)
Oct 21 12:33:22 porky kernel: [<c01c50cc>] pci_unregister_driver+0x1c/0x30 (16)
Oct 21 12:33:22 porky kernel: [<e09a7767>] tulip_cleanup+0x17/0x1b [tulip] (16)
Oct 21 12:33:22 porky kernel: [<c0139801>] sys_delete_module+0x121/0x150 (12)
Oct 21 12:33:22 porky kernel: [<c01531a1>] sys_munmap+0x51/0x60 (64)
Oct 21 12:33:22 porky kernel: [<c0116a20>] do_page_fault+0x0/0x660 (16)
Oct 21 12:33:22 porky kernel: [<c0106719>] sysenter_past_esp+0x52/0x71 (16)
Oct 21 12:33:22 porky kernel: preempt count: 00000001
Oct 21 12:33:22 porky kernel: . 1-level deep critical section nesting:
Oct 21 12:33:22 porky kernel: .. entry 1: print_traces+0x1d/0x60 / (dump_stack+0x23/0x30)
Oct 21 12:33:22 porky kernel:
Oct 21 12:33:22 porky kernel: tulip 0000:04:0a.0: Device was removed without properly calling pci_disable_device(). This may need fixing.
Scott Wood wrote:
> On Thu, Oct 21, 2004 at 12:49:30PM -0400, john cooper wrote:
>
>>It would seem a mutex ownership list still needs to be maintained.
>>Doing so in unordered priority will give a small fixed insertion
>>time, but will require an exhaustive search in order to calculate
>>maximum priority. Doing so in priority order will require an
>>average of #mutex_owned / 2 for the insertion, and gives a fixed
>>time for maximum priority calculation. The latter appears to offer
>>a performance benefit to the degree the incoming priorities are
>>random.
>
>
> If you keep it in priority order, then you're paying the O(n) cost
> every time you acquire a lock. If you keep it unordered and only
> search it when you need to recalculate a task's priority after a lock
> has been released (or priorities have been changed), you pay the cost
> much less often. Plus, the number of locks held by any given thread
> should generally be very small.
As to locks held by any given thread - it's not always true - take a
look at mm/filemap.c locks nesting map in comments.
Eugeny
Scott Wood wrote:
>On Thu, Oct 21, 2004 at 12:49:30PM -0400, john cooper wrote:
>
>>It would seem a mutex ownership list still needs to be maintained.
>>Doing so in unordered priority will give a small fixed insertion
>>time, but will require an exhaustive search in order to calculate
>>maximum priority. Doing so in priority order will require an
>>average of #mutex_owned / 2 for the insertion, and gives a fixed
>>time for maximum priority calculation. The latter appears to offer
>>a performance benefit to the degree the incoming priorities are
>>random.
>>
>
>If you keep it in priority order, then you're paying the O(n) cost
>every time you acquire a lock. If you keep it unordered and only
>search it when you need to recalculate a task's priority after a lock
>has been released (or priorities have been changed), you pay the cost
>much less often.
>
That's true for the case where the current priority is
somewhere else handy (likely) and we don't need to traverse
the list for other reasons such as allowing/disallowing
recursive acquisition of a mutex by a given task.
>Plus, the number of locks held by any given thread
>should generally be very small.
>
I agree, it is likely in the noise. The list length and
manipulation of the mutex waiter list overshadows this.
-john
--
[email protected]
Ingo Molnar wrote:
>* Gunther Persoons <[email protected]> wrote:
>
>
>
>>The kernel booted now with my firewire card plugged in. However when i
>>try to mount my reiser4 partition i get following error
>>
>>BUG: semaphore recursion deadlock detected!
>>.. current task mount/10514 is already holding ccb5bb4c.
>>
>>
>
>
>
>>[<c0344760>] down_write+0x103/0x1a6 (48)
>>[<d0b26a08>] kcond_wait+0xaa/0xac [reiser4] (36)
>>[<d0b280b0>] start_ktxnmgrd+0x98/0x9a [reiser4] (36)
>>[<d0b33716>] reiser4_fill_super+0x3b/0x71 [reiser4] (28)
>>[<d0b2d569>] reiser4_get_sb+0x2f/0x33 [reiser4] (68)
>>[<c016061a>] do_kern_mount+0x4f/0xc0 (4)
>>[<c0175945>] do_new_mount+0x9c/0xe1 (36)
>>[<c0175feb>] do_mount+0x145/0x194 (44)
>>[<c0176459>] sys_mount+0x9f/0xe0 (32)
>>[<c01060b1>] sysenter_past_esp+0x52/0x71 (44)
>>
>>
>
>reiser4 has some pretty ugly locking abstraction called kcond, i took a
>look but it doesnt seem simple to convert it. Reiserfs should really use
>a normal Linux waitqueue and nothing more...
>
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
Got a second error, while compiling a new kernel my network connection
got down
BUG: semaphore recursion deadlock detected!
.. current task eth0/10202 is already holding cc9345ac.
cc9345ac c03433b7 cfe51260 ccb5e000 cc9345ac cd11c740 ccb5ff8c ffffe000
c034490c cc9345ac 000001d9 cc9345b0 cc9345b0 cd11c740 00000002
cc9343e0
ccb5e000 cc934644 ffffe000 d0a7a136 c5fcd280 cc934000 ccb5e000
c0105fda
Call Trace:
[<c03433b7>] __sched_text_start+0x2e3/0x5e2 (8)
[<c034490c>] down_write_interruptible+0x109/0x243 (28)
[<d0a7a136>] airo_thread+0x82/0x2a0 [airo] (44)
[<c0105fda>] ret_from_fork+0x6/0x14 (16)
[<c011d046>] default_wake_function+0x0/0x12 (12)
[<d0a7a0b4>] airo_thread+0x0/0x2a0 [airo] (24)
[<c01042a9>] kernel_thread_helper+0x5/0xb (16)
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: down_write_interruptible+0x23e/0x243 / (0x0)
.. entry 2: down_write_interruptible+0x6c/0x243 / (0x0)
.. entry 3: print_traces+0x17/0x50 / (0x0)
On Thu, 21 Oct 2004, Ingo Molnar wrote:
> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U9
For those using kernel-package, I just submitted a patch to allow for
uppercase versions.
http://bugs.debian.org/277680
Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 20:07, K.R. Foley wrote:
>
>>Ingo Molnar wrote:
>>
>>>i have released the -U9 Real-Time Preemption patch, which can be
>>>downloaded from:
>>>
>>> http://redhat.com/~mingo/realtime-preempt/
>>>
>>
>>Finally a patch that I can get booted on my older SMP system at home
>>again. More correctly it is U9.2. I have been having problems with these
>>hanging after U5. Haven't had a ton of time to try to track down the
>>problems and didn't want to report problems without having done enough
>>troubleshooting. Anyway, I got this while booting U9.2.
>
>
> I guess, you don't have a tulip network card in your box, as the module
> is removed.
>
> The question is, if it got registered correctly before the removal.
>
> tglx
Actually I do have the tulip card in the box and I am pulling this stuff
from the logs over that connection now. Here are the next lines from the
log that might help.
Oct 21 12:33:22 porky kernel: tulip 0000:04:0a.0: Device was removed
without pro
perly calling pci_disable_device(). This may need fixing.
Oct 21 12:33:22 porky kernel: hda: ATAPI 48X CD-ROM CD-R/RW drive,
8192kB Cache,
UDMA(33)
Oct 21 12:33:22 porky kernel: Uniform CD-ROM driver Revision: 3.20
Oct 21 12:33:22 porky kernel: hdc: ATAPI 48X CD-ROM drive, 120kB Cache,
UDMA(33)
Oct 21 12:33:22 porky kernel: ip_tables: (C) 2000-2002 Netfilter core team
Oct 21 12:33:22 porky kernel: Linux Tulip driver version 1.1.13 (May 11,
2002)
Oct 21 12:33:22 porky kernel: PCI: Found IRQ 5 for device 0000:04:0a.0
Oct 21 12:33:22 porky kernel: PCI: Sharing IRQ 5 with 0000:04:05.1
Oct 21 12:33:22 porky kernel: tulip0: EEPROM default media type Autosense.
Oct 21 12:33:22 porky kernel: tulip0: Index #0 - Media MII (#11)
described by a
21140 MII PHY (1) block.
Oct 21 12:33:22 porky kernel: tulip0: MII transceiver #3 config 3100
status 780
9 advertising 01e1.
Oct 21 12:33:22 porky kernel: eth0: Digital DS21140 Tulip rev 32 at
0xe480, 00:0
0:C0:7F:A0:E9, IRQ 5.
On Thu, 2004-10-21 at 20:57, K.R. Foley wrote:
> > I guess, you don't have a tulip network card in your box, as the module
> > is removed.
> >
> > The question is, if it got registered correctly before the removal.
>
> Actually I do have the tulip card in the box and I am pulling this stuff
> from the logs over that connection now. Here are the next lines from the
> log that might help.
Not really. We must figure out, why sys_delete_module is called.
[<e09a7767>] tulip_cleanup+0x17/0x1b [tulip] (16)
[<c0139801>] sys_delete_module+0x121/0x150 (12) <<--------
[<c01531a1>] sys_munmap+0x51/0x60 (64)
[<c0116a20>] do_page_fault+0x0/0x660 (16)
[<c0106719>] sysenter_past_esp+0x52/0x71 (16)
> Oct 21 12:33:22 porky kernel: tulip 0000:04:0a.0: Device was removed
> without pro
> perly calling pci_disable_device(). This may need fixing.
This one is a result of the BUG()
tglx
Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 20:07, K.R. Foley wrote:
>
>>Ingo Molnar wrote:
>>
>>>i have released the -U9 Real-Time Preemption patch, which can be
>>>downloaded from:
>>>
>>> http://redhat.com/~mingo/realtime-preempt/
>>>
>>
>>Finally a patch that I can get booted on my older SMP system at home
>>again. More correctly it is U9.2. I have been having problems with these
>>hanging after U5. Haven't had a ton of time to try to track down the
>>problems and didn't want to report problems without having done enough
>>troubleshooting. Anyway, I got this while booting U9.2.
>
>
> I guess, you don't have a tulip network card in your box, as the module
> is removed.
>
> The question is, if it got registered correctly before the removal.
>
> tglx
Actually I do have the tulip card in the box and I am pulling this stuff
from the logs over that connection now. Here are the next lines from the
log that might help.
Sorry brainfart before completing that thought. It is not unusual for
the tulip to get loaded, unloaded,loaded again when this system starts
up. Not sure why. It has never really caused me any problems so I
haven't bothered figuring it out.
kr
Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 20:57, K.R. Foley wrote:
>
>>>I guess, you don't have a tulip network card in your box, as the module
>>>is removed.
>>>
>>>The question is, if it got registered correctly before the removal.
>>
>>Actually I do have the tulip card in the box and I am pulling this stuff
>>from the logs over that connection now. Here are the next lines from the
>>log that might help.
>
>
> Not really. We must figure out, why sys_delete_module is called.
Understand. I did not mean the log would help figure it out, but that it
would explain that the card really is present. :)
My guess is that it is prematurely unloading the driver before it's
fully registered, like you said. The question is why? Also I am not in
front of the system right now so I can't unload the module to see if it
generates the error again, but I will try it when I get home.
>
> [<e09a7767>] tulip_cleanup+0x17/0x1b [tulip] (16)
> [<c0139801>] sys_delete_module+0x121/0x150 (12) <<--------
> [<c01531a1>] sys_munmap+0x51/0x60 (64)
> [<c0116a20>] do_page_fault+0x0/0x660 (16)
> [<c0106719>] sysenter_past_esp+0x52/0x71 (16)
>
>
>>Oct 21 12:33:22 porky kernel: tulip 0000:04:0a.0: Device was removed
>>without pro
>>perly calling pci_disable_device(). This may need fixing.
>
>
> This one is a result of the BUG()
>
> tglx
>
>
>
On Thu, Oct 21, 2004 at 02:09:19PM -0400, john cooper wrote:
> Scott Wood wrote:
> >If you keep it in priority order, then you're paying the O(n) cost
> >every time you acquire a lock.
I partially take this back; depending on how it's implemented, you
can get away with only adding it to the list once contention occurs.
> That's true for the case where the current priority is
> somewhere else handy (likely) and we don't need to traverse
> the list for other reasons such as allowing/disallowing
> recursive acquisition of a mutex by a given task.
How would maintaining priority order make it faster to check for
recursive usage? You'd be looking for a specific mutex rather than
the highest priority blocker. You could also check the per-mutex
list of owners (which you'll need to implement PI on rwlocks), to
avoid needing to add to the locks-held list in non-contended cases.
On uniprocessor, one may wish to turn rwlocks into recursive non-rw
mutexes, where recursion checking would use a single owner field.
Also, keeping it in priority order would introduce yet another place
that assumes of a linear priority scheme. At some point, it may be
desireable to implement other schemes, such as maintaining per-CPU
priorities to deal with inheriting from CPU-bound tasks without
introducing said tasks' priorities on other CPUs.
-Scott
On Thu, 2004-10-21 at 20:07, K.R. Foley wrote:
> Ingo Molnar wrote:
> > i have released the -U9 Real-Time Preemption patch, which can be
> > downloaded from:
> >
> > http://redhat.com/~mingo/realtime-preempt/
> >
>
> Finally a patch that I can get booted on my older SMP system at home
> again. More correctly it is U9.2. I have been having problems with these
> hanging after U5. Haven't had a ton of time to try to track down the
> problems and didn't want to report problems without having done enough
> troubleshooting. Anyway, I got this while booting U9.2.
I guess, you don't have a tulip network card in your box, as the module
is removed.
The question is, if it got registered correctly before the removal.
tglx
> Oct 21 12:33:22 porky kernel: BUG: atomic counter underflow at:
> Oct 21 12:33:22 porky kernel: [<c0254dd8>] qdisc_destroy+0x98/0xa0 (12)
> Oct 21 12:33:22 porky kernel: [<c0254fed>] dev_shutdown+0x3d/0xa0 (16)
> Oct 21 12:33:22 porky kernel: [<c024773b>] unregister_netdevice+0x13b/0x280 (28)
> Oct 21 12:33:22 porky netfs: Mounting other filesystems: succeeded
> Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (12)
> Oct 21 12:33:22 porky kernel: [<e09a6160>] tulip_remove_one+0x0/0xa0 [tulip] (4)
> Oct 21 12:33:22 porky kernel: [<c02058de>] unregister_netdev+0x1e/0x30 (24)
> Oct 21 12:33:22 porky kernel: [<e09a618f>] tulip_remove_one+0x2f/0xa0 [tulip] (16)
> Oct 21 12:33:22 porky kernel: [<c01f2907>] device_release_driver+0x67/0x70 (8)
> Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (8)
> Oct 21 12:33:22 porky kernel: [<c01c4e26>] pci_device_remove+0x76/0x80 (20)
> Oct 21 12:33:22 porky kernel: [<c01f573b>] device_detach_shutdown+0xb/0x40 (12)
> Oct 21 12:33:22 porky kernel: [<c01f2907>] device_release_driver+0x67/0x70 (12)
> Oct 21 12:33:22 porky kernel: [<c01f293b>] driver_detach+0x2b/0x40 (24)
> Oct 21 12:33:22 porky kernel: [<c01f2daf>] bus_remove_driver+0x3f/0x70 (20)
> Oct 21 12:33:22 porky kernel: [<c01f32b9>] driver_unregister+0x19/0x30 (20)
> Oct 21 12:33:22 porky kernel: [<c01c50cc>] pci_unregister_driver+0x1c/0x30 (16)
> Oct 21 12:33:22 porky kernel: [<e09a7767>] tulip_cleanup+0x17/0x1b [tulip] (16)
> Oct 21 12:33:22 porky kernel: [<c0139801>] sys_delete_module+0x121/0x150 (12)
> Oct 21 12:33:22 porky kernel: [<c01531a1>] sys_munmap+0x51/0x60 (64)
> Oct 21 12:33:22 porky kernel: [<c0116a20>] do_page_fault+0x0/0x660 (16)
> Oct 21 12:33:22 porky kernel: [<c0106719>] sysenter_past_esp+0x52/0x71 (16)
> Oct 21 12:33:22 porky kernel: preempt count: 00000001
> Oct 21 12:33:22 porky kernel: . 1-level deep critical section nesting:
> Oct 21 12:33:22 porky kernel: .. entry 1: print_traces+0x1d/0x60 / (dump_stack+0x23/0x30)
> Oct 21 12:33:22 porky kernel:
> Oct 21 12:33:22 porky kernel: tulip 0000:04:0a.0: Device was removed without properly calling pci_disable_device(). This may need fixing.
On Thu, Oct 21, 2004 at 12:11:03PM +0200, Jens Axboe wrote:
> I didn't look at the USB code, I'm just saying that it's perfectly valid
> use of a semaphore the pattern you describe (process A holding it,
> process B releasing it).
A lot of things are perfectly "valid" in the Linux kernel regarding
stuff like that are a bit irregular. But the preemption work about
to stress these things in ways that was never designed to which is
why these patches are needed. Having a clear use of various locking
conventions is key to getting this system to behave in a predictable
manner. Quite simply, Linux was never targetted to do this and the
sloppiness is showing so it's got to be removed.
bill
Ingo Molnar wrote:
> i have released the -U9 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
I have a reproducible BUG which I first hit in -U9 and it's in -U9.2
too. It occurs always when a packet hits a REJECT rule in the INPUT
chain of iptables.
To reproduce: iptables -I INPUT 1 -p tcp --dport 666 -j REJECT
...and telnet your TCP port 666 from the network.
BUG: semaphore recursion deadlock detected!
.. current task ksoftirqd/0/2 is already holding f890a6d4.
c0414ec4 f7c27f64 0000012c 00000001 f7c26000 00000000 f7c27f9c c0121687
c03c8118 c0121758 c0121ae4 0000000a c03c8118 f7c26000 f7c26000
00000000
f7c27fa4 c0121770 f7c27fbc c0121ae4 00000001 fffffff6 f7c26000
f7c25f70
Call Trace:
[<c0121687>] ___do_softirq+0x87/0xd0 (32)
[<c0121758>] _do_softirq+0x8/0x30 (8)
[<c0121ae4>] ksoftirqd+0x94/0xe0 (4)
[<c0121770>] _do_softirq+0x20/0x30 (28)
[<c0121ae4>] ksoftirqd+0x94/0xe0 (8)
[<c013116a>] kthread+0xaa/0xb0 (24)
[<c0121a50>] ksoftirqd+0x0/0xe0 (20)
[<c01310c0>] kthread+0x0/0xb0 (12)
[<c0103319>] kernel_thread_helper+0x5/0xc (16)
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: down_write+0x1a4/0x1b0 / (ipt_do_table+0x6a/0x330 [ip_tables])
.. entry 2: down_write+0x72/0x1b0 / (ipt_do_table+0x6a/0x330 [ip_tables])
.. entry 3: print_traces+0x1d/0x90 / (show_stack+0x83/0xa0)
Michal
On Thu, Oct 21 2004, Bill Huey wrote:
> On Thu, Oct 21, 2004 at 12:11:03PM +0200, Jens Axboe wrote:
> > I didn't look at the USB code, I'm just saying that it's perfectly valid
> > use of a semaphore the pattern you describe (process A holding it,
> > process B releasing it).
>
> A lot of things are perfectly "valid" in the Linux kernel regarding
> stuff like that are a bit irregular. But the preemption work about
> to stress these things in ways that was never designed to which is
> why these patches are needed. Having a clear use of various locking
> conventions is key to getting this system to behave in a predictable
> manner. Quite simply, Linux was never targetted to do this and the
> sloppiness is showing so it's got to be removed.
I have to disagree, I don't think the above use is either convoluted or
sloppy in any way. Now that we have the completion structure, certain
things are surely better implemented as such. But the old use is
perfectly valid and logical, imho.
--
Jens Axboe
On Thu, Oct 21 2004, Bill Huey wrote:
> On Thu, Oct 21, 2004 at 10:14:43PM +0200, Jens Axboe wrote:
> > On Thu, Oct 21 2004, Bill Huey wrote:
> > > A lot of things are perfectly "valid" in the Linux kernel regarding
> > > stuff like that are a bit irregular. But the preemption work about
> > > to stress these things in ways that was never designed to which is
> > > why these patches are needed. Having a clear use of various locking
> > > conventions is key to getting this system to behave in a predictable
> > > manner. Quite simply, Linux was never targetted to do this and the
> > > sloppiness is showing so it's got to be removed.
> >
> > I have to disagree, I don't think the above use is either convoluted or
> > sloppy in any way. Now that we have the completion structure, certain
> > things are surely better implemented as such. But the old use is
> > perfectly valid and logical, imho.
>
> You use a semaphore to protect data, a completion isn't protecting data
> but preserving a certain kind of wait ordering in the code. The
> possibility of overloading the current mutex_t for PI makes for a conceptual
> mismatch when used in this case since having a kind of priority for
> completions is a bit odd. It's better to flat out use a completion
> instead, IMO.
Linux semaphores (being counted) have always been a fine fit for things
like the loop use, where you get to down it 10 times because you have 10
items pending. I know this isn't the traditional mutex and that it
doesn't protect data as such, but is was never abuse. It isn't overload.
Doing it with a traditional mutex (I'm assuming this is what mutex_t is
in Ingos tree) would be overload and a bad idea, indeed.
--
Jens Axboe
On Thu, 2004-10-21 at 22:38, Bill Huey wrote:
> On Thu, Oct 21, 2004 at 10:33:50PM +0200, Jens Axboe wrote:
> > On Thu, Oct 21 2004, Bill Huey wrote:
> > > You use a semaphore to protect data, a completion isn't protecting data
> > > but preserving a certain kind of wait ordering in the code. The
> > > possibility of overloading the current mutex_t for PI makes for a conceptual
> > > mismatch when used in this case since having a kind of priority for
> > > completions is a bit odd. It's better to flat out use a completion
> > > instead, IMO.
> >
> > Linux semaphores (being counted) have always been a fine fit for things
> > like the loop use, where you get to down it 10 times because you have 10
> > items pending. I know this isn't the traditional mutex and that it
> > doesn't protect data as such, but is was never abuse. It isn't overload.
> > Doing it with a traditional mutex (I'm assuming this is what mutex_t is
> > in Ingos tree) would be overload and a bad idea, indeed.
>
> Well, this is something that's got to be considered by the larger Linux
> community and whether these conventions are to be kept or removed. It's
> a larger issue than what can be address in Ingo's preemption patch, but
> with inevitable need for something like this in the kernel (hard RT)
> it's really unavoidable collision. IMO, it's got to go, which is a nasty
> change.
>
Hey, let's stop this here.
You are both (in)correct :)
1. It makes no sense to discuss, why X has been considered correct for
time T.
2. Counted semaphores are a valid use and should be marked explicit as
counted semaphores.
3. Using mutexes and semaphores for event and completion signalling
should be converted to the appropriate interfaces.
A bunch of work, but not really hard.
tglx
On Thu, Oct 21, 2004 at 01:38:21PM -0700, Bill Huey wrote:
> On Thu, Oct 21, 2004 at 10:33:50PM +0200, Jens Axboe wrote:
> > Linux semaphores (being counted) have always been a fine fit for things
> > like the loop use, where you get to down it 10 times because you have 10
> > items pending. I know this isn't the traditional mutex and that it
> > doesn't protect data as such, but is was never abuse. It isn't overload.
> > Doing it with a traditional mutex (I'm assuming this is what mutex_t is
> > in Ingos tree) would be overload and a bad idea, indeed.
>
> Well, this is something that's got to be considered by the larger Linux
> community and whether these conventions are to be kept or removed. It's
> a larger issue than what can be address in Ingo's preemption patch, but
> with inevitable need for something like this in the kernel (hard RT)
> it's really unavoidable collision. IMO, it's got to go, which is a nasty
> change.
But this is a non-fatal case. I'll see if I can change this logic to not
completely die when this case is detected.
bill
On Thu, Oct 21, 2004 at 10:33:50PM +0200, Jens Axboe wrote:
> On Thu, Oct 21 2004, Bill Huey wrote:
> > You use a semaphore to protect data, a completion isn't protecting data
> > but preserving a certain kind of wait ordering in the code. The
> > possibility of overloading the current mutex_t for PI makes for a conceptual
> > mismatch when used in this case since having a kind of priority for
> > completions is a bit odd. It's better to flat out use a completion
> > instead, IMO.
>
> Linux semaphores (being counted) have always been a fine fit for things
> like the loop use, where you get to down it 10 times because you have 10
> items pending. I know this isn't the traditional mutex and that it
> doesn't protect data as such, but is was never abuse. It isn't overload.
> Doing it with a traditional mutex (I'm assuming this is what mutex_t is
> in Ingos tree) would be overload and a bad idea, indeed.
Well, this is something that's got to be considered by the larger Linux
community and whether these conventions are to be kept or removed. It's
a larger issue than what can be address in Ingo's preemption patch, but
with inevitable need for something like this in the kernel (hard RT)
it's really unavoidable collision. IMO, it's got to go, which is a nasty
change.
bill
On Thu, Oct 21, 2004 at 10:14:43PM +0200, Jens Axboe wrote:
> On Thu, Oct 21 2004, Bill Huey wrote:
> > A lot of things are perfectly "valid" in the Linux kernel regarding
> > stuff like that are a bit irregular. But the preemption work about
> > to stress these things in ways that was never designed to which is
> > why these patches are needed. Having a clear use of various locking
> > conventions is key to getting this system to behave in a predictable
> > manner. Quite simply, Linux was never targetted to do this and the
> > sloppiness is showing so it's got to be removed.
>
> I have to disagree, I don't think the above use is either convoluted or
> sloppy in any way. Now that we have the completion structure, certain
> things are surely better implemented as such. But the old use is
> perfectly valid and logical, imho.
You use a semaphore to protect data, a completion isn't protecting data
but preserving a certain kind of wait ordering in the code. The
possibility of overloading the current mutex_t for PI makes for a conceptual
mismatch when used in this case since having a kind of priority for
completions is a bit odd. It's better to flat out use a completion
instead, IMO.
bill
On Thu, Oct 21, 2004 at 04:18:12PM -0400, john cooper wrote:
> Scott Wood wrote:
> >How would maintaining priority order make it faster to check for
> >recursive usage?
> >
> It wouldn't. My point was an exhaustive traversal may be
> needed for other reasons with an insertion sort being
> near free.
>
> Yet considering the cost to maintain these lists in priority
> order with multiple spinlock acquisition sequences due to how
> the aggregate data structure must be traversed/ordered,
> I haven't yet convinced myself either way.
Another issue is that if you keep them in order, you have to fix the
list whenever an owner of a listed mutex changes its priority.
> >On uniprocessor, one may wish to turn rwlocks into recursive non-rw
> >mutexes, where recursion checking would use a single owner field.
> >
> It isn't obvious to me how this would address the case of a
> task holding a reader lock on mx-A then blocking on mx-B.
> Another task attempting to acquire a reader lock on mx-A would
> block rather than immediately acquiring the lock.
Yes. However, the contention case should not be optimized at the
expense of the common case, which can be faster for non-rwlock
implementations when PI is involved. On SMP, you'd be introducing a
bottleneck by taking away rwlocks, but on UP it's only an issue when
you get preempted or block in a critical section.
There could be problems if some code tries to acquire read locks
out-of-order, believing that it can't deadlock that way (if the
writers don't nest), but that's a problem anyway unless there's a
reasonable way of implementing PI without limiting the number of
concurrent readers (they have to be stored somewhere, and the
alternatives of setting a hard limit on mutexes-per-thread or doing
dynamic allocation inside the lock function are worse).
-Scott
On Thu, Oct 21, 2004 at 11:01:21PM +0200, Esben Nielsen wrote:
> You can implement the full scheduler structure in each mutex. That would
> be O(1) but would take take quite a lot of memory.
Are you talking about the thread's locks-held list, or the mutex's
blocked-threads list? In the latter case, you could cut down on the
memory usage by allocating one such structure for each thread upon
creation, placing it into a pool, and associating them with a mutex
when it's contended (as you can't have more contended mutexes than
you have threads). It's still a lot heavier than a linked list,
though.
In either case, you'd need to do whatever adjustments to the
scheduler's data structures are necessary whenever a task's priority
changes (including via PI). It's worth it for at least one of the
lists, as if neither locks-held nor threads-blocked is kept in order,
priority recalculation becomes O(n^2) to find the highest priority
blocker among all held mutexes. Ordering threads-blocked seems to
make more sense, as you don't have the imbalance between the
frequency adding to the list and searching the list.
> On the other hand a sorted list will not take more memory than now but
> will appear to be O(n) when inserting into the list. However, on a UP
> machine no lower priority task will run when higher priority tasks runs.
> I.e. you will always add to the beginning of the list. That is assuming
> ofcourse that nobody will sleep while holding a mutex - which is a bad
> bahaviour.
It's bad, but inevitable if this is also used to provide PI mutexes
to userspace.
> > On uniprocessor, one may wish to turn rwlocks into recursive non-rw
> > mutexes, where recursion checking would use a single owner field.
> >
>
>
> Why not use a simple counter?
You'd need a counter as well, but you first have to check the owner
to make sure that the count reflects the calling thread's activity
rather than some other thread.
> The mutex protects it's own memory area.
> Anyway, I am not sure recursive acquisition is such a good idea: It will
> make the mutex more expensive to use and promote sloppy coding where you
> don't really know what mutexes are held right now.
I agree, but current code assumes rwlocks can be recursively obtained
for reading (unless that's changed recently).
> I am not sure this at all make sense on a SMP system. For performance
> reasons I think that one should stick to ordinary semaphores and in a lot
> of places spin-locks on such a system. Even with a dedicated RTOS you have
> to design your system from the bottum up to get a good real-time
> behaviour - and it depends a lot on the application of which you can only
> have one! That is very far from the world of SMP servers.
Things get weird on SMP, but it's possible to produce reasonable
real-time behavior, and there are people out there who are interested
in it.
> The ones who can use all the fancy mutexes are really embedded developer
> (like myself) who need a robust RTOS but also drivers for a lot of
> hardware, a good TCP/IP stack, firewall, good filesystems etc. which the
> commercial vendors have a hard time delivering all at once.
And some embedded devices also need a lot of CPU power, and some of
those use SMP.
> On the desktop it can lead to good performance of multimedia but as soon
> as you want to use two applications, which considers themselves multimedia
> and thus gives themselves high priority, it wont work.
It can be made to work if you have CPU reservations, or some other
way of ensuring that the applications take their CPU in
appropriately sized chunks.
-Scott
Scott Wood wrote:
>On Thu, Oct 21, 2004 at 04:18:12PM -0400, john cooper wrote:
>
>>Yet considering the cost to maintain these lists in priority
>>order with multiple spinlock acquisition sequences due to how
>>the aggregate data structure must be traversed/ordered,
>>I haven't yet convinced myself either way.
>>
>
>Another issue is that if you keep them in order, you have to fix the
>list whenever an owner of a listed mutex changes its priority.
>
Yes, but my concern was having to backoff in out-of-sequence
spinlock acquisition paths. Looking at it again if the canonical
lock acquisition sequence is a task's mutex-owned list then a
mutex's task-owned list, the nondeterministic backoff (if any)
gets pushed to the case of a waiter blocking on the lock.
>>It isn't obvious to me how this would address the case of a
>>task holding a reader lock on mx-A then blocking on mx-B.
>>Another task attempting to acquire a reader lock on mx-A would
>>block rather than immediately acquiring the lock.
>>
>
>Yes. However, the contention case should not be optimized at the
>expense of the common case, which can be faster for non-rwlock
>implementations when PI is involved. On SMP, you'd be introducing a
>bottleneck by taking away rwlocks, but on UP it's only an issue when
>you get preempted or block in a critical section.
>
My concern is removing what should be available reader
concurrency for the mutex in question. I can't assess
how un/common that may be over all application scenarios.
-john
--
[email protected]
Bill Huey (hui) wrote:
>
>
> You use a semaphore to protect data, a completion isn't protecting data
> but preserving a certain kind of wait ordering in the code. The
> possibility of overloading the current mutex_t for PI makes for a conceptual
> mismatch when used in this case since having a kind of priority for
> completions is a bit odd. It's better to flat out use a completion
> instead, IMO.
>
Could you please define "completion" for me in this context?
Thanks.
On Thu, Oct 21, 2004 at 06:15:00PM -0400, john cooper wrote:
> Yes, but my concern was having to backoff in out-of-sequence
> spinlock acquisition paths.
Out-of-sequence acquisition is a bug, unless the caller uses trylocks
and handles backoff itself.
-Scott
Scott Wood wrote:
>On Thu, Oct 21, 2004 at 06:15:00PM -0400, john cooper wrote:
>
>>Yes, but my concern was having to backoff in out-of-sequence
>>spinlock acquisition paths.
>>
>
>Out-of-sequence acquisition is a bug, unless the caller uses trylocks
>and handles backoff itself.
>
Understood -- we may be getting hung up on terminology here.
Rather the issue was whether the nondeterministic out-of-sequence
backoff could be pushed to a noncritical path. I believe so.
It is further likely a backoff would not be needed as the
a path acquiring a mutex's task-owned list lock during a
priority promotion scan shouldn't have reason to acquire any
task's mutex-owned list lock. The latter list would only need
to be locked at time of successful mutex acquisition/free.
-john
--
[email protected]
On Thu, Oct 21, 2004 at 10:43:41PM +0200, Thomas Gleixner wrote:
> Hey, let's stop this here.
>
> You are both (in)correct :)
>
> 1. It makes no sense to discuss, why X has been considered correct for
> time T.
>
> 2. Counted semaphores are a valid use and should be marked explicit as
> counted semaphores.
>
> 3. Using mutexes and semaphores for event and completion signalling
> should be converted to the appropriate interfaces.
>
> A bunch of work, but not really hard.
What's the verdict ? leave the lock detector alone or change it ?
bill
On Fri, 2004-10-22 at 00:42, Timothy Miller wrote:
> Bill Huey (hui) wrote:
> > You use a semaphore to protect data, a completion isn't protecting data
> > but preserving a certain kind of wait ordering in the code. The
> > possibility of overloading the current mutex_t for PI makes for a conceptual
> > mismatch when used in this case since having a kind of priority for
> > completions is a bit odd. It's better to flat out use a completion
> > instead, IMO.
> >
>
>
> Could you please define "completion" for me in this context?
A triggers B to exit and must wait until B has exited. It waits for
completion of exit.
A triggers B to execute a command and must wait until B has done so. It
waits for completion of the command.
Mutexes are used for that, but that's not the intended functionality of
a mutex. Of course it works as long as you do no owner checks on the
mutexes.
A {
init_MUTEX_LOCKED(m)
trigger B
down(m) <----- recursion, because A owns it already
}
The completion is designed for that and should be used IMHO. Mutexe were
used for that, because the ancestors of completion, sleep_on...(), are
racy.
tglx
Esben Nielsen wrote:
>Anyway, I am not sure recursive acquisition is such a good idea: It will
>make the mutex more expensive to use and promote sloppy coding where you
>don't really know what mutexes are held right now.
>
In my experience I haven't quite found this to be the case.
Rather allowing a lock/mutex/etc.. to be recursively acquired
within a given path can greatly simplify APIs. Such as
cases where a primitive acquires a lock internally which is
also know by and suited to protect data in the caller's context.
This avoids the "who has the lock?" information which must
otherwise get stuffed into the API. Nested function calls
simply observe correct lock usage at their respective levels
and all unwinds correctly.
The one notable place where this doesn't work is where
locks must be conditionally acquired in an out-of-order sequence.
However often these cases can be pushed out of the externally
visible API. Another operational consideration is maintaining
debug information in the lock. Keeping track of where each
lock acquisition was made is at odds with a fixed sized lock
data footprint. Still recursive acquisitions scale only
with available stack space so a fair upper limit approximation
is possible when running on fixed size stacks. Note I'm
refering to single-owner spinlocks/mutexes here rather than
reader/write locks.
>I think it is better that the sloppy coder discoveres that he
>deadlocks on himself before getting other sub-systems involved :-)
>
I agree. But I don't think the above precludes doing so.
Detecting violations of unbalanced lock/unlock calls isn't
really different than for non-recursive primitives.
-john
--
[email protected]
Esben Nielsen
Work:
Cotas Computer Technology A/S
Paludan Mullersvej 82
8200 Aarhus N
Private
Moellegade 7A, 3., 4
8000 Aarhus C
Phone:
+45 86 12 73 79
Mobile:
+45 27 13 10 05
On Thu, 21 Oct 2004, Scott Wood wrote:
> On Thu, Oct 21, 2004 at 02:09:19PM -0400, john cooper wrote:
> > Scott Wood wrote:
> > >If you keep it in priority order, then you're paying the O(n) cost
> > >every time you acquire a lock.
>
> I partially take this back; depending on how it's implemented, you
> can get away with only adding it to the list once contention occurs.
>
You can implement the full scheduler structure in each mutex. That would
be O(1) but would take take quite a lot of memory.
On the other hand a sorted list will not take more memory than now but
will appear to be O(n) when inserting into the list. However, on a UP
machine no lower priority task will run when higher priority tasks runs.
I.e. you will always add to the beginning of the list. That is assuming
ofcourse that nobody will sleep while holding a mutex - which is a bad
bahaviour. And if you try to take another mutex, while holding one already
and you have to block, the holder of the second mutex will be moved up to
your priority. I.e. it will block the CPU for any lower priority task...
I don't know what it would be on SMP systems though...
> > That's true for the case whe~re the current priority is
> > somewhere else handy (likely) and we don't need to traverse
> > the list for other reasons such as allowing/disallowing
> > recursive acquisition of a mutex by a given task.
>
> How would maintaining priority order make it faster to check for
> recursive usage? You'd be looking for a specific mutex rather than
> the highest priority blocker. You could also check the per-mutex
> list of owners (which you'll need to implement PI on rwlocks), to
> avoid needing to add to the locks-held list in non-contended cases.
>
> On uniprocessor, one may wish to turn rwlocks into recursive non-rw
> mutexes, where recursion checking would use a single owner field.
>
Why not use a simple counter? The mutex protects it's own memory area.
Anyway, I am not sure recursive acquisition is such a good idea: It will
make the mutex more expensive to use and promote sloppy coding where you
don't really know what mutexes are held right now. When you are building a
subsystem you can send around a flag in the argument saying whether you
have taken the lock or not. If you call into other systems, where you
can't add a parameter to each function, you should release your mutex(es)
anyway! I think it is better that the sloppy coder discoveres that he
deadlocks on himself before getting other sub-systems involved :-)
> Also, keeping it in priority order would introduce yet another place
> that assumes of a linear priority scheme. At some point, it may be
> desireable to implement other schemes, such as maintaining per-CPU
> priorities to deal with inheriting from CPU-bound tasks without
> introducing said tasks' priorities on other CPUs.
>
I am not sure this at all make sense on a SMP system. For performance
reasons I think that one should stick to ordinary semaphores and in a lot
of places spin-locks on such a system. Even with a dedicated RTOS you have
to design your system from the bottum up to get a good real-time
behaviour - and it depends a lot on the application of which you can only
have one! That is very far from the world of SMP servers.
The ones who can use all the fancy mutexes are really embedded developer
(like myself) who need a robust RTOS but also drivers for a lot of
hardware, a good TCP/IP stack, firewall, good filesystems etc. which the
commercial vendors have a hard time delivering all at once.
On the desktop it can lead to good performance of multimedia but as soon
as you want to use two applications, which considers themselves multimedia
and thus gives themselves high priority, it wont work.
I must also say, that from my perspective low latencies is not the issue.
The issue is predictability: I must be able to create threads and know
they can't all the sudden be preempted by all kinds of things. I.e. if I
give them higher priority than all the "normal" stuff and all shared
resources between my tasks and the "normal" stuff are locked with mutexes
using prirority inheritance and only for fixed amount of time, I am in the
clear. It is also important that all interrupts and spin-locks are only
held for a fixed amount of time - but as long as that time is lower than
the maximum latency and is bounded in occurance I don't really
care. Forinstance a driver servicing a serial channel wont hurt being run
in interrupt context as it is really limited how much CPU it can take
overall.
> -Scott
Esben
>i have released the -U9 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
This looks like a keeper. All results are with U9.2 (though it says
U9.1 in the system logs...).
I will send a separate message with details, but here is a summary of
the testing I have done so far.
[1] Boot up and init scripts.
Still get the BUG message after telinit 3
Oct 21 13:18:42 localhost kernel: eth0: RealTek RTL8139 at 0xdc00,
00:50:bf:39:11:fc, IRQ 11
Oct 21 13:18:42 localhost kernel: BUG: atomic counter underflow at:
...
but this is no longer fatal to system start up nor do I get any
further error messages related to the network card.
telinit 5 is successful without any problems.
[2] Initial audio / real time settings
Audio test was OK.
The script I use setting the real time environment now complains that
I cannot unthread the RTC nor audio (as expected) and doesn't appear
to cause any problems.
[3] Real time test
First time running my test (latencytest plus heavy non real time
operations), I got some nasty sounds from the speakers plus the
display / mouse locked up but unlocked when I was able to Ctrl-C
the job some time later. Rebuilt latencytest with a lower real time
priority to recover.
Second time, the test ran pretty much flawlessly. My background
job collecting latency trace data had only 4 hits in about 25-30
minutes of testing (> 200 usec). The summary data looked OK but
when I looked at the charts later the data is similar to what I
saw with -T3 with a few exceptions:
[1] X11 stress with T3 had no significant CPU delays (> 100 usec)
with more "little" delays than U9. U9 had about 0.3% of the samples
with over 200 usec delays for the CPU task.
[2] proc stress had similar results with U9 having 0.19% samples
over 100 usec delay and 0.04% samples over 200 usec delay.
[3] Both network tests were great with both kernels (T3 and U9).
U9 was slightly better with 100% samples within 100 usec of nominal
value and T3 with 0.01% and 0.04% 100 usec delay samples on network
output / input respectively.
[4] Disk write test was a little odd on U9, the first 30 seconds
had some long delays (max 630 usec) but then it settled down and
no long delays the remaining time period (total 5 minutes).
[5] Disk copy test had a 5 second burp at the start (max 380 usec)
but it settled down as well.
[6] Disk read test results were worse on U9 than T3. Both the
maximum delay (1.7 msec - over 100% overhead) and percentage of
longer delays (1.72% on U9, 0.22% on T3) were worse on U9. I
checked again, DMA mode was set to udma2 to prevent the long
latencies I saw in previous tests.
[4] Latency trace results.
Only a handful of traces during the test. Numbers below refer
to the file numbers (lt.xx).
00 - 229 usec, appears to be nmi during kmap_atomic; likely sampling
overhead.
01 - 67569 usec, this and next one appear to be that long delay where
the system gets "stuck" and finally finishes the work several clock
ticks later. Notice - this did not affect the application level
timing collection. The start was in flush_tlb_all (remove_vm_area)
ending at do_flush_tlb_all (flush_tlb_all). Several cycles of a
sequence like this were in between...
00000003 64.462ms (+0.137ms): __do_softirq (do_softirq)
00010002 64.599ms (+0.000ms): do_nmi (smp_call_function)
00010002 64.600ms (+0.000ms): do_nmi (check_preempt_timing)
00010002 64.600ms (+0.854ms): do_nmi (<00000046>)
00000002 65.454ms (+0.000ms): smp_apic_timer_interrupt (smp_call_function)
00010002 65.455ms (+0.000ms): profile_hook (profile_tick)
00010002 65.455ms (+0.000ms): _read_lock (profile_hook)
00010003 65.455ms (+0.000ms): _read_unlock (profile_tick)
00010002 65.456ms (+0.000ms): update_process_times
(smp_apic_timer_interrupt)
00010002 65.456ms (+0.000ms): update_one_process (update_process_times)
00010002 65.457ms (+0.000ms): run_local_timers (update_process_times)
00010002 65.457ms (+0.000ms): raise_softirq (update_process_times)
00010002 65.457ms (+0.000ms): scheduler_tick (update_process_times)
00010002 65.458ms (+0.000ms): sched_clock (scheduler_tick)
00010002 65.458ms (+0.000ms): _spin_lock (scheduler_tick)
00010003 65.459ms (+0.000ms): _spin_unlock (scheduler_tick)
00010002 65.459ms (+0.000ms): rebalance_tick (scheduler_tick)
00010002 65.460ms (+0.000ms): irq_exit (smp_apic_timer_interrupt)
00000003 65.460ms (+0.000ms): do_softirq (irq_exit)
02 - 1856 usec, same symptom as 01 but shorter duration.
03 - 224 usec, total trace follows...
preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U9.1
-------------------------------------------------------
latency: 224 us, entries: 7 (7) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: cat/7650, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80 <c0314e0f>
=> ended at: _spin_unlock_irqrestore+0x20/0x50 <c0315200>
=======>
00000001 0.000ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
00000001 0.000ms (+0.173ms): avc_insert (avc_has_perm_noaudit)
00010001 0.173ms (+0.000ms): do_nmi (avc_insert)
00010001 0.174ms (+0.000ms): do_nmi (cycles_to_usecs)
00010001 0.174ms (+0.047ms): do_nmi (<00000046>)
00000001 0.222ms (+0.000ms): memcpy (avc_has_perm_noaudit)
00000001 0.223ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
That's it. Overall a very good performance for this kernel. I will
send the system log and full traces separately.
--Mark
Scott Wood wrote:
>On Thu, Oct 21, 2004 at 02:09:19PM -0400, john cooper wrote:
>
>>That's true for the case where the current priority is
>>somewhere else handy (likely) and we don't need to traverse
>>the list for other reasons such as allowing/disallowing
>>recursive acquisition of a mutex by a given task.
>>
>
>How would maintaining priority order make it faster to check for
>recursive usage?
>
It wouldn't. My point was an exhaustive traversal may be
needed for other reasons with an insertion sort being
near free.
Yet considering the cost to maintain these lists in priority
order with multiple spinlock acquisition sequences due to how
the aggregate data structure must be traversed/ordered,
I haven't yet convinced myself either way.
>On uniprocessor, one may wish to turn rwlocks into recursive non-rw
>mutexes, where recursion checking would use a single owner field.
>
It isn't obvious to me how this would address the case of a
task holding a reader lock on mx-A then blocking on mx-B.
Another task attempting to acquire a reader lock on mx-A would
block rather than immediately acquiring the lock.
-john
--
[email protected]
On Thu, Oct 21, 2004 at 07:47:29PM +0400, Eugeny S. Mints wrote:
> Seems it is too coplex model at least for the first step. The one of
> possible trade-offs coming on mind is to trace the number of resources
> (mutexes) held by a process and to restore original priority only when
> resource count reaches 0. This is one of the sollutions accepted by RTOS
> guys.
That complicates analysis, though, since you now have to look at all
critical sections that the shared-with-high-priority-threads critical
sections nest in. IMHO, it's important that the inherited priority
be given up as soon as the resource is released.
-Scott
On Thu, Oct 21, 2004 at 10:10:17PM +0400, Eugeny S. Mints wrote:
> Scott Wood wrote:
> >If you keep it in priority order, then you're paying the O(n) cost
> >every time you acquire a lock. If you keep it unordered and only
> >search it when you need to recalculate a task's priority after a lock
> >has been released (or priorities have been changed), you pay the cost
> >much less often. Plus, the number of locks held by any given thread
> >should generally be very small.
> As to locks held by any given thread - it's not always true - take a
> look at mm/filemap.c locks nesting map in comments.
I guess it depends on the definition of "very small" and "generally".
:-)
A nesting of 5 locks is pushing the limits of "very small", but it's
still no big deal to iterate over once in a while.
-Scott
On Tue, 2004-10-19 at 14:52, Adam Heath wrote:
> On Tue, 19 Oct 2004, Ingo Molnar wrote:
>
> >
> > * Adam Heath <[email protected]> wrote:
> >
> > > I am still having the same bug(repeatable by running liquidwar) as I
> > > reported with -U5(see my earlier email).
> >
> > ok, this seems to be some questionable code in OSS. It really has no
> > business up()-ing the inode semaphore - nobody down()-ed it before! This
> > could be either a bad workaround for a bug/hang someone saw, or an old
> > VFS assumption that doesnt hold anymore. In any case, could you try the
> > patch below, does it fix liquidwar?
>
> Yup, the below fixes it. However, this problem *only* started occuring in
> -U5. I've been running liquidwar on all versions(it's my current
> game-to-play-when-I-feel-stupid program).
The real fix is to use ALSA. :-)
Lee
Ingo Molnar wrote:
> i have released the -U6 Real-Time Preemption patch:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U6
>
> [...]
> - deadlock bug: fix networking deadlock reported by Matthew L Foster.
> Restructured the way the RT-RCU locking of ptype_lock is done - it's
> cleaner and more obvious now (besides being correct). This could also
> fix the deadlock reported by Michal Schmidt.
Yes, this fixed the deadlock for me.
I'm now going to try -U7.
Michal
i have released the -U7 Real-Time Preemption patch:
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
this too is a fixes-only release.
Changes since -U6:
- crash fix: turn off 4K stacks when using RWSEM_DEADLOCK_DETECTION,
and tune down the default max # of tasks traced per semaphore. This
increases process-stack size and reduces the footprint of lock
objects. This should fix the bootup crash reported by Rui Nuno
Capela.
- assert fix: fixed an ide-taskfile scheduling-with-irqs-off assert
that Rui's .config triggers.
- assert workaround: disabled PARPORT_1284 for now, this should fix the
assert seen by Mark H Johnson.
- NFS fix: clnt.c fix from Thomas Gleixner
- debugging helper: print stackframe-size in backtraces.
- large-stackframe fix: inflate.c fix
to create a -U7 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
Ingo
On Tue, 2004-10-19 at 18:44, Thomas Gleixner wrote:
> On Tue, 2004-10-19 at 17:57, Ingo Molnar wrote:
The laptop boots now. The culprits, which break the boot are:
pci-hotplug and firewire drivers.
agp loads correct.
No deeper insight yet.
The IPV6 code triggers the irqs_disabled() check in schedule. dmesg
output attached.
tglx
Ingo Molnar wrote:
> * [email protected] <[email protected]> wrote:
>
>
>>Booted to single user and was able to get some network operations
>>going with this version (w/ previously mentioned update). However, at
>>the step where I start CUPS, I got a number of traces on the display
>>referring to parport_pc related function calls [but I don't use a
>>parallel printer...]. It ended with:
>
>
> thanks for the logs - there are some semaphore assumptions in
> ieee1284.c, it should use completions & wait_for_completion_timeout()
> too. The workaround is to disable CONFIG_PARPORT_1284. (or
> CONFIG_PARPORT altogether.)
>
> Ingo
>
If only he had been a little faster getting this in, I wouldn't have sat
and written the entire screen full down on paper. :) Mine wasn't making
it to the logs. At least he saved me from typing it all back in. Thanks
Mark. Disabling PARPORT fixed my problem as well.
kr
* Florian Schmidt <[email protected]> wrote:
> I don't get any oopses or panics, but i can observer a rather
> interesting behaviour. When i enable the latency traces via
>
> echo 1 > /proc/sys/kernel/trace_enabled
>
> my machine starts to make little pauses of ca 3-4 secs. X "hangs" for
> this duration and so does aplay when playing a .wav file. "hangs"
> means that the X display seems to be locked. Interestingly enough all
> keystrokes i entered during the "hang" seem to arrive fine after the
> hang has ended. aplay experiences an xrun.
do you get the same pauses if you do 'dmesg -n 1'? Also, are you using
preempt_thresh or the maximum-searching variant? preempt_thresh can
generate _tons_ of messages with a low threshold, freezing the system in
essence for long periods of time.
but this trace is weird:
> preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U6
> -------------------------------------------------------
> latency: 1841 us, entries: 4000 (12990) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
> -----------------
> | task: aplay/2160, uid:1000 nice:0 policy:0 rt_prio:0
> -----------------
> => started at: __schedule+0x3b/0x5d0 <c02a767b>
> => ended at: finish_task_switch+0x43/0xb0 <c0114ae3>
> =======>
> 00000001 0.000ms (+0.000ms): __schedule (ksoftirqd)
> 00000001 0.000ms (+0.000ms): sched_clock (__schedule)
> 00000002 0.000ms (+0.000ms): deactivate_task (__schedule)
> 00000002 0.000ms (+0.000ms): dequeue_task (deactivate_task)
> 04000002 0.000ms (+0.000ms): __switch_to (__schedule)
> 04000002 0.001ms (+0.000ms): finish_task_switch (__schedule)
> 04000000 0.001ms (+0.000ms): schedule (down_write)
> 04000000 0.001ms (+0.000ms): __schedule (down_write)
> 04000001 0.001ms (+0.000ms): sched_clock (__schedule)
> 04000000 0.001ms (+0.000ms): schedule (down_write)
> 04000000 0.001ms (+0.000ms): __schedule (down_write)
> 04000001 0.002ms (+0.000ms): sched_clock (__schedule)
> 04000000 0.002ms (+0.000ms): schedule (down_write)
this doesnt seem like normal behavior. It seems two tasks are
ping-pong-ing a semaphore but are unable to make any progress. The whole
thing is non-preemptible because this semaphore was taken while in a
PREEMPT_ACTIVE section.
(i'd say this is the BKL semaphore - it is quite special in that
regard.)
Ingo
On Thu, Oct 21 2004, Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 22:38, Bill Huey wrote:
> > On Thu, Oct 21, 2004 at 10:33:50PM +0200, Jens Axboe wrote:
> > > On Thu, Oct 21 2004, Bill Huey wrote:
> > > > You use a semaphore to protect data, a completion isn't protecting data
> > > > but preserving a certain kind of wait ordering in the code. The
> > > > possibility of overloading the current mutex_t for PI makes for a conceptual
> > > > mismatch when used in this case since having a kind of priority for
> > > > completions is a bit odd. It's better to flat out use a completion
> > > > instead, IMO.
> > >
> > > Linux semaphores (being counted) have always been a fine fit for things
> > > like the loop use, where you get to down it 10 times because you have 10
> > > items pending. I know this isn't the traditional mutex and that it
> > > doesn't protect data as such, but is was never abuse. It isn't overload.
> > > Doing it with a traditional mutex (I'm assuming this is what mutex_t is
> > > in Ingos tree) would be overload and a bad idea, indeed.
> >
> > Well, this is something that's got to be considered by the larger Linux
> > community and whether these conventions are to be kept or removed. It's
> > a larger issue than what can be address in Ingo's preemption patch, but
> > with inevitable need for something like this in the kernel (hard RT)
> > it's really unavoidable collision. IMO, it's got to go, which is a nasty
> > change.
> >
>
> Hey, let's stop this here.
>
> You are both (in)correct :)
>
> 1. It makes no sense to discuss, why X has been considered correct for
> time T.
Because it is correct. Debating that it's now incorrect because it
inconveniently happens to make some detection scheme harder is silly.
> 2. Counted semaphores are a valid use and should be marked explicit as
> counted semaphores.
Indeed
> 3. Using mutexes and semaphores for event and completion signalling
> should be converted to the appropriate interfaces.
Agree. Do you test all your conversions? Whole-sale conversions like
this tend to break at least some of the drivers. And that's totally
unacceptable, breaking a working solution because of something that's
not really a bug.
> A bunch of work, but not really hard.
Not if you don't test it.
--
Jens Axboe
On Thu, Oct 21 2004, Bill Huey wrote:
> On Thu, Oct 21, 2004 at 10:33:50PM +0200, Jens Axboe wrote:
> > On Thu, Oct 21 2004, Bill Huey wrote:
> > > You use a semaphore to protect data, a completion isn't protecting data
> > > but preserving a certain kind of wait ordering in the code. The
> > > possibility of overloading the current mutex_t for PI makes for a conceptual
> > > mismatch when used in this case since having a kind of priority for
> > > completions is a bit odd. It's better to flat out use a completion
> > > instead, IMO.
> >
> > Linux semaphores (being counted) have always been a fine fit for things
> > like the loop use, where you get to down it 10 times because you have 10
> > items pending. I know this isn't the traditional mutex and that it
> > doesn't protect data as such, but is was never abuse. It isn't overload.
> > Doing it with a traditional mutex (I'm assuming this is what mutex_t is
> > in Ingos tree) would be overload and a bad idea, indeed.
>
> Well, this is something that's got to be considered by the larger Linux
> community and whether these conventions are to be kept or removed. It's
> a larger issue than what can be address in Ingo's preemption patch, but
> with inevitable need for something like this in the kernel (hard RT)
> it's really unavoidable collision. IMO, it's got to go, which is a nasty
> change.
It has to go, why? Because your deadlock detection breaks? Doesn't seem
a very strong reason to me at all, sorry.
--
Jens Axboe
On Tue, 2004-10-19 at 18:26, Ingo Molnar wrote:
> thanks, i've applied your patch to my tree. Find below an untested
> implementation of wait_for_completion_timeout().
Will give it a try.
Found another exterm ugly one. In scsi_error_handler a mutex is
initialized locked and then it is acquired again with
down_interruptible()
I have no fix yet. Somebody else ?
tglx
PCI: Found IRQ 10 for device 0000:00:02.0
sym0: <875> rev 0x4 at pci 0000:00:02.0 irq 10
BUG: semaphore recursion deadlock detected!
.. current task scsi_eh_0/730 is already holding cfed3ed8.
00000000 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000 c025a590 00000000 c0104115 cfed7800 00000000 00000000
Call Trace:
[<c025a590>] scsi_error_handler+0x0/0x100
[<c0104115>] kernel_thread_helper+0x5/0x10
------------[ cut here ]------------
kernel BUG at lib/rwsem-generic.c:472!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in:
CPU: 0
EIP: 0060:[<c0307bb5>] Not tainted VLI
EFLAGS: 00010046 (2.6.9-rc4-mm1-RT-U6)
EIP is at __down_write_interruptible+0xd5/0x296
eax: 00000001 ebx: 00000000 ecx: c036222c edx: 00000001
esi: cfed3ed8 edi: cfd04070 ebp: 00000001 esp: cfed3e8c
ds: 007b es: 007b ss: 0068 preempt: 00000003
Process scsi_eh_0 (pid: 730, threadinfo=cfed2000 task=cfd04070)
Stack: cfed3ed8 00000000 00000000 cfed3edc cfed3edc cfd04070 00000002
cfed2000
cfed3ed8 cfed3ed8 00000000 c01c6ae4 cfed3ed8 cfd04070 cfed7800
00000000
c025a617 c032a106 00000000 ffffffff cfed3e98 cfed3e98 00000001
cfd04070
Call Trace:
[<c01c6ae4>] down_write_interruptible+0x44/0x70
[<c025a617>] scsi_error_handler+0x87/0x100
[<c025a590>] scsi_error_handler+0x0/0x100
[<c0104115>] kernel_thread_helper+0x5/0x10
Code: 08 89 44 24 10 89 34 24 e8 29 e7 eb ff 89 34 24 e8 d1 e7 eb ff 85
c0 74 1a 8b 2d 20 b6 36 c0 85 ed 74 10 31 db 89 1d 2
<6>note: scsi_eh_0[730] exited with preempt_count 2
* Jens Axboe <[email protected]> wrote:
> It has to go, why? Because your deadlock detection breaks? Doesn't
> seem a very strong reason to me at all, sorry.
no, this is no reason at all. I'm really sorry this issue came up in
this context because now people appear to be arguing this as some sort
of policy issue, implying that is somehow improper to use mutexes
instead of completions, which it clearly is _not_. I very much wanted to
avoid this particular type of flamewar :-)
Using mutexes for completion purposes is perfectly fine kernel code.
Full stop.
Using completions instead of mutexes in certain cases has some minor
advantages for two simple reasons: it's slighly faster and it's also
more readable.
here's an example: initially i made the scheduler's migration logic use
semaphores in that fashion and Linus made me change it to completions,
because, and i quote Linus here:
[...]
Btw, should you not use completions here?
Completions are optimized for the sleep (ie contention) case, while
semaphores are optimized for the non-contention case. It also makes
more "sense" from a conceptual angle (you're waiting for something to
complete, not asking for an exclusive thing).
[...]
and i have to say the migration code did become cleaner. To signal some
sort of event it's a more intuitive _symbol_ _name_ to use 'complete()'
and 'wait_for_completion()' than to use 'up()' and 'down()'.
[ If you truly do not agree with this contention then please just look
at one simple conversion we did and check out the previous and the new
logic, by reading the full previous code and the full resulting code. I
do believe that if anyone at that point still thinks that the
semaphore-based code is just as readable (in that context!) as the
completion-based code that then his brains are not made of neurons but
silicon :) ]
but it has never been kernel policy to not allow the use of mutexes that
way! In some cases it's somewhat cleaner to use completions (and if
something is cleaner in Linux then in most cases it's faster as well),
but it's a judgement thing just like it's judgement thing whether to use
kmalloc() or get_free_pages(). Both are correct for the generic problem
of 'allocate some kernel RAM', but optimized for two different types of
uses.
Ingo
* Thomas Gleixner <[email protected]> wrote:
> On Tue, 2004-10-19 at 16:46, Ingo Molnar wrote:
> > * Ingo Molnar <[email protected]> wrote:
> > i've re-released the patch because shortly after releasing it i found a
> > false-positive in the deadlock-detector that was triggering in oowriter.
>
> Hit and converted another one. There are more, but they need more
> modifications as they don't have a condition to wait for and therefor
> must be converted to use the completion API, which must be extended to
> provide completion_timemout() first.
thanks, i've applied your patch to my tree. Find below an untested
implementation of wait_for_completion_timeout().
Ingo
--- linux/kernel/sched.c.orig
+++ linux/kernel/sched.c
@@ -3070,6 +3148,31 @@ void fastcall __sched wait_for_completio
}
EXPORT_SYMBOL(wait_for_completion);
+unsigned long fastcall __sched
+wait_for_completion_timeout(struct completion *x, unsigned long timeout)
+{
+ might_sleep();
+ spin_lock_irq(&x->wait.lock);
+ if (!x->done) {
+ DECLARE_WAITQUEUE(wait, current);
+
+ wait.flags |= WQ_FLAG_EXCLUSIVE;
+ __add_wait_queue_tail(&x->wait, &wait);
+ do {
+ __set_current_state(TASK_UNINTERRUPTIBLE);
+ spin_unlock_irq(&x->wait.lock);
+ timeout = schedule_timeout(timeout);
+ spin_lock_irq(&x->wait.lock);
+ } while (!x->done);
+ __remove_wait_queue(&x->wait, &wait);
+ }
+ x->done--;
+ spin_unlock_irq(&x->wait.lock);
+
+ return timeout;
+}
+EXPORT_SYMBOL(wait_for_completion_timeout);
+
#define SLEEP_ON_VAR \
unsigned long flags; \
wait_queue_t wait; \
--- linux/include/linux/completion.h.orig
+++ linux/include/linux/completion.h
@@ -28,6 +28,8 @@ static inline void init_completion(struc
}
extern void FASTCALL(wait_for_completion(struct completion *));
+extern unsigned long FASTCALL(wait_for_completion_timeout(struct completion *,
+ unsigned long));
extern void FASTCALL(complete(struct completion *));
extern void FASTCALL(complete_all(struct completion *));
On Fri, Oct 22 2004, Ingo Molnar wrote:
>
> * Jens Axboe <[email protected]> wrote:
>
> > It has to go, why? Because your deadlock detection breaks? Doesn't
> > seem a very strong reason to me at all, sorry.
>
> no, this is no reason at all. I'm really sorry this issue came up in
> this context because now people appear to be arguing this as some sort
> of policy issue, implying that is somehow improper to use mutexes
> instead of completions, which it clearly is _not_. I very much wanted to
> avoid this particular type of flamewar :-)
>
> Using mutexes for completion purposes is perfectly fine kernel code.
> Full stop.
>
> Using completions instead of mutexes in certain cases has some minor
> advantages for two simple reasons: it's slighly faster and it's also
> more readable.
>
> here's an example: initially i made the scheduler's migration logic use
> semaphores in that fashion and Linus made me change it to completions,
> because, and i quote Linus here:
>
> [...]
> Btw, should you not use completions here?
>
> Completions are optimized for the sleep (ie contention) case, while
> semaphores are optimized for the non-contention case. It also makes
> more "sense" from a conceptual angle (you're waiting for something to
> complete, not asking for an exclusive thing).
> [...]
>
> and i have to say the migration code did become cleaner. To signal some
> sort of event it's a more intuitive _symbol_ _name_ to use 'complete()'
> and 'wait_for_completion()' than to use 'up()' and 'down()'.
>
> [ If you truly do not agree with this contention then please just look
> at one simple conversion we did and check out the previous and the new
> logic, by reading the full previous code and the full resulting code. I
> do believe that if anyone at that point still thinks that the
> semaphore-based code is just as readable (in that context!) as the
> completion-based code that then his brains are not made of neurons but
> silicon :) ]
I fully agree with everything in your mail so far. What annoyed me is
some people advocating their changes under the false pretense that
existing use was broken, which it isn't.
completions _do_ make cleaner code for the intended case. But your
writing above is very clear and already explains that very well.
Lets put the issue to rest and get back to more productive work!
--
Jens Axboe
On Tue, 19 Oct 2004 16:23:49 +0100 (WEST)
"Rui Nuno Capela" <[email protected]> wrote:
> I'm experiencing terrible kernel panics at a very early bootstrap stage
> while testing the U5 and U6 latest patch(es) on my laptop (P4/UP) --
> (Ingo: this is about the very same trouble I've reported while pre-testing
> U6).
[..]
> OK. After some incremental configurations, I've isolated that those
> oops(es) only occurs if PREEMPT_TIMING and/or LATENCY_TRACE areset (Y). My
> first suspect was that newest RWSEM_DEADLOCK_DETECT, but it wasn't the
> case.
>
> So something has broken on that non-preemptible critical section timing
> stuff since U4.
>
> Hasn't anybody else stumbled on this?
I don't get any oopses or panics, but i can observer a rather interesting
behaviour. When i enable the latency traces via
echo 1 > /proc/sys/kernel/trace_enabled
my machine starts to make little pauses of ca 3-4 secs. X "hangs" for this
duration and so does aplay when playing a .wav file. "hangs" means that the
X display seems to be locked. Interestingly enough all keystrokes i entered
during the "hang" seem to arrive fine after the hang has ended. aplay
experiences an xrun.
jackd OTOH is not affected (probably since it runs SCHED_FIFO). I can
happily continue noodling with my guitar through jackd and jack-rack..
But besides that it runs fine here. I get some fairly long non preemptible
critical sections reports though.
here's one (i snipped off quite a few in the middle to make the email
smaller):
preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U6
-------------------------------------------------------
latency: 1841 us, entries: 4000 (12990) | [VP:1 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: aplay/2160, uid:1000 nice:0 policy:0 rt_prio:0
-----------------
=> started at: __schedule+0x3b/0x5d0 <c02a767b>
=> ended at: finish_task_switch+0x43/0xb0 <c0114ae3>
=======>
00000001 0.000ms (+0.000ms): __schedule (ksoftirqd)
00000001 0.000ms (+0.000ms): sched_clock (__schedule)
00000002 0.000ms (+0.000ms): deactivate_task (__schedule)
00000002 0.000ms (+0.000ms): dequeue_task (deactivate_task)
04000002 0.000ms (+0.000ms): __switch_to (__schedule)
04000002 0.001ms (+0.000ms): finish_task_switch (__schedule)
04000000 0.001ms (+0.000ms): schedule (down_write)
04000000 0.001ms (+0.000ms): __schedule (down_write)
04000001 0.001ms (+0.000ms): sched_clock (__schedule)
04000000 0.001ms (+0.000ms): schedule (down_write)
04000000 0.001ms (+0.000ms): __schedule (down_write)
04000001 0.002ms (+0.000ms): sched_clock (__schedule)
04000000 0.002ms (+0.000ms): schedule (down_write)
04000000 0.002ms (+0.000ms): __schedule (down_write)
04000001 0.002ms (+0.000ms): sched_clock (__schedule)
04000000 0.002ms (+0.000ms): schedule (down_write)
04000000 0.002ms (+0.000ms): __schedule (down_write)
04000001 0.003ms (+0.000ms): sched_clock (__schedule)
04000000 0.003ms (+0.000ms): schedule (down_write)
04000000 0.003ms (+0.000ms): __schedule (down_write)
04000001 0.003ms (+0.000ms): sched_clock (__schedule)
04000000 0.003ms (+0.000ms): schedule (down_write)
04000000 0.003ms (+0.000ms): __schedule (down_write)
04000001 0.003ms (+0.000ms): sched_clock (__schedule)
04000000 0.004ms (+0.000ms): schedule (down_write)
04000000 0.004ms (+0.000ms): __schedule (down_write)
04000001 0.004ms (+0.000ms): sched_clock (__schedule)
04000000 0.004ms (+0.000ms): schedule (down_write)
04000000 0.004ms (+0.000ms): __schedule (down_write)
04000001 0.004ms (+0.000ms): sched_clock (__schedule)
04000000 0.005ms (+0.000ms): schedule (down_write)
04000000 0.005ms (+0.000ms): __schedule (down_write)
04000001 0.005ms (+0.000ms): sched_clock (__schedule)
04000000 0.005ms (+0.000ms): schedule (down_write)
04000000 0.005ms (+0.000ms): __schedule (down_write)
04000001 0.005ms (+0.000ms): sched_clock (__schedule)
04000000 0.006ms (+0.000ms): schedule (down_write)
04000000 0.006ms (+0.000ms): __schedule (down_write)
04000001 0.006ms (+0.000ms): sched_clock (__schedule)
04000000 0.006ms (+0.000ms): schedule (down_write)
04000000 0.006ms (+0.000ms): __schedule (down_write)
04000001 0.006ms (+0.000ms): sched_clock (__schedule)
04000000 0.006ms (+0.000ms): schedule (down_write)
04000000 0.007ms (+0.000ms): __schedule (down_write)
04000001 0.007ms (+0.000ms): sched_clock (__schedule)
04000000 0.007ms (+0.000ms): schedule (down_write)
04000000 0.007ms (+0.000ms): __schedule (down_write)
04000001 0.007ms (+0.000ms): sched_clock (__schedule)
04000000 0.007ms (+0.000ms): schedule (down_write)
04000000 0.007ms (+0.000ms): __schedule (down_write)
04000001 0.008ms (+0.000ms): sched_clock (__schedule)
04000000 0.008ms (+0.000ms): schedule (down_write)
04000000 0.008ms (+0.000ms): __schedule (down_write)
04000001 0.008ms (+0.000ms): sched_clock (__schedule)
04000000 0.008ms (+0.000ms): schedule (down_write)
04000000 0.008ms (+0.000ms): __schedule (down_write)
04000001 0.009ms (+0.000ms): sched_clock (__schedule)
04000000 0.009ms (+0.000ms): schedule (down_write)
04000000 0.009ms (+0.000ms): __schedule (down_write)
04000001 0.009ms (+0.000ms): sched_clock (__schedule)
04000000 0.009ms (+0.000ms): schedule (down_write)
04000000 0.009ms (+0.000ms): __schedule (down_write)
04000001 0.009ms (+0.000ms): sched_clock (__schedule)
04000000 0.010ms (+0.000ms): schedule (down_write)
04000000 0.010ms (+0.000ms): __schedule (down_write)
04000001 0.010ms (+0.000ms): sched_clock (__schedule)
04000000 0.010ms (+0.000ms): schedule (down_write)
04000000 0.010ms (+0.000ms): __schedule (down_write)
04000001 0.010ms (+0.000ms): sched_clock (__schedule)
04000000 0.011ms (+0.000ms): schedule (down_write)
04000000 0.011ms (+0.000ms): __schedule (down_write)
04000001 0.011ms (+0.000ms): sched_clock (__schedule)
04000000 0.011ms (+0.000ms): schedule (down_write)
04000000 0.011ms (+0.000ms): __schedule (down_write)
04000001 0.011ms (+0.000ms): sched_clock (__schedule)
04000000 0.012ms (+0.000ms): schedule (down_write)
04000000 0.012ms (+0.000ms): __schedule (down_write)
04000001 0.012ms (+0.000ms): sched_clock (__schedule)
04000000 0.012ms (+0.000ms): schedule (down_write)
04000000 0.012ms (+0.000ms): __schedule (down_write)
04000001 0.012ms (+0.000ms): sched_clock (__schedule)
04000000 0.012ms (+0.000ms): schedule (down_write)
04000000 0.013ms (+0.000ms): __schedule (down_write)
04000001 0.013ms (+0.000ms): sched_clock (__schedule)
04000000 0.013ms (+0.000ms): schedule (down_write)
04000000 0.013ms (+0.000ms): __schedule (down_write)
04000001 0.013ms (+0.000ms): sched_clock (__schedule)
04000000 0.013ms (+0.000ms): schedule (down_write)
04000000 0.013ms (+0.000ms): __schedule (down_write)
04000001 0.014ms (+0.000ms): sched_clock (__schedule)
04000000 0.014ms (+0.000ms): schedule (down_write)
04000000 0.014ms (+0.000ms): __schedule (down_write)
04000001 0.014ms (+0.000ms): sched_clock (__schedule)
04000000 0.014ms (+0.000ms): schedule (down_write)
04000000 0.014ms (+0.000ms): __schedule (down_write)
04000001 0.014ms (+0.000ms): sched_clock (__schedule)
04000000 0.015ms (+0.000ms): schedule (down_write)
04000000 0.015ms (+0.000ms): __schedule (down_write)
04000001 0.015ms (+0.000ms): sched_clock (__schedule)
04000000 0.015ms (+0.000ms): schedule (down_write)
04000000 0.015ms (+0.000ms): __schedule (down_write)
04000001 0.015ms (+0.000ms): sched_clock (__schedule)
04000000 0.016ms (+0.000ms): schedule (down_write)
04000000 0.016ms (+0.000ms): __schedule (down_write)
04000001 0.016ms (+0.000ms): sched_clock (__schedule)
04000000 0.016ms (+0.000ms): schedule (down_write)
04000000 0.016ms (+0.000ms): __schedule (down_write)
04000001 0.016ms (+0.000ms): sched_clock (__schedule)
04000000 0.017ms (+0.000ms): schedule (down_write)
04000000 0.017ms (+0.000ms): __schedule (down_write)
04000001 0.017ms (+0.000ms): sched_clock (__schedule)
04000000 0.017ms (+0.000ms): schedule (down_write)
04000000 0.017ms (+0.000ms): __schedule (down_write)
04000001 0.017ms (+0.000ms): sched_clock (__schedule)
04000000 0.018ms (+0.000ms): schedule (down_write)
04000000 0.018ms (+0.000ms): __schedule (down_write)
04000001 0.018ms (+0.000ms): sched_clock (__schedule)
04000000 0.018ms (+0.000ms): schedule (down_write)
04000000 0.018ms (+0.000ms): __schedule (down_write)
04000001 0.018ms (+0.000ms): sched_clock (__schedule)
04000000 0.018ms (+0.000ms): schedule (down_write)
04000000 0.019ms (+0.000ms): __schedule (down_write)
04000001 0.019ms (+0.000ms): sched_clock (__schedule)
04000000 0.019ms (+0.000ms): schedule (down_write)
04000000 0.019ms (+0.000ms): __schedule (down_write)
04000001 0.019ms (+0.000ms): sched_clock (__schedule)
04000000 0.019ms (+0.000ms): schedule (down_write)
04000000 0.019ms (+0.000ms): __schedule (down_write)
04000001 0.020ms (+0.000ms): sched_clock (__schedule)
04000000 0.020ms (+0.000ms): schedule (down_write)
04000000 0.020ms (+0.000ms): __schedule (down_write)
04000001 0.020ms (+0.000ms): sched_clock (__schedule)
04000000 0.020ms (+0.000ms): schedule (down_write)
04000000 0.020ms (+0.000ms): __schedule (down_write)
04000001 0.020ms (+0.000ms): sched_clock (__schedule)
04000000 0.021ms (+0.000ms): schedule (down_write)
04000000 0.021ms (+0.000ms): __schedule (down_write)
04000001 0.021ms (+0.000ms): sched_clock (__schedule)
04000000 0.021ms (+0.000ms): schedule (down_write)
04000000 0.021ms (+0.000ms): __schedule (down_write)
04000001 0.021ms (+0.000ms): sched_clock (__schedule)
04000000 0.022ms (+0.000ms): schedule (down_write)
04000000 0.022ms (+0.000ms): __schedule (down_write)
04000001 0.022ms (+0.000ms): sched_clock (__schedule)
04000000 0.022ms (+0.000ms): schedule (down_write)
04000000 0.022ms (+0.000ms): __schedule (down_write)
04000001 0.022ms (+0.000ms): sched_clock (__schedule)
04000000 0.023ms (+0.000ms): schedule (down_write)
04000000 0.023ms (+0.000ms): __schedule (down_write)
04000001 0.023ms (+0.000ms): sched_clock (__schedule)
04000000 0.023ms (+0.000ms): schedule (down_write)
04000000 0.023ms (+0.000ms): __schedule (down_write)
04000001 0.023ms (+0.000ms): sched_clock (__schedule)
04000000 0.023ms (+0.000ms): schedule (down_write)
04000000 0.024ms (+0.000ms): __schedule (down_write)
04000001 0.024ms (+0.000ms): sched_clock (__schedule)
04000000 0.024ms (+0.000ms): schedule (down_write)
04000000 0.024ms (+0.000ms): __schedule (down_write)
04000001 0.024ms (+0.000ms): sched_clock (__schedule)
04000000 0.024ms (+0.000ms): schedule (down_write)
04000000 0.025ms (+0.000ms): __schedule (down_write)
04000001 0.025ms (+0.000ms): sched_clock (__schedule)
04000000 0.025ms (+0.000ms): schedule (down_write)
04000000 0.025ms (+0.000ms): __schedule (down_write)
04000001 0.025ms (+0.000ms): sched_clock (__schedule)
04000000 0.025ms (+0.000ms): schedule (down_write)
04000000 0.025ms (+0.000ms): __schedule (down_write)
04000001 0.026ms (+0.000ms): sched_clock (__schedule)
04000000 0.026ms (+0.000ms): schedule (down_write)
04000000 0.026ms (+0.000ms): __schedule (down_write)
04000001 0.026ms (+0.000ms): sched_clock (__schedule)
04000000 0.026ms (+0.000ms): schedule (down_write)
04000000 0.026ms (+0.000ms): __schedule (down_write)
04000001 0.026ms (+0.000ms): sched_clock (__schedule)
04000000 0.027ms (+0.000ms): schedule (down_write)
04000000 0.027ms (+0.000ms): __schedule (down_write)
04000001 0.027ms (+0.000ms): sched_clock (__schedule)
04000000 0.027ms (+0.000ms): schedule (down_write)
04000000 0.027ms (+0.000ms): __schedule (down_write)
04000001 0.027ms (+0.000ms): sched_clock (__schedule)
04000000 0.028ms (+0.000ms): schedule (down_write)
04000000 0.028ms (+0.000ms): __schedule (down_write)
04000001 0.028ms (+0.000ms): sched_clock (__schedule)
04000000 0.028ms (+0.000ms): schedule (down_write)
04000000 0.028ms (+0.000ms): __schedule (down_write)
04000001 0.028ms (+0.000ms): sched_clock (__schedule)
04000000 0.029ms (+0.000ms): schedule (down_write)
04000000 0.029ms (+0.000ms): __schedule (down_write)
04000001 0.029ms (+0.000ms): sched_clock (__schedule)
04000000 0.029ms (+0.000ms): schedule (down_write)
04000000 0.029ms (+0.000ms): __schedule (down_write)
04000001 0.029ms (+0.000ms): sched_clock (__schedule)
04000000 0.029ms (+0.000ms): schedule (down_write)
04000000 0.030ms (+0.000ms): __schedule (down_write)
04000001 0.030ms (+0.000ms): sched_clock (__schedule)
04000000 0.030ms (+0.000ms): schedule (down_write)
04000000 0.030ms (+0.000ms): __schedule (down_write)
04000001 0.030ms (+0.000ms): sched_clock (__schedule)
04000000 0.030ms (+0.000ms): schedule (down_write)
04000000 0.030ms (+0.000ms): __schedule (down_write)
04000001 0.031ms (+0.000ms): sched_clock (__schedule)
04000000 0.031ms (+0.000ms): schedule (down_write)
04000000 0.031ms (+0.000ms): __schedule (down_write)
04000001 0.031ms (+0.000ms): sched_clock (__schedule)
04000000 0.031ms (+0.000ms): schedule (down_write)
04000000 0.031ms (+0.000ms): __schedule (down_write)
04000001 0.031ms (+0.000ms): sched_clock (__schedule)
04000000 0.032ms (+0.000ms): schedule (down_write)
04000000 0.032ms (+0.000ms): __schedule (down_write)
04000001 0.032ms (+0.000ms): sched_clock (__schedule)
04000000 0.032ms (+0.000ms): schedule (down_write)
04000000 0.032ms (+0.000ms): __schedule (down_write)
04000001 0.032ms (+0.000ms): sched_clock (__schedule)
04000000 0.033ms (+0.000ms): schedule (down_write)
04000000 0.033ms (+0.000ms): __schedule (down_write)
04000001 0.033ms (+0.000ms): sched_clock (__schedule)
04000000 0.033ms (+0.000ms): schedule (down_write)
04000000 0.033ms (+0.000ms): __schedule (down_write)
04000001 0.033ms (+0.000ms): sched_clock (__schedule)
04000000 0.034ms (+0.000ms): schedule (down_write)
04000000 0.034ms (+0.000ms): __schedule (down_write)
04000001 0.034ms (+0.000ms): sched_clock (__schedule)
04000000 0.034ms (+0.000ms): schedule (down_write)
04000000 0.034ms (+0.000ms): __schedule (down_write)
04000001 0.034ms (+0.000ms): sched_clock (__schedule)
04000000 0.035ms (+0.000ms): schedule (down_write)
04000000 0.035ms (+0.000ms): __schedule (down_write)
04000001 0.035ms (+0.000ms): sched_clock (__schedule)
04000000 0.035ms (+0.000ms): schedule (down_write)
04000000 0.035ms (+0.000ms): __schedule (down_write)
04000001 0.035ms (+0.000ms): sched_clock (__schedule)
04000000 0.035ms (+0.000ms): schedule (down_write)
04000000 0.036ms (+0.000ms): __schedule (down_write)
[...many many more of similar ones...]
04000000 0.554ms (+0.000ms): __schedule (down_write)
04000001 0.554ms (+0.000ms): sched_clock (__schedule)
04000000 0.554ms (+0.000ms): schedule (down_write)
04000000 0.554ms (+0.000ms): __schedule (down_write)
04000001 0.554ms (+0.000ms): sched_clock (__schedule)
04000000 0.555ms (+0.000ms): schedule (down_write)
04000000 0.555ms (+0.000ms): __schedule (down_write)
04000001 0.555ms (+0.000ms): sched_clock (__schedule)
04000000 0.555ms (+0.000ms): schedule (down_write)
04000000 0.555ms (+0.000ms): __schedule (down_write)
04000001 0.555ms (+0.000ms): sched_clock (__schedule)
04000000 0.555ms (+0.000ms): schedule (down_write)
04000000 0.556ms (+0.000ms): __schedule (down_write)
04000001 0.556ms (+0.000ms): sched_clock (__schedule)
04000000 0.556ms (+0.000ms): schedule (down_write)
04000000 0.556ms (+0.000ms): __schedule (down_write)
04000001 0.556ms (+0.000ms): sched_clock (__schedule)
04000000 0.556ms (+0.000ms): schedule (down_write)
04000000 0.556ms (+0.000ms): __schedule (down_write)
04000001 0.557ms (+0.000ms): sched_clock (__schedule)
04000000 0.557ms (+0.000ms): schedule (down_write)
04000000 0.557ms (+0.000ms): __schedule (down_write)
04000001 0.557ms (+0.000ms): sched_clock (__schedule)
04000000 0.557ms (+0.000ms): schedule (down_write)
04000000 0.557ms (+0.000ms): __schedule (down_write)
04000001 0.557ms (+0.000ms): sched_clock (__schedule)
04000000 0.558ms (+0.000ms): schedule (down_write)
04000000 0.558ms (+0.000ms): __schedule (down_write)
04000001 0.558ms (+0.000ms): sched_clock (__schedule)
04000000 0.558ms (+0.000ms): schedule (down_write)
04000000 0.558ms (+0.000ms): __schedule (down_write)
04000001 0.558ms (+0.000ms): sched_clock (__schedule)
04000000 0.559ms (+0.000ms): schedule (down_write)
04000000 0.559ms (+0.000ms): __schedule (down_write)
04000001 0.559ms (+0.000ms): sched_clock (__schedule)
04000000 0.559ms (+0.000ms): schedule (down_write)
04000000 0.559ms (+0.000ms): __schedule (down_write)
04000001 0.559ms (+0.000ms): sched_clock (__schedule)
04000000 0.560ms (+0.000ms): schedule (down_write)
04000000 0.560ms (+0.000ms): __schedule (down_write)
04000001 0.560ms (+0.000ms): sched_clock (__schedule)
04000000 0.560ms (+0.000ms): schedule (down_write)
04000000 0.560ms (+0.000ms): __schedule (down_write)
04000001 0.560ms (+0.000ms): sched_clock (__schedule)
04000000 0.561ms (+0.000ms): schedule (down_write)
04000000 0.561ms (+0.000ms): __schedule (down_write)
04000001 0.561ms (+0.000ms): sched_clock (__schedule)
04000000 0.561ms (+0.000ms): schedule (down_write)
04000000 0.561ms (+0.000ms): __schedule (down_write)
04000001 0.561ms (+0.000ms): sched_clock (__schedule)
04000000 0.561ms (+0.000ms): schedule (down_write)
04000000 0.562ms (+0.000ms): __schedule (down_write)
04000001 0.562ms (+0.000ms): sched_clock (__schedule)
04000000 0.562ms (+0.000ms): schedule (down_write)
04000000 0.562ms (+0.000ms): __schedule (down_write)
04000001 0.562ms (+0.000ms): sched_clock (__schedule)
04000000 0.562ms (+0.000ms): schedule (down_write)
04000000 0.562ms (+0.000ms): __schedule (down_write)
04000001 0.563ms (+0.000ms): sched_clock (__schedule)
04000000 0.563ms (+0.000ms): schedule (down_write)
04000000 0.563ms (+0.000ms): __schedule (down_write)
04000001 0.563ms (+0.000ms): sched_clock (__schedule)
04000000 0.563ms (+0.000ms): schedule (down_write)
04000000 0.563ms (+0.000ms): __schedule (down_write)
04000001 0.563ms (+0.000ms): sched_clock (__schedule)
04000000 0.564ms (+0.000ms): schedule (down_write)
04000000 0.564ms (+0.000ms): __schedule (down_write)
04000001 0.564ms (+0.000ms): sched_clock (__schedule)
04000000 0.564ms (+0.000ms): schedule (down_write)
04000000 0.564ms (+0.000ms): __schedule (down_write)
04000001 0.564ms (+0.000ms): sched_clock (__schedule)
04000000 0.565ms (+0.000ms): schedule (down_write)
04000000 0.565ms (+0.000ms): __schedule (down_write)
04000001 0.565ms (+0.000ms): sched_clock (__schedule)
04000000 0.565ms (+0.000ms): schedule (down_write)
04000000 0.565ms (+0.000ms): __schedule (down_write)
04000001 0.565ms (+0.000ms): sched_clock (__schedule)
04000000 0.566ms (+0.000ms): schedule (down_write)
04000000 0.566ms (+0.000ms): __schedule (down_write)
04000001 0.566ms (+0.000ms): sched_clock (__schedule)
04000000 0.566ms (+0.000ms): schedule (down_write)
04000000 0.566ms (+0.000ms): __schedule (down_write)
04000001 0.566ms (+0.000ms): sched_clock (__schedule)
04000000 0.567ms (+0.000ms): schedule (down_write)
04000000 0.567ms (+0.000ms): __schedule (down_write)
04000001 0.567ms (+0.000ms): sched_clock (__schedule)
04000000 0.567ms (+0.000ms): schedule (down_write)
04000000 0.567ms (+0.000ms): __schedule (down_write)
04000001 0.567ms (+0.000ms): sched_clock (__schedule)
04000000 0.567ms (+0.000ms): schedule (down_write)
04000000 0.568ms (+0.000ms): __schedule (down_write)
04000001 0.568ms (+0.000ms): sched_clock (__schedule)
04000000 0.568ms (+0.000ms): schedule (down_write)
04000000 0.568ms (+0.000ms): __schedule (down_write)
04000001 0.568ms (+0.000ms): sched_clock (__schedule)
04000000 0.568ms (+0.000ms): schedule (down_write)
04000000 0.568ms (+0.000ms): __schedule (down_write)
04000001 0.569ms (+0.000ms): sched_clock (__schedule)
04000000 0.569ms (+0.000ms): schedule (down_write)
04000000 0.569ms (+0.000ms): __schedule (down_write)
04000001 0.569ms (+0.000ms): sched_clock (__schedule)
04000000 0.569ms (+0.000ms): schedule (down_write)
04000000 0.569ms (+0.000ms): __schedule (down_write)
04000001 0.569ms (+0.000ms): sched_clock (__schedule)
04000000 0.570ms (+0.000ms): schedule (down_write)
04000000 0.570ms (+0.000ms): __schedule (down_write)
04000001 0.570ms (+0.000ms): sched_clock (__schedule)
04000000 0.570ms (+0.000ms): schedule (down_write)
04000000 0.570ms (+0.000ms): __schedule (down_write)
04000001 0.570ms (+0.000ms): sched_clock (__schedule)
04000000 0.571ms (+0.000ms): schedule (down_write)
04000000 0.571ms (+0.000ms): __schedule (down_write)
04000001 0.571ms (+0.000ms): sched_clock (__schedule)
04000000 0.571ms (+0.000ms): schedule (down_write)
04000000 0.571ms (+0.000ms): __schedule (down_write)
04000001 0.571ms (+0.000ms): sched_clock (__schedule)
04000000 0.572ms (+0.000ms): schedule (down_write)
04000000 0.572ms (+0.000ms): __schedule (down_write)
04000001 0.572ms (+0.000ms): sched_clock (__schedule)
04000000 0.572ms (+0.000ms): schedule (down_write)
04000000 0.572ms (+0.000ms): __schedule (down_write)
04000001 0.572ms (+0.000ms): sched_clock (__schedule)
04000000 0.572ms (+0.000ms): schedule (down_write)
04000000 0.573ms (+0.000ms): __schedule (down_write)
04000001 0.573ms (+0.000ms): sched_clock (__schedule)
04000000 0.573ms (+0.000ms): schedule (down_write)
04000000 0.573ms (+0.000ms): __schedule (down_write)
04000001 0.573ms (+0.000ms): sched_clock (__schedule)
04000000 0.573ms (+0.000ms): schedule (down_write)
04000000 0.574ms (+0.000ms): __schedule (down_write)
04000001 0.574ms (+0.000ms): sched_clock (__schedule)
04000000 0.574ms (+0.000ms): schedule (down_write)
04000000 0.574ms (+0.000ms): __schedule (down_write)
04000001 0.574ms (+0.000ms): sched_clock (__schedule)
04000000 0.574ms (+0.000ms): schedule (down_write)
04000000 0.574ms (+0.000ms): __schedule (down_write)
04000001 0.575ms (+0.000ms): sched_clock (__schedule)
04000000 0.575ms (+0.000ms): schedule (down_write)
04000000 0.575ms (+0.000ms): __schedule (down_write)
04000001 0.575ms (+0.000ms): sched_clock (__schedule)
04000000 0.575ms (+0.000ms): schedule (down_write)
04000000 0.575ms (+0.000ms): __schedule (down_write)
04000001 0.575ms (+0.000ms): sched_clock (__schedule)
04000000 0.576ms (+0.000ms): schedule (down_write)
04000000 0.576ms (+0.000ms): __schedule (down_write)
04000001 0.576ms (+0.000ms): sched_clock (__schedule)
04000000 0.576ms (+0.000ms): schedule (down_write)
04000000 0.576ms (+0.000ms): __schedule (down_write)
04000001 0.576ms (+0.000ms): sched_clock (__schedule)
04000000 0.577ms (+0.000ms): schedule (down_write)
04000000 0.577ms (+0.000ms): __schedule (down_write)
04000001 0.577ms (+0.000ms): sched_clock (__schedule)
04000000 0.577ms (+0.000ms): schedule (down_write)
04000000 0.577ms (+0.000ms): __schedule (down_write)
04000001 0.577ms (+0.000ms): sched_clock (__schedule)
04000000 0.578ms (+0.000ms): schedule (down_write)
04000000 0.578ms (+0.000ms): __schedule (down_write)
04000001 0.578ms (+0.000ms): sched_clock (__schedule)
04000000 0.578ms (+0.000ms): schedule (down_write)
04000000 0.578ms (+0.000ms): __schedule (down_write)
04000001 0.578ms (+0.000ms): sched_clock (__schedule)
04000000 0.578ms (+0.000ms): schedule (down_write)
04000000 0.579ms (+0.000ms): __schedule (down_write)
04000001 0.579ms (+0.000ms): sched_clock (__schedule)
04000000 0.579ms (+0.000ms): schedule (down_write)
04000000 0.579ms (+0.000ms): __schedule (down_write)
04000001 0.579ms (+0.000ms): sched_clock (__schedule)
04000000 0.579ms (+0.000ms): schedule (down_write)
04000000 0.579ms (+0.000ms): __schedule (down_write)
04000001 0.580ms (+0.000ms): sched_clock (__schedule)
04000000 0.580ms (+0.000ms): schedule (down_write)
04000000 0.580ms (+0.000ms): __schedule (down_write)
04000001 0.580ms (+0.000ms): sched_clock (__schedule)
04000000 0.580ms (+0.000ms): schedule (down_write)
04000000 0.580ms (+0.000ms): __schedule (down_write)
04000001 0.581ms (+0.000ms): sched_clock (__schedule)
04000000 0.581ms (+0.000ms): schedule (down_write)
04000000 0.581ms (+0.000ms): __schedule (down_write)
04000001 0.581ms (+0.000ms): sched_clock (__schedule)
04000000 0.581ms (+0.000ms): schedule (down_write)
04000000 0.581ms (+0.000ms): __schedule (down_write)
04000001 0.581ms (+0.000ms): sched_clock (__schedule)
04000000 0.582ms (+0.000ms): schedule (down_write)
04000000 0.582ms (+0.000ms): __schedule (down_write)
04000001 0.582ms (+0.000ms): sched_clock (__schedule)
04000000 0.582ms (+0.000ms): schedule (down_write)
04000000 0.582ms (+0.000ms): __schedule (down_write)
04000001 0.582ms (+0.000ms): sched_clock (__schedule)
04000000 0.583ms (+0.000ms): schedule (down_write)
04000000 0.583ms (+0.000ms): __schedule (down_write)
04000001 0.583ms (+0.000ms): sched_clock (__schedule)
04000000 0.583ms (+0.000ms): schedule (down_write)
04000000 0.583ms (+0.000ms): __schedule (down_write)
04000001 0.583ms (+0.000ms): sched_clock (__schedule)
04000000 0.584ms (+0.000ms): schedule (down_write)
04000000 0.584ms (+0.000ms): __schedule (down_write)
04000001 0.584ms (+0.000ms): sched_clock (__schedule)
04000000 0.584ms (+0.000ms): schedule (down_write)
04000000 0.584ms (+0.000ms): __schedule (down_write)
04000001 0.584ms (+0.000ms): sched_clock (__schedule)
04000000 0.584ms (+0.000ms): schedule (down_write)
04000000 0.585ms (+0.000ms): __schedule (down_write)
04000001 0.585ms (+0.000ms): sched_clock (__schedule)
04000000 0.585ms (+0.000ms): schedule (down_write)
04000000 0.585ms (+0.000ms): __schedule (down_write)
04000001 0.585ms (+0.000ms): sched_clock (__schedule)
04000000 0.585ms (+0.000ms): schedule (down_write)
04000000 0.585ms (+0.000ms): __schedule (down_write)
04000001 0.586ms (+0.000ms): sched_clock (__schedule)
04000000 0.586ms (+0.000ms): schedule (down_write)
04000000 0.586ms (+0.000ms): __schedule (down_write)
04000001 0.586ms (+0.000ms): sched_clock (__schedule)
04000000 0.586ms (+0.000ms): schedule (down_write)
04000000 0.586ms (+0.000ms): __schedule (down_write)
04000001 0.586ms (+0.000ms): sched_clock (__schedule)
04000000 0.587ms (+0.000ms): schedule (down_write)
04000000 0.587ms (+0.000ms): __schedule (down_write)
04000001 0.587ms (+0.000ms): sched_clock (__schedule)
04000000 0.587ms (+0.000ms): schedule (down_write)
04000000 0.587ms (+0.000ms): __schedule (down_write)
04000001 0.587ms (+0.000ms): sched_clock (__schedule)
04000000 0.588ms (+0.000ms): schedule (down_write)
04000000 0.588ms (+0.000ms): __schedule (down_write)
04000001 0.588ms (+0.000ms): sched_clock (__schedule)
04000000 0.588ms (+0.000ms): schedule (down_write)
04000000 0.588ms (+0.000ms): __schedule (down_write)
04000001 0.588ms (+0.000ms): sched_clock (__schedule)
04000000 0.589ms (+0.000ms): schedule (down_write)
04000000 0.589ms (+0.000ms): __schedule (down_write)
04000001 0.589ms (+0.000ms): sched_clock (__schedule)
04000000 0.589ms (+0.000ms): schedule (down_write)
04000000 0.589ms (+0.000ms): __schedule (down_write)
04000001 0.589ms (+0.000ms): sched_clock (__schedule)
04000000 0.590ms (+0.000ms): schedule (down_write)
04000000 0.590ms (+0.000ms): __schedule (down_write)
04000001 0.590ms (+0.000ms): sched_clock (__schedule)
04000000 0.590ms (+0.000ms): schedule (down_write)
04000000 0.590ms (+0.000ms): __schedule (down_write)
04000001 0.590ms (+0.000ms): sched_clock (__schedule)
04000000 0.590ms (+0.000ms): schedule (down_write)
04000000 0.591ms (+0.000ms): __schedule (down_write)
04000001 0.591ms (+0.000ms): sched_clock (__schedule)
04000000 0.591ms (+0.000ms): schedule (down_write)
04000000 0.591ms (+0.000ms): __schedule (down_write)
04000001 0.591ms (+0.000ms): sched_clock (__schedule)
04000000 0.591ms (+0.000ms): schedule (down_write)
04000000 0.591ms (+0.000ms): __schedule (down_write)
04000001 0.592ms (+0.000ms): sched_clock (__schedule)
04000000 0.592ms (+0.000ms): schedule (down_write)
04000000 0.592ms (+0.000ms): __schedule (down_write)
04000001 0.592ms (+0.000ms): sched_clock (__schedule)
04000000 0.592ms (+0.000ms): schedule (down_write)
04000000 0.592ms (+0.000ms): __schedule (down_write)
04000001 0.592ms (+0.000ms): sched_clock (__schedule)
04000000 0.593ms (+0.000ms): schedule (down_write)
04000000 0.593ms (+0.000ms): __schedule (down_write)
04000001 0.593ms (+0.000ms): sched_clock (__schedule)
04000000 0.593ms (+0.000ms): schedule (down_write)
04000000 0.593ms (+0.000ms): __schedule (down_write)
04000001 0.593ms (+0.000ms): sched_clock (__schedule)
04000000 0.594ms (+0.000ms): schedule (down_write)
04000000 0.594ms (+0.000ms): __schedule (down_write)
04000001 0.594ms (+0.000ms): sched_clock (__schedule)
04000000 0.594ms (+0.000ms): schedule (down_write)
04000000 0.594ms (+0.000ms): __schedule (down_write)
04000001 0.594ms (+0.000ms): sched_clock (__schedule)
04000000 0.595ms (+0.000ms): schedule (down_write)
04000000 0.595ms (+0.000ms): __schedule (down_write)
04000001 0.595ms (+0.000ms): sched_clock (__schedule)
04000000 0.595ms (+0.000ms): schedule (down_write)
04000000 0.595ms (+0.000ms): __schedule (down_write)
04000001 0.595ms (+0.000ms): sched_clock (__schedule)
04000000 0.596ms (+0.000ms): schedule (down_write)
04000000 0.596ms (+0.000ms): __schedule (down_write)
04000001 0.596ms (+0.000ms): sched_clock (__schedule)
04000000 0.596ms (+0.000ms): schedule (down_write)
04000000 0.596ms (+0.000ms): __schedule (down_write)
04000001 0.596ms (+0.000ms): sched_clock (__schedule)
04000000 0.596ms (+0.000ms): schedule (down_write)
04000000 0.597ms (+0.000ms): __schedule (down_write)
04000001 0.597ms (+0.000ms): sched_clock (__schedule)
04000000 0.597ms (+0.000ms): schedule (down_write)
04000000 0.597ms (+0.000ms): __schedule (down_write)
04000001 0.597ms (+0.000ms): sched_clock (__schedule)
04000000 0.597ms (+0.000ms): schedule (down_write)
04000000 0.597ms (+0.000ms): __schedule (down_write)
04000001 0.598ms (+0.000ms): sched_clock (__schedule)
04000000 0.598ms (+0.000ms): schedule (down_write)
04000000 0.598ms (+0.000ms): __schedule (down_write)
04000001 0.598ms (+0.000ms): sched_clock (__schedule)
04000000 0.598ms (+0.000ms): schedule (down_write)
04000000 0.598ms (+0.000ms): __schedule (down_write)
04000001 0.598ms (+0.000ms): sched_clock (__schedule)
04000000 0.599ms (+0.000ms): schedule (down_write)
04000000 0.599ms (+0.000ms): __schedule (down_write)
04000001 0.599ms (+0.000ms): sched_clock (__schedule)
04000000 0.599ms (+0.000ms): schedule (down_write)
04000000 0.599ms (+0.000ms): __schedule (down_write)
04000001 0.599ms (+0.000ms): sched_clock (__schedule)
04000000 0.600ms (+0.000ms): schedule (down_write)
04000000 0.600ms (+0.000ms): __schedule (down_write)
04000001 0.600ms (+0.000ms): sched_clock (__schedule)
04000000 0.600ms (+0.000ms): schedule (down_write)
04000000 0.600ms (+0.000ms): __schedule (down_write)
04000001 0.600ms (+0.000ms): sched_clock (__schedule)
04000000 0.601ms (+0.000ms): schedule (down_write)
04000000 0.601ms (+0.000ms): __schedule (down_write)
04000001 0.601ms (+0.000ms): sched_clock (__schedule)
04000000 0.601ms (+0.000ms): schedule (down_write)
04000000 0.601ms (+0.000ms): __schedule (down_write)
04000001 0.601ms (+0.000ms): sched_clock (__schedule)
04000000 0.601ms (+0.000ms): schedule (down_write)
04000000 0.602ms (+0.000ms): __schedule (down_write)
04000001 0.602ms (+0.000ms): sched_clock (__schedule)
04000000 0.602ms (+0.000ms): schedule (down_write)
04000000 0.602ms (+0.000ms): __schedule (down_write)
04000001 0.602ms (+0.000ms): sched_clock (__schedule)
04000000 0.602ms (+0.000ms): schedule (down_write)
04000000 0.602ms (+0.000ms): __schedule (down_write)
04000001 0.603ms (+0.000ms): sched_clock (__schedule)
04000000 0.603ms (+0.000ms): schedule (down_write)
04000000 0.603ms (+0.000ms): __schedule (down_write)
04000001 0.603ms (+0.000ms): sched_clock (__schedule)
04000000 0.603ms (+0.000ms): schedule (down_write)
04000000 0.603ms (+0.000ms): __schedule (down_write)
04000001 0.604ms (+0.000ms): sched_clock (__schedule)
04000000 0.604ms (+0.000ms): schedule (down_write)
04000000 0.604ms (+0.000ms): __schedule (down_write)
04000001 0.604ms (+0.000ms): sched_clock (__schedule)
04000000 0.604ms (+0.000ms): schedule (down_write)
04000000 0.604ms (+0.000ms): __schedule (down_write)
04000001 0.604ms (+0.000ms): sched_clock (__schedule)
04000000 0.605ms (+0.000ms): schedule (down_write)
04000000 0.605ms (+0.000ms): __schedule (down_write)
04000001 0.605ms (+0.000ms): sched_clock (__schedule)
04000000 0.605ms (+0.000ms): schedule (down_write)
04000000 0.605ms (+0.000ms): __schedule (down_write)
04000001 0.605ms (+0.000ms): sched_clock (__schedule)
04000000 0.606ms (+0.000ms): schedule (down_write)
04000000 0.606ms (+0.000ms): __schedule (down_write)
04000001 0.606ms (+0.000ms): sched_clock (__schedule)
04000000 0.606ms (+0.000ms): schedule (down_write)
04000000 0.606ms (+0.000ms): __schedule (down_write)
04000001 0.606ms (+0.000ms): sched_clock (__schedule)
04000000 0.607ms (+0.000ms): schedule (down_write)
04000000 0.607ms (+0.000ms): __schedule (down_write)
04000001 0.607ms (+0.000ms): sched_clock (__schedule)
04000000 0.607ms (+0.000ms): schedule (down_write)
04000000 0.607ms (+0.000ms): __schedule (down_write)
04000001 0.607ms (+0.000ms): sched_clock (__schedule)
04000000 0.607ms (+0.000ms): schedule (down_write)
04000000 0.608ms (+0.000ms): __schedule (down_write)
04000001 0.608ms (+0.000ms): sched_clock (__schedule)
04000000 0.608ms (+0.000ms): schedule (down_write)
04000000 0.608ms (+0.000ms): __schedule (down_write)
04000001 0.608ms (+0.000ms): sched_clock (__schedule)
04000000 0.608ms (+0.000ms): schedule (down_write)
04000000 0.608ms (+0.000ms): __schedule (down_write)
04000001 0.609ms (+0.000ms): sched_clock (__schedule)
04000000 0.609ms (+0.000ms): schedule (down_write)
04000000 0.609ms (+0.000ms): __schedule (down_write)
04000001 0.609ms (+0.000ms): sched_clock (__schedule)
04000000 0.609ms (+0.000ms): schedule (down_write)
04000000 0.609ms (+0.000ms): __schedule (down_write)
04000001 0.610ms (+0.000ms): sched_clock (__schedule)
04000000 0.610ms (+0.000ms): schedule (down_write)
04000000 0.610ms (+0.000ms): __schedule (down_write)
04000001 0.610ms (+0.000ms): sched_clock (__schedule)
04000000 0.610ms (+0.000ms): schedule (down_write)
04000000 0.610ms (+0.000ms): __schedule (down_write)
04000001 0.610ms (+0.000ms): sched_clock (__schedule)
04000000 0.611ms (+0.000ms): schedule (down_write)
04000000 0.611ms (+0.000ms): __schedule (down_write)
04000001 0.611ms (+0.000ms): sched_clock (__schedule)
04000000 0.611ms (+0.000ms): schedule (down_write)
04000000 0.611ms (+0.000ms): __schedule (down_write)
04000001 0.611ms (+0.000ms): sched_clock (__schedule)
04000000 0.612ms (+0.000ms): schedule (down_write)
04000000 0.612ms (+0.000ms): __schedule (down_write)
04000001 0.612ms (+0.000ms): sched_clock (__schedule)
04000000 0.612ms (+0.000ms): schedule (down_write)
04000000 0.612ms (+0.000ms): __schedule (down_write)
04000001 0.612ms (+0.000ms): sched_clock (__schedule)
04000000 0.613ms (+0.000ms): schedule (down_write)
04000000 0.613ms (+0.000ms): __schedule (down_write)
04000001 0.613ms (+0.000ms): sched_clock (__schedule)
04000000 0.613ms (+0.000ms): schedule (down_write)
04000000 0.613ms (+0.000ms): __schedule (down_write)
04000001 0.613ms (+0.000ms): sched_clock (__schedule)
04000000 0.613ms (+1168210.574ms): schedule (down_write)
mango:~# uname -a
Linux mango.fruits.de 2.6.9-rc4-mm1-RT-U6 #1 Tue Oct 19 17:59:48 CEST 2004 i686 GNU/Linux
mango:~# cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 4
model name : AMD Athlon(tm) Processor
stepping : 2
cpu MHz : 1195.144
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr syscall mmxext 3dnowext 3dnow
bogomips : 2359.29
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.9-rc4-mm1-RT-U6
# Tue Oct 19 17:39:11 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y
#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=14
# CONFIG_HOTPLUG is not set
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
# CONFIG_TINY_SHMEM is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
# CONFIG_MODULE_FORCE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
CONFIG_MK7=y
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=6
CONFIG_RWSEM_GENERIC=y
CONFIG_GENERIC_SEMAPHORES=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_X86_USE_3DNOW=y
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
CONFIG_PREEMPT_BKL=y
CONFIG_PREEMPT_VOLUNTARY=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_REALTIME=y
# CONFIG_X86_UP_APIC is not set
CONFIG_X86_TSC=y
# CONFIG_X86_MCE is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_HAVE_DEC_LOCK=y
#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOW_INTS is not set
CONFIG_APM_REAL_MODE_POWER_OFF=y
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_LEGACY_PROC is not set
CONFIG_PCI_NAMES=y
# CONFIG_ISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
# CONFIG_DEBUG_DRIVER is not set
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
# CONFIG_PARPORT is not set
#
# Plug and Play support
#
#
# Block devices
#
CONFIG_BLK_DEV_FD=m
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_SX8 is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_LBD is not set
# CONFIG_CDROM_PKTCDVD is not set
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
CONFIG_BLK_DEV_IDESCSI=m
# CONFIG_IDE_TASK_IOCTL is not set
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
# CONFIG_BLK_DEV_CMD640 is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
# CONFIG_BLK_DEV_PIIX is not set
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
CONFIG_BLK_DEV_SIS5513=y
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_ARM is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=m
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
CONFIG_BLK_DEV_SR=m
# CONFIG_BLK_DEV_SR_VENDOR is not set
CONFIG_CHR_DEV_SG=m
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AACRAID is not set
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_DEBUG_ENABLE is not set
CONFIG_AIC7XXX_DEBUG_MASK=0
# CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLOGIC_1280_1040 is not set
CONFIG_SCSI_QLA2XXX=m
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
CONFIG_NETLINK_DEV=y
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
# CONFIG_IP_MULTIPLE_TABLES is not set
# CONFIG_IP_ROUTE_MULTIPATH is not set
# CONFIG_IP_ROUTE_VERBOSE is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_TUNNEL is not set
#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
# CONFIG_IP_NF_CT_ACCT is not set
# CONFIG_IP_NF_CT_PROTO_SCTP is not set
# CONFIG_IP_NF_FTP is not set
# CONFIG_IP_NF_IRC is not set
# CONFIG_IP_NF_TFTP is not set
# CONFIG_IP_NF_AMANDA is not set
# CONFIG_IP_NF_QUEUE is not set
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
CONFIG_IP_NF_MATCH_ADDRTYPE=m
CONFIG_IP_NF_MATCH_REALM=m
# CONFIG_IP_NF_MATCH_SCTP is not set
# CONFIG_IP_NF_MATCH_COMMENT is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
CONFIG_IP_NF_TARGET_NETMAP=m
CONFIG_IP_NF_TARGET_SAME=m
# CONFIG_IP_NF_NAT_LOCAL is not set
# CONFIG_IP_NF_NAT_SNMP_BASIC is not set
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
CONFIG_IP_NF_TARGET_CLASSIFY=m
# CONFIG_IP_NF_RAW is not set
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_HW_FLOWCONTROL is not set
#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CLK_JIFFIES=y
# CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set
# CONFIG_NET_SCH_CLK_CPU is not set
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
# CONFIG_NET_SCH_HFSC is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
# CONFIG_NET_SCH_NETEM is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
# CONFIG_CLS_U32_PERF is not set
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_POLICE=y
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_ETHERTAP is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_HP100 is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
CONFIG_SIS900=m
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
CONFIG_GAMEPORT=m
CONFIG_SOUND_GAMEPORT=m
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
# CONFIG_GAMEPORT_EMU10K1 is not set
# CONFIG_GAMEPORT_VORTEX is not set
# CONFIG_GAMEPORT_FM801 is not set
# CONFIG_GAMEPORT_CS461x is not set
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
CONFIG_INPUT_PCSPKR=y
# CONFIG_INPUT_UINPUT is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_CONSOLE is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
CONFIG_NVRAM=y
CONFIG_RTC=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=m
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
# CONFIG_AGP_INTEL is not set
# CONFIG_AGP_INTEL_MCH is not set
# CONFIG_AGP_NVIDIA is not set
CONFIG_AGP_SIS=m
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
# CONFIG_DRM is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HANGCHECK_TIMER=m
#
# I2C support
#
# CONFIG_I2C is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
# CONFIG_FB is not set
CONFIG_VIDEO_SELECT=y
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_DUMMY_CONSOLE=y
#
# Sound
#
CONFIG_SOUND=m
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
# CONFIG_SND_VERBOSE_PRINTK is not set
CONFIG_SND_DEBUG=y
# CONFIG_SND_DEBUG_MEMORY is not set
# CONFIG_SND_DEBUG_DETECT is not set
#
# Generic devices
#
CONFIG_SND_DUMMY=m
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
#
# PCI devices
#
CONFIG_SND_AC97_CODEC=m
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
CONFIG_SND_CS46XX=m
CONFIG_SND_CS46XX_NEW_DSP=y
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
CONFIG_SND_INTEL8X0=m
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VX222 is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
# CONFIG_USB is not set
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
# CONFIG_EXT2_FS_POSIX_ACL is not set
# CONFIG_EXT2_FS_SECURITY is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
CONFIG_ROMFS_FS=y
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
#
# Caches
#
# CONFIG_FSCACHE is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS_XATTR=y
# CONFIG_DEVPTS_FS_SECURITY is not set
CONFIG_TMPFS=y
# CONFIG_TMPFS_XATTR is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=m
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
#
# Network File Systems
#
# CONFIG_NFS_FS is not set
# CONFIG_NFSD is not set
# CONFIG_EXPORTFS is not set
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
CONFIG_AFS_FS=m
CONFIG_RXRPC=m
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
CONFIG_NLS_CODEPAGE_850=y
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
CONFIG_NLS_CODEPAGE_1250=y
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=y
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set
#
# Profiling support
#
# CONFIG_PROFILING is not set
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_DEBUG_SLAB is not set
CONFIG_DEBUG_PREEMPT=y
CONFIG_PREEMPT_TIMING=y
CONFIG_PREEMPT_TRACE=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
CONFIG_RWSEM_DEADLOCK_DETECT=y
CONFIG_RWSEM_MAX_OWNERS=64
# CONFIG_DEBUG_INFO is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_KPROBES is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_4KSTACKS=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_KGDB is not set
#
# Security options
#
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
# CONFIG_SECURITY_NETWORK is not set
CONFIG_SECURITY_CAPABILITIES=m
# CONFIG_SECURITY_SECLVL is not set
# CONFIG_SECURITY_SELINUX is not set
#
# Cryptographic options
#
CONFIG_CRYPTO=y
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
# CONFIG_CRYPTO_MD5 is not set
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES_586 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set
#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC32=m
CONFIG_LIBCRC32C=m
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_GENERIC_HARDIRQS=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
* Jens Axboe <[email protected]> wrote:
> I fully agree with everything in your mail so far. What annoyed me is
> some people advocating their changes under the false pretense that
> existing use was broken, which it isn't.
yeah, and i have to say that such advocacy mostly comes from the natural
desire to solve those _other_ problems that non-standard locking designs
have with Linux mutexes. But those problems are that of the other trees
alone, not upstream's :) Suggesting that those problems are in any way
upstream's problem, even if well-intentioned, can be quite offensive.
> completions _do_ make cleaner code for the intended case. But your
> writing above is very clear and already explains that very well.
>
> Lets put the issue to rest and get back to more productive work!
/me rejoices :)
Ingo
On Fri, Oct 22, 2004 at 08:19:01AM +0200, Jens Axboe wrote:
> It has to go, why? Because your deadlock detection breaks? Doesn't seem
> a very strong reason to me at all, sorry.
The deadlock detector is needed. Whether you understand that or not is
irrelevant to the current work that's being done. And your idiot attacks
against it doesn't correct these issues nor does it gain credibility
with the audience that does find it useful.
bill
* Bill Huey <[email protected]> wrote:
> On Fri, Oct 22, 2004 at 08:19:01AM +0200, Jens Axboe wrote:
> > It has to go, why? Because your deadlock detection breaks? Doesn't seem
> > a very strong reason to me at all, sorry.
>
> The deadlock detector is needed. Whether you understand that or not is
> irrelevant to the current work that's being done. And your idiot
> attacks against it doesn't correct these issues nor does it gain
> credibility with the audience that does find it useful.
oh no!
/me watches the flames fan out again as a bushfire
do you expect there to be any meaningful technical discussion resulting
out of you calling Jens' (valid) comments 'idiot attacks'? Fact is,
upstream couldnt care less about PREEMPT_REALTIME and/or the deadlock
detector at this point. Upstream _never_ cared about any not yet merged
(and in this case, completely unfinished) feature.
/me apologizes to Jens
Ingo
On Fri, Oct 22 2004, Bill Huey wrote:
> On Fri, Oct 22, 2004 at 08:19:01AM +0200, Jens Axboe wrote:
> > It has to go, why? Because your deadlock detection breaks? Doesn't seem
> > a very strong reason to me at all, sorry.
>
> The deadlock detector is needed. Whether you understand that or not is
> irrelevant to the current work that's being done. And your idiot attacks
> against it doesn't correct these issues nor does it gain credibility
> with the audience that does find it useful.
*plonk*
If you can't stand criticism without resorting to feeble personal
attacks, I suggest you go elsewhere.
--
Jens Axboe
On Fri, Oct 22, 2004 at 10:59:28AM +0200, Jens Axboe wrote:
> On Fri, Oct 22 2004, Bill Huey wrote:
> > On Fri, Oct 22, 2004 at 08:19:01AM +0200, Jens Axboe wrote:
> > > It has to go, why? Because your deadlock detection breaks? Doesn't seem
> > > a very strong reason to me at all, sorry.
> >
> > The deadlock detector is needed. Whether you understand that or not is
> > irrelevant to the current work that's being done. And your idiot attacks
> > against it doesn't correct these issues nor does it gain credibility
> > with the audience that does find it useful.
>
> *plonk*
>
> If you can't stand criticism without resorting to feeble personal
> attacks, I suggest you go elsewhere.
Then stick to the topic at hand, suggest positive changes, and cut the
crap with implied personal attacks like the above. If you hadn't pull
the discussion to that point, I wouldn't have reacted that way. It's
completely juvenile behavior from you and you can't expect me or
anybody else to take it sitting down.
bill
On Fri, Oct 22, 2004 at 02:06:37AM -0700, Bill Huey wrote:
> On Fri, Oct 22, 2004 at 10:59:28AM +0200, Jens Axboe wrote:
> > *plonk*
> >
> > If you can't stand criticism without resorting to feeble personal
> > attacks, I suggest you go elsewhere.
>
> Then stick to the topic at hand, suggest positive changes, and cut the
> crap with implied personal attacks like the above. If you hadn't pull
> the discussion to that point, I wouldn't have reacted that way. It's
> completely juvenile behavior from you and you can't expect me or
> anybody else to take it sitting down.
This is also email, so misunderstanding and misinterpretations do
happen. If that's the case, then I'm sorry to misunderstand you and then
get upset, but next time be more specific about improving this code and
other things related to it.
bill
On Fri, Oct 22 2004, Bill Huey wrote:
> On Fri, Oct 22, 2004 at 02:06:37AM -0700, Bill Huey wrote:
> > On Fri, Oct 22, 2004 at 10:59:28AM +0200, Jens Axboe wrote:
> > > *plonk*
> > >
> > > If you can't stand criticism without resorting to feeble personal
> > > attacks, I suggest you go elsewhere.
> >
> > Then stick to the topic at hand, suggest positive changes, and cut the
> > crap with implied personal attacks like the above. If you hadn't pull
> > the discussion to that point, I wouldn't have reacted that way. It's
> > completely juvenile behavior from you and you can't expect me or
> > anybody else to take it sitting down.
>
> This is also email, so misunderstanding and misinterpretations do
> happen. If that's the case, then I'm sorry to misunderstand you and then
> get upset, but next time be more specific about improving this code and
> other things related to it.
I've been as clear as I know how on the matter of semaphore use in
Linux. I've made no comments at all on improving your deadlock
detection scheme.
--
Jens Axboe
On Fri, Oct 22 2004, Bill Huey wrote:
> On Fri, Oct 22, 2004 at 10:59:28AM +0200, Jens Axboe wrote:
> > On Fri, Oct 22 2004, Bill Huey wrote:
> > > On Fri, Oct 22, 2004 at 08:19:01AM +0200, Jens Axboe wrote:
> > > > It has to go, why? Because your deadlock detection breaks? Doesn't seem
> > > > a very strong reason to me at all, sorry.
> > >
> > > The deadlock detector is needed. Whether you understand that or not is
> > > irrelevant to the current work that's being done. And your idiot attacks
> > > against it doesn't correct these issues nor does it gain credibility
> > > with the audience that does find it useful.
> >
> > *plonk*
> >
> > If you can't stand criticism without resorting to feeble personal
> > attacks, I suggest you go elsewhere.
>
> Then stick to the topic at hand, suggest positive changes, and cut the
> crap with implied personal attacks like the above. If you hadn't pull
> the discussion to that point, I wouldn't have reacted that way. It's
> completely juvenile behavior from you and you can't expect me or
> anybody else to take it sitting down.
What mails are you reading?!
Personally, I could not care less about the deadlock detection. If it's
a priority for you personally or due to corporate reasons, fine, but
don't involve me.
I have made no attacks on your deadlock detection other than to state
the obvious - that it has cases where it triggers on perfectly legit
code. If you read that as "implied personal attacks" or "juvenile
behaviour" then you need to grow thicker skin. The only personal attacks
here are the ones coming from you.
--
Jens Axboe
On Fri, Oct 22, 2004 at 11:20:59AM +0200, Jens Axboe wrote:
> I've been as clear as I know how on the matter of semaphore use in
> Linux. I've made no comments at all on improving your deadlock
> detection scheme.
True, but "...deadlock detection breaks" is a negative comment about
the deadlock detector without a positive suggestion to change it, is
it not ? if so, then suggest a change to be made and it'll get
implementated somehow.
bill
On Fri, Oct 22 2004, Bill Huey wrote:
> On Fri, Oct 22, 2004 at 11:20:59AM +0200, Jens Axboe wrote:
> > I've been as clear as I know how on the matter of semaphore use in
> > Linux. I've made no comments at all on improving your deadlock
> > detection scheme.
>
> True, but "...deadlock detection breaks" is a negative comment about
> the deadlock detector without a positive suggestion to change it, is
> it not ? if so, then suggest a change to be made and it'll get
> implementated somehow.
It's a statement about the deadlock detection which is true, it's not a
negative comment. A negative comment would be something ala "the
deadlock detection code is crap". Note, to avoid further confusion in
this thread: I have not read the deadlock detection code, nor do I
intend to. The sentence is only an example of what a negative comment
would look like, in no way does it reflect my view of the deadlock
detection code. End disclaimer.
As I said, I have no personal motivation to work on the deadlock
detection. My interest in the thread pertained only to code in the
kernel and its use of semaphores - something that we already cleared up
many mails ago.
So, please, lets just end it here. This branch of the thread has already
dragged on for way too long.
--
Jens Axboe
On Fri, 2004-10-22 at 11:06, Bill Huey wrote:
> On Fri, Oct 22, 2004 at 10:59:28AM +0200, Jens Axboe wrote:
> > On Fri, Oct 22 2004, Bill Huey wrote:
> > > On Fri, Oct 22, 2004 at 08:19:01AM +0200, Jens Axboe wrote:
> > > > It has to go, why? Because your deadlock detection breaks? Doesn't seem
> > > > a very strong reason to me at all, sorry.
> > >
> > > The deadlock detector is needed. Whether you understand that or not is
> > > irrelevant to the current work that's being done. And your idiot attacks
> > > against it doesn't correct these issues nor does it gain credibility
> > > with the audience that does find it useful.
> >
> > *plonk*
> >
> > If you can't stand criticism without resorting to feeble personal
> > attacks, I suggest you go elsewhere.
>
> Then stick to the topic at hand, suggest positive changes, and cut the
> crap with implied personal attacks like the above. If you hadn't pull
> the discussion to that point, I wouldn't have reacted that way. It's
> completely juvenile behavior from you and you can't expect me or
> anybody else to take it sitting down.
Stop that now !
You have started personal attacks.
This flame was already off. What the heck are you trying to achieve with
this ?
tglx
> Rui Nuno Capela wrote:
>> Ingo Molnar wrote:
>>>
>>> i have released the -U8 Real-Time Preemption patch:
>>>
>>> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U8
>>>
> [...]
>
> The fact is jackd -R (realtime mode; SCHED_FIFO) hosing the system, and
> thats exposed as soon as some jack audio client application enters into
> the chain.
>
> Running jackd non-realtime (SCHED_OTHER) does not expose this problem, so
> I think it's a scheduling related one.
>
> [...]
After some trial-and-error cycle, changing kernel configuration options, I
come to believe the obvious, that this jackd -R nasty behavior seems to be
present (only) if PREEMPT_REALTIME is set (Y).
When PREEMPT_REALTIME is not set (N), it just runs and I can throw any
client at 'jackd -R' without hosing the whole system. However, I'm seeing
plenty of these:
BUG: scheduling while atomic: jackd/0x00000002/3968
caller is schedule_timeout+0x5a/0xa8
[<c0104ec8>] dump_stack+0x1e/0x20 (20)
[<c02cd6d2>] __schedule+0x4c4/0x69e (76)
[<c02ce45f>] schedule_timeout+0x5a/0xa8 (60)
[<c0169c75>] do_poll+0x9b/0xb9 (48)
[<c0169e02>] sys_poll+0x16f/0x225 (76)
[<c010408d>] sysenter_past_esp+0x52/0x71 (-8124)
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: ipc_lock_writer+0x2f/0xae / (sys_shmctl+0x88/0x895)
.. entry 2: ipc_lock_writer+0xa3/0xae / (sys_shmctl+0x88/0x895)
.. entry 3: print_traces+0x16/0x4a / (dump_stack+0x1e/0x20)
BUG: sleeping function called from invalid context jackd(3968) at
mm/slab.c:2055
in_atomic():1 [00000002], irqs_disabled():0
[<c0104ec8>] dump_stack+0x1e/0x20 (20)
[<c011505f>] __might_sleep+0xb2/0xc5 (36)
[<c013e7f8>] __kmalloc+0xa3/0xaa (28)
[<c0169d60>] sys_poll+0xcd/0x225 (76)
[<c010408d>] sysenter_past_esp+0x52/0x71 (-8124)
preempt count: 00000003
. 3-level deep critical section nesting:
.. entry 1: ipc_lock_writer+0x2f/0xae / (sys_shmctl+0x88/0x895)
.. entry 2: ipc_lock_writer+0xa3/0xae / (sys_shmctl+0x88/0x895)
.. entry 3: print_traces+0x16/0x4a / (dump_stack+0x1e/0x20)
OTOH, I do get in some trouble elsewhere, but not related to jackd. For
example, the system hangs on udev, never managed to shutdown cleanly, and
some system monitoring applications just keeps failing completely (e.g.
gkrellm).
But I found this on late init:
BUG: Unable to handle kernel NULL pointer dereference at virtual address
00000000
printing eip:
c01cbb7d
*pde = 00000000
Oops: 0000 [#1]
PREEMPT
Modules linked in: realtime commoncap snd_seq_oss snd_seq_midi_event
snd_seq snd_pcm_oss snd_mixer_oss snd_usb_usx2y snd_usb_lib snd_rawmidi
snd_seq_device snd_hwdep snd_ali5451 snd_ac97_codec snd_pcm snd_timer
snd_page_alloc snd soundcore prism2_cs p80211 ds yenta_socket pcmcia_core
natsemi crc32 loop subfs evdev ohci_hcd usbcore thermal processor fan
button battery ac
CPU: 0
EIP: 0060:[<c01cbb7d>] Not tainted VLI
EFLAGS: 00210246 (2.6.9-rc4-mm1-U9.2)
EIP is at acpi_os_signal_semaphore+0x30/0x4e
eax: df75b700 ebx: 00000001 ecx: 00000000 edx: 00000010
esi: c155c400 edi: 00000000 ebp: de48dd9c esp: de48dd98
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process kdeinit (pid: 3862, threadinfo=de48c000 task=de5af400)
Stack: df750700 de48ddac c01d4b88 df74b160 00000001 de48ddc4 c01d67a2
df750700
df750700 c155c400 c155c400 de48ddd8 c01d3abe df750700 c155c400
00000000
de48ddf8 c01cd2e2 c155c400 00000000 c149c820 c155c400 c155c5ec
00000000
Call Trace:
[<c0104e94>] show_stack+0x80/0x96 (28)
[<c010502f>] show_registers+0x165/0x1de (56)
[<c0105241>] die+0xf6/0x191 (64)
[<c01123ab>] do_page_fault+0x483/0x6a4 (212)
[<c0104af1>] error_code+0x2d/0x38 (72)
[<c01d4b88>] acpi_ex_system_release_mutex+0x28/0x2a (16)
[<c01d67a2>] acpi_ex_release_mutex+0x135/0x154 (24)
[<c01d3abe>] acpi_ex_opcode_1A_0T_0R+0x2a/0x92 (20)
[<c01cd2e2>] acpi_ds_exec_end_op+0xb4/0x28e (32)
[<c01db23e>] acpi_ps_parse_loop+0x528/0x810 (40)
[<c01db57d>] acpi_ps_parse_aml+0x57/0x1c2 (32)
[<c01dbea5>] acpi_psx_execute+0x15d/0x1c4 (28)
[<c01d914d>] acpi_ns_execute_control_method+0x41/0x51 (20)
[<c01d90f2>] acpi_ns_evaluate_by_handle+0x74/0x8e (16)
[<c01d8fed>] acpi_ns_evaluate_relative+0xa9/0xc5 (32)
[<c01d88a5>] acpi_evaluate_object+0xfd/0x1ae (52)
[<c01cbedd>] acpi_evaluate_integer+0x32/0x4f (52)
[<e001a06c>] acpi_button_state_seq_show+0x27/0x64 [button] (32)
[<c0175562>] seq_read+0xd3/0x2cf (60)
[<c015461c>] vfs_read+0xc1/0x13a (44)
[<c015490b>] sys_read+0x4b/0x75 (44)
[<c010408d>] sysenter_past_esp+0x52/0x71 (-8124)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: die+0x3a/0x191 / (do_page_fault+0x483/0x6a4)
.. entry 2: print_traces+0x16/0x4a / (show_stack+0x80/0x96)
Code: 4d 08 8b 5d 0c 85 c9 0f 94 c2 85 db 0f 94 c0 09 d0 ba 01 10 00 00 a8
01 75 28 83 fb 01 66 ba 10 00 77 1f ff 01 0f 8e d2 00 00 00 <8b> 01 48 7e
10 68 29 7e 2e c0 e8 5a c3 f4 ff 5b e8 18 93 f3 ff
As it seemed an ACPI issue, I turned it off and the troubles went away.
Again, all this is happening with PREEMPT_REALTIME off.
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
On Fri, Oct 22, 2004 at 01:50:07AM -0700, Bill Huey wrote:
> On Fri, Oct 22, 2004 at 08:19:01AM +0200, Jens Axboe wrote:
> > It has to go, why? Because your deadlock detection breaks? Doesn't seem
> > a very strong reason to me at all, sorry.
>
> The deadlock detector is needed. Whether you understand that or not is
> irrelevant to the current work that's being done. And your idiot attacks
> against it doesn't correct these issues nor does it gain credibility
> with the audience that does find it useful.
*plonk*
* Nikita Danilov <[email protected]> wrote:
> > look but it doesnt seem simple to convert it. Reiserfs should really use
> > a normal Linux waitqueue and nothing more...
>
> Why? Condition variable is very well known and widely used concept. In
> the area of their applicability (where predicate whose change is
> waited upon is protected by a single lock) they provide clean and
> easily recognizable synchronization device.
sorry, but just look at the kcond code and compare the 'fastpath' with
say the fastpath of Linux semaphores or waitqueue handling.
condition variables (here i dont mean your code specifically, but the
general pthread concept) are simply trying to achieve too much via a
single object, which increases their complexity quite significantly.
Separating out a few select atomic synchronization primitives that can
be used for each appropriate purpose does the job equally well.
condition variables are fine if you 1) already know them from userspace
and 2) want to use a single locking abstraction for everything. It is
thus also a kitchen-sink primitive that is inevitably slow and complex.
I still have to see a locking problem where condvars are the
cleanest/simplest answer, and i've yet to see a locking problem where
condvars are not the slowest answer ;)
of course this too is valid kernel code so i'm not complaining at all.
It was simply too complex to be converted at first sight to
PREEMPT_REALTIME.
Ingo
Ingo Molnar writes:
>
> * Nikita Danilov <[email protected]> wrote:
>
> > > look but it doesnt seem simple to convert it. Reiserfs should really use
> > > a normal Linux waitqueue and nothing more...
> >
> > Why? Condition variable is very well known and widely used concept. In
> > the area of their applicability (where predicate whose change is
> > waited upon is protected by a single lock) they provide clean and
> > easily recognizable synchronization device.
>
> sorry, but just look at the kcond code and compare the 'fastpath' with
> say the fastpath of Linux semaphores or waitqueue handling.
Agree completely. kcond implementation is very inefficient. But it's
"obviously correct" at that. Idea was to optimize it later, when we
would have time for this. Didn't happen so far.
>
> condition variables (here i dont mean your code specifically, but the
> general pthread concept) are simply trying to achieve too much via a
> single object, which increases their complexity quite significantly.
This is quite questionable. Where is this complexity? Standard condition
variable usage looks like
spin_lock(&lock);
while (!predicate)
kcond_wait(&cond_var, &lock);
/*
* at this point predicate is true and @lock is held
*/
This can, of course, be implemented with wait-queues, but:
- condition variables are used idiomatically, and hence, contain strong
hint about what code tries to achieve.
- their API is designed to match their (rather narrow) usage. For
example, kcond_wait() takes lock as an argument and can (given proper
debugging support in the spin-lock implementation) check that this lock
is actually locked by the calling thread.
>
> Separating out a few select atomic synchronization primitives that can
> be used for each appropriate purpose does the job equally well.
Difference between a condition variable and "atomic synchronization
primitives" is like difference between a spin-lock and open-coded Dekker
algorithm: both provide you with mutual exclusion, but the former gives
one distinct clue about what is going on.
Umm... I have better example: every algorithm can be coded without loop
statements, with goto-s only. And (given proper programmer) goto
produces better assembly than while(). Does this constitutes an argument
in favor of throwing away all these wimpy loops that no Real Programmer
should use?
>
> condition variables are fine if you 1) already know them from userspace
> and 2) want to use a single locking abstraction for everything. It is
> thus also a kitchen-sink primitive that is inevitably slow and complex.
> I still have to see a locking problem where condvars are the
> cleanest/simplest answer, and i've yet to see a locking problem where
> condvars are not the slowest answer ;)
A kernel daemon that waits for some work to do is an example.
(And just to fight what seems to be common misconception: condition
variables were not introduced by POSIX committee, they are much older
than that. As all synchronization primitives they originate from the
kernel land.)
>
> of course this too is valid kernel
* K.R. Foley <[email protected]> wrote:
> Oct 21 12:33:22 porky kernel: BUG: atomic counter underflow at:
> Oct 21 12:33:22 porky kernel: [<c0254dd8>] qdisc_destroy+0x98/0xa0 (12)
> Oct 21 12:33:22 porky kernel: [<c0254fed>] dev_shutdown+0x3d/0xa0 (16)
> Oct 21 12:33:22 porky kernel: [<c024773b>] unregister_netdevice+0x13b/0x280 (28)
> Oct 21 12:33:22 porky netfs: Mounting other filesystems: succeeded
> Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (12)
> Oct 21 12:33:22 porky kernel: [<e09a6160>] tulip_remove_one+0x0/0xa0 [tulip] (4)
> Oct 21 12:33:22 porky kernel: [<c02058de>] unregister_netdev+0x1e/0x30 (24)
> Oct 21 12:33:22 porky kernel: [<e09a618f>] tulip_remove_one+0x2f/0xa0 [tulip] (16)
> Oct 21 12:33:22 porky kernel: [<c01f2907>] device_release_driver+0x67/0x70 (8)
> Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (8)
> Oct 21 12:33:22 porky kernel: [<c01c4e26>] pci_device_remove+0x76/0x80 (20)
> Oct 21 12:33:22 porky kernel: [<c01f573b>] device_detach_shutdown+0xb/0x40 (12)
> Oct 21 12:33:22 porky kernel: [<c01f2907>] device_release_driver+0x67/0x70 (12)
> Oct 21 12:33:22 porky kernel: [<c01f293b>] driver_detach+0x2b/0x40 (24)
> Oct 21 12:33:22 porky kernel: [<c01f2daf>] bus_remove_driver+0x3f/0x70 (20)
> Oct 21 12:33:22 porky kernel: [<c01f32b9>] driver_unregister+0x19/0x30 (20)
> Oct 21 12:33:22 porky kernel: [<c01c50cc>] pci_unregister_driver+0x1c/0x30 (16)
> Oct 21 12:33:22 porky kernel: [<e09a7767>] tulip_cleanup+0x17/0x1b [tulip] (16)
> Oct 21 12:33:22 porky kernel: [<c0139801>] sys_delete_module+0x121/0x150 (12)
> Oct 21 12:33:22 porky kernel: [<c01531a1>] sys_munmap+0x51/0x60 (64)
> Oct 21 12:33:22 porky kernel: [<c0116a20>] do_page_fault+0x0/0x660 (16)
> Oct 21 12:33:22 porky kernel: [<c0106719>] sysenter_past_esp+0x52/0x71 (16)
i think this is an upstream bug that the atomic-counter debugging assert
triggers. Jeff, the assert above shows qdisc->refcnt underflowing from 0
to -1. So the qdisc_destroy() [or dev_shutdown()?] use is inbalanced.
Plus rtl8139 is doing this too, see the log below. The upstream kernel
does not notice this condition. Or is ->refcnt allowed to underflow?
Ingo
Oct 20 16:47:18 localhost kernel: BUG: atomic counter underflow at:
Oct 20 16:47:18 localhost kernel: [<c02b8d88>] qdisc_destroy+0x98/0xa0 (12)
Oct 20 16:47:18 localhost kernel: [<c02b8f9d>] dev_shutdown+0x3d/0xa0 (16)
Oct 20 16:47:18 localhost kernel: [<c02aa38b>] unregister_netdevice+0x13b/0x280 (28)
Oct 20 16:47:18 localhost kernel: [<c01148b0>] mcount+0x14/0x18 (8)
Oct 20 16:47:18 localhost kernel: [<e0836b10>] rtl8139_remove_one+0x0/0xa0 [8139too] (4)
Oct 20 16:47:18 localhost kernel: [<c0241f8e>] unregister_netdev+0x1e/0x30 (24)
Oct 20 16:47:18 localhost kernel: [<e0836b3a>] rtl8139_remove_one+0x2a/0xa0 [8139too] (16)
Oct 20 16:47:18 localhost kernel: [<c01148b0>] mcount+0x14/0x18 (12)
Oct 20 16:47:18 localhost kernel: [<c01e4016>] pci_device_remove+0x76/0x80 (20)
Oct 20 16:47:18 localhost kernel: [<c02306cb>] device_detach_shutdown+0xb/0x40 (12)
Oct 20 16:47:18 localhost kernel: [<c022d817>] device_release_driver+0x67/0x70 (12)
Oct 20 16:47:18 localhost kernel: [<c022d84b>] driver_detach+0x2b/0x40 (24)
Oct 20 16:47:18 localhost kernel: [<c022dcbf>] bus_remove_driver+0x3f/0x70 (20)
Oct 20 16:47:18 localhost kernel: [<c022e1c9>] driver_unregister+0x19/0x30 (20)
Oct 20 16:47:18 localhost kernel: [<c01e42bc>] pci_unregister_driver+0x1c/0x30 (16)
Oct 20 16:47:18 localhost kernel: [<e0838b07>] rtl8139_cleanup_module+0x17/0x1b [8139too] (16)
Oct 20 16:47:18 localhost kernel: [<c013c8d1>] sys_delete_module+0x121/0x150 (12)
Oct 20 16:47:18 localhost kernel: [<c01596a4>] sys_munmap+0x54/0x70 (64)
Oct 20 16:47:18 localhost kernel: [<c0118560>] do_page_fault+0x0/0x6d0 (16)
Oct 20 16:47:18 localhost kernel: [<c0107b49>] sysenter_past_esp+0x52/0x71 (16)
* Nikita Danilov <[email protected]> wrote:
> > condition variables are fine if you 1) already know them from userspace
> > and 2) want to use a single locking abstraction for everything. It is
> > thus also a kitchen-sink primitive that is inevitably slow and complex.
> > I still have to see a locking problem where condvars are the
> > cleanest/simplest answer, and i've yet to see a locking problem where
> > condvars are not the slowest answer ;)
>
> A kernel daemon that waits for some work to do is an example.
what type of work - could you be a bit more specific?
Ingo
Ingo Molnar writes:
>
> * Nikita Danilov <[email protected]> wrote:
>
> > > condition variables are fine if you 1) already know them from userspace
> > > and 2) want to use a single locking abstraction for everything. It is
> > > thus also a kitchen-sink primitive that is inevitably slow and complex.
> > > I still have to see a locking problem where condvars are the
> > > cleanest/simplest answer, and i've yet to see a locking problem where
> > > condvars are not the slowest answer ;)
> >
> > A kernel daemon that waits for some work to do is an example.
>
> what type of work - could you be a bit more specific?
Take a loop in fs/cifs/cifsfs.c:cifs_oplock_thread() (I won't copy it
here to avoid you all going blind). It can be recoded as
while(1) {
spin_lock(&GlobalMid_Lock);
while (list_empty(&GlobalOplock_Q)) {
if (kcond_timedwait(&SomeCIFSCVAR, &GlobalMid_Lock, HZ) == -EINTR) {
spin_unlock(&GlobalMid_Lock);
complete_and_exit(&cifs_oplock_exited, 0);
}
}
oplock_item = list_entry(GlobalOplock_Q.next, struct oplock_q_entry, qhead);
/* do stuff with oplock_item ... */
spin_unlock(&GlobalMid_Lock);
....
}
Point is that this is very stylistic usage---easily recognizable.
>
> Ingo
Nikita.
* Nikita Danilov <[email protected]> wrote:
> > > A kernel daemon that waits for some work to do is an example.
> >
> > what type of work - could you be a bit more specific?
>
> Take a loop in fs/cifs/cifsfs.c:cifs_oplock_thread() (I won't copy it
> here to avoid you all going blind). It can be recoded as
>
> while(1) {
> spin_lock(&GlobalMid_Lock);
> while (list_empty(&GlobalOplock_Q)) {
> if (kcond_timedwait(&SomeCIFSCVAR, &GlobalMid_Lock, HZ) == -EINTR) {
> spin_unlock(&GlobalMid_Lock);
> complete_and_exit(&cifs_oplock_exited, 0);
> }
> }
> oplock_item = list_entry(GlobalOplock_Q.next, struct oplock_q_entry, qhead);
> /* do stuff with oplock_item ... */
> spin_unlock(&GlobalMid_Lock);
> ....
> }
in this particular case i'd use a workqueue, which would simplify this
down to something like:
workqueue_handler()
{
spin_lock(&GlobalMid_Lock);
oplock_item = list_entry(GlobalOplock_Q.next, ...);
/* do stuff with oplock_item */
spin_unlock(&GlobalMid_Lock);
}
and instead of playing games with signals to exit the worker thread, i'd
use destroy_workqueue().
Ingo
* Alexander Batyrshin <[email protected]> wrote:
> used i386/defconfig
> BUG: semaphore recursion deadlock detected!
> .. current task khpsbpkt/723 is already holding c04610c0.
ok, this should be fixed in -U9.2.
> 2.
> if execute
> ``for i in `seq 1 9999`; do nohup bash >/dev/null 2>&1 & done'',
> then you'll get something like:
> [...skip...]
> Warning: dev (pts0) tty->count(16) != #fd's(8) in tty_open
> Warning: dev (pts0) tty->count(16) != #fd's(11) in tty_open
> I'v tested it against linux-2.6.9-rc4-mm1 => all was ok
i have trouble reproducing this myself. Can you still trigger it under
-U9.2? If yes then could you check whether this still happens with a the
same .config but with CONFIG_SMP turned off? This smells like a locking
bug/breakage in the tty layer that we dont detect. You have all the
relevant debug options turned on, correct? (DEBUG_PREEMPT and
RWSEM_DEADLOCK_DETECT)
Ingo
i have released the -U9.3 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this too is a fixes-only release.
Changes since -U9.2:
- tons more driver/mutex/completion conversion done by Thomas Gleixner
for: ppp, ipmi, parport/ieeee1284, scsi, hotplug, and more.
- iptables/netfilter deadlock fix, this should fix the bug reported by
Michal Schmidt.
- .config housekeeping: disallow the turning off of PREEMPT_BKL when
PREEMPT_REALTIME is on. This solves the build error reported by
Matthew L Foster.
- print the full stacktrace of the current task in the deadlock
detector and dont use show_stack(). This explains some of the weird
partial stackdumps reported.
- some more minor updates to the case when the deadlock detector turns
itself off due to reaching the limit. We kept the spinlock locked.
to create a -U9.3 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.8.tar.bz2
+ http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.9-rc4.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9-rc4/2.6.9-rc4-mm1/2.6.9-rc4-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U9.3
Ingo
* Alexander Batyrshin <[email protected]> wrote:
> 2.
> if execute
> ``for i in `seq 1 9999`; do nohup bash >/dev/null 2>&1 & done'',
> then you'll get something like:
> [...skip...]
> Warning: dev (pts0) tty->count(16) != #fd's(8) in tty_open
> Warning: dev (pts0) tty->count(16) != #fd's(11) in tty_open
btw., where did you run the test - over ssh or in an xterm?
Ingo
Thomas Gleixner wrote:
> On Thu, 2004-10-21 at 20:57, K.R. Foley wrote:
>
>>>I guess, you don't have a tulip network card in your box, as the module
>>>is removed.
>>>
>>>The question is, if it got registered correctly before the removal.
>>
>>Actually I do have the tulip card in the box and I am pulling this stuff
>>from the logs over that connection now. Here are the next lines from the
>>log that might help.
>
>
More info on this:
As mentioned before, I get this on boot (every boot):
Oct 21 12:33:22 porky kernel: Linux Tulip driver version 1.1.13 (May 11,
2002)
Oct 21 12:33:22 porky kernel: PCI: Found IRQ 5 for device 0000:04:0a.0
Oct 21 12:33:22 porky kernel: PCI: Sharing IRQ 5 with 0000:04:05.1
Oct 21 12:33:22 porky kernel: tulip0: EEPROM default media type Autosense.
Oct 21 12:33:22 porky kernel: tulip0: Index #0 - Media MII (#11)
described by a 21140 MII PHY (1) block.
Oct 21 12:33:22 porky kernel: tulip0: MII transceiver #3 config 3100
status 7809 advertising 01e1.
Oct 21 12:33:22 porky kernel: eth0: Digital DS21140 Tulip rev 32 at
0xe480, 00:00:C0:7F:A0:E9, IRQ 5.
Oct 21 12:33:22 porky kernel: BUG: atomic counter underflow at:
Oct 21 12:33:22 porky kernel: [<c0254dd8>] qdisc_destroy+0x98/0xa0 (12)
Oct 21 12:33:22 porky kernel: [<c0254fed>] dev_shutdown+0x3d/0xa0 (16)
Oct 21 12:33:22 porky kernel: [<c024773b>]
unregister_netdevice+0x13b/0x280 (28)
Oct 21 12:33:22 porky netfs: Mounting other filesystems: succeeded
Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (12)
Oct 21 12:33:22 porky kernel: [<e09a6160>] tulip_remove_one+0x0/0xa0
[tulip] (4)
Oct 21 12:33:22 porky kernel: [<c02058de>] unregister_netdev+0x1e/0x30 (24)
Oct 21 12:33:22 porky kernel: [<e09a618f>] tulip_remove_one+0x2f/0xa0
[tulip] (16)
Oct 21 12:33:22 porky kernel: [<c01f2907>]
device_release_driver+0x67/0x70 (8)
Oct 21 12:33:22 porky kernel: [<c0112fb0>] mcount+0x14/0x18 (8)
Oct 21 12:33:22 porky kernel: [<c01c4e26>] pci_device_remove+0x76/0x80 (20)
Oct 21 12:33:22 porky kernel: [<c01f573b>]
device_detach_shutdown+0xb/0x40 (12)
Oct 21 12:33:22 porky kernel: [<c01f2907>]
device_release_driver+0x67/0x70 (12)
Oct 21 12:33:22 porky kernel: [<c01f293b>] driver_detach+0x2b/0x40 (24)
Oct 21 12:33:22 porky kernel: [<c01f2daf>] bus_remove_driver+0x3f/0x70 (20)
Oct 21 12:33:22 porky kernel: [<c01f32b9>] driver_unregister+0x19/0x30 (20)
Oct 21 12:33:22 porky kernel: [<c01c50cc>]
pci_unregister_driver+0x1c/0x30 (16)
Oct 21 12:33:22 porky kernel: [<e09a7767>] tulip_cleanup+0x17/0x1b
[tulip] (16)
Oct 21 12:33:22 porky kernel: [<c0139801>]
sys_delete_module+0x121/0x150 (12)
Oct 21 12:33:22 porky kernel: [<c01531a1>] sys_munmap+0x51/0x60 (64)
Oct 21 12:33:22 porky kernel: [<c0116a20>] do_page_fault+0x0/0x660 (16)
Oct 21 12:33:22 porky kernel: [<c0106719>] sysenter_past_esp+0x52/0x71 (16)
Oct 21 12:33:22 porky kernel: preempt count: 00000001
Oct 21 12:33:22 porky kernel: . 1-level deep critical section nesting:
Oct 21 12:33:22 porky kernel: .. entry 1: print_traces+0x1d/0x60 /
(dump_stack+0x23/0x30)
Oct 21 12:33:22 porky kernel:
Oct 21 12:33:22 porky kernel: tulip 0000:04:0a.0: Device was removed
without properly calling pci_disable_device(). This may need fixing.
I am not sure why the tulip driver is being loaded,unloaded,reloaded
every time on boot? Anyway, I wanted to check to see if I could generate
the above bug on subsequent unloads of the module. I downed the network
and the unloaded the tulip module. I did get the message below when
unloading the module but no "BUG: atomic counter underflow" message.
Oct 22 05:43:33 porky kernel: tulip 0000:04:0a.0: Device was removed
without properly calling pci_disable_device(). This may need fixing.
Oct 22 05:43:33 porky net.agent[921]: remove event not handled
kr
On Fri, 2004-10-22 at 16:12, K.R. Foley wrote:
> I am not sure why the tulip driver is being loaded,unloaded,reloaded
> every time on boot? Anyway, I wanted to check to see if I could generate
> the above bug on subsequent unloads of the module. I downed the network
> and the unloaded the tulip module. I did get the message below when
> unloading the module but no "BUG: atomic counter underflow" message.
>
> Oct 22 05:43:33 porky kernel: tulip 0000:04:0a.0: Device was removed
> without properly calling pci_disable_device(). This may need fixing.
> Oct 22 05:43:33 porky net.agent[921]: remove event not handled
Can you please verify this against vanilla 2.6.9 and 2.6.9-mm1 ?
tglx
Thomas Gleixner wrote:
> On Fri, 2004-10-22 at 16:12, K.R. Foley wrote:
>
>>I am not sure why the tulip driver is being loaded,unloaded,reloaded
>>every time on boot? Anyway, I wanted to check to see if I could generate
>>the above bug on subsequent unloads of the module. I downed the network
>>and the unloaded the tulip module. I did get the message below when
>>unloading the module but no "BUG: atomic counter underflow" message.
>>
>>Oct 22 05:43:33 porky kernel: tulip 0000:04:0a.0: Device was removed
>>without properly calling pci_disable_device(). This may need fixing.
>>Oct 22 05:43:33 porky net.agent[921]: remove event not handled
>
>
> Can you please verify this against vanilla 2.6.9 and 2.6.9-mm1 ?
>
> tglx
>
I will verify it against 2.6.9 when I get time. I did verify the "Device
was removed without properly calling pci_disable_device(). This may need
fixing." message is generated with 2.6.9-rc3-mm3 without preempt
patches. Also thanks to Mark Johnson's suggestion I verified that the
reason the driver is being loaded twice is because kudzu is loading it
once then unloading it.
kr
i have released the -U10 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this is purely a rebasing of -U9.3 to 2.6.9-mm1.
to create a -U10 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10
Ingo
On Fri, 2004-10-22 at 17:15, K.R. Foley wrote:
> Thomas Gleixner wrote:
> > On Fri, 2004-10-22 at 16:12, K.R. Foley wrote:
> >
> >>I am not sure why the tulip driver is being loaded,unloaded,reloaded
> >>every time on boot? Anyway, I wanted to check to see if I could generate
> >>the above bug on subsequent unloads of the module. I downed the network
> >>and the unloaded the tulip module. I did get the message below when
> >>unloading the module but no "BUG: atomic counter underflow" message.
> >>
> >>Oct 22 05:43:33 porky kernel: tulip 0000:04:0a.0: Device was removed
> >>without properly calling pci_disable_device(). This may need fixing.
> >>Oct 22 05:43:33 porky net.agent[921]: remove event not handled
> >
> >
> > Can you please verify this against vanilla 2.6.9 and 2.6.9-mm1 ?
> >
> > tglx
> >
>
> I will verify it against 2.6.9 when I get time. I did verify the "Device
> was removed without properly calling pci_disable_device(). This may need
> fixing." message is generated with 2.6.9-rc3-mm3 without preempt
> patches. Also thanks to Mark Johnson's suggestion I verified that the
> reason the driver is being loaded twice is because kudzu is loading it
> once then unloading it.
>
> kr
--- 2.6.9-rc4-mm1/drivers/net/tulip/tulip_core.c 2004-10-12
09:41:27.000000000 +0200
+++ 2.6.9-rc4-mm1-U9-E0/drivers/net/tulip/tulip_core.c 2004-10-22
17:54:31.000000000 +0200
@@ -1784,6 +1784,7 @@
#endif
free_netdev (dev);
pci_release_regions (pdev);
+ pci_disable_device (pdev);
pci_set_drvdata (pdev, NULL);
/* pci_power_off (pdev, -1); */
On Friday 22 October 2004 11:50, Ingo Molnar wrote:
>i have released the -U10 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this is purely a rebasing of -U9.3 to 2.6.9-mm1.
>
>to create a -U10 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> +
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.
>6.9-mm1/2.6.9-mm1.bz2 +
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm
>1-U10
>
> Ingo
As sort of the ultimate dummy test, I'm building this right now. The
only oddments so far are a bunch of deprecated variable warnings,
quite a few but many are dups.
I'll repost after I've tried it.
>To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.28% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
* Gene Heskett <[email protected]> wrote:
> As sort of the ultimate dummy test, I'm building this right now. The
> only oddments so far are a bunch of deprecated variable warnings,
> quite a few but many are dups.
these warnings are present in -mm1 too.
Ingo
* Thomas Gleixner <[email protected]> wrote:
> --- 2.6.9-rc4-mm1/drivers/net/tulip/tulip_core.c 2004-10-12
> pci_release_regions (pdev);
> + pci_disable_device (pdev);
> pci_set_drvdata (pdev, NULL);
i've uploaded -U10.1 with this fix included plus a fix to the tg3 and
3c59x drivers. (the drivers would disable interrupts in
hard_start_xmit). I've also added debugging code to catch future
instances of this network driver related problem.
Ingo
Ingo Molnar wrote:
>* Gene Heskett <[email protected]> wrote:
>
>
>
>>As sort of the ultimate dummy test, I'm building this right now. The
>>only oddments so far are a bunch of deprecated variable warnings,
>>quite a few but many are dups.
>>
>>
>
>these warnings are present in -mm1 too.
>
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
Hey Ingo,
Bite Me.
:-)
Jeff
Russell Miller wrote:
>Hey, Jeff,
>
>This is a very high traffic list. If you have nothing constructive to say,
>don't say it.
>--Russell
>
>
>
I'll stick to bugs from now own Russell. Please forward the same to
Ingo ....
Jeff
Thomas Gleixner wrote:
> On Fri, 2004-10-22 at 17:15, K.R. Foley wrote:
>
>>Thomas Gleixner wrote:
>>
>>>On Fri, 2004-10-22 at 16:12, K.R. Foley wrote:
>>>
>>>
>>>>I am not sure why the tulip driver is being loaded,unloaded,reloaded
>>>>every time on boot? Anyway, I wanted to check to see if I could generate
>>>>the above bug on subsequent unloads of the module. I downed the network
>>>>and the unloaded the tulip module. I did get the message below when
>>>>unloading the module but no "BUG: atomic counter underflow" message.
>>>>
>>>>Oct 22 05:43:33 porky kernel: tulip 0000:04:0a.0: Device was removed
>>>>without properly calling pci_disable_device(). This may need fixing.
>>>>Oct 22 05:43:33 porky net.agent[921]: remove event not handled
>>>
>>>
>>>Can you please verify this against vanilla 2.6.9 and 2.6.9-mm1 ?
>>>
>>>tglx
>>>
>>
>>I will verify it against 2.6.9 when I get time. I did verify the "Device
>>was removed without properly calling pci_disable_device(). This may need
>>fixing." message is generated with 2.6.9-rc3-mm3 without preempt
>>patches. Also thanks to Mark Johnson's suggestion I verified that the
>>reason the driver is being loaded twice is because kudzu is loading it
>>once then unloading it.
>>
>>kr
>
>
> --- 2.6.9-rc4-mm1/drivers/net/tulip/tulip_core.c 2004-10-12
> 09:41:27.000000000 +0200
> +++ 2.6.9-rc4-mm1-U9-E0/drivers/net/tulip/tulip_core.c 2004-10-22
> 17:54:31.000000000 +0200
> @@ -1784,6 +1784,7 @@
> #endif
> free_netdev (dev);
> pci_release_regions (pdev);
> + pci_disable_device (pdev);
> pci_set_drvdata (pdev, NULL);
>
> /* pci_power_off (pdev, -1); */
>
>
>
This does fix this error:
porky kernel: tulip 0000:04:0a.0: Device was removed without properly
calling pci_disable_device(). This may need fixing.
thanks,
kr
i have released the -U10.2 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this is a fixes-only release.
Changes since -U10:
- fixed a big bug present ever since: the BKL got dropped when a
spinlock-mutex was acquired and it scheduled away. This reduced the
locking efficiency of the BKL. A number of outstanding problems could
be affected, in particular this should fix the tty locking breakage
reported by Alexander Batyrshin and Adam Heath. UP and SMP systems
are affected too, with SMP systems having a higher chance to trigger
this condition.
- tulip.c breakage fix from Thomas Gleixner
- tg3 and 3c59x fixes.
- made the hardirq threads SCHED_FIFO by default. They get priorities
between 25 and 50, depending on the irq #. (this is pretty random but
i found no better scheme.) Made the softirq thread SCHED_FIFO by
default as well, albeit this probably will have to change. These
changes should make it easier to debug a hung system.
to create a -U10.2 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10.2
Ingo
Hi Ingo,
U9.3 (defconfig) died with trace:
------------[ cut here ]------------
kernel BUG at kernel/sched.c:784!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c011873d>] Not tainted VLI
EFLAGS: 00010002 (2.6.9-rc4-mm1-RT-U9.3)
EIP is at resched_task+0x80/0x8a
eax: 00000001 ebx: c1674000 ecx: dfb578f0 edx: 00000000
esi: c16712c0 edi: 00000000 ebp: c167bc48 esp: c167bc3c
ds: 007b es: 007b ss: 0068 preempt: 00000003
Process ksoftirqd/0 (pid: 3, threadinfo=c167a000 task=c1670660)
Stack: c1403820 c1403820 00000000 c167bc94 c0118b78 c16712c0 c1433820
00000000
00000000 00000000 00000001 00000100 c1433820 c1433820 00000001
00000000
00000000 00000001 00000082 00000001 de9d3e60 00000001 c167bcd8
c0133c7b
Call Trace:
[<c0118b78>] try_to_wake_up+0x1f3/0x26b (20)
[<c0133c7b>] autoremove_wake_function+0x2f/0x57 (76)
[<c011a766>] __wake_up_common+0x3f/0x5e (28)
[<c011a7d0>] __wake_up+0x4b/0x62 (40)
[<c034e08c>] sock_def_readable+0x8e/0x90 (52)
[<c0388dbe>] tcp_child_process+0xe6/0xec (28)
[<c0384df7>] tcp_v4_do_rcv+0x123/0x162 (36)
[<c0134079>] _mutex_lock+0x2c/0x3b (16)
[<c0385513>] tcp_v4_rcv+0x6dd/0x92c (16)
[<c03c0e30>] svc_revisit+0x27/0x154 (48)
[<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (40)
[<c0368609>] ip_local_deliver_finish+0xb6/0x1b2 (4)
[<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (12)
[<c035f607>] nf_hook_slow+0xdc/0x12e (20)
[<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (28)
[<c0368705>] ip_rcv_finish+0x0/0x2b3 (28)
[<c0367fcb>] ip_local_deliver+0x208/0x226 (4)
[<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (24)
[<c0368828>] ip_rcv_finish+0x123/0x2b3 (20)
[<c0368705>] ip_rcv_finish+0x0/0x2b3 (32)
[<c035f607>] nf_hook_slow+0xdc/0x12e (20)
[<c0368705>] ip_rcv_finish+0x0/0x2b3 (28)
[<c036846a>] ip_rcv+0x481/0x56a (32)
[<c0368705>] ip_rcv_finish+0x0/0x2b3 (24)
[<c0354e68>] netif_receive_skb+0x117/0x1dd (28)
[<c0354ff7>] process_backlog+0xc9/0x1cb (36)
[<c03551b2>] net_rx_action+0xb9/0x1ed (48)
[<c01242d9>] ___do_softirq+0xe1/0x130 (36)
[<c0124923>] ksoftirqd+0x0/0xda (40)
[<c01243fb>] _do_softirq+0x4b/0x4d (4)
[<c01249c5>] ksoftirqd+0xa2/0xda (16)
[<c013376d>] kthread+0xb7/0xbd (24)
[<c01336b6>] kthread+0x0/0xbd (28)
[<c0103375>] kernel_thread_helper+0x5/0xb (16)
preempt count: 00000004
. 4-level deep critical section nesting:
.. entry 1: _spin_lock_irqsave+0x1d/0xa5 [<00000000>] / (0x0 [<00000000>])
.. entry 2: _spin_lock+0x19/0x6d [<00000000>] / (0x0 [<00000000>])
.. entry 3: _spin_lock_irqsave+0x1d/0xa5 [<00000000>] / (0x0 [<00000000>])
.. entry 4: print_traces+0x17/0x4e [<00000000>] / (0x0 [<00000000>])
>i have released the -U9.3 Real-Time Preemption patch, ...
>
It is getting hard to keep up with the updates....
This version built OK and since I noticed it includes fixes for the
parallel port, I added that back to my configuration and built / booted
without any problems. I still see the BUG from:
Oct 22 12:27:50 dws77 kernel: 8139too Fast Ethernet driver 0.9.27
Oct 22 12:27:50 dws77 kernel: eth0: RealTek RTL8139 at 0xdc00,
00:50:bf:39:11:fc, IRQ 11
Oct 22 12:27:50 dws77 kernel: BUG: atomic counter underflow at:
Oct 22 12:27:50 dws77 kernel: [<c02b8f88>] qdisc_destroy+0x98/0xa0 (12)
I saw the messages about fixes for the other network drivers, but
don't forget this one.
Real time stress tests ran more smoothly this time with fewer
odd symptoms but a few new symptoms showed up too. I'll send the
boot log and traces separately. The following summarizes the tests
and results.
[1] X11 stress - very clean, max CPU delay was only 20 usec.
[2] proc stress - very clean, max CPU delay was only 30 usec.
[3] network output stress - only trace much worse than U9.2. An odd
pattern in the graph showing a delay of roughly 400 usec every 5
seconds with a much smaller delay following. There were also a couple
bursts of delays at 90-100 seconds, and 250-260 seconds. Did not
see this pattern on any other test.
[4] network input stress - very clean, max CPU delay was only 80 usec.
[5] disk write stress - very clean, max CPU delay only 70 usec.
[6] disk copy stress - very clean, max CPU delay only 90 usec.
[7] disk read stress - first 25 seconds, had a pattern of roughly 100 usec
CPU delays with a few peaks at 500 usec. After that, was very clean, almost
99.7% of the CPU delays were under 100 usec.
During these tests (total 25-30 minutes) had seven latency traces
over >200 usec. Summary follows:
00 - find_symbol, a single trace line over 400 usec as follows
preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U9.3
-------------------------------------------------------
latency: 495 us, entries: 9 (9) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: modprobe/3643, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80 <c0314f4f>
=> ended at: _spin_unlock_irq+0x1b/0x40 <c031538b>
=======>
00000001 0.000ms (+0.000ms): _spin_lock_irqsave (resolve_symbol)
00000001 0.000ms (+0.447ms): __find_symbol (resolve_symbol)
00010001 0.448ms (+0.000ms): do_nmi (__find_symbol)
00010001 0.448ms (+0.000ms): do_nmi (add_preempt_count)
00010001 0.449ms (+0.042ms): do_nmi (<00200093>)
00000001 0.491ms (+0.000ms): use_module (resolve_symbol)
00000001 0.492ms (+0.001ms): already_uses (use_module)
00000001 0.493ms (+0.000ms): kmem_cache_alloc (use_module)
00000001 0.494ms (+0.000ms): _spin_unlock_irq (resolve_symbol)
01, 02, 03, 05, 06 - flush_tlb
latency: 1815 us, entries: 108 (108) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
latency: 8959 us, entries: 180 (180) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
latency: 175300 us, entries: 4000 (8116) | [VP:1 KP:1 SP:1 HP:1
#CPUS:2]
latency: 80679 us, entries: 1545 (1545) | [VP:1 KP:1 SP:1 HP:1
#CPUS:2]
latency: 76801 us, entries: 3561 (3561) | [VP:1 KP:1 SP:1 HP:1
#CPUS:2]
This is that symptom I reported before where something gets "stuck"
and one or more clock ticks later, it finally gets freed up. Note that
the real time application did not see any of these delays. It may
be interesting to do another test w/ two real time tasks to see if
these are real or a sampling artifact.
04 - avc_insert, a single > 200 usec trace entry.
preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U9.3
-------------------------------------------------------
latency: 216 us, entries: 4 (4) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: fam/2933, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: _spin_lock_irqsave+0x1f/0x80 <c0314f4f>
=> ended at: _spin_unlock_irqrestore+0x20/0x50 <c0315340>
=======>
00000001 0.000ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
00000001 0.000ms (+0.214ms): avc_insert (avc_has_perm_noaudit)
00000001 0.214ms (+0.000ms): memcpy (avc_has_perm_noaudit)
00000001 0.215ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
--Mark
[email protected] wrote:
>>i have released the -U9.3 Real-Time Preemption patch, ...
>>
>
> It is getting hard to keep up with the updates....
>
> This version built OK and since I noticed it includes fixes for the
> parallel port, I added that back to my configuration and built / booted
> without any problems. I still see the BUG from:
>
> Oct 22 12:27:50 dws77 kernel: 8139too Fast Ethernet driver 0.9.27
> Oct 22 12:27:50 dws77 kernel: eth0: RealTek RTL8139 at 0xdc00,
> 00:50:bf:39:11:fc, IRQ 11
> Oct 22 12:27:50 dws77 kernel: BUG: atomic counter underflow at:
> Oct 22 12:27:50 dws77 kernel: [<c02b8f88>] qdisc_destroy+0x98/0xa0 (12)
>
> I saw the messages about fixes for the other network drivers, but
> don't forget this one.
I still get this also. This is not fixed by the network driver fix, but
I don't think it was expected to be.
>
> Real time stress tests ran more smoothly this time with fewer
> odd symptoms but a few new symptoms showed up too. I'll send the
> boot log and traces separately. The following summarizes the tests
> and results.
>
> [1] X11 stress - very clean, max CPU delay was only 20 usec.
>
> [2] proc stress - very clean, max CPU delay was only 30 usec.
>
> [3] network output stress - only trace much worse than U9.2. An odd
> pattern in the graph showing a delay of roughly 400 usec every 5
> seconds with a much smaller delay following. There were also a couple
> bursts of delays at 90-100 seconds, and 250-260 seconds. Did not
> see this pattern on any other test.
>
> [4] network input stress - very clean, max CPU delay was only 80 usec.
>
> [5] disk write stress - very clean, max CPU delay only 70 usec.
>
> [6] disk copy stress - very clean, max CPU delay only 90 usec.
>
> [7] disk read stress - first 25 seconds, had a pattern of roughly 100 usec
> CPU delays with a few peaks at 500 usec. After that, was very clean, almost
> 99.7% of the CPU delays were under 100 usec.
>
> During these tests (total 25-30 minutes) had seven latency traces
> over >200 usec. Summary follows:
>
> 00 - find_symbol, a single trace line over 400 usec as follows
> preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U9.3
> -------------------------------------------------------
> latency: 495 us, entries: 9 (9) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
> -----------------
> | task: modprobe/3643, uid:0 nice:-10 policy:0 rt_prio:0
> -----------------
> => started at: _spin_lock_irqsave+0x1f/0x80 <c0314f4f>
> => ended at: _spin_unlock_irq+0x1b/0x40 <c031538b>
> =======>
> 00000001 0.000ms (+0.000ms): _spin_lock_irqsave (resolve_symbol)
> 00000001 0.000ms (+0.447ms): __find_symbol (resolve_symbol)
> 00010001 0.448ms (+0.000ms): do_nmi (__find_symbol)
> 00010001 0.448ms (+0.000ms): do_nmi (add_preempt_count)
> 00010001 0.449ms (+0.042ms): do_nmi (<00200093>)
> 00000001 0.491ms (+0.000ms): use_module (resolve_symbol)
> 00000001 0.492ms (+0.001ms): already_uses (use_module)
> 00000001 0.493ms (+0.000ms): kmem_cache_alloc (use_module)
> 00000001 0.494ms (+0.000ms): _spin_unlock_irq (resolve_symbol)
>
> 01, 02, 03, 05, 06 - flush_tlb
> latency: 1815 us, entries: 108 (108) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
> latency: 8959 us, entries: 180 (180) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
> latency: 175300 us, entries: 4000 (8116) | [VP:1 KP:1 SP:1 HP:1
> #CPUS:2]
> latency: 80679 us, entries: 1545 (1545) | [VP:1 KP:1 SP:1 HP:1
> #CPUS:2]
> latency: 76801 us, entries: 3561 (3561) | [VP:1 KP:1 SP:1 HP:1
> #CPUS:2]
>
> This is that symptom I reported before where something gets "stuck"
> and one or more clock ticks later, it finally gets freed up. Note that
> the real time application did not see any of these delays. It may
> be interesting to do another test w/ two real time tasks to see if
> these are real or a sampling artifact.
>
> 04 - avc_insert, a single > 200 usec trace entry.
>
> preemption latency trace v1.0.7 on 2.6.9-rc4-mm1-RT-U9.3
> -------------------------------------------------------
> latency: 216 us, entries: 4 (4) | [VP:1 KP:1 SP:1 HP:1 #CPUS:2]
> -----------------
> | task: fam/2933, uid:0 nice:0 policy:0 rt_prio:0
> -----------------
> => started at: _spin_lock_irqsave+0x1f/0x80 <c0314f4f>
> => ended at: _spin_unlock_irqrestore+0x20/0x50 <c0315340>
> =======>
> 00000001 0.000ms (+0.000ms): _spin_lock_irqsave (avc_has_perm_noaudit)
> 00000001 0.000ms (+0.214ms): avc_insert (avc_has_perm_noaudit)
> 00000001 0.214ms (+0.000ms): memcpy (avc_has_perm_noaudit)
> 00000001 0.215ms (+0.000ms): _spin_unlock_irqrestore (avc_has_perm_noaudit)
>
> --Mark
>
>
On Fri, 2004-10-22 at 13:49 -0500, K.R. Foley wrote:
> [email protected] wrote:
> >>i have released the -U9.3 Real-Time Preemption patch, ...
> >>
> >
> > It is getting hard to keep up with the updates....
> >
> > This version built OK and since I noticed it includes fixes for the
> > parallel port, I added that back to my configuration and built / booted
> > without any problems. I still see the BUG from:
> >
> > Oct 22 12:27:50 dws77 kernel: 8139too Fast Ethernet driver 0.9.27
> > Oct 22 12:27:50 dws77 kernel: eth0: RealTek RTL8139 at 0xdc00,
> > 00:50:bf:39:11:fc, IRQ 11
> > Oct 22 12:27:50 dws77 kernel: BUG: atomic counter underflow at:
> > Oct 22 12:27:50 dws77 kernel: [<c02b8f88>] qdisc_destroy+0x98/0xa0 (12)
> >
> > I saw the messages about fixes for the other network drivers, but
> > don't forget this one.
>
> I still get this also. This is not fixed by the network driver fix, but
> I don't think it was expected to be.
No, the fix was for the missing pci shutdown in tulip.
This one is something weird, which has to do with kuzdu triggered
load,unload,reload of a module. Not sure what happens there. I would
need more detailed information. Maybe enabling some debug print of the
card driver would reviel whats going on.
tglx
>No, the fix was for the missing pci shutdown in tulip.
I thought the two were related (since I get the failed to shutdown
message right after that traceback. Perhaps the 8139too needs
that same shutdown fix.
--Mark H Johnson
<mailto:[email protected]>
On Fri, 2004-10-22 at 14:40 -0500, [email protected] wrote:
> >No, the fix was for the missing pci shutdown in tulip.
> I thought the two were related (since I get the failed to shutdown
> message right after that traceback. Perhaps the 8139too needs
> that same shutdown fix.
The shutdown fix is neccecary, but it is not related to the other
problem. The shutdown message will also happen , if you unload the
module manualy. The patch below fixes only the shutdown warning. For the
other one I need more information.
tglx
---
diff -urN --exclude='*~' 2.6.9-mm1-U10/drivers/net/8139too.c
2.6.9-mm1-U10.work/drivers/net/8139too.c
--- 2.6.9-mm1-U10/drivers/net/8139too.c 2004-10-22 19:10:44.000000000
+0200
+++ 2.6.9-mm1-U10.work/drivers/net/8139too.c 2004-10-22
21:52:19.000000000 +0200
@@ -749,7 +749,7 @@
pci_release_regions (pdev);
free_netdev(dev);
-
+ pci_disable_dev (pdev);
pci_set_drvdata (pdev, NULL);
}
how about U10.3? That has the BKL fix too.
Ingo
* Alexander Batyrshin <[email protected]> wrote:
> Hi Ingo,
> U9.3 (defconfig) died with trace:
>
> ------------[ cut here ]------------
> kernel BUG at kernel/sched.c:784!
> invalid operand: 0000 [#1]
> PREEMPT SMP
> Modules linked in:
> CPU: 0
> EIP: 0060:[<c011873d>] Not tainted VLI
> EFLAGS: 00010002 (2.6.9-rc4-mm1-RT-U9.3)
> EIP is at resched_task+0x80/0x8a
> eax: 00000001 ebx: c1674000 ecx: dfb578f0 edx: 00000000
> esi: c16712c0 edi: 00000000 ebp: c167bc48 esp: c167bc3c
> ds: 007b es: 007b ss: 0068 preempt: 00000003
> Process ksoftirqd/0 (pid: 3, threadinfo=c167a000 task=c1670660)
> Stack: c1403820 c1403820 00000000 c167bc94 c0118b78 c16712c0 c1433820
> 00000000
> 00000000 00000000 00000001 00000100 c1433820 c1433820 00000001
> 00000000
> 00000000 00000001 00000082 00000001 de9d3e60 00000001 c167bcd8
> c0133c7b
> Call Trace:
> [<c0118b78>] try_to_wake_up+0x1f3/0x26b (20)
> [<c0133c7b>] autoremove_wake_function+0x2f/0x57 (76)
> [<c011a766>] __wake_up_common+0x3f/0x5e (28)
> [<c011a7d0>] __wake_up+0x4b/0x62 (40)
> [<c034e08c>] sock_def_readable+0x8e/0x90 (52)
> [<c0388dbe>] tcp_child_process+0xe6/0xec (28)
> [<c0384df7>] tcp_v4_do_rcv+0x123/0x162 (36)
> [<c0134079>] _mutex_lock+0x2c/0x3b (16)
> [<c0385513>] tcp_v4_rcv+0x6dd/0x92c (16)
> [<c03c0e30>] svc_revisit+0x27/0x154 (48)
> [<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (40)
> [<c0368609>] ip_local_deliver_finish+0xb6/0x1b2 (4)
> [<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (12)
> [<c035f607>] nf_hook_slow+0xdc/0x12e (20)
> [<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (28)
> [<c0368705>] ip_rcv_finish+0x0/0x2b3 (28)
> [<c0367fcb>] ip_local_deliver+0x208/0x226 (4)
> [<c0368553>] ip_local_deliver_finish+0x0/0x1b2 (24)
> [<c0368828>] ip_rcv_finish+0x123/0x2b3 (20)
> [<c0368705>] ip_rcv_finish+0x0/0x2b3 (32)
> [<c035f607>] nf_hook_slow+0xdc/0x12e (20)
> [<c0368705>] ip_rcv_finish+0x0/0x2b3 (28)
> [<c036846a>] ip_rcv+0x481/0x56a (32)
> [<c0368705>] ip_rcv_finish+0x0/0x2b3 (24)
> [<c0354e68>] netif_receive_skb+0x117/0x1dd (28)
> [<c0354ff7>] process_backlog+0xc9/0x1cb (36)
> [<c03551b2>] net_rx_action+0xb9/0x1ed (48)
> [<c01242d9>] ___do_softirq+0xe1/0x130 (36)
> [<c0124923>] ksoftirqd+0x0/0xda (40)
> [<c01243fb>] _do_softirq+0x4b/0x4d (4)
> [<c01249c5>] ksoftirqd+0xa2/0xda (16)
> [<c013376d>] kthread+0xb7/0xbd (24)
> [<c01336b6>] kthread+0x0/0xbd (28)
> [<c0103375>] kernel_thread_helper+0x5/0xb (16)
> preempt count: 00000004
> . 4-level deep critical section nesting:
> .. entry 1: _spin_lock_irqsave+0x1d/0xa5 [<00000000>] / (0x0 [<00000000>])
> .. entry 2: _spin_lock+0x19/0x6d [<00000000>] / (0x0 [<00000000>])
> .. entry 3: _spin_lock_irqsave+0x1d/0xa5 [<00000000>] / (0x0 [<00000000>])
> .. entry 4: print_traces+0x17/0x4e [<00000000>] / (0x0 [<00000000>])
On Friday 22 October 2004 12:19, Jeff V. Merkey wrote:
>Ingo Molnar wrote:
>>* Gene Heskett <[email protected]> wrote:
>>>As sort of the ultimate dummy test, I'm building this right now.
>>> The only oddments so far are a bunch of deprecated variable
>>> warnings, quite a few but many are dups.
>>
>>these warnings are present in -mm1 too.
>>
>> Ingo
>>-
>>To unsubscribe from this list: send the line "unsubscribe
>> linux-kernel" in the body of a message to
>> [email protected]
>>More majordomo info at http://vger.kernel.org/majordomo-info.html
>>Please read the FAQ at http://www.tux.org/lkml/
>
>Hey Ingo,
>
>Bite Me.
>
>:-)
>
>Jeff
Frankly Jeff, we'd best be afraid of food poisoning if we got that
hungry.
--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.28% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
On Friday 22 October 2004 11:50, Ingo Molnar wrote:
>i have released the -U10 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this is purely a rebasing of -U9.3 to 2.6.9-mm1.
>
>to create a -U10 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> +
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.
>6.9-mm1/2.6.9-mm1.bz2 +
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm
>1-U10
>
> Ingo
I teased 2 gallons of latex primer out of the cans, making it jump up
onto some outbuildings, so I'm a bit late with the report, which is
that it spit out a whole flurry of scheduleing while atomic messages
when I tried it just now, and finally hung, requiring I exersize the
reset button long before it got to rc.sysinit. The best I could do
might be a screen snapshot, but I haven't figured out howto shut the
flippin flash off and use the available light, or try and scribble it
all down. And that would lead to violent deaths if I was a doctor
trying to write prescriptions. :-(
As an experiment to see if I could lay bleeding edge, I bled. :)
--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.28% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
On Fri, 2004-10-22 at 17:49 -0400, Gene Heskett wrote:
> Mmm, I get a 404 page not found. when I click on on thsi link.
Same here. The current version is 10.3:
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10.3
I did not get an announcement, it looks like the listserv is backlogged,
or Ingo did not announce it yet.
Lee
On Friday 22 October 2004 13:56, Ingo Molnar wrote:
>i have released the -U10.2 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this is a fixes-only release.
>
>Changes since -U10:
>
> - fixed a big bug present ever since: the BKL got dropped when a
> spinlock-mutex was acquired and it scheduled away. This reduced
> the locking efficiency of the BKL. A number of outstanding problems
> could be affected, in particular this should fix the tty locking
> breakage reported by Alexander Batyrshin and Adam Heath. UP and SMP
> systems are affected too, with SMP systems having a higher chance
> to trigger this condition.
>
> - tulip.c breakage fix from Thomas Gleixner
>
> - tg3 and 3c59x fixes.
>
> - made the hardirq threads SCHED_FIFO by default. They get
> priorities between 25 and 50, depending on the irq #. (this is
> pretty random but i found no better scheme.) Made the softirq
> thread SCHED_FIFO by default as well, albeit this probably will
> have to change. These changes should make it easier to debug a hung
> system.
>
>to create a -U10.2 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> +
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.
>6.9-mm1/2.6.9-mm1.bz2 +
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm
>1-U10.2
Mmm, I get a 404 page not found. when I click on on thsi link.
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe
> linux-kernel" in the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.28% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
On Fri, 2004-10-22 at 18:06 -0400, Lee Revell wrote:
> On Fri, 2004-10-22 at 17:49 -0400, Gene Heskett wrote:
> > Mmm, I get a 404 page not found. when I click on on thsi link.
>
> Same here. The current version is 10.3:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10.3
OK, U10.3 is the first since T3 for me that boots flawlessly. Latency
numbers forthcoming...
Lee
Oct 22 14:11:29 swdev14 kernel: NET: Registered protocol family 4
Oct 22 14:13:11 swdev14 ntpd[2865]: synchronized to 151.131.44.70, stratum=4
Oct 22 14:13:11 swdev14 ntpd[2865]: kernel time sync disabled 0041
Oct 22 14:26:00 swdev14 ntpd[2865]: kernel time sync enabled 0001
Oct 22 14:37:14 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
Oct 22 14:37:14 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
Oct 22 14:37:14 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
Oct 22 14:37:14 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
Oct 22 14:37:14 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
Oct 22 14:37:14 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
Oct 22 14:37:14 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
Oct 22 14:37:14 swdev14 kernel: [<c023fadc>] llc_rcv+0x19b/0x2a3 (24)
Oct 22 14:37:14 swdev14 kernel: [<c02325df>] netif_receive_skb+0x201/0x30c (32)
Oct 22 14:37:14 swdev14 kernel: [<c0130400>] remove_from_abslist+0x6/0x64 (20)
Oct 22 14:37:14 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (8)
Oct 22 14:37:14 swdev14 kernel: [<c0232780>] process_backlog+0x96/0x164 (28)
Oct 22 14:37:14 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (40)
Oct 22 14:37:14 swdev14 kernel: [<c012299c>] ___do_softirq+0xb8/0x107 (40)
Oct 22 14:37:14 swdev14 kernel: [<c013317c>] __mcount+0x1d/0x21 (12)
Oct 22 14:37:14 swdev14 kernel: [<c0122a9b>] _do_softirq+0x20/0x22 (36)
Oct 22 14:37:14 swdev14 kernel: [<c0122f85>] ksoftirqd+0xcd/0x105 (8)
Oct 22 14:37:14 swdev14 kernel: [<c0132227>] kthread+0xbc/0xc1 (32)
Oct 22 14:37:14 swdev14 kernel: [<c0122eb8>] ksoftirqd+0x0/0x105 (20)
Oct 22 14:37:14 swdev14 kernel: [<c013216b>] kthread+0x0/0xc1 (12)
Oct 22 14:37:14 swdev14 kernel: [<c01043b1>] kernel_thread_helper+0x5/0xb (16)
Oct 22 14:37:14 swdev14 kernel: preempt count: 00000002
Oct 22 14:37:14 swdev14 kernel: . 2-level deep critical section nesting:
Oct 22 14:37:14 swdev14 kernel: .. entry 1: snap_rcv+0x1b/0xe0 [<c023fadc>] / (llc_rcv+0x19b/0x2a3 [<c023fadc>])
Oct 22 14:37:14 swdev14 kernel: .. entry 2: print_traces+0x1d/0x59 [<c01070c3>] / (dump_stack+0x23/0x27 [<c01070c3>])
Oct 22 14:37:14 swdev14 kernel:
Oct 22 14:37:16 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
Oct 22 14:37:16 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
Oct 22 14:37:16 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
Oct 22 14:37:16 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
Oct 22 14:37:16 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
Oct 22 14:37:16 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
Oct 22 14:37:16 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
Oct 22 14:37:16 swdev14 kernel: [<c023fadc>] llc_rcv+0x19b/0x2a3 (24)
Oct 22 14:37:16 swdev14 kernel: [<c02325df>] netif_receive_skb+0x201/0x30c (32)
Oct 22 14:37:16 swdev14 kernel: [<c0130400>] remove_from_abslist+0x6/0x64 (20)
Oct 22 14:37:16 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (8)
Oct 22 14:37:16 swdev14 kernel: [<c0232780>] process_backlog+0x96/0x164 (28)
Oct 22 14:37:16 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (40)
Oct 22 14:37:16 swdev14 kernel: [<c012299c>] ___do_softirq+0xb8/0x107 (40)
Oct 22 14:37:16 swdev14 kernel: [<c013317c>] __mcount+0x1d/0x21 (12)
Oct 22 14:37:16 swdev14 kernel: [<c0122a9b>] _do_softirq+0x20/0x22 (36)
Oct 22 14:37:16 swdev14 kernel: [<c0122f85>] ksoftirqd+0xcd/0x105 (8)
Oct 22 14:37:16 swdev14 kernel: [<c0132227>] kthread+0xbc/0xc1 (32)
Oct 22 14:37:16 swdev14 kernel: [<c0122eb8>] ksoftirqd+0x0/0x105 (20)
Oct 22 14:37:16 swdev14 kernel: [<c013216b>] kthread+0x0/0xc1 (12)
Oct 22 14:37:16 swdev14 kernel: [<c01043b1>] kernel_thread_helper+0x5/0xb (16)
Oct 22 14:37:16 swdev14 kernel: preempt count: 00000002
Oct 22 14:37:16 swdev14 kernel: . 2-level deep critical section nesting:
Oct 22 14:37:16 swdev14 kernel: .. entry 1: snap_rcv+0x1b/0xe0 [<c023fadc>] / (llc_rcv+0x19b/0x2a3 [<c023fadc>])
Oct 22 14:37:16 swdev14 kernel: .. entry 2: print_traces+0x1d/0x59 [<c01070c3>] / (dump_stack+0x23/0x27 [<c01070c3>])
Oct 22 14:37:16 swdev14 kernel:
Oct 22 14:37:17 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
Oct 22 14:37:17 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
Oct 22 14:37:17 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
Oct 22 14:37:17 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
Oct 22 14:37:17 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
Oct 22 14:37:17 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
Oct 22 14:37:17 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
Oct 22 14:37:17 swdev14 kernel: [<c023fadc>] llc_rcv+0x19b/0x2a3 (24)
Oct 22 14:37:17 swdev14 kernel: [<c02325df>] netif_receive_skb+0x201/0x30c (32)
Oct 22 14:37:17 swdev14 kernel: [<c0130400>] remove_from_abslist+0x6/0x64 (20)
Oct 22 14:37:17 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (8)
Oct 22 14:37:17 swdev14 kernel: [<c0232780>] process_backlog+0x96/0x164 (28)
Oct 22 14:37:17 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (40)
Oct 22 14:37:17 swdev14 kernel: [<c012299c>] ___do_softirq+0xb8/0x107 (40)
Oct 22 14:37:17 swdev14 kernel: [<c013317c>] __mcount+0x1d/0x21 (12)
Oct 22 14:37:17 swdev14 kernel: [<c0122a9b>] _do_softirq+0x20/0x22 (36)
Oct 22 14:37:17 swdev14 kernel: [<c0122f85>] ksoftirqd+0xcd/0x105 (8)
Oct 22 14:37:17 swdev14 kernel: [<c0132227>] kthread+0xbc/0xc1 (32)
Oct 22 14:37:17 swdev14 kernel: [<c0122eb8>] ksoftirqd+0x0/0x105 (20)
Oct 22 14:37:17 swdev14 kernel: [<c013216b>] kthread+0x0/0xc1 (12)
Oct 22 14:37:17 swdev14 kernel: [<c01043b1>] kernel_thread_helper+0x5/0xb (16)
Oct 22 14:37:17 swdev14 kernel: preempt count: 00000002
Oct 22 14:37:17 swdev14 kernel: . 2-level deep critical section nesting:
Oct 22 14:37:17 swdev14 kernel: .. entry 1: snap_rcv+0x1b/0xe0 [<c023fadc>] / (llc_rcv+0x19b/0x2a3 [<c023fadc>])
Oct 22 14:37:17 swdev14 kernel: .. entry 2: print_traces+0x1d/0x59 [<c01070c3>] / (dump_stack+0x23/0x27 [<c01070c3>])
Oct 22 14:37:17 swdev14 kernel:
Oct 22 14:37:17 swdev14 kernel: BUG: scheduling while atomic: ksoftirqd/0/0x10000001/3
Oct 22 14:37:17 swdev14 kernel: caller is cond_resched+0x64/0x78
Oct 22 14:37:17 swdev14 kernel: [<c029dd88>] __sched_text_start+0xbdc/0xc2c (12)
Oct 22 14:37:17 swdev14 kernel: [<c029e6da>] cond_resched+0x64/0x78 (8)
Oct 22 14:37:17 swdev14 kernel: [<c01338ce>] check_preempt_timing+0x5d/0x289 (12)
Oct 22 14:37:17 swdev14 kernel: [<c0133b40>] touch_preempt_timing+0x46/0x4a (4)
Oct 22 14:37:17 swdev14 kernel: [<c029e69c>] cond_resched+0x26/0x78 (4)
Oct 22 14:37:17 swdev14 kernel: [<c029deab>] preempt_schedule+0x11/0x6f (4)
Oct 22 14:37:17 swdev14 kernel: [<c01070c3>] dump_stack+0x23/0x27 (4)
Oct 22 14:37:17 swdev14 kernel: [<c0112038>] mcount+0x14/0x18 (8)
Oct 22 14:37:17 swdev14 kernel: [<c0132aed>] _mutex_lock+0x43/0x63 (60)
Oct 22 14:37:17 swdev14 kernel: [<c029e6da>] cond_resched+0x64/0x78 (20)
Oct 22 14:37:17 swdev14 kernel: [<c0132aed>] _mutex_lock+0x43/0x63 (16)
Oct 22 14:37:17 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
Oct 22 14:37:17 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
Oct 22 14:37:17 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
Oct 22 14:37:17 swdev14 kernel: [<c023fadc>] llc_rcv+0x19b/0x2a3 (24)
Oct 22 14:37:17 swdev14 kernel: [<c02325df>] netif_receive_skb+0x201/0x30c (32)
Oct 22 14:37:17 swdev14 kernel: [<c0130400>] remove_from_abslist+0x6/0x64 (20)
Oct 22 14:37:17 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (8)
Oct 22 14:37:17 swdev14 kernel: [<c0232780>] process_backlog+0x96/0x164 (28)
Oct 22 14:37:17 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (40)
Oct 22 14:37:17 swdev14 kernel: [<c012299c>] ___do_softirq+0xb8/0x107 (40)
Oct 22 14:37:17 swdev14 kernel: [<c013317c>] __mcount+0x1d/0x21 (12)
Oct 22 14:37:18 swdev14 kernel: [<c0122a9b>] _do_softirq+0x20/0x22 (36)
Oct 22 14:37:18 swdev14 kernel: [<c0122f85>] ksoftirqd+0xcd/0x105 (8)
Oct 22 14:37:18 swdev14 kernel: [<c0132227>] kthread+0xbc/0xc1 (32)
Oct 22 14:37:18 swdev14 kernel: [<c0122eb8>] ksoftirqd+0x0/0x105 (20)
Oct 22 14:37:18 swdev14 kernel: [<c013216b>] kthread+0x0/0xc1 (12)
Oct 22 14:37:18 swdev14 kernel: [<c01043b1>] kernel_thread_helper+0x5/0xb (16)
Oct 22 14:37:18 swdev14 kernel: preempt count: 10000002
Oct 22 14:37:18 swdev14 kernel: . 2-level deep critical section nesting:
Oct 22 14:37:18 swdev14 kernel: .. entry 1: snap_rcv+0x1b/0xe0 [<c023fadc>] / (llc_rcv+0x19b/0x2a3 [<c023fadc>])
Oct 22 14:37:18 swdev14 kernel: .. entry 2: print_traces+0x1d/0x59 [<c01070c3>] / (dump_stack+0x23/0x27 [<c01070c3>])
Oct 22 14:37:18 swdev14 kernel:
Oct 22 15:22:47 swdev14 modprobe: FATAL: Error running install command for sound_slot_0
Oct 22 15:22:47 swdev14 modprobe: FATAL: Error running install command for sound_slot_1
Oct 22 15:22:47 swdev14 modprobe: FATAL: Error running install command for sound_slot_0
Oct 22 15:22:47 swdev14 modprobe: FATAL: Error running install command for sound_slot_1
Oct 22 15:22:47 swdev14 modprobe: FATAL: Error running install command for sound_slot_0
Oct 22 15:22:47 swdev14 modprobe: FATAL: Error running install command for sound_slot_1
Oct 22 15:22:47 swdev14 modprobe: FATAL: Error running install command for sound_slot_0
Oct 22 15:22:47 swdev14 last message repeated 2 times
Oct 22 15:25:07 swdev14 su(pam_unix)[15830]: session opened for user root by aaektkf(uid=19990)
Oct 22 15:30:34 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
Oct 22 15:30:34 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
Oct 22 15:30:34 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
Oct 22 15:30:34 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
Oct 22 15:30:34 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
Oct 22 15:30:34 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
Oct 22 15:30:34 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
Oct 22 15:30:34 swdev14 kernel: [<c023fadc>] llc_rcv+0x19b/0x2a3 (24)
Oct 22 15:30:34 swdev14 kernel: [<c02325df>] netif_receive_skb+0x201/0x30c (32)
Oct 22 15:30:34 swdev14 kernel: [<c0130400>] remove_from_abslist+0x6/0x64 (20)
Oct 22 15:30:34 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (8)
Oct 22 15:30:34 swdev14 kernel: [<c0232780>] process_backlog+0x96/0x164 (28)
Oct 22 15:30:34 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (40)
Oct 22 15:30:34 swdev14 kernel: [<c012299c>] ___do_softirq+0xb8/0x107 (40)
Oct 22 15:30:34 swdev14 kernel: [<c013317c>] __mcount+0x1d/0x21 (12)
Oct 22 15:30:34 swdev14 kernel: [<c0122a9b>] _do_softirq+0x20/0x22 (36)
Oct 22 15:30:34 swdev14 kernel: [<c0122f85>] ksoftirqd+0xcd/0x105 (8)
Oct 22 15:30:34 swdev14 kernel: [<c0132227>] kthread+0xbc/0xc1 (32)
Oct 22 15:30:34 swdev14 kernel: [<c0122eb8>] ksoftirqd+0x0/0x105 (20)
Oct 22 15:30:34 swdev14 kernel: [<c013216b>] kthread+0x0/0xc1 (12)
Oct 22 15:30:34 swdev14 kernel: [<c01043b1>] kernel_thread_helper+0x5/0xb (16)
Oct 22 15:30:34 swdev14 kernel: preempt count: 00000002
Oct 22 15:30:34 swdev14 kernel: . 2-level deep critical section nesting:
Oct 22 15:30:34 swdev14 kernel: .. entry 1: snap_rcv+0x1b/0xe0 [<c023fadc>] / (llc_rcv+0x19b/0x2a3 [<c023fadc>])
Oct 22 15:30:34 swdev14 kernel: .. entry 2: print_traces+0x1d/0x59 [<c01070c3>] / (dump_stack+0x23/0x27 [<c01070c3>])
Oct 22 15:30:34 swdev14 kernel:
Oct 22 15:30:35 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
Oct 22 15:30:35 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
Oct 22 15:30:35 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
Oct 22 15:30:35 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
Oct 22 15:30:35 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
Oct 22 15:30:35 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
Oct 22 15:30:35 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
Oct 22 15:30:35 swdev14 kernel: [<c023fadc>] llc_rcv+0x19b/0x2a3 (24)
Oct 22 15:30:35 swdev14 kernel: [<c02325df>] netif_receive_skb+0x201/0x30c (32)
Oct 22 15:30:35 swdev14 kernel: [<c0130400>] remove_from_abslist+0x6/0x64 (20)
Oct 22 15:30:35 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (8)
Oct 22 15:30:35 swdev14 kernel: [<c0232780>] process_backlog+0x96/0x164 (28)
Oct 22 15:30:35 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (40)
Oct 22 15:30:35 swdev14 kernel: [<c012299c>] ___do_softirq+0xb8/0x107 (40)
Oct 22 15:30:35 swdev14 kernel: [<c013317c>] __mcount+0x1d/0x21 (12)
Oct 22 15:30:35 swdev14 kernel: [<c0122a9b>] _do_softirq+0x20/0x22 (36)
Oct 22 15:30:35 swdev14 kernel: [<c0122f85>] ksoftirqd+0xcd/0x105 (8)
Oct 22 15:30:35 swdev14 kernel: [<c0132227>] kthread+0xbc/0xc1 (32)
Oct 22 15:30:35 swdev14 kernel: [<c0122eb8>] ksoftirqd+0x0/0x105 (20)
Oct 22 15:30:35 swdev14 kernel: [<c013216b>] kthread+0x0/0xc1 (12)
Oct 22 15:30:35 swdev14 kernel: [<c01043b1>] kernel_thread_helper+0x5/0xb (16)
Oct 22 15:30:35 swdev14 kernel: preempt count: 00000002
Oct 22 15:30:35 swdev14 kernel: . 2-level deep critical section nesting:
Oct 22 15:30:35 swdev14 kernel: .. entry 1: snap_rcv+0x1b/0xe0 [<c023fadc>] / (llc_rcv+0x19b/0x2a3 [<c023fadc>])
Oct 22 15:30:35 swdev14 kernel: .. entry 2: print_traces+0x1d/0x59 [<c01070c3>] / (dump_stack+0x23/0x27 [<c01070c3>])
Oct 22 15:30:35 swdev14 kernel:
Oct 22 15:30:37 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
Oct 22 15:30:37 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
Oct 22 15:30:37 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
Oct 22 15:30:37 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
Oct 22 15:30:37 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
Oct 22 15:30:37 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
Oct 22 15:30:37 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
Oct 22 15:30:37 swdev14 kernel: [<c023fadc>] llc_rcv+0x19b/0x2a3 (24)
Oct 22 15:30:37 swdev14 kernel: [<c02325df>] netif_receive_skb+0x201/0x30c (32)
Oct 22 15:30:37 swdev14 kernel: [<c0130400>] remove_from_abslist+0x6/0x64 (20)
Oct 22 15:30:37 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (8)
Oct 22 15:30:37 swdev14 kernel: [<c0232780>] process_backlog+0x96/0x164 (28)
Oct 22 15:30:37 swdev14 kernel: [<c02328ef>] net_rx_action+0xa1/0x178 (40)
Oct 22 15:30:37 swdev14 kernel: [<c012299c>] ___do_softirq+0xb8/0x107 (40)
Oct 22 15:30:37 swdev14 kernel: [<c013317c>] __mcount+0x1d/0x21 (12)
Oct 22 15:30:37 swdev14 kernel: [<c0122a9b>] _do_softirq+0x20/0x22 (36)
Oct 22 15:30:37 swdev14 kernel: [<c0122f85>] ksoftirqd+0xcd/0x105 (8)
Oct 22 15:30:37 swdev14 kernel: [<c0132227>] kthread+0xbc/0xc1 (32)
Oct 22 15:30:37 swdev14 kernel: [<c0122eb8>] ksoftirqd+0x0/0x105 (20)
Oct 22 15:30:37 swdev14 kernel: [<c013216b>] kthread+0x0/0xc1 (12)
Oct 22 15:30:37 swdev14 kernel: [<c01043b1>] kernel_thread_helper+0x5/0xb (16)
Oct 22 15:30:37 swdev14 kernel: preempt count: 00000002
Oct 22 15:30:37 swdev14 kernel: . 2-level deep critical section nesting:
Oct 22 15:30:37 swdev14 kernel: .. entry 1: snap_rcv+0x1b/0xe0 [<c023fadc>] / (llc_rcv+0x19b/0x2a3 [<c023fadc>])
Oct 22 15:30:37 swdev14 kernel: .. entry 2: print_traces+0x1d/0x59 [<c01070c3>] / (dump_stack+0x23/0x27 [<c01070c3>])
Oct 22 15:30:37 swdev14 kernel:
On Fri, 2004-10-22 at 17:27, Rui Nuno Capela wrote:
> Ingo Molnar wrote:
> >> i have released the -U10.2 Real-Time Preemption patch, which can be
> > downloaded from:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> Regarding the jackd -R issue, I was trying to capture some debug data via
> netconsole on my laptop (P4/UP) running RT-U10.2
Just starting to test U10.3. So far no freezes but I do see problems
with Jack kicking out Hydrogen from the graph and some xruns.
I see this on boot (athlon64 system):
ACPI: PCI interrupt 0000:00:0e.0[A] -> GSI 19 (level, low) -> IRQ 19
ohci1394: fw-host0: OHCI-1394 1.0 (PCI): IRQ=[19]
MMIO=[cfffe000-cfffe7ff] Max Packet=[2048]
ohci1394 0000:00:0e.0: Device was removed without properly calling
pci_disable_device(). This may need fixing.
r8169 Gigabit Ethernet driver 1.6LK loaded
ACPI: PCI interrupt 0000:00:0b.0[A] -> GSI 16 (level, low) -> IRQ 16
divert: allocating divert_blk for eth0
eth0: Identified chip type is 'RTL8169s/8110s'.
eth0: RTL8169 at 0xf88c8f00, 00:0c:76:b3:c2:43, IRQ 16
BUG: atomic counter underflow at:
[<c02b99a1>] qdisc_destroy+0x91/0xa0 (8)
[<c02b9b6f>] dev_shutdown+0x2f/0x90 (12)
[<c030dfc7>] cond_resched+0x17/0x70 (8)
[<c02acea5>] unregister_netdevice+0x125/0x250 (16)
[<c030e6b2>] down_write+0xa2/0x1e0 (12)
[<c01cd87a>] __up_write+0x11a/0x200 (4)
[<c02476cf>] unregister_netdev+0xf/0x20 (16)
[<f8b0163d>] rtl8169_remove_one+0x1d/0x50 [r8169] (8)
[<c01d6c08>] pci_device_remove+0x68/0x70 (16)
[<c0235cb6>] device_release_driver+0x56/0x60 (20)
[<c0235cd8>] driver_detach+0x18/0x30 (12)
[<c02360d9>] bus_remove_driver+0x29/0x50 (12)
[<c02364eb>] driver_unregister+0xb/0x20 (8)
[<c01d6e3b>] pci_unregister_driver+0xb/0x20 (8)
[<c0137795>] sys_delete_module+0x105/0x130 (8)
[<c0106019>] sysenter_past_esp+0x52/0x71 (80)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0xd/0x40 [<00000000>] / (0x0 [<00000000>])
divert: freeing divert_blk for eth0
ip_tables: (C) 2000-2002 Netfilter core team
ip_conntrack version 2.1 (8192 buckets, 65536 max) - 300 bytes per
conntrack
r8169 Gigabit Ethernet driver 1.6LK loaded
ACPI: PCI interrupt 0000:00:0b.0[A] -> GSI 16 (level, low) -> IRQ 16
divert: allocating divert_blk for eth0
eth0: Identified chip type is 'RTL8169s/8110s'.
eth0: RTL8169 at 0xf8af4f00, 00:0c:76:b3:c2:43, IRQ 16
IRQ#16 thread RT prio: 49.
r8169: eth0: link up
-- Fernando
On Tue, 2004-10-19 at 12:57 -0700, Bill Huey wrote:
> On Tue, Oct 19, 2004 at 08:00:59PM +0200, Ingo Molnar wrote:
> >
> > i have released the -U7 Real-Time Preemption patch:
> >
> > http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-rc4-mm1-U7
>
> You should seriously think about using a kind of bitkeeper or CVS stle
> system so that multipule folks can dump stuff into it rapidly. This
> project is large enough that it needs some kind of facility like that.
We also might want to consider a separate mailing list, if the volume of
mail gets unwieldy, then just post release announcements to LKML. The
volume is high enough and some of the terminology different enough that
we already had one inadvertent bout of flaming over some unfortunate
miscommunication. As more people get involved this could get out of
hand. The cc: lists are getting pretty long, and a lot of the testing
requires posting huge traces. Think about how many copies of each mail
vger has to deliver...
Anyway, I am fine with continuing on LKML, it would really depend on how
Ingo and especially the people for whom this is all noise feel about it.
Lee
On Fri, 2004-10-22 at 19:56 +0200, Ingo Molnar wrote:
> i have released the -U10.2 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
The current version is -U10.3
Find below a patch against -U10.3. It contains a couple of smaller fixes
including the network driver bootup BUG.
<Dislaimer>
I'm trying to start a serious discussion on that. It's not my intention
to start a flamewar.
</Disclaimer>
The network driver load/unload problem, which was reported from a couple
of testers, is related to Ingo's check inside of atomic_dec_and_test().
if (!atomic_read(v)) {
printk("BUG: atomic counter underflow at:\n");
dump_stack();
}
The unmodified atomic_dec_and_test() returns 0 in all cases except for
the case when the counter is 0 after the decrement. This implementation
covers automatically the case, where atomic_dec_and_test() is called
with counter == 0.
The network code uses atomic_dec_and_test() in qdisk_destroy to check,
if the caller owns the last reference to the qdisc. This covers also the
case when no reference is there when this is called. I fixed this for
now with a check for 0 before calling qdisk_destroy().
Thats a fix, but not an answer to the following question.
The question is whether the atomic implementation is intended to allow
counter values less than 0 or not. Reviewing the usage it seems not to
be the case. The qdisc_destroy code is the only place I recognized so
far, which makes use of that implementation detail.
>From my point of view is allowing such usage covering a hidden problem.
Assume count = 0, call atomic_dec_test() and you have count = -1.
At a differemt places the check is if (!atomic_read(v)){...}
If count = -1 it signals, that data or whatever is available.
I don't think there is code which is actually failing on that, but as a
design rule it might turn out that Ingo's check and the implied rule
"count must be >= 0" is appropriate.
tglx
---
diff --exclude='*~' -urN 2.6.9-mm1-U10.3/drivers/net/8139too.c
2.6.9-mm1-U10.work/drivers/net/8139too.c
--- 2.6.9-mm1-U10.3/drivers/net/8139too.c 2004-10-23 01:59:14.000000000
+0200
+++ 2.6.9-mm1-U10.work/drivers/net/8139too.c 2004-10-22
23:18:32.000000000 +0200
@@ -749,7 +749,7 @@
pci_release_regions (pdev);
free_netdev(dev);
-
+ pci_disable_device (pdev);
pci_set_drvdata (pdev, NULL);
}
diff --exclude='*~' -urN
2.6.9-mm1-U10.3/drivers/pci/hotplug/cpci_hotplug_core.c
2.6.9-mm1-U10.work/drivers/pci/hotplug/cpci_hotplug_core.c
--- 2.6.9-mm1-U10.3/drivers/pci/hotplug/cpci_hotplug_core.c 2004-10-23
01:59:15.000000000 +0200
+++ 2.6.9-mm1-U10.work/drivers/pci/hotplug/cpci_hotplug_core.c
2004-10-22 19:06:47.000000000 +0200
@@ -598,7 +598,7 @@
msleep(100);
}
dbg("poll thread signals exit");
- up(&thread_exit);
+ complete(&thread_exit);
return 0;
}
diff --exclude='*~' -urN 2.6.9-mm1-U10.3/net/sched/sch_generic.c
2.6.9-mm1-U10.work/net/sched/sch_generic.c
--- 2.6.9-mm1-U10.3/net/sched/sch_generic.c 2004-10-23
01:59:12.000000000 +0200
+++ 2.6.9-mm1-U10.work/net/sched/sch_generic.c 2004-10-23
02:45:58.000000000 +0200
@@ -584,7 +584,9 @@
qdisc = dev->qdisc_sleeping;
dev->qdisc = &noop_qdisc;
dev->qdisc_sleeping = &noop_qdisc;
- qdisc_destroy(qdisc);
+
+ if (atomic_read(&qdisc->refcnt))
+ qdisc_destroy(qdisc);
#if defined(CONFIG_NET_SCH_INGRESS) ||
defined(CONFIG_NET_SCH_INGRESS_MODULE)
if ((qdisc = dev->qdisc_ingress) != NULL) {
dev->qdisc_ingress = NULL;
diff --exclude='*~' -urN 2.6.9-mm1-U10.3/drivers/pcmcia/yenta_socket.c
2.6.9-mm1-U10.work/drivers/pcmcia/yenta_socket.c
--- 2.6.9-mm1-U10.3/drivers/pcmcia/yenta_socket.c 2004-10-22
16:52:26.000000000 +0200
+++ 2.6.9-mm1-U10.work/drivers/pcmcia/yenta_socket.c 2004-10-22
21:26:15.000000000 +0200
@@ -653,6 +653,7 @@
yenta_free_resources(sock);
pci_release_regions(dev);
+ pci_disable_device(dev);
pci_set_drvdata(dev, NULL);
}
Ingo Molnar wrote:
>
> i have released the -U10.2 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
Regarding the jackd -R issue, I was trying to capture some debug data via
netconsole on my laptop (P4/UP) running RT-U10.2, and when the system
freezes as reported before, I was able to kick the SysRq+T. But, instead
of a task trace list, I get the following:
SysRq : <3>BUG: sleeping function called from invalid context IRQ 1(776)
at kernel/mutex.c:37
in_atomic():1 [00000001], irqs_disabled():1
[<c0104ee4>] dump_stack+0x1e/0x20 (20)
[<c0114a23>] __might_sleep+0xb2/0xc7 (36)
[<c012c0f2>] _mutex_lock+0x39/0x5e (28)
[<c012c13a>] _mutex_lock_irqsave+0x11/0x15 (12)
[<c027f927>] refill_skbs+0x13/0x6d (20)
[<c027fa4c>] find_skb+0x5d/0x9d (24)
[<c027fb74>] netpoll_send_udp+0x3b/0x298 (48)
[<e00ef047>] write_msg+0x47/0x5c [netconsole] (36)
[<c0117804>] __call_console_drivers+0x51/0x60 (32)
[<c0117910>] call_console_drivers+0x6d/0x147 (40)
[<c0117caf>] release_console_sem+0x48/0x100 (36)
[<c0117bd5>] vprintk+0x127/0x174 (36)
[<c0117aac>] printk+0x18/0x1a (16)
[<c01f4849>] __handle_sysrq+0x38/0xed (40)
[<c01ee426>] kbd_event+0xeb/0xfa (40)
[<c025f6a8>] input_event+0x160/0x3d4 (44)
[<c02620b6>] atkbd_report_key+0x3b/0x95 (32)
[<c026236c>] atkbd_interrupt+0x25c/0x590 (60)
[<c01f6fd2>] serio_interrupt+0x4f/0xa5 (44)
[<c01f78cb>] i8042_interrupt+0xb8/0x1b8 (40)
[<c0131dbc>] handle_IRQ_event+0x48/0x79 (32)
[<c01325dd>] do_hardirq+0x86/0x123 (40)
[<c0132712>] do_irqd+0x98/0xc9 (36)
[<c012b7d7>] kthread+0x9c/0xc9 (48)
[<c0102305>] kernel_thread_helper+0x5/0xb (548454420)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sysrq_lock_table+0x12/0x14 [<c01f482b>] /
(__handle_sysrq+0x1a/0xed [<c01f482b>])
.. entry 2: print_traces+0x16/0x48 [<c0104ee4>] / (dump_stack+0x1e/0x20
[<c0104ee4>])
Other SysRq key combinations dumps similar.
Any suggestions?
--
rncbc aka Rui Nuno Capela
[email protected]
On Friday 22 October 2004 11:19, Jeff V. Merkey wrote:
> Hey Ingo,
>
> Bite Me.
>
> :-)
>
Hey, Jeff,
This is a very high traffic list. If you have nothing constructive to say,
don't say it. Many here are already not happy with your attitude already,
you're not helping.
For my part, you're getting a shiny new procmail entry right to dave null.
I'm sick of it.
Sorry, everyone else, for adding to the noise. But someone's gotta say it.
--Russell
--
Russell Miller - [email protected] - Le Mars, IA
Duskglow Consulting - Helping companies just like you to succeed for ~ 10 yrs.
http://www.duskglow.com - 712-546-5886
* K.R. Foley <[email protected]> wrote:
> Oct 22 14:37:14 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
> Oct 22 14:37:14 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
> Oct 22 14:37:14 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
> Oct 22 14:37:14 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
> Oct 22 14:37:14 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
> Oct 22 14:37:14 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
> Oct 22 14:37:14 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
does the patch below fix these?
Ingo
--- linux/net/802/psnap.c.orig
+++ linux/net/802/psnap.c
@@ -55,7 +55,7 @@ static int snap_rcv(struct sk_buff *skb,
.type = __constant_htons(ETH_P_SNAP),
};
- rcu_read_lock();
+ rcu_read_lock_spin(&snap_lock);
proto = find_snap_client(skb->h.raw);
if (proto) {
/* Pass the frame on. */
@@ -68,7 +68,7 @@ static int snap_rcv(struct sk_buff *skb,
rc = 1;
}
- rcu_read_unlock();
+ rcu_read_unlock_spin(&snap_lock);
return rc;
}
* Rui Nuno Capela <[email protected]> wrote:
> Regarding the jackd -R issue, I was trying to capture some debug data
> via netconsole on my laptop (P4/UP) running RT-U10.2, and when the
> system freezes as reported before, I was able to kick the SysRq+T.
> But, instead of a task trace list, I get the following:
>
> SysRq : <3>BUG: sleeping function called from invalid context IRQ 1(776)
> at kernel/mutex.c:37
> in_atomic():1 [00000001], irqs_disabled():1
> [<c0104ee4>] dump_stack+0x1e/0x20 (20)
> [<c0114a23>] __might_sleep+0xb2/0xc7 (36)
> [<c012c0f2>] _mutex_lock+0x39/0x5e (28)
> preempt count: 00000002
> . 2-level deep critical section nesting:
> .. entry 1: __sysrq_lock_table+0x12/0x14 [<c01f482b>] /
> (__handle_sysrq+0x1a/0xed [<c01f482b>])
> .. entry 2: print_traces+0x16/0x48 [<c0104ee4>] / (dump_stack+0x1e/0x20
does the patch below help?
Ingo
--- linux/drivers/char/sysrq.c.orig
+++ linux/drivers/char/sysrq.c
@@ -252,7 +252,7 @@ static struct sysrq_key_op sysrq_kill_op
/* Key Operations table and lock */
-static DECLARE_RAW_SPINLOCK(sysrq_key_table_lock);
+static DECLARE_SPINLOCK(sysrq_key_table_lock);
#define SYSRQ_KEY_TABLE_LENGTH 36
static struct sysrq_key_op *sysrq_key_table[SYSRQ_KEY_TABLE_LENGTH] = {
/* 0 */ &sysrq_loglevel_op,
* Lee Revell <[email protected]> wrote:
> On Fri, 2004-10-22 at 17:49 -0400, Gene Heskett wrote:
> > Mmm, I get a 404 page not found. when I click on on thsi link.
>
> Same here. The current version is 10.3:
>
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10.3
>
> I did not get an announcement, it looks like the listserv is
> backlogged, or Ingo did not announce it yet.
had to do it in the last minute and didnt have time to announce it.
-RT-U10.3 fixes a UP compilation error reported by John Gilbert.
Ingo
Ingo Molnar wrote:
>
> Rui Nuno Capela wrote:
>
>> Regarding the jackd -R issue, I was trying to capture some debug data
>> via netconsole on my laptop (P4/UP) running RT-U10.2, and when the
>> system freezes as reported before, I was able to kick the SysRq+T.
>> But, instead of a task trace list, I get the following:
>>
>> SysRq : <3>BUG: sleeping function called from invalid context IRQ 1(776)
>> at kernel/mutex.c:37
>> in_atomic():1 [00000001], irqs_disabled():1
>> [<c0104ee4>] dump_stack+0x1e/0x20 (20)
>> [<c0114a23>] __might_sleep+0xb2/0xc7 (36)
>> [<c012c0f2>] _mutex_lock+0x39/0x5e (28)
>
>> preempt count: 00000002
>> . 2-level deep critical section nesting:
>> .. entry 1: __sysrq_lock_table+0x12/0x14 [<c01f482b>] /
>> (__handle_sysrq+0x1a/0xed [<c01f482b>])
>> .. entry 2: print_traces+0x16/0x48 [<c0104ee4>] / (dump_stack+0x1e/0x20
>
> does the patch below help?
>
> Ingo
>
> --- linux/drivers/char/sysrq.c.orig
> +++ linux/drivers/char/sysrq.c
> @@ -252,7 +252,7 @@ static struct sysrq_key_op sysrq_kill_op
>
>
> /* Key Operations table and lock */
> -static DECLARE_RAW_SPINLOCK(sysrq_key_table_lock);
> +static DECLARE_SPINLOCK(sysrq_key_table_lock);
> #define SYSRQ_KEY_TABLE_LENGTH 36
> static struct sysrq_key_op *sysrq_key_table[SYSRQ_KEY_TABLE_LENGTH] = {
> /* 0 */ &sysrq_loglevel_op,
>
Nope. Same result:
SysRq : <3>BUG: sleeping function called from invalid context IRQ 1(776)
at kernel/mutex.c:37
in_atomic():0 [00000000], irqs_disabled():1
[<c0104ee4>] dump_stack+0x1e/0x20 (20)
[<c0114a23>] __might_sleep+0xb2/0xc7 (36)
[<c012c0f2>] _mutex_lock+0x39/0x5e (28)
[<c012c13a>] _mutex_lock_irqsave+0x11/0x15 (12)
[<c027f913>] refill_skbs+0x13/0x6d (20)
[<c027fa38>] find_skb+0x5d/0x9d (24)
[<c027fb60>] netpoll_send_udp+0x3b/0x298 (48)
[<e0136047>] write_msg+0x47/0x5c [netconsole] (36)
[<c0117804>] __call_console_drivers+0x51/0x60 (32)
[<c0117910>] call_console_drivers+0x6d/0x147 (40)
[<c0117caf>] release_console_sem+0x48/0x100 (36)
[<c0117bd5>] vprintk+0x127/0x174 (36)
[<c0117aac>] printk+0x18/0x1a (16)
[<c01f4835>] __handle_sysrq+0x38/0xed (40)
[<c01ee426>] kbd_event+0xeb/0xfa (40)
[<c025f694>] input_event+0x160/0x3d4 (44)
[<c02620a2>] atkbd_report_key+0x3b/0x95 (32)
[<c0262358>] atkbd_interrupt+0x25c/0x590 (60)
[<c01f6fbe>] serio_interrupt+0x4f/0xa5 (44)
[<c01f78b7>] i8042_interrupt+0xb8/0x1b8 (40)
[<c0131dbc>] handle_IRQ_event+0x48/0x79 (32)
[<c01325dd>] do_hardirq+0x86/0x123 (40)
[<c0132712>] do_irqd+0x98/0xc9 (36)
[<c012b7d7>] kthread+0x9c/0xc9 (48)
[<c0102305>] kernel_thread_helper+0x5/0xb (548454420)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x16/0x48 [<c0104ee4>] / (dump_stack+0x1e/0x20
[<c0104ee4>])
Bye.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> > does the patch below help?
> Nope. Same result:
> SysRq : <3>BUG: sleeping function called from invalid context IRQ 1(776)
> at kernel/mutex.c:37
> in_atomic():0 [00000000], irqs_disabled():1
interrupts are disabled. You used a -RT-U10.2/3 kernel, and have
CONFIG_REALTIME enabled, right? Do you have this in
drivers/net/netconsole.c, line 77:
#ifdef PREEMPT_REALTIME
/*
* A bit hairy. Netconsole uses mutexes (indirectly) and
* thus must have interrupts enabled:
*/
local_irq_enable();
#endif
correct? Could you do this a few lines below:
WARN_ON_RT(irqs_disabled());
netpoll_send_udp(&np, msg, frag);
WARN_ON_RT(irqs_disabled());
to figure out who disables interrupts. Also, could you add the same two
lines to net/core/netpoll.c, line 83:
WARN_ON_RT(irqs_disabled());
np->dev->poll_controller(np->dev);
WARN_ON_RT(irqs_disabled());
and send me either the full bootlog, or the _first_ such BUG message
you'll be getting. Which network controller is this?
Ingo
Ingo Molnar wrote:
>
> Rui Nuno Capela wrote:
>
>> > does the patch below help?
>
>> Nope. Same result:
>
>> SysRq : <3>BUG: sleeping function called from invalid context IRQ 1(776)
>> at kernel/mutex.c:37
>> in_atomic():0 [00000000], irqs_disabled():1
>
> interrupts are disabled. You used a -RT-U10.2/3 kernel, and have
> CONFIG_REALTIME enabled, right? Do you have this in
> drivers/net/netconsole.c, line 77:
>
> #ifdef PREEMPT_REALTIME
> /*
> * A bit hairy. Netconsole uses mutexes (indirectly) and
> * thus must have interrupts enabled:
> */
> local_irq_enable();
> #endif
>
> correct? Could you do this a few lines below:
>
> WARN_ON_RT(irqs_disabled());
> netpoll_send_udp(&np, msg, frag);
> WARN_ON_RT(irqs_disabled());
>
> to figure out who disables interrupts. Also, could you add the same two
> lines to net/core/netpoll.c, line 83:
>
> WARN_ON_RT(irqs_disabled());
> np->dev->poll_controller(np->dev);
> WARN_ON_RT(irqs_disabled());
>
> and send me either the full bootlog, or the _first_ such BUG message
> you'll be getting. Which network controller is this?
>
OK. All affirmative. NIC is natsemi.
Here it is:
SysRq : IRQ 1/776: BUG in write_msg at drivers/net/netconsole.c:87
[<c0104ee4>] dump_stack+0x1e/0x20 (20)
[<e00ef0ab>] write_msg+0xab/0xf4 [netconsole] (52)
[<c0117804>] __call_console_drivers+0x51/0x60 (32)
[<c0117910>] call_console_drivers+0x6d/0x147 (40)
[<c0117caf>] release_console_sem+0x48/0x100 (36)
[<c0117bd5>] vprintk+0x127/0x174 (36)
[<c0117aac>] printk+0x18/0x1a (16)
[<c01f4835>] __handle_sysrq+0x38/0xed (40)
[<c01ee426>] kbd_event+0xeb/0xfa (40)
[<c025f694>] input_event+0x160/0x3d4 (44)
[<c02620a2>] atkbd_report_key+0x3b/0x95 (32)
[<c0262358>] atkbd_interrupt+0x25c/0x590 (60)
[<c01f6fbe>] serio_interrupt+0x4f/0xa5 (44)
[<c01f78b7>] i8042_interrupt+0xb8/0x1b8 (40)
[<c0131dbc>] handle_IRQ_event+0x48/0x79 (32)
[<c01325dd>] do_hardirq+0x86/0x123 (40)
[<c0132712>] do_irqd+0x98/0xc9 (36)
[<c012b7d7>] kthread+0x9c/0xc9 (48)
[<c0102305>] kernel_thread_helper+0x5/0xb (548454420)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x16/0x48 [<c0104ee4>] / (dump_stack+0x1e/0x20
[<c0104ee4>])
BUG: sleeping function called from invalid context IRQ 1(776) at
kernel/mutex.c:37
in_atomic():0 [00000000], irqs_disabled():1
[<c0104ee4>] dump_stack+0x1e/0x20 (20)
[<c0114a23>] __might_sleep+0xb2/0xc7 (36)
[<c012c0f2>] _mutex_lock+0x39/0x5e (28)
[<c012c13a>] _mutex_lock_irqsave+0x11/0x15 (12)
[<c027f9bc>] refill_skbs+0x13/0x6d (20)
[<c027fae1>] find_skb+0x5d/0x9d (24)
[<c027fc09>] netpoll_send_udp+0x3b/0x298 (48)
[<e00ef050>] write_msg+0x50/0xf4 [netconsole] (52)
[<c0117804>] __call_console_drivers+0x51/0x60 (32)
[<c0117910>] call_console_drivers+0x6d/0x147 (40)
[<c0117caf>] release_console_sem+0x48/0x100 (36)
[<c0117bd5>] vprintk+0x127/0x174 (36)
[<c0117aac>] printk+0x18/0x1a (16)
[<c01f4835>] __handle_sysrq+0x38/0xed (40)
[<c01ee426>] kbd_event+0xeb/0xfa (40)
[<c025f694>] input_event+0x160/0x3d4 (44)
[<c02620a2>] atkbd_report_key+0x3b/0x95 (32)
[<c0262358>] atkbd_interrupt+0x25c/0x590 (60)
[<c01f6fbe>] serio_interrupt+0x4f/0xa5 (44)
[<c01f78b7>] i8042_interrupt+0xb8/0x1b8 (40)
[<c0131dbc>] handle_IRQ_event+0x48/0x79 (32)
[<c01325dd>] do_hardirq+0x86/0x123 (40)
[<c0132712>] do_irqd+0x98/0xc9 (36)
[<c012b7d7>] kthread+0x9c/0xc9 (48)
[<c0102305>] kernel_thread_helper+0x5/0xb (548454420)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x16/0x48 [<c0104ee4>] / (dump_stack+0x1e/0x20
[<c0104ee4>])
Bye.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> OK. All affirmative. NIC is natsemi.
>
> Here it is:
>
> SysRq : IRQ 1/776: BUG in write_msg at drivers/net/netconsole.c:87
doh! Go to line 77 and spot the bug. (yes, the PREEMPT_REALTIME needs to
become CONFIG_PREEMPT_REALTIME) With that fixed does it work for you?
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Oct 22 14:37:14 swdev14 kernel: BUG: sleeping function called from invalid context ksoftirqd/0(3) at kernel/mutex.c:37
>>Oct 22 14:37:14 swdev14 kernel: in_atomic():1 [00000001], irqs_disabled():0
>>Oct 22 14:37:14 swdev14 kernel: [<c011ac3d>] __might_sleep+0xc4/0xd6 (12)
>>Oct 22 14:37:14 swdev14 kernel: [<c0132ae8>] _mutex_lock+0x3e/0x63 (36)
>>Oct 22 14:37:14 swdev14 kernel: [<e0a8b297>] ipxitf_find_using_phys+0x1e/0x4c [ipx] (28)
>>Oct 22 14:37:14 swdev14 kernel: [<e0a8d5a6>] ipx_rcv+0xdc/0x1dd [ipx] (20)
>>Oct 22 14:37:14 swdev14 kernel: [<c024050b>] snap_rcv+0x5f/0xe0 (32)
>
>
> does the patch below fix these?
>
> Ingo
>
I will try reproducing it here (at home). Otherwise it'll have to wait
till Monday.
> --- linux/net/802/psnap.c.orig
> +++ linux/net/802/psnap.c
> @@ -55,7 +55,7 @@ static int snap_rcv(struct sk_buff *skb,
> .type = __constant_htons(ETH_P_SNAP),
> };
>
> - rcu_read_lock();
> + rcu_read_lock_spin(&snap_lock);
> proto = find_snap_client(skb->h.raw);
> if (proto) {
> /* Pass the frame on. */
> @@ -68,7 +68,7 @@ static int snap_rcv(struct sk_buff *skb,
> rc = 1;
> }
>
> - rcu_read_unlock();
> + rcu_read_unlock_spin(&snap_lock);
> return rc;
> }
>
>
* Paul E. McKenney <[email protected]> wrote:
> o In rcupdate.h, I believe that the:
>
> +# define rcu_read_unlock_nort() rcu_read_lock_nort()
>
> should instead be:
>
> +# define rcu_read_unlock_nort() rcu_read_unlock()
yeah, correct - fortunately this is a non-default path, but still a nice
fix.
> o The rcu_read_lock_spin(), rcu_read_lock_read(),
> rcu_read_lock_bh_read(), rcu_read_lock_sem(), and
> rcu_read_lock_bh_spin() APIs cannot be called recursively.
> But you probably already knew that. ;-)
>
> I don't understand why the rcu_read_lock_sem() API gets its
> own #ifdef.
actually, rcu_read_lock_read() is the variant that _can_ be called
recursively and which i used in the networking code quite extensively.
The others are only useful if the locking is 'flat' in the original
code, or if the locking is extensively rewritten. (I havent tried to
convert the IPC code back from the 'flat' locking to the original
'nested' locking, but i've done it for the networking code.)
> o Some recent RCU patches acquire the update-side lock
> under rcu_read_lock(), which I believe will deadlock here.
which codepaths do you mean? Things are looking pretty good in -U10.3 so
far.
Ingo
On Sat, 2004-10-23 at 15:14 -0400, Lee Revell wrote:
> On Sat, 2004-10-23 at 11:51 -0700, Paul E. McKenney wrote:
> > On Fri, Oct 22, 2004 at 07:56:33PM +0200, Ingo Molnar wrote:
> > >
> > > i have released the -U10.2 Real-Time Preemption patch, which can be
> > > downloaded from:
> > >
> > > http://redhat.com/~mingo/realtime-preempt/
> >
> > On realtime-preempt-2.6.9-mm1-U10.3:
> >
> > o In rcupdate.h, I believe that the:
> >
> > +# define rcu_read_unlock_nort() rcu_read_lock_nort()
> >
> > should instead be:
> >
> > +# define rcu_read_unlock_nort() rcu_read_unlock()
> >
>
> Oh no! That would explain a lot... the typical report is it works fine
> until people go to use the network :-P
Yes and No !
The wrong define is in the #else path of CONFIG_PREEMPT_REALTIME, so it
affects the kernel only when it is built with PREEMPT_REALTIME
disabled.
The network problem with PREEMPT_REALTIME enabled is a subtle race,
which I have nearly tracked down. I know the scenario, but I have not
yet identified the culprit. (:
tglx
On Sat, 2004-10-23 at 11:51 -0700, Paul E. McKenney wrote:
> On Fri, Oct 22, 2004 at 07:56:33PM +0200, Ingo Molnar wrote:
> >
> > i have released the -U10.2 Real-Time Preemption patch, which can be
> > downloaded from:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> On realtime-preempt-2.6.9-mm1-U10.3:
>
> o In rcupdate.h, I believe that the:
>
> +# define rcu_read_unlock_nort() rcu_read_lock_nort()
>
> should instead be:
>
> +# define rcu_read_unlock_nort() rcu_read_unlock()
>
Oh no! That would explain a lot... the typical report is it works fine
until people go to use the network :-P
Lee
On Sat, Oct 23, 2004 at 10:24:51PM +0200, Ingo Molnar wrote:
> * Paul E. McKenney <[email protected]> wrote:
> > o The rcu_read_lock_spin(), rcu_read_lock_read(),
> > rcu_read_lock_bh_read(), rcu_read_lock_sem(), and
> > rcu_read_lock_bh_spin() APIs cannot be called recursively.
> > But you probably already knew that. ;-)
> >
> > I don't understand why the rcu_read_lock_sem() API gets its
> > own #ifdef.
>
> actually, rcu_read_lock_read() is the variant that _can_ be called
> recursively and which i used in the networking code quite extensively.
> The others are only useful if the locking is 'flat' in the original
> code, or if the locking is extensively rewritten. (I havent tried to
> convert the IPC code back from the 'flat' locking to the original
> 'nested' locking, but i've done it for the networking code.)
OK, sorry for my confusion. I still don't see why rcu_read_lock_sem()
is segregated, but it will clearly work either way.
> > o Some recent RCU patches acquire the update-side lock
> > under rcu_read_lock(), which I believe will deadlock here.
>
> which codepaths do you mean? Things are looking pretty good in -U10.3 so
> far.
The one that I am aware of has not yet hit mainline -- Kaigai Kohei's
scalability changes to Linux. See:
http://marc.theaimsgroup.com/?l=linux-kernel&m=109628285418353&w=2
The function avc_update_cache() does an rcu_read_lock(), then
invokes avc_update_node(), which acquires the update-side lock.
No problem under conventional RCU, in the case where one might
realize that an update is needed during what is a read-only search
in the common case, but would be problematic given real-time preemption.
Thanx, Paul
> Ingo Molnar
>
> Rui Nuno Capela wrote:
>
>> OK. All affirmative. NIC is natsemi.
>>
>> Here it is:
>>
>> SysRq : IRQ 1/776: BUG in write_msg at drivers/net/netconsole.c:87
>
> doh! Go to line 77 and spot the bug. (yes, the PREEMPT_REALTIME needs to
> become CONFIG_PREEMPT_REALTIME) With that fixed does it work for you?
>
OK again. And found another place where PREEMPT_REALTIME is in place of
CONFIG_PREEMPT_REALTIME, on drivers/ide/ide-taskfile.c, lines 287 and 308
(see appended diffs).
Anyway, back to my jackd -R issue. I tell you that things are really
different now: hitting SysRq+T, just about when it all gets frozen, I see
nothing on netconsole capture end, only this single line:
SysRq : Show State
and nothing more.
--
rncbc aka Rui Nuno Capela
[email protected]
On Fri, Oct 22, 2004 at 07:56:33PM +0200, Ingo Molnar wrote:
>
> i have released the -U10.2 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
On realtime-preempt-2.6.9-mm1-U10.3:
o In rcupdate.h, I believe that the:
+# define rcu_read_unlock_nort() rcu_read_lock_nort()
should instead be:
+# define rcu_read_unlock_nort() rcu_read_unlock()
It looks to me like the current definition would cause
preemption to be permanently disabled in a kernel with
CONFIG_PREEMPT but without CONFIG_PREEMPT_REALTIME,
at least if one used SysV IPC.
o The rcu_read_lock_spin(), rcu_read_lock_read(),
rcu_read_lock_bh_read(), rcu_read_lock_sem(), and
rcu_read_lock_bh_spin() APIs cannot be called recursively.
But you probably already knew that. ;-)
I don't understand why the rcu_read_lock_sem() API gets its
own #ifdef.
o Some recent RCU patches acquire the update-side lock
under rcu_read_lock(), which I believe will deadlock here.
Since the same CPU/task is acquiring the same lock twice, I don't
believe that the mutex mods help, but could easily be mistaken.
Then again, this may well be why there are all the emails on
this thread advising that SELinux be disabled.
Thanx, Paul
> this is a fixes-only release.
>
> Changes since -U10:
>
> - fixed a big bug present ever since: the BKL got dropped when a
> spinlock-mutex was acquired and it scheduled away. This reduced the
> locking efficiency of the BKL. A number of outstanding problems could
> be affected, in particular this should fix the tty locking breakage
> reported by Alexander Batyrshin and Adam Heath. UP and SMP systems
> are affected too, with SMP systems having a higher chance to trigger
> this condition.
>
> - tulip.c breakage fix from Thomas Gleixner
>
> - tg3 and 3c59x fixes.
>
> - made the hardirq threads SCHED_FIFO by default. They get priorities
> between 25 and 50, depending on the irq #. (this is pretty random but
> i found no better scheme.) Made the softirq thread SCHED_FIFO by
> default as well, albeit this probably will have to change. These
> changes should make it easier to debug a hung system.
>
> to create a -U10.2 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10.2
>
> Ingo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
Ingo Molnar wrote:
>i have released the -U10.2 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this is a fixes-only release.
>
>Changes since -U10:
>
> - fixed a big bug present ever since: the BKL got dropped when a
> spinlock-mutex was acquired and it scheduled away. This reduced the
> locking efficiency of the BKL. A number of outstanding problems could
> be affected, in particular this should fix the tty locking breakage
> reported by Alexander Batyrshin and Adam Heath. UP and SMP systems
> are affected too, with SMP systems having a higher chance to trigger
> this condition.
>
> - tulip.c breakage fix from Thomas Gleixner
>
> - tg3 and 3c59x fixes.
>
> - made the hardirq threads SCHED_FIFO by default. They get priorities
> between 25 and 50, depending on the irq #. (this is pretty random but
> i found no better scheme.) Made the softirq thread SCHED_FIFO by
> default as well, albeit this probably will have to change. These
> changes should make it easier to debug a hung system.
>
>to create a -U10.2 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10.2
>
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
I get following error with 10.3 on kernel 2.6.9-mm1
Device 'i823650' does not have a release() function, it is broken and
must be fixed.
swapper/1: BUG in device_release at drivers/base/core.c:85
[<c0238aeb>] kobject_cleanup+0x9a/0x9c (8)
[<c0238b15>] kobject_put+0x1e/0x22 (24)
[<c0238aed>] kobject_release+0x0/0xa (8)
[<c045cecf>] init_i82365+0x1f2/0x208 (4)
[<c02487b6>] pci_register_driver+0x94/0xa6 (8)
[<c044286c>] do_initcalls+0x54/0xb6 (32)
[<c010040b>] init+0x0/0x10d (16)
[<c010043f>] init+0x34/0x10d (12)
[<c01042a8>] kernel_thread_helper+0x0/0xb (12)
[<c01042ad>] kernel_thread_helper+0x5/0xb (4)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x14/0x45 [<00000000>] / (0x0 [<00000000>])
Ingo Molnar wrote:
>* Lee Revell <[email protected]> wrote:
>
>
>>On Fri, 2004-10-22 at 17:49 -0400, Gene Heskett wrote:
>>
>>>Mmm, I get a 404 page not found. when I click on on thsi link.
>>>
>>Same here. The current version is 10.3:
>>
>>http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-U10.3
>>
I'm seeing an odd build error in the -U10.3 patch to 2.6.9-mm1:
<snip>
AS arch/i386/boot/compressed/head.o
CC arch/i386/boot/compressed/misc.o
OBJCOPY arch/i386/boot/compressed/vmlinux.bin
BFD: Warning: Writing section `.bss' to huge (ie negative) file offset
0xc03ac000.
objcopy: arch/i386/boot/compressed/vmlinux.bin: File truncated
make[2]: *** [arch/i386/boot/compressed/vmlinux.bin] Error 1
make[1]: *** [arch/i386/boot/compressed/vmlinux] Error 2
make: *** [bzImage] Error 2
[root@otaku linux-2.6.9]# objdump -f vmlinux
vmlinux: file format elf32-i386
architecture: i386, flags 0x00000112:
EXEC_P, HAS_SYMS, D_PAGED
start address 0x00100000
This appears a result of changes in:
arch/i386/kernel/vmlinux.lds.S
apparently for support of CONFIG_KERN_PHYS_OFFSET.
This causes the kernel LMA start address to
change from 0xc0100000 to 0x100000 and objcopy to
gag. I rolled back to a 2.6.9-mm1 version of the
above linker map file and did get the kernel to
build and boot.
Anyone else seeing this? .config attached.
BTW, the -U10.2 patch seems to have disappeared
from:
http://people.redhat.com/mingo/realtime-preempt/older/
--
[email protected]
Hi,
> I'm seeing an odd build error in the -U10.3 patch to 2.6.9-mm1:
>
> <snip>
>
> AS arch/i386/boot/compressed/head.o
> CC arch/i386/boot/compressed/misc.o
> OBJCOPY arch/i386/boot/compressed/vmlinux.bin
> BFD: Warning: Writing section `.bss' to huge (ie negative) file offset
> 0xc03ac000.
> objcopy: arch/i386/boot/compressed/vmlinux.bin: File truncated
> make[2]: *** [arch/i386/boot/compressed/vmlinux.bin] Error 1
> make[1]: *** [arch/i386/boot/compressed/vmlinux] Error 2
> make: *** [bzImage] Error 2
>
> [root@otaku linux-2.6.9]# objdump -f vmlinux
>
> vmlinux: file format elf32-i386
> architecture: i386, flags 0x00000112:
> EXEC_P, HAS_SYMS, D_PAGED
> start address 0x00100000
>
> This appears a result of changes in:
>
> arch/i386/kernel/vmlinux.lds.S
>
> apparently for support of CONFIG_KERN_PHYS_OFFSET.
> This causes the kernel LMA start address to
> change from 0xc0100000 to 0x100000 and objcopy to
> gag. I rolled back to a 2.6.9-mm1 version of the
> above linker map file and did get the kernel to
> build and boot.
>
> Anyone else seeing this? .config attached.
Yes.
You probably need to upgrade your binutil package. The .bss LMA start address
section is not dealt the way it should by ld.
An other (bad) way to work around this compile problem is to force the .bss LMA
start address with the following OBJCOPYFLAGS at objcopy time.
OBJCOPYFLAGS := -O binary --change-section-lma .bss-0xc0000000 -R .note -R
.comment -S
Hope this help,
Remi
i have released the -V0 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
NOTE: this is a highly experimental release, a more experimental one
than -U10.3.
the big change in the '-V' series of the patchset is that i have
converted the last couple of non-preemptible kernel subsystems to
fully-preemptible mutex-based locking. These subsystems are:
- the SLAB allocator
- the buddy page allocator
- waitqueue handling
- soft-timer subsystem
- security/selinux
- workqueues
- the random driver
this is probably the last 'big leap forward' in terms of the scope of
the patch. (having reached the ultimate scope: it now encompasses
everything ;)
But as an inevitable result of this big leap it will likely break in a
couple of places. Unfortunately these subsystems were largely
interdependent so it's an all-or-nothing step with not much middle
ground between the locking done in -U10.3 and in -V0.
another result of these changes is that the number of critical sections
in -V0 is roughly 30% of that in -U10.3. Now we only have the scheduler
and very lowlevel IRQ-hardware locks as raw spinlocks. (plus the lone
holdout vga_lock - which i will probably make a mutex too in the near
future)
[ NOTE: there's one known bug in this release: selinux on one of my
testsystems broke, it hangs during bootup. With CONFIG_SECURITY disabled
it works fine. I'm working on the fix. So please keep CONFIG_SECURITY
disabled for the time being. ]
other changes in -V0:
- build fixes: more driver fixes from Thomas Gleixner
- crash fix: fixed a bug found by Thomas Gleixner: rwsem runtime
initialization was racy.
- deadlock fix: fixed lockup bug caused by __schedule clearing
PREEMPT_ACTIVE. The need_resched loop is now outside of __schedule().
This might solve lockups/slowdowns reported by some people.
- latency fix: made keventd SCHED_FIFO - this could fix the mouse
related delays reported by a number of people.
- latency fix: fixed SMP lock-break mechanism of mutexes.
- usability feature: hard-interrupts get decreasing SCHED_FIFO priority
starting at prio 49 and stopping at prio 25. This should give a good
default.
- debug feature: implemented SysRq-D to show the list of tasks with
locks blocked on, if RW_SEM_DEADLOCK_DETECTION is enabled.
to create a -V0 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-V0
Ingo
Ingo Molnar wrote:
> [ NOTE: there's one known bug in this release: selinux on one of my
> testsystems broke, it hangs during bootup. With CONFIG_SECURITY disabled
> it works fine. I'm working on the fix. So please keep CONFIG_SECURITY
> disabled for the time being. ]
>
Does this include all models of security or just the selinux stuff?
> other changes in -V0:
>
> - build fixes: more driver fixes from Thomas Gleixner
>
> - crash fix: fixed a bug found by Thomas Gleixner: rwsem runtime
> initialization was racy.
>
> - deadlock fix: fixed lockup bug caused by __schedule clearing
> PREEMPT_ACTIVE. The need_resched loop is now outside of __schedule().
> This might solve lockups/slowdowns reported by some people.
>
> - latency fix: made keventd SCHED_FIFO - this could fix the mouse
> related delays reported by a number of people.
>
> - latency fix: fixed SMP lock-break mechanism of mutexes.
>
> - usability feature: hard-interrupts get decreasing SCHED_FIFO priority
> starting at prio 49 and stopping at prio 25. This should give a good
> default.
>
> - debug feature: implemented SysRq-D to show the list of tasks with
> locks blocked on, if RW_SEM_DEADLOCK_DETECTION is enabled.
>
> to create a -V0 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> + http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-V0
>
> Ingo
>
* K.R. Foley <[email protected]> wrote:
> >[ NOTE: there's one known bug in this release: selinux on one of my
> >testsystems broke, it hangs during bootup. With CONFIG_SECURITY disabled
> >it works fine. I'm working on the fix. So please keep CONFIG_SECURITY
> >disabled for the time being. ]
> >
> Does this include all models of security or just the selinux stuff?
i have only tried selinux. (which is installed/enabled by default on FC3
so it's easy for me to test on an out of box distro.)
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>>[ NOTE: there's one known bug in this release: selinux on one of my
>>>testsystems broke, it hangs during bootup. With CONFIG_SECURITY disabled
>>>it works fine. I'm working on the fix. So please keep CONFIG_SECURITY
>>>disabled for the time being. ]
>>>
>>
>>Does this include all models of security or just the selinux stuff?
>
>
> i have only tried selinux. (which is installed/enabled by default on FC3
> so it's easy for me to test on an out of box distro.)
>
> Ingo
>
Well I will know when it boots, or doesn't. :) I will report back when I
know.
kr
* Ingo Molnar <[email protected]> wrote:
> > >[ NOTE: there's one known bug in this release: selinux on one of my
> > >testsystems broke, it hangs during bootup. With CONFIG_SECURITY disabled
> > >it works fine. I'm working on the fix. So please keep CONFIG_SECURITY
> > >disabled for the time being. ]
> > >
> > Does this include all models of security or just the selinux stuff?
>
> i have only tried selinux. (which is installed/enabled by default on
> FC3 so it's easy for me to test on an out of box distro.)
i think i found the bug - now selinux boots fine. I've uploaded -V0.1
with the fix included. This fix could solve a number of other complaints
as well.
Ingo
On Mon, 25 Oct 2004 14:12:10 +0200
Ingo Molnar <[email protected]> wrote:
> i think i found the bug - now selinux boots fine. I've uploaded -V0.1
> with the fix included. This fix could solve a number of other complaints
> as well.
hi, i saw these during boot (config and complete dmesg attached):
Freeing unused kernel memory: 348k freed
Adding 289160k swap on /dev/hda3. Priority:-1 extents:1
EXT3 FS on hdc1, internal journal
IRQ#8 thread RT prio: 45.
BUG: sleeping function called from invalid context modprobe(116) at kernel/mutex.c:28
in_atomic():1 [00000001], irqs_disabled():1
[<c0117182>] __might_sleep+0xc2/0xe0 (12)
[<c0134989>] resolve_symbol+0xb9/0xc0 (24)
[<c01309f8>] _mutex_lock+0x38/0x50 (12)
[<c0144995>] kmem_cache_alloc+0x45/0x100 (24)
[<c0134989>] resolve_symbol+0xb9/0xc0 (8)
[<c0133cfc>] use_module+0x4c/0x160 (4)
[<c0133d50>] use_module+0xa0/0x160 (20)
[<f08414f0>] unregister_sound_special+0x0/0x40 [soundcore] (12)
[<c0134989>] resolve_symbol+0xb9/0xc0 (16)
[<c0134fe2>] simplify_symbols+0xb2/0x120 (48)
[<c0135cc3>] load_module+0x573/0xa90 (44)
[<c013100d>] __mcount+0x1d/0x20 (48)
[<c0136236>] sys_init_module+0x56/0x260 (112)
[<c010617b>] syscall_call+0x7/0xb (28)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: resolve_symbol+0x21/0xc0 [<c01348f1>] / (simplify_symbols+0xb2/0x120 [<c0134fe2>])
.. entry 2: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
PCI: Found IRQ 3 for device 0000:00:0f.0
sis900.c: v1.08.07 11/02/2003
[...]
EXT3-fs: mounted filesystem with ordered data mode.
eth0: Media Link On 100mbps full-duplex
IRQ#5 thread RT prio: 44.
ip_tables: (C) 2000-2002 Netfilter core team
BUG: sleeping function called from invalid context modprobe(591) at kernel/mutex.c:28
in_atomic():1 [00000001], irqs_disabled():1
[<c0117182>] __might_sleep+0xc2/0xe0 (12)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (24)
[<c01309f8>] _mutex_lock+0x38/0x50 (12)
[<c0144995>] kmem_cache_alloc+0x45/0x100 (24)
[<c0134989>] resolve_symbol+0xb9/0xc0 (8)
[<c0133cfc>] use_module+0x4c/0x160 (4)
[<c0133d50>] use_module+0xa0/0x160 (20)
[<f0914030>] ipt_do_table+0x0/0x320 [ip_tables] (12)
[<c0134989>] resolve_symbol+0xb9/0xc0 (16)
[<c0134fe2>] simplify_symbols+0xb2/0x120 (48)
[<c0135cc3>] load_module+0x573/0xa90 (44)
[<c01ef27b>] __up_write+0x13b/0x320 (48)
[<c0136236>] sys_init_module+0x56/0x260 (112)
[<c010617b>] syscall_call+0x7/0xb (28)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: resolve_symbol+0x21/0xc0 [<c01348f1>] / (simplify_symbols+0xb2/0x120 [<c0134fe2>])
.. entry 2: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
ip_conntrack version 2.1 (6144 buckets, 49152 max) - 452 bytes per conntrack
Module Size Used by
ipt_addrtype 2112 1
ipt_state 1920 1
ip_conntrack 51096 1 ipt_state
iptable_filter 3136 1
ip_tables 19584 3 ipt_addrtype,ipt_state,iptable_filter
bsd_comp 5952 0
ppp_deflate 6272 0
zlib_deflate 22360 1 ppp_deflate
ppp_async 13784 1
ppp_generic 34276 7 bsd_comp,ppp_deflate,ppp_async
slhc 8320 1 ppp_generic
crc_ccitt 2112 1 ppp_async
sis900 20036 0
crc32 4352 1 sis900
snd_cs46xx 84168 1
snd_rawmidi 26592 1 snd_cs46xx
snd_seq_device 8972 1 snd_rawmidi
snd_ac97_codec 77472 1 snd_cs46xx
snd_pcm 100600 2 snd_cs46xx,snd_ac97_codec
snd_timer 27612 1 snd_pcm
snd 57668 8 snd_cs46xx,snd_rawmidi,snd_seq_device,snd_ac97_codec,snd_pcm,snd_timer
soundcore 10688 1 snd
snd_page_alloc 10244 2 snd_cs46xx,snd_pcm
gameport 4992 1 snd_cs46xx
flo
On Mon, 25 Oct 2004 14:12:10 +0200
Ingo Molnar <[email protected]> wrote:
> i think i found the bug - now selinux boots fine. I've uploaded -V0.1
> with the fix included. This fix could solve a number of other complaints
> as well.
some more:
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
mozilla-bin/753: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c01ef27b>] __up_write+0x13b/0x320 (84)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (16)
[<c02b9880>] down_write+0xd0/0x2b0 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c02b9880>] down_write+0xd0/0x2b0 (4)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (8)
[<c01ef27b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef27b>] __up_write+0x13b/0x320 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (72)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
* Florian Schmidt <[email protected]> wrote:
> hi, i saw these during boot (config and complete dmesg attached):
>
> Freeing unused kernel memory: 348k freed
> Adding 289160k swap on /dev/hda3. Priority:-1 extents:1
> EXT3 FS on hdc1, internal journal
> IRQ#8 thread RT prio: 45.
> BUG: sleeping function called from invalid context modprobe(116) at kernel/mutex.c:28
> in_atomic():1 [00000001], irqs_disabled():1
> [<c0117182>] __might_sleep+0xc2/0xe0 (12)
> [<c0134989>] resolve_symbol+0xb9/0xc0 (24)
> [<c01309f8>] _mutex_lock+0x38/0x50 (12)
> [<c0144995>] kmem_cache_alloc+0x45/0x100 (24)
> [<c0134989>] resolve_symbol+0xb9/0xc0 (8)
does the patch below fix this?
Ingo
--- linux/kernel/module.c.orig
+++ linux/kernel/module.c
@@ -53,7 +53,7 @@
#define INIT_OFFSET_MASK (1UL << (BITS_PER_LONG-1))
/* Protects module list */
-static DECLARE_RAW_SPINLOCK(modlist_lock);
+static DECLARE_SPINLOCK(modlist_lock);
/* List of modules, protected by module_mutex AND modlist_lock */
static DECLARE_MUTEX(module_mutex);
On Mon, 25 Oct 2004 15:39:40 +0200
Florian Schmidt <[email protected]> wrote:
> On Mon, 25 Oct 2004 14:12:10 +0200
> Ingo Molnar <[email protected]> wrote:
>
> > i think i found the bug - now selinux boots fine. I've uploaded -V0.1
> > with the fix included. This fix could solve a number of other complaints
> > as well.
>
> some more:
[snip]
i forgot to mention these were from the same session as the previous one.
also i think i missed the first ones, so the reports in this mail are
probably useless(?).
flo
* Florian Schmidt <[email protected]> wrote:
> [snip]
>
> i forgot to mention these were from the same session as the previous
> one. also i think i missed the first ones, so the reports in this mail
> are probably useless(?).
i think the futex assert is a separate problem not triggered by the
module.c warnings.
Ingo
On Mon, 25 Oct 2004 15:26:05 +0200
Ingo Molnar <[email protected]> wrote:
> does the patch below fix this?
looks like it. they didn't show on first boot of the new kernel with patch
applied.
Btw: i still experience some "pauses". They are different now though. It
seems i can trigger them by reloading a page in mozilla (not always). This
BUG definetly looks related. Dunno, when exactly it happened (related to
what i did at that moment), but it's the only one in dmesg output on this
bootup. Each of the pauses is accompanied by a high cpu usage of ksoftirqd.
I cannot retrigger the BUG though.
mozilla-bin/763: BUG in futex_wait at kernel/futex.c:542
[<c0132962>] futex_wait+0x192/0x1a0 (12)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (64)
[<c01ef17b>] __up_write+0x13b/0x320 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01ef17b>] __up_write+0x13b/0x320 (4)
[<c02b94d3>] down_read+0xd3/0x2b0 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01eef11>] up_read+0x111/0x240 (8)
[<c0131788>] check_preempt_timing+0x58/0x290 (8)
[<c0131c55>] sub_preempt_count+0x65/0xd0 (4)
[<c01eef11>] up_read+0x111/0x240 (4)
[<c0115f60>] default_wake_function+0x0/0x20 (104)
[<c0115f60>] default_wake_function+0x0/0x20 (32)
[<c0132c17>] do_futex+0x47/0xa0 (40)
[<c0132d60>] sys_futex+0xf0/0x100 (40)
[<c010617b>] syscall_call+0x7/0xb (68)
preempt count: 00000001
. 1-level deep critical section nesting:
.. entry 1: print_traces+0x17/0x90 [<c0131fc7>] / (dump_stack+0x23/0x30 [<c0106733>])
Here's the syslog entry:
Oct 25 15:56:31 mango kernel: mozilla-bin/763: BUG in futex_wait at kernel/futex.c:542
Oct 25 15:56:31 mango kernel: [futex_requeue+162/608] futex_wait+0x192/0x1a0 (12)
Oct 25 15:56:31 mango kernel: [print_name_offset+149/160] sub_preempt_count+0x65/0xd0 (64)
Oct 25 15:56:31 mango kernel: [zlib_inflateReset+43/128] __up_write+0x13b/0x320 (8)
Oct 25 15:56:31 mango kernel: [kfifo_alloc+216/240] check_preempt_timing+0x58/0x290 (8)
Oct 25 15:56:31 mango kernel: [print_name_offset+149/160] sub_preempt_count+0x65/0xd0 (4)
Oct 25 15:56:31 mango kernel: [zlib_inflateReset+43/128] __up_write+0x13b/0x320 (4)
Oct 25 15:56:31 mango kernel: [__func__.0+318/987] down_read+0xd3/0x2b0 (8)
Oct 25 15:56:31 mango kernel: [print_name_offset+149/160] sub_preempt_count+0x65/0xd0 (4)
Oct 25 15:56:31 mango kernel: [zlib_inflate_fast+465/1024] up_read+0x111/0x240 (8)
Oct 25 15:56:31 mango kernel: [kfifo_alloc+216/240] check_preempt_timing+0x58/0x290 (8)
Oct 25 15:56:31 mango kernel: [print_name_offset+149/160] sub_preempt_count+0x65/0xd0 (4)
Oct 25 15:56:31 mango kernel: [zlib_inflate_fast+465/1024] up_read+0x111/0x240 (4)
Oct 25 15:56:31 mango kernel: [wake_up_process+32/48] default_wake_function+0x0/0x20 (104)
Oct 25 15:56:31 mango kernel: [wake_up_process+32/48] default_wake_function+0x0/0x20 (32)
Oct 25 15:56:31 mango kernel: [unqueue_me+103/256] do_futex+0x47/0xa0 (40)
Oct 25 15:56:31 mango kernel: [futex_wait+176/400] sys_futex+0xf0/0x100 (40)
Oct 25 15:56:31 mango kernel: [irq_entries_start+107/128] syscall_call+0x7/0xb (68)
Oct 25 15:56:31 mango kernel: preempt count: 00000001
Oct 25 15:56:31 mango kernel: . 1-level deep critical section nesting:
Oct 25 15:56:31 mango kernel: .. entry 1: print_traces+0x17/0x90 [update_max_trace+23/160] / (dump_stack+0x23/0x30 [show_registers+131/464])
* Florian Schmidt <[email protected]> wrote:
> > does the patch below fix this?
>
> looks like it. they didn't show on first boot of the new kernel with
> patch applied.
ok, i've added it and uploaded -V0.2 together with another fix: there
was a scheduler recursion possible via the delayed-put mechanism using
workqueues - now it's using its own separate lists and per-CPU threads.
> Btw: i still experience some "pauses". They are different now though.
> It seems i can trigger them by reloading a page in mozilla (not
> always). This BUG definetly looks related. Dunno, when exactly it
> happened (related to what i did at that moment), but it's the only one
> in dmesg output on this bootup. Each of the pauses is accompanied by a
> high cpu usage of ksoftirqd. I cannot retrigger the BUG though.
please try -V0.2 - maybe the delayed-put fix is somehow related. (but
only maybe...)
Ingo
* Ingo Molnar <[email protected]> wrote:
> ok, i've added it and uploaded -V0.2 together with another fix: there
> was a scheduler recursion possible via the delayed-put mechanism using
> workqueues - now it's using its own separate lists and per-CPU
> threads.
-V0.2 seems to behave quite well on my testboxes - i'm unable to
reproduce the selinux boot hang anymore.
Ingo
On Mon, 25 Oct 2004 16:10:08 +0200
Ingo Molnar <[email protected]> wrote:
> > Btw: i still experience some "pauses". They are different now though.
> > It seems i can trigger them by reloading a page in mozilla (not
> > always). This BUG definetly looks related. Dunno, when exactly it
> > happened (related to what i did at that moment), but it's the only one
> > in dmesg output on this bootup. Each of the pauses is accompanied by a
> > high cpu usage of ksoftirqd. I cannot retrigger the BUG though.
>
> please try -V0.2 - maybe the delayed-put fix is somehow related. (but
> only maybe...)
>
doesn't seem so. V0.2 doesn't fix this for me. This time i got a BUG storm
again in syslog (it kinda seems related to starting playback in xmms plus
loading pages in mozilla. will boot again to verify):
Oct 25 16:53:42 mango kernel: IRQ#3 thread RT prio: 43.
Oct 25 16:53:52 mango kernel: mozilla-bin/741: BUG in futex_wait at kernel/futex.c:542
Oct 25 16:53:52 mango kernel: [add_preempt_count+130/224] futex_wait+0x192/0x1a0 (12)
Oct 25 16:53:52 mango kernel: [zlib_inflate_blocks+1352/3088] __up_write+0x148/0x320 (100)
Oct 25 16:53:52 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (16)
Oct 25 16:53:52 mango kernel: [__func__.13+351/922] down_write+0xd5/0x250 (8)
Oct 25 16:53:52 mango kernel: [kthread_create+120/208] check_preempt_timing+0x58/0x290 (8)
Oct 25 16:53:52 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:52 mango kernel: [__func__.13+351/922] down_write+0xd5/0x250 (4)
Oct 25 16:53:52 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (8)
Oct 25 16:53:52 mango kernel: [zlib_inflate_blocks+1352/3088] __up_write+0x148/0x320 (8)
Oct 25 16:53:52 mango kernel: [kthread_create+120/208] check_preempt_timing+0x58/0x290 (8)
Oct 25 16:53:52 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [zlib_inflate_blocks+1352/3088] __up_write+0x148/0x320 (4)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (72)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (32)
Oct 25 16:53:56 mango kernel: [print_last_trace+247/256] do_futex+0x47/0xa0 (40)
Oct 25 16:53:56 mango kernel: [get_futex_key+128/448] sys_futex+0xf0/0x100 (40)
Oct 25 16:53:56 mango kernel: [syscall_fault+35/40] syscall_call+0x7/0xb (68)
Oct 25 16:53:56 mango kernel: preempt count: 00000001
Oct 25 16:53:56 mango kernel: . 1-level deep critical section nesting:
Oct 25 16:53:56 mango kernel: .. entry 1: print_traces+0x1d/0x70 [__kfifo_put+157/208] / (dump_stack+0x23/0x30 [show_registers+3/464])
Oct 25 16:53:56 mango kernel:
Oct 25 16:53:56 mango kernel: kernel/futex.c:542
Oct 25 16:53:56 mango kernel: [add_preempt_count+130/224] futex_wait+0x192/0x1a0 (12)
Oct 25 16:53:56 mango kernel: [dequeue_task+24/64] effective_prio+0x8/0x60 (80)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+136/624] recalc_task_prio+0x98/0x190 (8)
Oct 25 16:53:56 mango kernel: [task_rq_lock+98/112] enqueue_task+0x12/0x50 (24)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+487/624] activate_task+0x67/0x80 (16)
Oct 25 16:53:56 mango kernel: [__mon_yday+161/268] preempt_schedule+0x11/0x80 (12)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (16)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (8)
Oct 25 16:53:56 mango kernel: [kthread_create+120/208] check_preempt_timing+0x58/0x290 (8)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (4)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (60)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (12)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (20)
Oct 25 16:53:56 mango kernel: [print_last_trace+247/256] do_futex+0x47/0xa0 (40)
Oct 25 16:53:56 mango kernel: [get_futex_key+128/448] sys_futex+0xf0/0x100 (40)
Oct 25 16:53:56 mango kernel: [syscall_fault+35/40] syscall_call+0x7/0xb (68)
Oct 25 16:53:56 mango kernel: preempt count: 00000001
Oct 25 16:53:56 mango kernel: . 1-level deep critical section nesting:
Oct 25 16:53:56 mango kernel: .. entry 1: print_traces+0x1d/0x70 [__kfifo_put+157/208] / (dump_stack+0x23/0x30 [show_registers+3/464])
Oct 25 16:53:56 mango kernel:
Oct 25 16:53:56 mango kernel: mozilla-bin/741: BUG in futex_wait at kernel/futex.c:542
Oct 25 16:53:56 mango kernel: [add_preempt_count+130/224] futex_wait+0x192/0x1a0 (12)
Oct 25 16:53:56 mango kernel: [dequeue_task+24/64] effective_prio+0x8/0x60 (80)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+136/624] recalc_task_prio+0x98/0x190 (8)
Oct 25 16:53:56 mango kernel: [task_rq_lock+98/112] enqueue_task+0x12/0x50 (24)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+487/624] activate_task+0x67/0x80 (16)
Oct 25 16:53:56 mango kernel: [__mon_yday+161/268] preempt_schedule+0x11/0x80 (12)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (16)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (8)
Oct 25 16:53:56 mango kernel: [kthread_create+120/208] check_preempt_timing+0x58/0x290 (8)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (4)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (60)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (12)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (20)
Oct 25 16:53:56 mango kernel: [print_last_trace+247/256] do_futex+0x47/0xa0 (40)
Oct 25 16:53:56 mango kernel: [get_futex_key+128/448] sys_futex+0xf0/0x100 (40)
Oct 25 16:53:56 mango kernel: [syscall_fault+35/40] syscall_call+0x7/0xb (68)
Oct 25 16:53:56 mango kernel: preempt count: 00000001
Oct 25 16:53:56 mango kernel: . 1-level deep critical section nesting:
Oct 25 16:53:56 mango kernel: .. entry 1: print_traces+0x1d/0x70 [__kfifo_put+157/208] / (dump_stack+0x23/0x30 [show_registers+3/464])
Oct 25 16:53:56 mango kernel:
Oct 25 16:53:56 mango kernel: mozilla-bin/741: BUG in futex_wait at kernel/futex.c:542
Oct 25 16:53:56 mango kernel: [add_preempt_count+130/224] futex_wait+0x192/0x1a0 (12)
Oct 25 16:53:56 mango kernel: [dequeue_task+24/64] effective_prio+0x8/0x60 (80)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+136/624] recalc_task_prio+0x98/0x190 (8)
Oct 25 16:53:56 mango kernel: [task_rq_lock+98/112] enqueue_task+0x12/0x50 (24)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+487/624] activate_task+0x67/0x80 (16)
Oct 25 16:53:56 mango kernel: [__mon_yday+161/268] preempt_schedule+0x11/0x80 (12)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (16)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (8)
Oct 25 16:53:56 mango kernel: [kthread_create+120/208] check_preempt_timing+0x58/0x290 (8)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (4)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (60)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (12)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (20)
Oct 25 16:53:56 mango kernel: [print_last_trace+247/256] do_futex+0x47/0xa0 (40)
Oct 25 16:53:56 mango kernel: [get_futex_key+128/448] sys_futex+0xf0/0x100 (40)
Oct 25 16:53:56 mango kernel: [syscall_fault+35/40] syscall_call+0x7/0xb (68)
Oct 25 16:53:56 mango kernel: preempt count: 00000001
Oct 25 16:53:56 mango kernel: . 1-level deep critical section nesting:
Oct 25 16:53:56 mango kernel: .. entry 1: print_traces+0x1d/0x70 [__kfifo_put+157/208] / (dump_stack+0x23/0x30 [show_registers+3/464])
Oct 25 16:53:56 mango kernel:
Oct 25 16:53:56 mango kernel: mozilla-bin/741: BUG in futex_wait at kernel/futex.c:542
Oct 25 16:53:56 mango kernel: [add_preempt_count+130/224] futex_wait+0x192/0x1a0 (12)
Oct 25 16:53:56 mango kernel: [dequeue_task+24/64] effective_prio+0x8/0x60 (80)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+136/624] recalc_task_prio+0x98/0x190 (8)
Oct 25 16:53:56 mango kernel: [task_rq_lock+98/112] enqueue_task+0x12/0x50 (24)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+487/624] activate_task+0x67/0x80 (16)
Oct 25 16:53:56 mango kernel: [__mon_yday+161/268] preempt_schedule+0x11/0x80 (12)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (16)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (8)
Oct 25 16:53:56 mango kernel: [kthread_create+120/208] check_preempt_timing+0x58/0x290 (8)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (4)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (60)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (12)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (20)
Oct 25 16:53:56 mango kernel: [print_last_trace+247/256] do_futex+0x47/0xa0 (40)
Oct 25 16:53:56 mango kernel: [get_futex_key+128/448] sys_futex+0xf0/0x100 (40)
Oct 25 16:53:56 mango kernel: [syscall_fault+35/40] syscall_call+0x7/0xb (68)
Oct 25 16:53:56 mango kernel: preempt count: 00000001
Oct 25 16:53:56 mango kernel: . 1-level deep critical section nesting:
Oct 25 16:53:56 mango kernel: .. entry 1: print_traces+0x1d/0x70 [__kfifo_put+157/208] / (dump_stack+0x23/0x30 [show_registers+3/464])
Oct 25 16:53:56 mango kernel:
Oct 25 16:53:56 mango kernel: mozilla-bin/741: BUG in futex_wait at kernel/futex.c:542
Oct 25 16:53:56 mango kernel: [add_preempt_count+130/224] futex_wait+0x192/0x1a0 (12)
Oct 25 16:53:56 mango kernel: [dequeue_task+24/64] effective_prio+0x8/0x60 (80)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+136/624] recalc_task_prio+0x98/0x190 (8)
Oct 25 16:53:56 mango kernel: [task_rq_lock+98/112] enqueue_task+0x12/0x50 (24)
Oct 25 16:53:56 mango kernel: [decay_avgs_and_calculate_rates+487/624] activate_task+0x67/0x80 (16)
Oct 25 16:53:56 mango kernel: [__mon_yday+161/268] preempt_schedule+0x11/0x80 (12)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (16)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (8)
Oct 25 16:53:56 mango kernel: [kthread_create+120/208] check_preempt_timing+0x58/0x290 (8)
Oct 25 16:53:56 mango kernel: [wake_bit_function+5/96] sub_preempt_count+0x65/0xd0 (4)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (4)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (60)
Oct 25 16:53:56 mango kernel: [__func__.5+7/19] preempt_schedule_irq+0x6f/0xa0 (12)
Oct 25 16:53:56 mango kernel: [deactivate_task+0/64] default_wake_function+0x0/0x20 (20)
Oct 25 16:53:56 mango kernel: [print_last_trace+247/256] do_futex+0x47/0xa0 (40)
Oct 25 16:53:56 mango kernel: [get_futex_key+128/448] sys_futex+0xf0/0x100 (40)
Oct 25 16:53:56 mango kernel: [syscall_fault+35/40] syscall_call+0x7/0xb (68)
Oct 25 16:53:56 mango kernel: preempt count: 00000001
Oct 25 16:53:56 mango kernel: . 1-level deep critical section nesting:
Oct 25 16:53:56 mango kernel: .. entry 1: print_traces+0x1d/0x70 [__kfifo_put+157/208] / (dump_stack+0x23/0x30 [show_registers+3/464])
Oct 25 16:53:56 mango kernel:
[gazillions more]
flo
Florian Schmidt wrote:
> On Mon, 25 Oct 2004 16:10:08 +0200
> Ingo Molnar <[email protected]> wrote:
>
>
>>>Btw: i still experience some "pauses". They are different now though.
>>>It seems i can trigger them by reloading a page in mozilla (not
>>>always). This BUG definetly looks related. Dunno, when exactly it
>>>happened (related to what i did at that moment), but it's the only one
>>>in dmesg output on this bootup. Each of the pauses is accompanied by a
>>>high cpu usage of ksoftirqd. I cannot retrigger the BUG though.
>>
>>please try -V0.2 - maybe the delayed-put fix is somehow related. (but
>>only maybe...)
>>
>
>
> doesn't seem so. V0.2 doesn't fix this for me. This time i got a BUG storm
> again in syslog (it kinda seems related to starting playback in xmms plus
> loading pages in mozilla. will boot again to verify):
>
Well I have now gotten a couple of these now too (with V0.2). They all
seem to be generated by firefox or thunderbird and the traces are all
identical except for the offending process.
Oct 25 11:22:11 swdev14 kernel:
Oct 25 11:22:20 swdev14 kernel: thunderbird-bin/3946: BUG in futex_wait
at kernel/futex.c:542
Oct 25 11:22:20 swdev14 kernel: [<c0136389>] futex_wait+0x192/0x19c (12)
Oct 25 11:22:20 swdev14 kernel: [<c0135646>]
sub_preempt_count+0x75/0xd8 (72)
Oct 25 11:22:20 swdev14 kernel: [<c02aa9f2>] _spin_unlock+0x1a/0x34 (4)
Oct 25 11:22:20 swdev14 kernel: [<c02aa9f2>] _spin_unlock+0x1a/0x34 (84)
Oct 25 11:22:20 swdev14 kernel: [<c01120ac>] mcount+0x14/0x18 (4)
Oct 25 11:22:20 swdev14 kernel: [<c02aa9f2>] _spin_unlock+0x1a/0x34 (20)
Oct 25 11:22:20 swdev14 kernel: [<c0118e15>]
default_wake_function+0x0/0x1c (60)
Oct 25 11:22:20 swdev14 kernel: [<c0118e15>]
default_wake_function+0x0/0x1c (32)
Oct 25 11:22:20 swdev14 kernel: [<c0136777>] sys_futex+0xf0/0xfc (12)
Oct 25 11:22:20 swdev14 kernel: [<c01120ac>] mcount+0x14/0x18 (8)
Oct 25 11:22:20 swdev14 kernel: [<c0136637>] do_futex+0x47/0x97 (20)
Oct 25 11:22:20 swdev14 kernel: [<c0136777>] sys_futex+0xf0/0xfc (40)
Oct 25 11:22:20 swdev14 kernel: [<c010623d>]
sysenter_past_esp+0x52/0x71 (68)
Oct 25 11:22:20 swdev14 kernel: preempt count: 00000001
Oct 25 11:22:20 swdev14 kernel: . 1-level deep critical section nesting:
Oct 25 11:22:20 swdev14 kernel: .. entry 1: print_traces+0x1d/0x59
[<c0135a28>] / (dump_stack+0x23/0x27 [<c01070db>])
Oct 25 11:22:20 swdev14 kernel:
Ingo Molnar wrote:
> i have released the -V0 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
Actually pertaining to V0.2. I just got my UP system booted up on V0.2
and got this in the log. I did notice that this is not new to this
release. It has been here at least since U10.3. Sorry I didn't catch it
sooner.
Oct 25 13:31:56 daffy kernel: IRQ#11 thread RT prio: 43.
Oct 25 13:31:56 daffy kernel: ip/2432: BUG in enable_irq at
kernel/irq/manage.c:112
Oct 25 13:31:56 daffy kernel: [<c01396ab>] enable_irq+0xfb/0x100 (12)
Oct 25 13:31:56 daffy kernel: [<d0975614>] e100_up+0x114/0x200 [e100] (48)
Oct 25 13:31:56 daffy kernel: [<d0976a20>] e100_open+0x30/0x80 [e100] (48)
Oct 25 13:31:56 daffy kernel: [<c0113154>] mcount+0x14/0x18 (12)
Oct 25 13:31:56 daffy kernel: [<c0265d98>] dev_open+0x88/0xa0 (20)
Oct 25 13:31:56 daffy kernel: [<c02677cd>] dev_change_flags+0x5d/0x140 (28)
Oct 25 13:31:56 daffy kernel: [<c02653ee>] __dev_get_by_name+0xe/0xd0 (8)
Oct 25 13:31:56 daffy kernel: [<c02af3d7>] devinet_ioctl+0x277/0x6c0 (28)
Oct 25 13:31:56 daffy kernel: [<c02b1894>] inet_ioctl+0x64/0xb0 (108)
Oct 25 13:31:56 daffy kernel: [<c025c048>] sock_ioctl+0xc8/0x250 (28)
Oct 25 13:31:56 daffy kernel: [<c0171cf7>] sys_ioctl+0xf7/0x260 (32)
Oct 25 13:31:56 daffy kernel: [<c01064ed>] sysenter_past_esp+0x52/0x71 (48)
Oct 25 13:31:56 daffy kernel: preempt count: 00000002
Oct 25 13:31:56 daffy kernel: . 2-level deep critical section nesting:
Oct 25 13:31:56 daffy kernel: .. entry 1: enable_irq+0x33/0x100
[<c01395e3>] / (e100_up+0x114/0x200 [e100] [<d0975614>])
Oct 25 13:31:56 daffy kernel: .. entry 2: print_traces+0x1d/0x60
[<c0132ecd>] / (dump_stack+0x23/0x30 [<c0106b23>])
Oct 25 13:31:56 daffy kernel:
OK. Am now trying with -V0.2, it works better but locks up in more
mysterious ways.... The modprobe messages are gone - thanks.
Was able (once) to get the X server up and started some of my tests
but the machine locked up (no mouse movement, no response to keyboard)
and had to use the hardware reset to recover. Only significant message
in the system log was
BUG: sleeping function called from invalid context hdparm(3606) at
kernel/mutex.c
in_atomic():0 [00000000], irqs_disabled():1
... will send stack traceback separately ...
when setting udma2 mode in hdparm.
Also noticed a "14 minute gap" in the log file, presumably when I was
running my real time test. I could not get control back until the first
test had run to completion (but heard the audio - so the machine was
working...). Machine locked up within the next 4 minutes.
The second try, the X server came up but the system froze when I tried
to login (according to the splash screen, was reloading my environment).
Messages in the log file were normal until the failure.
Third try, booting with selinux=0. Froze up again, this time the X
server did not make it all the way up. Last image is the blue background
with the hourglass cursor (frozen in center).
I will send what I can, please advise any further tests or data you
need for analysis.
--Mark H Johnson
<mailto:[email protected]>
>i have released the -V0 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
Actually, I picked up -V0.1 since that was available when I started
this morning and of course -V0.2 came out while I was building....
The kernel I built does not even make it to single user mode unless
I disable selinux.
First try had console messages stop after the message
INIT: version 2.85 booting
but the Alt-Sysrq keys do work. Alt-Sysrq-L shows...
Pid: 5, comm: ksoftirqd/1
EIP: 0060:[<c011d808>] CPU:1
EIP is at smp_processor_id+0x28/0xc0
(registers and stack trace...)
preempt count: 00010002
...
Pid: 266, comm: IRQ 1
EIP: 0060:[<c0115b70>] CPU:0
EIP is at nmi_show_all_regs+0xd0/0x120
(registers and stack trace...)
preempt count: 00010002
...
If I repeat this a few times, the first process changed to "hotplug".
Repeating Alt-Sysrq-L will eventually gets stuck and requires
a hardware reset.
Second try - the INIT message was incomplete, stopping at
INIT:
plus the Alt-Sysrq keys not working at all.
Third try - added
selinux=0
to the boot parameters. Made it single user mode w/o any errors.
Using telinit 3 got me a few BUG messages related to modprobe
and sleeping from an invalid context. From what I read in
other messages, you may have already fixed that so I will be
building -V0.2 shortly.
I also got the atomic counter underflow (qdisc_destroy) message.
Let me know if you need the system log, it appears to have
captured the messages if you need them.
--Mark H Johnson
<mailto:[email protected]>
* K.R. Foley <[email protected]> wrote:
> Actually pertaining to V0.2. I just got my UP system booted up on V0.2
> and got this in the log. I did notice that this is not new to this
> release. It has been here at least since U10.3. Sorry I didn't catch
> it sooner.
>
> Oct 25 13:31:56 daffy kernel: IRQ#11 thread RT prio: 43.
> Oct 25 13:31:56 daffy kernel: ip/2432: BUG in enable_irq at
> kernel/irq/manage.c:112
this is pretty harmless and has been happening in -mm for some time. The
e100 device will work fine afterwards.
Ingo
* [email protected] <[email protected]> wrote:
> BUG: sleeping function called from invalid context hdparm(3606) at
> kernel/mutex.c
> in_atomic():0 [00000000], irqs_disabled():1
> ... will send stack traceback separately ...
> when setting udma2 mode in hdparm.
i suspect the patch below will fix the hdparm message - but i dont think
it's related to the other problems you have reported.
Ingo
--- linux/drivers/ide/ide-iops.c.orig
+++ linux/drivers/ide/ide-iops.c
@@ -783,13 +783,11 @@ int ide_driveid_update (ide_drive_t *dri
printk("%s: CHECK for good STATUS\n", drive->name);
return 0;
}
- local_irq_save(flags);
- SELECT_MASK(drive, 0);
id = kmalloc(SECTOR_WORDS*4, GFP_ATOMIC);
- if (!id) {
- local_irq_restore(flags);
+ if (!id)
return 0;
- }
+ local_irq_save(flags);
+ SELECT_MASK(drive, 0);
ata_input_data(drive, id, SECTOR_WORDS);
(void) hwif->INB(IDE_STATUS_REG); /* clear drive IRQ */
local_irq_enable();
I see the same problem. It courses no problems. I _think_ the enable_irq()
call have to be removed. I mailed the list about but nobody answered. I am
rather new to Linux kernel programming so I am not sure...
Esben
On Mon, 25 Oct 2004, K.R. Foley wrote:
> Ingo Molnar wrote:
> > i have released the -V0 Real-Time Preemption patch, which can be
> > downloaded from:
> >
> > http://redhat.com/~mingo/realtime-preempt/
> >
>
> Actually pertaining to V0.2. I just got my UP system booted up on V0.2
> and got this in the log. I did notice that this is not new to this
> release. It has been here at least since U10.3. Sorry I didn't catch it
> sooner.
>
> Oct 25 13:31:56 daffy kernel: IRQ#11 thread RT prio: 43.
> Oct 25 13:31:56 daffy kernel: ip/2432: BUG in enable_irq at
> kernel/irq/manage.c:112
> Oct 25 13:31:56 daffy kernel: [<c01396ab>] enable_irq+0xfb/0x100 (12)
> Oct 25 13:31:56 daffy kernel: [<d0975614>] e100_up+0x114/0x200 [e100] (48)
> Oct 25 13:31:56 daffy kernel: [<d0976a20>] e100_open+0x30/0x80 [e100] (48)
> Oct 25 13:31:56 daffy kernel: [<c0113154>] mcount+0x14/0x18 (12)
> Oct 25 13:31:56 daffy kernel: [<c0265d98>] dev_open+0x88/0xa0 (20)
> Oct 25 13:31:56 daffy kernel: [<c02677cd>] dev_change_flags+0x5d/0x140 (28)
> Oct 25 13:31:56 daffy kernel: [<c02653ee>] __dev_get_by_name+0xe/0xd0 (8)
> Oct 25 13:31:56 daffy kernel: [<c02af3d7>] devinet_ioctl+0x277/0x6c0 (28)
> Oct 25 13:31:56 daffy kernel: [<c02b1894>] inet_ioctl+0x64/0xb0 (108)
> Oct 25 13:31:56 daffy kernel: [<c025c048>] sock_ioctl+0xc8/0x250 (28)
> Oct 25 13:31:56 daffy kernel: [<c0171cf7>] sys_ioctl+0xf7/0x260 (32)
> Oct 25 13:31:56 daffy kernel: [<c01064ed>] sysenter_past_esp+0x52/0x71 (48)
> Oct 25 13:31:56 daffy kernel: preempt count: 00000002
> Oct 25 13:31:56 daffy kernel: . 2-level deep critical section nesting:
> Oct 25 13:31:56 daffy kernel: .. entry 1: enable_irq+0x33/0x100
> [<c01395e3>] / (e100_up+0x114/0x200 [e100] [<d0975614>])
> Oct 25 13:31:56 daffy kernel: .. entry 2: print_traces+0x1d/0x60
> [<c0132ecd>] / (dump_stack+0x23/0x30 [<c0106b23>])
> Oct 25 13:31:56 daffy kernel:
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
Ingo Molnar wrote:
> i have released the -V0 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
I'm testing -V0.2. I'm getting reproducible deadlocks very soon after I
start anything talking to network (firefox, links, licq).
Example with links:
BUG: semaphore recursion deadlock detected!
.. current task links/4724 is already holding c0430840.
[<c01c7546>] __rwsem_deadlock+0x176/0x190 (12)
[<c02cb966>] down_write+0x116/0x250 (20)
[<c01c743c>] __rwsem_deadlock+0x6c/0x190 (28)
[<c02cb839>] down_read+0x39/0x50 (20)
[<c02cb8b2>] down_write+0x62/0x250 (4)
[<c02cb966>] down_write+0x116/0x250 (24)
[<c02cb839>] down_read+0x39/0x50 (48)
[<c0114310>] mcount+0x14/0x18 (12)
[<c0271c91>] dev_queue_xmit_nit+0x41/0x130 (28)
[<c01c7bf6>] up_write+0x26/0x60 (12)
[<c0281443>] qdisc_restart+0x223/0x250 (24)
[<c027221d>] dev_queue_xmit+0x1ad/0x260 (12)
[<c0114310>] mcount+0x14/0x18 (8)
[<c027222f>] dev_queue_xmit+0x1bf/0x260 (36)
[<c027856e>] neigh_resolve_output+0xfe/0x240 (32)
[<c029327e>] ip_finish_output2+0xbe/0x240 (56)
[<c027dc45>] nf_hook_slow+0xd5/0x130 (36)
[<c02931c0>] ip_finish_output2+0x0/0x240 (28)
[<c0290a8b>] ip_finish_output+0x26b/0x270 (32)
[<c02931c0>] ip_finish_output2+0x0/0x240 (24)
[<c02931aa>] dst_output+0x1a/0x30 (32)
[<c027dc45>] nf_hook_slow+0xd5/0x130 (12)
[<c0293190>] dst_output+0x0/0x30 (28)
[<c029114a>] ip_queue_xmit+0x45a/0x570 (32)
[<c0293190>] dst_output+0x0/0x30 (24)
[<c0135c75>] sub_preempt_count+0x65/0xd0 (24)
[<c01c79f8>] __up_write+0x148/0x320 (8)
[<c0135708>] check_preempt_timing+0x58/0x2e0 (8)
[<c0135c75>] sub_preempt_count+0x65/0xd0 (4)
[<c01c79f8>] __up_write+0x148/0x320 (4)
[<c0134f9d>] __mcount+0x1d/0x20 (28)
[<c01c727e>] rwsem_owner_del+0xe/0x120 (4)
[<c0134f9d>] __mcount+0x1d/0x20 (52)
[<c02a875e>] tcp_v4_send_check+0xe/0xf0 (4)
[<c02a2259>] tcp_transmit_skb+0x439/0x880 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02a879f>] tcp_v4_send_check+0x4f/0xf0 (20)
[<c02a2302>] tcp_transmit_skb+0x4e2/0x880 (32)
[<c0114310>] mcount+0x14/0x18 (28)
[<c02a4f96>] tcp_send_ack+0xa6/0xf0 (52)
[<c0297ccc>] tcp_recvmsg+0x2ec/0x750 (36)
[<c0134f9d>] __mcount+0x1d/0x20 (20)
[<c0114310>] mcount+0x14/0x18 (44)
[<c026bb59>] sock_common_recvmsg+0x59/0x70 (20)
[<c0267f78>] sock_aio_read+0xf8/0x110 (48)
[<c0114310>] mcount+0x14/0x18 (100)
[<c015e7fa>] do_sync_read+0xaa/0xe0 (20)
[<c01344a0>] autoremove_wake_function+0x0/0x60 (116)
[<c01c11b8>] dummy_file_permission+0x8/0x10 (12)
[<c015e8d6>] vfs_read+0xa6/0x140 (4)
[<c015e933>] vfs_read+0x103/0x140 (36)
[<c0114310>] mcount+0x14/0x18 (24)
[<c015ebe0>] sys_read+0x50/0x80 (20)
[<c010527b>] syscall_call+0x7/0xb (44)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: down_write+0x249/0x250 [<c02cba99>] / (down_read+0x39/0x50
[<c02cb839>])
.. entry 2: print_traces+0x1d/0x90 [<c0135fcd>] / (dump_stack+0x23/0x30
[<c01060b3>])
BUG: circular semaphore deadlock: ksoftirqd/0/2 is blocked on c0430840,
deadlocking links/4724
f7c2be00 00000046 f7c24020 c03bfd60 00000202 00001db4 f7c2a000 f7c24020
f7c24020 f7c2bdec 00000227 e443e1ce 0000001d f7c24020 f7c242b4
f7c2a000
f7c24020 f7c24020 f7c2be24 c02ca79f f7c2be24 00000086 c03e3b00
c02cb9d0
Call Trace:
[<c02ca79f>] schedule+0x2f/0xe0 (80)
[<c02cb9d0>] down_write+0x180/0x250 (16)
[<c02cb9b1>] down_write+0x161/0x250 (20)
[<c02cb839>] down_read+0x39/0x50 (48)
[<c0114310>] mcount+0x14/0x18 (12)
[<c027db97>] nf_hook_slow+0x27/0x130 (28)
[<c0134f9d>] __mcount+0x1d/0x20 (24)
[<c028d4ee>] ip_rcv+0xe/0x540 (4)
[<c027274d>] netif_receive_skb+0x12d/0x240 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c028d940>] ip_rcv+0x460/0x540 (20)
[<c028dbc0>] ip_rcv_finish+0x0/0x300 (24)
[<c0114310>] mcount+0x14/0x18 (8)
[<c027274d>] netif_receive_skb+0x12d/0x240 (28)
[<c0270008>] gnet_stats_start_copy+0x18/0x40 (20)
[<c0272a7f>] net_rx_action+0x7f/0x1a0 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02728e8>] process_backlog+0x88/0x1a0 (20)
[<c0272a7f>] net_rx_action+0x7f/0x1a0 (40)
[<c01237c7>] ___do_softirq+0x87/0xd0 (36)
[<c0123898>] _do_softirq+0x8/0x30 (8)
[<c0123c84>] ksoftirqd+0xb4/0x100 (4)
[<c01238b0>] _do_softirq+0x20/0x30 (28)
[<c0123c84>] ksoftirqd+0xb4/0x100 (8)
[<c0133eea>] kthread+0xaa/0xb0 (24)
[<c0123bd0>] ksoftirqd+0x0/0x100 (20)
[<c0133e40>] kthread+0x0/0xb0 (12)
[<c0103319>] kernel_thread_helper+0x5/0xc (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __schedule+0x4e/0x5f0 [<c02ca1ce>] / (schedule+0x2f/0xe0
[<c02ca79f>])
.. entry 2: __schedule+0xdd/0x5f0 [<c02ca25d>] / (schedule+0x2f/0xe0
[<c02ca79f>])
Michal
Ingo Molnar wrote:
>i have released the -V0 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>
I'am facing three different BUG messages :
Oct 26 01:32:46 tigre02 kernel: BUG: sleeping function called from
invalid context kpnpbiosd(229) at kernel/mutex.c:28
Oct 26 01:32:46 tigre02 kernel: in_atomic():1 [00000001], irqs_disabled():0
Oct 26 01:32:46 tigre02 kernel: [<c0121c4a>] __might_sleep+0xca/0xe0 (12)
Oct 26 01:32:46 tigre02 kernel: [<c013e268>] _mutex_lock+0x38/0x50 (36)
Oct 26 01:32:46 tigre02 kernel: [<c013e2d6>]
_mutex_lock_irqsave+0x16/0x20 (24)
Oct 26 01:32:46 tigre02 kernel: [<c02e5237>]
pnp_bios_dock_station_info+0x97/0x1f0 (12)
Oct 26 01:32:46 tigre02 kernel: [<c02e4371>] pnp_dock_thread+0x81/0x100
(48)
Oct 26 01:32:46 tigre02 kernel: [<c02e42f0>] pnp_dock_thread+0x0/0x100 (16)
Oct 26 01:32:46 tigre02 kernel: [<c01033f9>]
kernel_thread_helper+0x5/0xc (16)
Oct 26 01:32:46 tigre02 kernel: preempt count: 00000002
Oct 26 01:32:46 tigre02 kernel: . 2-level deep critical section nesting:
Oct 26 01:32:46 tigre02 kernel: .. entry 1:
pnp_bios_dock_station_info+0x4f/0x1f0 [<c02e51ef>] /
(pnp_dock_thread+0x81/0x100 [<c02e4371>])
Oct 26 01:32:46 tigre02 kernel: .. entry 2: print_traces+0x1d/0x60
[<c013fa7d>] / (dump_stack+0x23/0x30 [<c01064d3>])
Oct 26 01:32:46 tigre02 kernel:
Oct 26 01:32:45 tigre02 kernel: BUG: using smp_processor_id() in
preemptible [00000001] code: usb.agent/1479
Oct 26 01:32:45 tigre02 kernel: caller is store_stackinfo+0x5d/0xb0
Oct 26 01:32:45 tigre02 kernel: [<c011f0d7>] smp_processor_id+0xb7/0xc0
(12)
Oct 26 01:32:45 tigre02 kernel: [<c0158fcd>] store_stackinfo+0x5d/0xb0 (8)
Oct 26 01:32:45 tigre02 kernel: [<c0158fcd>] store_stackinfo+0x5d/0xb0 (24)
Oct 26 01:32:45 tigre02 kernel: [<c015ac1a>]
cache_free_debugcheck+0x1aa/0x390 (28)
Oct 26 01:32:45 tigre02 kernel: [<c018446e>] __user_walk+0x5e/0x80 (12)
Oct 26 01:32:45 tigre02 kernel: [<c015bba3>] kmem_cache_free+0x33/0xf0 (4)
Oct 26 01:32:45 tigre02 kernel: [<c0116454>] mcount+0x14/0x18 (8)
Oct 26 01:32:45 tigre02 kernel: [<c015bbb6>] kmem_cache_free+0x46/0xf0 (28)
Oct 26 01:32:45 tigre02 kernel: [<c018446e>] __user_walk+0x5e/0x80 (12)
Oct 26 01:32:45 tigre02 kernel: [<c018446e>] __user_walk+0x5e/0x80 (20)
Oct 26 01:32:45 tigre02 kernel: [<c017e323>] vfs_stat+0x23/0x60 (32)
Oct 26 01:32:45 tigre02 kernel: [<c013e8cd>] __mcount+0x1d/0x30 (56)
Oct 26 01:32:45 tigre02 kernel: [<c017eaae>] sys_stat64+0xe/0x50 (4)
Oct 26 01:32:45 tigre02 kernel: [<c0105541>]
sysenter_past_esp+0x52/0x71 (4)
Oct 26 01:32:45 tigre02 kernel: [<c0116454>] mcount+0x14/0x18 (8)
Oct 26 01:32:45 tigre02 kernel: [<c017eac0>] sys_stat64+0x20/0x50 (20)
Oct 26 01:32:45 tigre02 kernel: [<c0291659>] copy_to_user+0x69/0x80 (24)
Oct 26 01:32:45 tigre02 kernel: [<c01339eb>]
sys_rt_sigprocmask+0x9b/0xf0 (28)
Oct 26 01:32:45 tigre02 kernel: [<c0105541>]
sysenter_past_esp+0x52/0x71 (48)
Oct 26 01:32:45 tigre02 kernel: preempt count: 00000002
Oct 26 01:32:45 tigre02 kernel: . 2-level deep critical section nesting:
Oct 26 01:32:45 tigre02 kernel: .. entry 1: smp_processor_id+0x5f/0xc0
[<c011f07f>] / (store_stackinfo+0x5d/0xb0 [<c0158fcd>])
Oct 26 01:32:46 tigre02 kernel: .. entry 2: print_traces+0x1d/0x60
[<c013fa7d>] / (dump_stack+0x23/0x30 [<c01064d3>])
Oct 26 01:32:46 tigre02 kernel:
Oct 26 01:32:46 tigre02 kernel: BUG: using smp_processor_id() in
preemptible [00000001] code: rc.sysinit/1628
Oct 26 01:32:46 tigre02 kernel: caller is store_stackinfo+0x5d/0xb0
Oct 26 01:32:46 tigre02 kernel: [<c011f0d7>] smp_processor_id+0xb7/0xc0
(12)
Oct 26 01:32:46 tigre02 kernel: [<c0158fcd>] store_stackinfo+0x5d/0xb0 (8)
Oct 26 01:32:46 tigre02 kernel: [<c0158fcd>] store_stackinfo+0x5d/0xb0 (24)
Oct 26 01:32:46 tigre02 kernel: [<c015ac1a>]
cache_free_debugcheck+0x1aa/0x390 (28)
Oct 26 01:32:46 tigre02 kernel: [<c0121d4c>] free_task+0x1c/0x40 (12)
Oct 26 01:32:46 tigre02 kernel: [<c015bd2e>] kfree+0x5e/0x120 (4)
Oct 26 01:32:46 tigre02 kernel: [<c0116454>] mcount+0x14/0x18 (8)
Oct 26 01:32:46 tigre02 kernel: [<c015bd41>] kfree+0x71/0x120 (28)
Oct 26 01:32:46 tigre02 kernel: [<c0121d4c>] free_task+0x1c/0x40 (12)
Oct 26 01:32:46 tigre02 kernel: [<c0121d4c>] free_task+0x1c/0x40 (28)
Oct 26 01:32:46 tigre02 kernel: [<c0126f9c>] release_task+0xec/0x1a0 (20)
Oct 26 01:32:46 tigre02 kernel: [<c0129617>]
wait_task_zombie+0x4b7/0x5d0 (40)
Oct 26 01:32:46 tigre02 kernel: [<c013e8cd>] __mcount+0x1d/0x30 (28)
Oct 26 01:32:46 tigre02 kernel: [<c0271ed8>] dummy_task_wait+0x8/0x10 (4)
Oct 26 01:32:46 tigre02 kernel: [<c0128ed1>] eligible_child+0x71/0xf0 (4)
Oct 26 01:32:46 tigre02 kernel: [<c0116454>] mcount+0x14/0x18 (8)
Oct 26 01:32:46 tigre02 kernel: [<c0271ed8>] dummy_task_wait+0x8/0x10 (20)
Oct 26 01:32:46 tigre02 kernel: [<c012a1ce>] do_wait+0x4ae/0x590 (24)
Oct 26 01:32:46 tigre02 kernel: [<c0291501>]
__copy_to_user_ll+0x11/0x80 (40)
Oct 26 01:32:46 tigre02 kernel: [<c011f1b0>]
default_wake_function+0x0/0x20 (28)
Oct 26 01:32:46 tigre02 kernel: [<c011f1b0>]
default_wake_function+0x0/0x20 (32)
Oct 26 01:32:46 tigre02 kernel: [<c012a3bc>] sys_waitpid+0x2c/0x30 (12)
Oct 26 01:32:46 tigre02 kernel: [<c0116454>] mcount+0x14/0x18 (8)
Oct 26 01:32:46 tigre02 kernel: [<c012a388>] sys_wait4+0x48/0x50 (20)
Oct 26 01:32:46 tigre02 kernel: [<c012a3bc>] sys_waitpid+0x2c/0x30 (28)
Oct 26 01:32:46 tigre02 kernel: [<c0105541>]
sysenter_past_esp+0x52/0x71 (24)
Oct 26 01:32:46 tigre02 kernel: preempt count: 00000002
Oct 26 01:32:46 tigre02 kernel: . 2-level deep critical section nesting:
Oct 26 01:32:47 tigre02 kernel: .. entry 1: smp_processor_id+0x5f/0xc0
[<c011f07f>] / (store_stackinfo+0x5d/0xb0 [<c0158fcd>])
Oct 26 01:32:47 tigre02 kernel: .. entry 2: print_traces+0x1d/0x60
[<c013fa7d>] / (dump_stack+0x23/0x30 [<c01064d3>])
Oct 26 01:32:47 tigre02 kernel:
The kernel is stable (X started).
Remi
Lee Revell wrote:
> On Mon, 2004-10-25 at 20:40 +0100, Rui Nuno Capela wrote:
>
>>OTOH, jackd -R xruns are awfully back, even thought I (re)prioritize the
>>relevant IRQ thread handlers to be always higher than jackd's.
>
>
> Actually they should be lower, except the soundcard. I was only able to
> get the xrun free behavior of T3 by setting all IRQ threads except the
> soundcard to SCHED_OTHER. Especially important was setting ksoftirqd to
> SCHED_OTHER, this actually may have been the only one necessary.
>
> The relative priorities of jackd and the soundcard irq do not matter as
> these two should never contend (aka they are never both runnable at the
> same time).
>
> Lee
>
>
Not being familiar with jack, does it use rtc?
kr
On Mon, 2004-10-25 at 20:40 +0100, Rui Nuno Capela wrote:
> OTOH, jackd -R xruns are awfully back, even thought I (re)prioritize the
> relevant IRQ thread handlers to be always higher than jackd's.
Actually they should be lower, except the soundcard. I was only able to
get the xrun free behavior of T3 by setting all IRQ threads except the
soundcard to SCHED_OTHER. Especially important was setting ksoftirqd to
SCHED_OTHER, this actually may have been the only one necessary.
The relative priorities of jackd and the soundcard irq do not matter as
these two should never contend (aka they are never both runnable at the
same time).
Lee
Lee Revell wrote:
> On Mon, 2004-10-25 at 22:11 -0500, K.R. Foley wrote:
>
>>
>>Not being familiar with jack, does it use rtc?
>>
>
>
> No it normally uses the soundcard for timing. For testing there is a
> dummy backend that just usleep()s. This makes a pretty useful latency
> tester.
>
> Lee
>
>
Just wondered. I am writing an email right now about my results with
V0.2, one of which happens to be that amlat will kill any of my systems
running it.
kr
On Mon, 2004-10-25 at 22:11 -0500, K.R. Foley wrote:
>
> Not being familiar with jack, does it use rtc?
>
No it normally uses the soundcard for timing. For testing there is a
dummy backend that just usleep()s. This makes a pretty useful latency
tester.
Lee
Ingo Molnar wrote:
>> ok, i've added it and uploaded -V0.2 together with another fix: there
>> was a scheduler recursion possible via the delayed-put mechanism using
>> workqueues - now it's using its own separate lists and per-CPU
>> threads.
>
> -V0.2 seems to behave quite well on my testboxes - i'm unable to
> reproduce the selinux boot hang anymore.
>
OK. RT-V0.2 boots on my laptop (P4/UP), sometimes ;)
I know that my early impressions are illusive, rather subjective, but I do
feel overall behavior is getting worst, when regarding low-latency audio
work with jackd -R.
To put things straight with RT-V0.2, I get trouble with much less load
than even before.
I noticed that something is, now and then, topping the cpu to 99%, leaving
the system to a crawl, eventually returning back to normal. Can't figure
out who or what, just because ps or top are stalling to silence, only
returning results after when the crawl ends, which are of no useful
evidence. When I'm lucky enough to let top (and gkrellm) telling me
something, it does look like that most of the time is spent on kernel mode
(sys time) and none of the running processes are at stake. Puzzled. It's
just like you're about to loose confidence on the procps tools.
OTOH, jackd -R xruns are awfully back, even thought I (re)prioritize the
relevant IRQ thread handlers to be always higher than jackd's. This just
doesn't seem like an improvement, not at all :( IMO, given the xrun rate
I'm experiencing with RT-V0.2, it all seems that I'm running on vanilla
2.6.9, with pretty much instability added to the picture.
About that jackd -R issue, which has been hosing the complete system
occasionally, is still an annoyance on RT-V0.2. On this same laptop
(P4/UP), it does happen only if PREEMPT_REALTIME is set. However, I think
I've narrowed it's reproducibility: loading more than two fluidsynth
instances was the easiest way to get the box frozen in less than one
minute, at least on RT-U10.3. With RT-V0.2 is even easier, with just two
fluidsynth instances, or even one.
Sorry for this kind of rant, but I had to distress myself, somehow ;)
Nevertheless, I'll keep on going with my user level trials... and let you
informed, of course.
Cheers,
--
rncbc aka Rui Nuno Capela
[email protected]
On Mon, 2004-10-25 at 20:01, Lee Revell wrote:
> On Mon, 2004-10-25 at 20:40 +0100, Rui Nuno Capela wrote:
> > OTOH, jackd -R xruns are awfully back, even thought I (re)prioritize the
> > relevant IRQ thread handlers to be always higher than jackd's.
>
> Actually they should be lower, except the soundcard. I was only able to
> get the xrun free behavior of T3 by setting all IRQ threads except the
> soundcard to SCHED_OTHER. Especially important was setting ksoftirqd to
> SCHED_OTHER, this actually may have been the only one necessary.
>
> The relative priorities of jackd and the soundcard irq do not matter as
> these two should never contend (aka they are never both runnable at the
> same time).
What happens when one is blessed with a laptop where everything is
sharing an interrupt?
$ cat /proc/interrupts
CPU0
0: 2372239 XT-PIC timer 0/72239
1: 5362 XT-PIC i8042 0/5362
2: 0 XT-PIC cascade 0/0
8: 1 XT-PIC rtc 0/1
9: 616176 XT-PIC acpi, uhci_hcd, uhci_hcd, uhci_hcd,
eth0, yenta, yenta, Intel 82801CA-ICH3, radeon@PCI:1:0:0 0/16176
11: 37 XT-PIC sonypi 0/35
12: 28392 XT-PIC i8042 0/28392
14: 21078 XT-PIC ide0 0/21078
15: 472 XT-PIC ide1 0/472
NMI: 0
LOC: 0
ERR: 0
MIS: 0
I'm running U10.3 and I'm consistently seeing xruns when Jack clients
start and stop, something I would not see before (I have not tried the
latest V series yet). I have tried changing the priority of IRQ9 and the
scheduler but I still see the xruns. Yesterday I tried enabling
preempt_thresh to a low value but did not see hits when the xruns
occurred. Maybe I'm missing something I need to do...
-- Fernando
On Monday 25 October 2004 22:40, Rui Nuno Capela wrote:
> Ingo Molnar wrote:
> >> ok, i've added it and uploaded -V0.2 together with another fix: there
> >> was a scheduler recursion possible via the delayed-put mechanism using
> >> workqueues - now it's using its own separate lists and per-CPU
> >> threads.
> >
> > -V0.2 seems to behave quite well on my testboxes - i'm unable to
> > reproduce the selinux boot hang anymore.
> >
>
> OK. RT-V0.2 boots on my laptop (P4/UP), sometimes ;)
>
> I know that my early impressions are illusive, rather subjective, but I do
> feel overall behavior is getting worst, when regarding low-latency audio
> work with jackd -R.
>
> To put things straight with RT-V0.2, I get trouble with much less load
> than even before.
>
> I noticed that something is, now and then, topping the cpu to 99%, leaving
> the system to a crawl, eventually returning back to normal. Can't figure
> out who or what, just because ps or top are stalling to silence, only
> returning results after when the crawl ends, which are of no useful
> evidence. When I'm lucky enough to let top (and gkrellm) telling me
> something, it does look like that most of the time is spent on kernel mode
> (sys time) and none of the running processes are at stake. Puzzled. It's
> just like you're about to loose confidence on the procps tools.
<shameless plug>
Maybe this program will be useful. It is designed to give you
overall system statistics without the need to scan entire /proc/NNN
forest. Together with nice -20, it will hopefully not stall.
Compiled with dietlibc. If you will have trouble compiling it, binary is
attached too.
Latest version is 0.9 but it seems I forgot it in my home box :(
</shameless plug>
--
vda
K.R. Foley wrote:
> Florian Schmidt wrote:
>
>> doesn't seem so. V0.2 doesn't fix this for me. This time i got a BUG
>> storm
>> again in syslog (it kinda seems related to starting playback in xmms plus
>> loading pages in mozilla. will boot again to verify):
>>
>
> Well I have now gotten a couple of these now too (with V0.2). They all
> seem to be generated by firefox or thunderbird and the traces are all
> identical except for the offending process.
>
>
> Oct 25 11:22:11 swdev14 kernel:
> Oct 25 11:22:20 swdev14 kernel: thunderbird-bin/3946: BUG in futex_wait
> at kernel/futex.c:542
> Oct 25 11:22:20 swdev14 kernel: [<c0136389>] futex_wait+0x192/0x19c (12)
> Oct 25 11:22:20 swdev14 kernel: [<c0135646>]
> sub_preempt_count+0x75/0xd8 (72)
> Oct 25 11:22:20 swdev14 kernel: [<c02aa9f2>] _spin_unlock+0x1a/0x34 (4)
> Oct 25 11:22:20 swdev14 kernel: [<c02aa9f2>] _spin_unlock+0x1a/0x34 (84)
> Oct 25 11:22:20 swdev14 kernel: [<c01120ac>] mcount+0x14/0x18 (4)
> Oct 25 11:22:20 swdev14 kernel: [<c02aa9f2>] _spin_unlock+0x1a/0x34 (20)
> Oct 25 11:22:20 swdev14 kernel: [<c0118e15>]
> default_wake_function+0x0/0x1c (60)
> Oct 25 11:22:20 swdev14 kernel: [<c0118e15>]
> default_wake_function+0x0/0x1c (32)
> Oct 25 11:22:20 swdev14 kernel: [<c0136777>] sys_futex+0xf0/0xfc (12)
> Oct 25 11:22:20 swdev14 kernel: [<c01120ac>] mcount+0x14/0x18 (8)
> Oct 25 11:22:20 swdev14 kernel: [<c0136637>] do_futex+0x47/0x97 (20)
> Oct 25 11:22:20 swdev14 kernel: [<c0136777>] sys_futex+0xf0/0xfc (40)
> Oct 25 11:22:20 swdev14 kernel: [<c010623d>]
> sysenter_past_esp+0x52/0x71 (68)
> Oct 25 11:22:20 swdev14 kernel: preempt count: 00000001
> Oct 25 11:22:20 swdev14 kernel: . 1-level deep critical section nesting:
> Oct 25 11:22:20 swdev14 kernel: .. entry 1: print_traces+0x1d/0x59
> [<c0135a28>] / (dump_stack+0x23/0x27 [<c01070db>])
> Oct 25 11:22:20 swdev14 kernel:
I see a lot of similar traces too on V0.2 (also either from firefox or
thunderbird):
Oct 26 10:20:57 eran kernel: thunderbird-bin/4285: BUG in futex_wait at
kernel/futex.c:542
Oct 26 10:20:57 eran kernel: [<c01338d9>] futex_wait+0x1b9/0x1c0 (8)
Oct 26 10:20:57 eran kernel: [<c0132b84>]
check_preempt_timing+0x64/0x190 (80)
Oct 26 10:20:57 eran kernel: [<c0132b84>]
check_preempt_timing+0x64/0x190 (4)
Oct 26 10:20:57 eran kernel: [<c01191f7>] recalc_task_prio+0xa7/0x1a0 (12)
Oct 26 10:20:57 eran kernel: [<c0119785>] finish_task_switch+0x35/0xb0 (8)
Oct 26 10:20:57 eran kernel: [<c011941a>] try_to_wake_up+0x8a/0xc0 (8)
Oct 26 10:20:57 eran kernel: [<c01191f7>] recalc_task_prio+0xa7/0x1a0 (12)
Oct 26 10:20:57 eran kernel: [<c01191f7>] recalc_task_prio+0xa7/0x1a0 (12)
Oct 26 10:20:57 eran kernel: [<c01191f7>] recalc_task_prio+0xa7/0x1a0 (8)
Oct 26 10:20:57 eran kernel: [<c0119785>] finish_task_switch+0x35/0xb0 (12)
Oct 26 10:20:57 eran kernel: [<c01191f7>] recalc_task_prio+0xa7/0x1a0 (20)
Oct 26 10:20:57 eran kernel: [<c0119785>] finish_task_switch+0x35/0xb0 (12)
Oct 26 10:20:57 eran kernel: [<c0119e20>]
default_wake_function+0x0/0x10 (64)
Oct 26 10:20:57 eran kernel: [<c0119e20>]
default_wake_function+0x0/0x10 (32)
Oct 26 10:20:57 eran kernel: [<c0108609>] do_IRQ+0x39/0x60 (20)
Oct 26 10:20:57 eran kernel: [<c0133b55>] do_futex+0x35/0x90 (20)
Oct 26 10:20:57 eran kernel: [<c022cf3c>] copy_from_user+0x5c/0x90 (8)
Oct 26 10:20:57 eran kernel: [<c0133c9a>] sys_futex+0xea/0x100 (16)
Oct 26 10:20:57 eran kernel: [<c01060f9>] sysenter_past_esp+0x52/0x71 (56)
Oct 26 10:20:57 eran kernel: preempt count: 00000001
Oct 26 10:20:57 eran kernel: . 1-level deep critical section nesting:
Oct 26 10:20:57 eran kernel: .. entry 1: print_traces+0xd/0x40
[<c0132f8d>] / (0x0 [<00000000>])
I also get these errors from 'tail -f /var/log/messages':
tail: cannot read realtime clock: Unknown error 516
(it seems to happen at the same time as the above traces, though less
often).
--
Eran Mann
MRV International
Tel: 972-4-9936297
Fax: 972-4-9890430
http://www.mrv.com
Denis Vlasenko wrote:
>
> <shameless plug>
> Maybe this program will be useful. It is designed to give you
> overall system statistics without the need to scan entire /proc/NNN
> forest. Together with nice -20, it will hopefully not stall.
>
> Compiled with dietlibc. If you will have trouble compiling it,
> binary is attached too.
>
> Latest version is 0.9 but it seems I forgot it in my home box :(
</shameless plug>
Thanks for nmeter. I have changed a couple of little bits to build with
gcc-3.4 here (see diff attached).
Indeed, it says 0.7 as its version string. What's up on 0.9?
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Actually pertaining to V0.2. I just got my UP system booted up on V0.2
>>and got this in the log. I did notice that this is not new to this
>>release. It has been here at least since U10.3. Sorry I didn't catch
>>it sooner.
>>
>>Oct 25 13:31:56 daffy kernel: IRQ#11 thread RT prio: 43.
>>Oct 25 13:31:56 daffy kernel: ip/2432: BUG in enable_irq at
>>kernel/irq/manage.c:112
>
>
> this is pretty harmless and has been happening in -mm for some time. The
> e100 device will work fine afterwards.
>
> Ingo
>
Several things in regard to V0.2:
1) Interactive responsiveness seems to be noticably sluggish at times on
all three of the systems I have tested this on.
2) My 450MHz UP system is definitely the worst by far. Scrolling through
the syslog in a telnet session produces pauses every few seconds for
about a second, that is while it's still responding. These problems seem
to be network related, but there are no indications of what the problem
is. This system also at times will just stop responding to network requests.
3) Both of the SMP systems are lacking the snappy responsiveness in X
that I have become accustomed to with previous patches, but the 2.6GHz
Xeon (w/HT) is worse than the 933MHz Xeon. Again no indications of
problems in the logs.
4) Using amlat to run the RTC at 1kHz will kill any of these systems
very quickly.
kr
* K.R. Foley <[email protected]> wrote:
> 1) Interactive responsiveness seems to be noticably sluggish at times on
> all three of the systems I have tested this on.
yeah, something's seriously buggered in V0.2 - dont bother testing its
latencies, the bug hides all the benefits.
Ingo
On Mon, 2004-10-25 at 22:11 -0700, Fernando Pablo Lopez-Lezcano wrote:
> On Mon, 2004-10-25 at 20:01, Lee Revell wrote:
> > On Mon, 2004-10-25 at 20:40 +0100, Rui Nuno Capela wrote:
> > > OTOH, jackd -R xruns are awfully back, even thought I (re)prioritize the
> > > relevant IRQ thread handlers to be always higher than jackd's.
> >
> > Actually they should be lower, except the soundcard. I was only able to
> > get the xrun free behavior of T3 by setting all IRQ threads except the
> > soundcard to SCHED_OTHER. Especially important was setting ksoftirqd to
> > SCHED_OTHER, this actually may have been the only one necessary.
> >
> > The relative priorities of jackd and the soundcard irq do not matter as
> > these two should never contend (aka they are never both runnable at the
> > same time).
>
> What happens when one is blessed with a laptop where everything is
> sharing an interrupt?
>
> $ cat /proc/interrupts
> CPU0
> 0: 2372239 XT-PIC timer 0/72239
> 1: 5362 XT-PIC i8042 0/5362
> 2: 0 XT-PIC cascade 0/0
> 8: 1 XT-PIC rtc 0/1
> 9: 616176 XT-PIC acpi, uhci_hcd, uhci_hcd, uhci_hcd,
> eth0, yenta, yenta, Intel 82801CA-ICH3, radeon@PCI:1:0:0 0/16176
> 11: 37 XT-PIC sonypi 0/35
> 12: 28392 XT-PIC i8042 0/28392
> 14: 21078 XT-PIC ide0 0/21078
> 15: 472 XT-PIC ide1 0/472
Ugh, why would _anyone_ design a laptop like that? You have 4
perfectly good interrupts that you are not using at all. Is it really
cheaper to put everything on the same irq? Does this work better under
Windows or something?
AFAIK there is nothing you can do - any other irq that fires on 9 will
mask out all the others until it completes.
I am increasingly convinced that the vast majority of laptops are
horribly broken and completely unsuitable for low latency audio work.
Lee
On Tue, 2004-10-26 at 10:25, Lee Revell wrote:
> On Mon, 2004-10-25 at 22:11 -0700, Fernando Pablo Lopez-Lezcano wrote:
> > On Mon, 2004-10-25 at 20:01, Lee Revell wrote:
> > > On Mon, 2004-10-25 at 20:40 +0100, Rui Nuno Capela wrote:
> > > > OTOH, jackd -R xruns are awfully back, even thought I (re)prioritize the
> > > > relevant IRQ thread handlers to be always higher than jackd's.
> > >
> > > Actually they should be lower, except the soundcard. I was only able to
> > > get the xrun free behavior of T3 by setting all IRQ threads except the
> > > soundcard to SCHED_OTHER. Especially important was setting ksoftirqd to
> > > SCHED_OTHER, this actually may have been the only one necessary.
> > >
> > > The relative priorities of jackd and the soundcard irq do not matter as
> > > these two should never contend (aka they are never both runnable at the
> > > same time).
> >
> > What happens when one is blessed with a laptop where everything is
> > sharing an interrupt?
> >
> > $ cat /proc/interrupts
> > CPU0
> > 0: 2372239 XT-PIC timer 0/72239
> > 1: 5362 XT-PIC i8042 0/5362
> > 2: 0 XT-PIC cascade 0/0
> > 8: 1 XT-PIC rtc 0/1
> > 9: 616176 XT-PIC acpi, uhci_hcd, uhci_hcd, uhci_hcd,
> > eth0, yenta, yenta, Intel 82801CA-ICH3, radeon@PCI:1:0:0 0/16176
> > 11: 37 XT-PIC sonypi 0/35
> > 12: 28392 XT-PIC i8042 0/28392
> > 14: 21078 XT-PIC ide0 0/21078
> > 15: 472 XT-PIC ide1 0/472
>
> Ugh, why would _anyone_ design a laptop like that? You have 4
> perfectly good interrupts that you are not using at all. Is it really
> cheaper to put everything on the same irq? Does this work better under
> Windows or something?
>
> AFAIK there is nothing you can do - any other irq that fires on 9 will
> mask out all the others until it completes.
Yes, except I did not see all these xruns running 2.4.26 + lowlat +
preempt (same machine). Things got better with 2.6.x up to, perhaps, S7,
although I would have to retest to make sure. Now they seem to be worse
than before.
-- Fernando
On Tue, 2004-10-26 at 10:45 -0700, Fernando Pablo Lopez-Lezcano wrote:
> > AFAIK there is nothing you can do - any other irq that fires on 9 will
> > mask out all the others until it completes.
>
> Yes, except I did not see all these xruns running 2.4.26 + lowlat +
> preempt (same machine). Things got better with 2.6.x up to, perhaps, S7,
> although I would have to retest to make sure. Now they seem to be worse
> than before.
Hmm, interesting. Anyway T3 is the last version that was stable for me,
this is the xrun-free standard that I compare the later ones to.
Lee
On Tuesday 26 October 2004 13:40, Rui Nuno Capela wrote:
> Denis Vlasenko wrote:
> >
> > <shameless plug>
> > Maybe this program will be useful. It is designed to give you
> > overall system statistics without the need to scan entire /proc/NNN
> > forest. Together with nice -20, it will hopefully not stall.
> >
> > Compiled with dietlibc. If you will have trouble compiling it,
> > binary is attached too.
> >
> > Latest version is 0.9 but it seems I forgot it in my home box :(
> </shameless plug>
>
> Thanks for nmeter. I have changed a couple of little bits to build with
> gcc-3.4 here (see diff attached).
Hmm will it compile on 3.4 with "static inline"?
> Indeed, it says 0.7 as its version string. What's up on 0.9?
Here it is.
--
vda
i have released the -V0.3 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this is a fixes-only release, but still experimental.
this release should fix a number of bugs that were reported for the V0
series: the futex.c assert, the lockups and the 'slowdown problem'.
The slowdown problem was an architectural issue that surfaced sometime
around U10 and increased in prominence as the the number of mutexes
increased and the number of spinlocks decreased. The futex.c assert was
related to this architectural issue as well, and most of the lockups
reported were i believe livelocks caused by the same issue. Also, the
scheduler path had an easy-to-trigger deadlock that often just silently
locked up.
some of the networking lockups might be related to this issue too, but i
think PREEMPT_REALTIME still has separate lock odering issues within the
networking code. Please re-report any deadlock-tracer asserts that you
might encounter.
Changes since -V0.2:
- HEAP_SIZE fix from Karsten Wiese
- fix hdparm-triggered debugging message reported by Mark H Johnson
- fixed mutex related preemption to not impact the task state, just
like a normal spinlock does. This necessiated the introduction of
TASK_RUNNING_MUTEX handling and related kernel infrastructure. This
framework avoids spurious wakeups done by mutex handling by isolating
the state changes done by normal wakeups vs. the state changes caused
by the mutex code.
- added per-CPU deschedule threads. This fixes a deadlock scenario and
it is also much faster than keventd.
- fix debugging message upon console unblanking
to create a -V0.3 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-V0.3
Ingo
* K.R. Foley <[email protected]> wrote:
> Several things in regard to V0.2:
>
> 1) Interactive responsiveness seems to be noticably sluggish at times on
> all three of the systems I have tested this on.
> 2) My 450MHz UP system is definitely the worst by far. Scrolling through
> the syslog in a telnet session produces pauses every few seconds for
> about a second, that is while it's still responding. These problems seem
> to be network related, but there are no indications of what the problem
> is. This system also at times will just stop responding to network requests.
> 3) Both of the SMP systems are lacking the snappy responsiveness in X
> that I have become accustomed to with previous patches, but the 2.6GHz
> Xeon (w/HT) is worse than the 933MHz Xeon. Again no indications of
> problems in the logs.
> 4) Using amlat to run the RTC at 1kHz will kill any of these systems
> very quickly.
could you try this with -V0.3 too? I believe most of these problems
should be solved.
Ingo
> i have released the -V0.3 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a fixes-only release, but still experimental.
i've uploaded -V0.3.1, it fixes a trivial procfs oversight (related to
the new TASK_RUNNING_MUTEX state) that just triggered a crash in one of
my stresstests.
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Several things in regard to V0.2:
>>
>>1) Interactive responsiveness seems to be noticably sluggish at times on
>>all three of the systems I have tested this on.
>>2) My 450MHz UP system is definitely the worst by far. Scrolling through
>>the syslog in a telnet session produces pauses every few seconds for
>>about a second, that is while it's still responding. These problems seem
>>to be network related, but there are no indications of what the problem
>>is. This system also at times will just stop responding to network requests.
>>3) Both of the SMP systems are lacking the snappy responsiveness in X
>>that I have become accustomed to with previous patches, but the 2.6GHz
>>Xeon (w/HT) is worse than the 933MHz Xeon. Again no indications of
>>problems in the logs.
>>4) Using amlat to run the RTC at 1kHz will kill any of these systems
>>very quickly.
>
>
> could you try this with -V0.3 too? I believe most of these problems
> should be solved.
>
> Ingo
>
Sure will. It's building on two of the systems now (V0.3.1).
kr
On Wed, Oct 27, 2004 at 02:15:42AM +0200, Ingo Molnar wrote:
>
> i have released the -V0.3 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-V0.3
Bad link.
bill
Bill Huey (hui) wrote:
> On Wed, Oct 27, 2004 at 02:15:42AM +0200, Ingo Molnar wrote:
>
>>i have released the -V0.3 Real-Time Preemption patch, which can be
>>downloaded from:
>>
>> http://redhat.com/~mingo/realtime-preempt/
>
>
>> + http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-V0.3
>
>
> Bad link.
>
> bill
>
>
That's because he uploaded V0.3.1.
kr
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Several things in regard to V0.2:
>>
>>1) Interactive responsiveness seems to be noticably sluggish at times on
>>all three of the systems I have tested this on.
>>2) My 450MHz UP system is definitely the worst by far. Scrolling through
>>the syslog in a telnet session produces pauses every few seconds for
>>about a second, that is while it's still responding. These problems seem
>>to be network related, but there are no indications of what the problem
>>is. This system also at times will just stop responding to network requests.
>>3) Both of the SMP systems are lacking the snappy responsiveness in X
>>that I have become accustomed to with previous patches, but the 2.6GHz
>>Xeon (w/HT) is worse than the 933MHz Xeon. Again no indications of
>>problems in the logs.
>>4) Using amlat to run the RTC at 1kHz will kill any of these systems
>>very quickly.
>
>
> could you try this with -V0.3 too? I believe most of these problems
> should be solved.
>
> Ingo
>
I've repeated the above on the dual 933 Xeon:
Still problems with interactive behavior. Running KDE, with top running
in xterm, scrolling through the menus I get some pauses. When the pauses
occur I see kdeinit hit the top of the list and sometimes consuming 90%
or more of a CPU and idle usage drops to 30-40%. I do see some latency
traces (not really high ones) in the log that were generated by kdeinit
but I think they were generated prior to when these pauses occurred,
most likely when logging in.
Running amlat still hard locks the system. The last time this happened I
got this in the log:
Oct 26 21:43:56 porky kernel: BUG: sleeping function called from
invalid context amlat(3963) at kernel/mutex.c:28
Oct 26 21:43:56 porky kernel: in_atomic():1 [00000001], irqs_disabled():1
Oct 26 21:43:56 porky kernel: [<c011c7da>] __might_sleep+0xca/0xe0 (12)
Oct 26 21:43:56 porky kernel: [<c0137d89>] _mutex_lock+0x39/0x50 (36)
Oct 26 21:43:56 porky kernel: [<c0137df6>]
_mutex_lock_irqsave+0x16/0x20 (24)
Oct 26 21:43:56 porky kernel: [<c012977d>] __mod_timer+0x4d/0x1f0 (12)
Oct 26 21:43:56 porky kernel: [<c01f6535>] rtc_do_ioctl+0x185/0x970 (44)
Oct 26 21:43:56 porky kernel: [<c013838d>] __mcount+0x1d/0x30 (136)
Oct 26 21:43:56 porky kernel: [<c01f6d2b>] rtc_ioctl+0xb/0x30 (4)
Oct 26 21:43:56 porky kernel: [<c0179367>] sys_ioctl+0xe7/0x250 (4)
Oct 26 21:43:56 porky kernel: [<c01131f8>] mcount+0x14/0x18 (8)
Oct 26 21:43:56 porky kernel: [<c01f6d2b>] rtc_ioctl+0xb/0x30 (20)
Oct 26 21:43:56 porky kernel: [<c0179367>] sys_ioctl+0xe7/0x250 (20)
Oct 26 21:43:56 porky kernel: [<c0106739>] sysenter_past_esp+0x52/0x71 (48)
Oct 26 21:43:56 porky kernel: preempt count: 00000002
Oct 26 21:43:56 porky kernel: . 2-level deep critical section nesting:
Oct 26 21:43:56 porky kernel: .. entry 1: _spin_lock_irqsave+0x22/0x80
[<c02c71c2>] / (rtc_do_ioctl+0x158/0x970 [<c01f6508>])
Oct 26 21:43:56 porky kernel: .. entry 2: print_traces+0x1d/0x60
[<c01394bd>] / (dump_stack+0x23/0x30 [<c0107613>])
Oct 26 21:43:56 porky kernel:
Oct 26 21:43:56 porky kernel: BUG: scheduling while atomic: IRQ
8/0x00000001/672
Oct 26 21:43:56 porky kernel: caller is schedule+0x30/0xe0
Oct 26 21:43:57 porky kernel: [<c02c58c1>] __schedule+0x771/0x7d0 (12)
Oct 26 21:43:57 porky kernel: [<c02c5950>] schedule+0x30/0xe0 (8)
Oct 26 21:43:57 porky kernel: [<c013838d>] __mcount+0x1d/0x30 (60)
Oct 26 21:43:57 porky kernel: [<c02c592e>] schedule+0xe/0xe0 (4)
Oct 26 21:43:57 porky kernel: [<c02c6c4d>] down_write_mutex+0x12d/0x1e0 (4)
Oct 26 21:43:57 porky kernel: [<c01131f8>] mcount+0x14/0x18 (8)
Oct 26 21:43:57 porky kernel: [<c02c5950>] schedule+0x30/0xe0 (20)
Oct 26 21:43:57 porky kernel: [<c01131f8>] mcount+0x14/0x18 (4)
Oct 26 21:43:57 porky kernel: [<c02c74ea>] _spin_unlock+0x1a/0x40 (20)
Oct 26 21:43:57 porky kernel: [<c02c6c4d>] down_write_mutex+0x12d/0x1e0
(12)
Working on booting the 450 right now.
kr
On Tue, Oct 26, 2004 at 09:04:18PM -0500, K.R. Foley wrote:
> Bill Huey (hui) wrote:
> >On Wed, Oct 27, 2004 at 02:15:42AM +0200, Ingo Molnar wrote:
> >>i have released the -V0.3 Real-Time Preemption patch, which can be
> >>downloaded from:
> >>
> >> http://redhat.com/~mingo/realtime-preempt/
> That's because he uploaded V0.3.1.
Even that top level link/directory doesn't work.
bill
Bill Huey (hui) wrote:
> On Tue, Oct 26, 2004 at 09:04:18PM -0500, K.R. Foley wrote:
>
>>Bill Huey (hui) wrote:
>>
>>>On Wed, Oct 27, 2004 at 02:15:42AM +0200, Ingo Molnar wrote:
>>>
>>>>i have released the -V0.3 Real-Time Preemption patch, which can be
>>>>downloaded from:
>>>>
>>>> http://redhat.com/~mingo/realtime-preempt/
>
>
>>That's because he uploaded V0.3.1.
>
>
> Even that top level link/directory doesn't work.
>
> bill
>
>
For me it redirects to http://people.redhat.com/mingo/realtime-preempt/
Try that directly.
kr
On Tue, Oct 26, 2004 at 10:35:01PM -0500, K.R. Foley wrote:
> Bill Huey (hui) wrote:
> For me it redirects to http://people.redhat.com/mingo/realtime-preempt/
> Try that directly.
They both work now, same for redirection.
bill
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Several things in regard to V0.2:
>>
>>1) Interactive responsiveness seems to be noticably sluggish at times on
>>all three of the systems I have tested this on.
>>2) My 450MHz UP system is definitely the worst by far. Scrolling through
>>the syslog in a telnet session produces pauses every few seconds for
>>about a second, that is while it's still responding. These problems seem
>>to be network related, but there are no indications of what the problem
>>is. This system also at times will just stop responding to network requests.
>>3) Both of the SMP systems are lacking the snappy responsiveness in X
>>that I have become accustomed to with previous patches, but the 2.6GHz
>>Xeon (w/HT) is worse than the 933MHz Xeon. Again no indications of
>>problems in the logs.
>>4) Using amlat to run the RTC at 1kHz will kill any of these systems
>>very quickly.
>
>
> could you try this with -V0.3 too? I believe most of these problems
> should be solved.
>
> Ingo
>
OK the 450 (UP) dies before it gets out of the gate. From the tail end
of the Oops on the screen it looks to be related to the rtc also. Will
investigate further after I rest my eyes a bit.
kr
>> Denis Vlasenko wrote:
>> >
>> > <shameless plug>
>> > Maybe this program will be useful. It is designed to give you
>> > overall system statistics without the need to scan entire /proc/NNN
>> > forest. Together with nice -20, it will hopefully not stall.
>> >
>> > Compiled with dietlibc. If you will have trouble compiling it,
>> > binary is attached too.
>> >
>> > Latest version is 0.9 but it seems I forgot it in my home box :(
>> </shameless plug>
>>
>> Thanks for nmeter. I have changed a couple of little bits to build with
>> gcc-3.4 here (see diff attached).
>
> Hmm will it compile on 3.4 with "static inline"?
>
Yes, it now compiles on gcc-3.4.1 out of the box.
Thanks for this nice little utility.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
* Ingo Molnar <[email protected]> wrote:
> > Oct 26 21:43:56 porky kernel: BUG: sleeping function called from
> > invalid context amlat(3963) at kernel/mutex.c:28
> > Oct 26 21:43:56 porky kernel: in_atomic():1 [00000001], irqs_disabled():1
> > Oct 26 21:43:56 porky kernel: [<c011c7da>] __might_sleep+0xca/0xe0 (12)
> > Oct 26 21:43:56 porky kernel: [<c0137d89>] _mutex_lock+0x39/0x50 (36)
> > Oct 26 21:43:56 porky kernel: [<c0137df6>] _mutex_lock_irqsave+0x16/0x20 (24)
> > Oct 26 21:43:56 porky kernel: [<c012977d>] __mod_timer+0x4d/0x1f0 (12)
> > Oct 26 21:43:56 porky kernel: [<c01f6535>] rtc_do_ioctl+0x185/0x970 (44)
>
> does the quick hack below help?
here's a more complete fix.
Ingo
--- linux/drivers/char/rtc.c.orig
+++ linux/drivers/char/rtc.c
@@ -177,7 +177,7 @@ static unsigned long rtc_max_user_freq =
/*
* rtc_task_lock nests inside rtc_lock.
*/
-static DECLARE_SPINLOCK(rtc_task_lock);
+static DECLARE_RAW_SPINLOCK(rtc_task_lock);
static rtc_task_t *rtc_callback = NULL;
#endif
@@ -238,10 +238,17 @@ irqreturn_t rtc_interrupt(int irq, void
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0);
}
- if (rtc_status & RTC_TIMER_ON)
+ if (rtc_status & RTC_TIMER_ON) {
+ spin_unlock (&rtc_lock);
+ /*
+ * We do the mod_timer outside of the lock because
+ * it may reschedule under PREEMPT_REALTIME. As long
+ * as we read the flag race-free it is not a problem
+ * if two mod_timer()s race:
+ */
mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
-
- spin_unlock (&rtc_lock);
+ } else
+ spin_unlock (&rtc_lock);
/* Now do the rest of the actions */
spin_lock(&rtc_task_lock);
@@ -1094,17 +1101,19 @@ static void rtc_dropped_irq(unsigned lon
return;
}
- /* Just in case someone disabled the timer from behind our back... */
- if (rtc_status & RTC_TIMER_ON)
- mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
-
rtc_irq_data += ((rtc_freq/HZ)<<8);
rtc_irq_data &= ~0xff;
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0); /* restart */
freq = rtc_freq;
- spin_unlock_irq(&rtc_lock);
+ /* Just in case someone disabled the timer from behind our back... */
+ if (rtc_status & RTC_TIMER_ON) {
+ spin_unlock_irq(&rtc_lock);
+ mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
+ } else
+ spin_unlock_irq(&rtc_lock);
+
printk(KERN_WARNING "rtc: lost some interrupts at %ldHz.\n", freq);
* Ingo Molnar <[email protected]> wrote:
> > does the quick hack below help?
>
> here's a more complete fix.
third time lucky?
Ingo
--- linux/drivers/char/rtc.c.orig
+++ linux/drivers/char/rtc.c
@@ -177,7 +177,7 @@ static unsigned long rtc_max_user_freq =
/*
* rtc_task_lock nests inside rtc_lock.
*/
-static DECLARE_SPINLOCK(rtc_task_lock);
+static DECLARE_RAW_SPINLOCK(rtc_task_lock);
static rtc_task_t *rtc_callback = NULL;
#endif
@@ -238,10 +238,17 @@ irqreturn_t rtc_interrupt(int irq, void
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0);
}
- if (rtc_status & RTC_TIMER_ON)
+ if (rtc_status & RTC_TIMER_ON) {
+ spin_unlock (&rtc_lock);
+ /*
+ * We do the mod_timer outside of the lock because
+ * it may reschedule under PREEMPT_REALTIME. As long
+ * as we read the flag race-free it is not a problem
+ * if two mod_timer()s race:
+ */
mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
-
- spin_unlock (&rtc_lock);
+ } else
+ spin_unlock (&rtc_lock);
/* Now do the rest of the actions */
spin_lock(&rtc_task_lock);
@@ -404,8 +411,8 @@ static int rtc_do_ioctl(unsigned int cmd
if (rtc_status & RTC_TIMER_ON) {
spin_lock_irq (&rtc_lock);
rtc_status &= ~RTC_TIMER_ON;
- del_timer(&rtc_irq_timer);
spin_unlock_irq (&rtc_lock);
+ del_timer(&rtc_irq_timer); // FIXME
}
return 0;
}
@@ -730,9 +737,10 @@ static int rtc_release(struct inode *ino
}
if (rtc_status & RTC_TIMER_ON) {
rtc_status &= ~RTC_TIMER_ON;
- del_timer(&rtc_irq_timer);
- }
- spin_unlock_irq(&rtc_lock);
+ spin_unlock_irq(&rtc_lock);
+ del_timer(&rtc_irq_timer); // FIXME
+ } else
+ spin_unlock_irq(&rtc_lock);
if (file->f_flags & FASYNC) {
rtc_fasync (-1, file, 0);
@@ -808,6 +816,7 @@ int rtc_unregister(rtc_task_t *task)
return -EIO;
#else
unsigned char tmp;
+ int rm_timer;
spin_lock_irq(&rtc_lock);
spin_lock(&rtc_task_lock);
@@ -827,13 +836,16 @@ int rtc_unregister(rtc_task_t *task)
CMOS_WRITE(tmp, RTC_CONTROL);
CMOS_READ(RTC_INTR_FLAGS);
}
+ rm_timer = 0;
if (rtc_status & RTC_TIMER_ON) {
rtc_status &= ~RTC_TIMER_ON;
- del_timer(&rtc_irq_timer);
+ rm_timer = 1;
}
rtc_status &= ~RTC_IS_OPEN;
spin_unlock(&rtc_task_lock);
spin_unlock_irq(&rtc_lock);
+ if (rm_timer)
+ del_timer(&rtc_irq_timer);
return 0;
#endif
}
@@ -1094,17 +1106,19 @@ static void rtc_dropped_irq(unsigned lon
return;
}
- /* Just in case someone disabled the timer from behind our back... */
- if (rtc_status & RTC_TIMER_ON)
- mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
-
rtc_irq_data += ((rtc_freq/HZ)<<8);
rtc_irq_data &= ~0xff;
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0); /* restart */
freq = rtc_freq;
- spin_unlock_irq(&rtc_lock);
+ /* Just in case someone disabled the timer from behind our back... */
+ if (rtc_status & RTC_TIMER_ON) {
+ spin_unlock_irq(&rtc_lock);
+ mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
+ } else
+ spin_unlock_irq(&rtc_lock);
+
printk(KERN_WARNING "rtc: lost some interrupts at %ldHz.\n", freq);
* K.R. Foley <[email protected]> wrote:
> Running amlat still hard locks the system. The last time this happened
> I got this in the log:
>
> Oct 26 21:43:56 porky kernel: BUG: sleeping function called from
> invalid context amlat(3963) at kernel/mutex.c:28
> Oct 26 21:43:56 porky kernel: in_atomic():1 [00000001], irqs_disabled():1
> Oct 26 21:43:56 porky kernel: [<c011c7da>] __might_sleep+0xca/0xe0 (12)
> Oct 26 21:43:56 porky kernel: [<c0137d89>] _mutex_lock+0x39/0x50 (36)
> Oct 26 21:43:56 porky kernel: [<c0137df6>] _mutex_lock_irqsave+0x16/0x20 (24)
> Oct 26 21:43:56 porky kernel: [<c012977d>] __mod_timer+0x4d/0x1f0 (12)
> Oct 26 21:43:56 porky kernel: [<c01f6535>] rtc_do_ioctl+0x185/0x970 (44)
does the quick hack below help?
Ingo
--- linux/drivers/char/rtc.c.orig
+++ linux/drivers/char/rtc.c
@@ -238,11 +238,11 @@ irqreturn_t rtc_interrupt(int irq, void
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0);
}
+ spin_unlock (&rtc_lock);
+
if (rtc_status & RTC_TIMER_ON)
mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
- spin_unlock (&rtc_lock);
-
/* Now do the rest of the actions */
spin_lock(&rtc_task_lock);
if (rtc_callback)
@@ -1094,10 +1094,6 @@ static void rtc_dropped_irq(unsigned lon
return;
}
- /* Just in case someone disabled the timer from behind our back... */
- if (rtc_status & RTC_TIMER_ON)
- mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
-
rtc_irq_data += ((rtc_freq/HZ)<<8);
rtc_irq_data &= ~0xff;
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0); /* restart */
@@ -1106,6 +1102,10 @@ static void rtc_dropped_irq(unsigned lon
spin_unlock_irq(&rtc_lock);
+ /* Just in case someone disabled the timer from behind our back... */
+ if (rtc_status & RTC_TIMER_ON)
+ mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
+
printk(KERN_WARNING "rtc: lost some interrupts at %ldHz.\n", freq);
/* Now we have new data */
* Ingo Molnar <[email protected]> wrote:
> third time lucky?
there was one more piece missing...
i've also uploaded -RT-V0.3.2 with this fix included.
Ingo
--- linux/drivers/char/rtc.c.orig
+++ linux/drivers/char/rtc.c
@@ -177,7 +177,7 @@ static unsigned long rtc_max_user_freq =
/*
* rtc_task_lock nests inside rtc_lock.
*/
-static DECLARE_SPINLOCK(rtc_task_lock);
+static DECLARE_RAW_SPINLOCK(rtc_task_lock);
static rtc_task_t *rtc_callback = NULL;
#endif
@@ -217,6 +217,8 @@ static inline unsigned char rtc_is_updat
irqreturn_t rtc_interrupt(int irq, void *dev_id, struct pt_regs *regs)
{
+ unsigned long flags;
+
/*
* Can be an alarm interrupt, update complete interrupt,
* or a periodic interrupt. We store the status in the
@@ -224,7 +226,7 @@ irqreturn_t rtc_interrupt(int irq, void
* the last read in the remainder of rtc_irq_data.
*/
- spin_lock (&rtc_lock);
+ spin_lock_irqsave(&rtc_lock, flags);
rtc_irq_data += 0x100;
rtc_irq_data &= ~0xff;
if (is_hpet_enabled()) {
@@ -238,16 +240,23 @@ irqreturn_t rtc_interrupt(int irq, void
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0);
}
- if (rtc_status & RTC_TIMER_ON)
+ if (rtc_status & RTC_TIMER_ON) {
+ spin_unlock_irqrestore(&rtc_lock, flags);
+ /*
+ * We do the mod_timer outside of the lock because
+ * it may reschedule under PREEMPT_REALTIME. As long
+ * as we read the flag race-free it is not a problem
+ * if two mod_timer()s race:
+ */
mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
-
- spin_unlock (&rtc_lock);
+ } else
+ spin_unlock_irqrestore(&rtc_lock, flags);
/* Now do the rest of the actions */
- spin_lock(&rtc_task_lock);
+ spin_lock_irqsave(&rtc_task_lock, flags);
if (rtc_callback)
rtc_callback->func(rtc_callback->private_data);
- spin_unlock(&rtc_task_lock);
+ spin_unlock_irqrestore(&rtc_task_lock, flags);
wake_up_interruptible(&rtc_wait);
kill_fasync (&rtc_async_queue, SIGIO, POLL_IN);
@@ -404,8 +413,8 @@ static int rtc_do_ioctl(unsigned int cmd
if (rtc_status & RTC_TIMER_ON) {
spin_lock_irq (&rtc_lock);
rtc_status &= ~RTC_TIMER_ON;
- del_timer(&rtc_irq_timer);
spin_unlock_irq (&rtc_lock);
+ del_timer(&rtc_irq_timer); // FIXME
}
return 0;
}
@@ -422,10 +431,9 @@ static int rtc_do_ioctl(unsigned int cmd
if (!(rtc_status & RTC_TIMER_ON)) {
spin_lock_irq (&rtc_lock);
- rtc_irq_timer.expires = jiffies + HZ/rtc_freq + 2*HZ/100;
- add_timer(&rtc_irq_timer);
rtc_status |= RTC_TIMER_ON;
spin_unlock_irq (&rtc_lock);
+ mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
}
set_rtc_irq_bit(RTC_PIE);
return 0;
@@ -730,9 +738,10 @@ static int rtc_release(struct inode *ino
}
if (rtc_status & RTC_TIMER_ON) {
rtc_status &= ~RTC_TIMER_ON;
- del_timer(&rtc_irq_timer);
- }
- spin_unlock_irq(&rtc_lock);
+ spin_unlock_irq(&rtc_lock);
+ del_timer(&rtc_irq_timer); // FIXME
+ } else
+ spin_unlock_irq(&rtc_lock);
if (file->f_flags & FASYNC) {
rtc_fasync (-1, file, 0);
@@ -808,6 +817,7 @@ int rtc_unregister(rtc_task_t *task)
return -EIO;
#else
unsigned char tmp;
+ int rm_timer;
spin_lock_irq(&rtc_lock);
spin_lock(&rtc_task_lock);
@@ -827,13 +837,16 @@ int rtc_unregister(rtc_task_t *task)
CMOS_WRITE(tmp, RTC_CONTROL);
CMOS_READ(RTC_INTR_FLAGS);
}
+ rm_timer = 0;
if (rtc_status & RTC_TIMER_ON) {
rtc_status &= ~RTC_TIMER_ON;
- del_timer(&rtc_irq_timer);
+ rm_timer = 1;
}
rtc_status &= ~RTC_IS_OPEN;
spin_unlock(&rtc_task_lock);
spin_unlock_irq(&rtc_lock);
+ if (rm_timer)
+ del_timer(&rtc_irq_timer);
return 0;
#endif
}
@@ -1094,17 +1107,19 @@ static void rtc_dropped_irq(unsigned lon
return;
}
- /* Just in case someone disabled the timer from behind our back... */
- if (rtc_status & RTC_TIMER_ON)
- mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
-
rtc_irq_data += ((rtc_freq/HZ)<<8);
rtc_irq_data &= ~0xff;
rtc_irq_data |= (CMOS_READ(RTC_INTR_FLAGS) & 0xF0); /* restart */
freq = rtc_freq;
- spin_unlock_irq(&rtc_lock);
+ /* Just in case someone disabled the timer from behind our back... */
+ if (rtc_status & RTC_TIMER_ON) {
+ spin_unlock_irq(&rtc_lock);
+ mod_timer(&rtc_irq_timer, jiffies + HZ/rtc_freq + 2*HZ/100);
+ } else
+ spin_unlock_irq(&rtc_lock);
+
printk(KERN_WARNING "rtc: lost some interrupts at %ldHz.\n", freq);
* K.R. Foley <[email protected]> wrote:
> I've repeated the above on the dual 933 Xeon:
>
> Still problems with interactive behavior. Running KDE, with top
> running in xterm, scrolling through the menus I get some pauses. When
> the pauses occur I see kdeinit hit the top of the list and sometimes
> consuming 90% or more of a CPU and idle usage drops to 30-40%. I do
> see some latency traces (not really high ones) in the log that were
> generated by kdeinit but I think they were generated prior to when
> these pauses occurred, most likely when logging in.
is this 90% or more CPU time system (kernel) overhead or userspace
overhead?
Ingo
* Ingo Molnar <[email protected]> wrote:
> > Still problems with interactive behavior. Running KDE, with top
> > running in xterm, scrolling through the menus I get some pauses. When
> > the pauses occur I see kdeinit hit the top of the list and sometimes
> > consuming 90% or more of a CPU and idle usage drops to 30-40%. I do
> > see some latency traces (not really high ones) in the log that were
> > generated by kdeinit but I think they were generated prior to when
> > these pauses occurred, most likely when logging in.
>
> is this 90% or more CPU time system (kernel) overhead or userspace
> overhead?
another thing to watch for is the context-switch rate in vmstat. -V0.3
patches include a hack that include involuntary context-switches (mutex
context switches) in the context-switch stat as well. (previously those
were reported in a separate field which vmstat didnt pick up.)
So if the context-switch rate shots up to above say 100K/sec that is a
sure sign of some mutex badness. The livelock scenarios i solved in
-V0.3 occasionally generated a more than 500K/sec context-switch rate on
a 2GHz box. Having just a couple of thousand per sec isnt by itself a
sign of anything unusual.
Ingo
> i've also uploaded -RT-V0.3.2 with this fix included.
note that if you are running amlat/realfeel then you should do something
like this after starting realfeel:
chrt -f 99 -p `pidof 'IRQ 8'`
chrt -f 98 -p `pidof realfeel`
because by default IRQ 8 has a lower RT priority than realfeel.
Ingo
On Wed, 27 Oct 2004 11:06:20 +0200
Ingo Molnar <[email protected]> wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > third time lucky?
>
> there was one more piece missing...
>
> i've also uploaded -RT-V0.3.2 with this fix included.
Hi,
V0.3.2 builds and boots fine here. It seems to run ok, too. Uptime 25
minutes and no BUG's [yay! 25 minutes already! ;)]. Well anyways,
preempt_max_thresh is at 38us after running several concurrent find's plus
jackd.
There's a problem though with jackd performance. Read on below if it is of
any interest at this point.
Soundcard IRQ is 3:
mango:~# cat /proc/interrupts
CPU0
0: 1331024 XT-PIC timer 0/31024
1: 3494 XT-PIC i8042 0/3494
2: 0 XT-PIC cascade 0/0
3: 781428 XT-PIC CS46XX 0/81428
5: 1811 XT-PIC eth0 0/1811
8: 4 XT-PIC rtc 0/4
12: 71236 XT-PIC i8042 1/71236
14: 5163 XT-PIC ide0 0/5163
15: 50392 XT-PIC ide1 0/50392
NMI: 0
ERR: 0
I set the irq handler thread to prio 99:
mango:~# chrt -p `pidof "IRQ 3"`
pid 118's current scheduling policy: SCHED_FIFO
pid 118's current scheduling priority: 99
jackd runs with a period size of 128 frames (48000hz samplerate) and with
SCHED_FIFO with prios from 60 to 70 (ps output is somewhat broken):
mango:~# ps -cmL `pidof jackd`
PID LWP CLS PRI TTY STAT TIME COMMAND
1286 - - - ? - 0:28 /usr/bin/jackd -R -P60 -t20000 -dalsa -
- 1286 TS 20 - SLsl 0:00 -
- 1287 TS 23 - SLsl 0:00 -
- 1288 FF 110 - SLsl 0:00 -
- 1289 FF 100 - SLsl 0:27 -
~$ chrt -p 1286
pid 1286's current scheduling policy: SCHED_OTHER
pid 1286's current scheduling priority: 0
~$ chrt -p 1287
pid 1287's current scheduling policy: SCHED_OTHER
pid 1287's current scheduling priority: 0
~$ chrt -p 1288
pid 1288's current scheduling policy: SCHED_FIFO
pid 1288's current scheduling priority: 70
~$ chrt -p 1289
pid 1289's current scheduling policy: SCHED_FIFO
pid 1289's current scheduling priority: 60
Anyways i get xruns like crazy under load (like 200 in 10 minutes). It seems
the scheduling class and high priority don't matter really as wiggling
windows around on the screen or doing a "find /" can easily provoke xruns.
flo
On Wed, 27 Oct 2004 12:33:29 +0200
Florian Schmidt <[email protected]> wrote:
> V0.3.2 builds and boots fine here. It seems to run ok, too. Uptime 25
> minutes and no BUG's [yay! 25 minutes already! ;)]. Well anyways,
> preempt_max_thresh is at 38us after running several concurrent find's plus
> jackd.
>
> There's a problem though with jackd performance. Read on below if it is of
> any interest at this point.
[snip]
anyways, i still see the mysterious pauses which do not show up in the
preempt_max_thresh.
ah, and btw: what does the /proc/sys/kernel/kernel_preemption tunable do
with PREEMPT_REALTIME enabled?
mango:~# cat /proc/sys/kernel/kernel_preemption
1
[all the other VP tunables are not available anymore]
mango:~# cat /proc/sys/kernel/voluntary_preemption
cat: /proc/sys/kernel/voluntary_preemption: No such file or directory
mango:~# cat /proc/sys/kernel/hardirq_preemption
cat: /proc/sys/kernel/hardirq_preemption: No such file or directory
mango:~# cat /proc/sys/kernel/softirq_preemption
cat: /proc/sys/kernel/softirq_preemption: No such file or directory
flo
* Florian Schmidt <[email protected]> wrote:
> jackd runs with a period size of 128 frames (48000hz samplerate) and with
> SCHED_FIFO with prios from 60 to 70 (ps output is somewhat broken):
>
> mango:~# ps -cmL `pidof jackd`
> PID LWP CLS PRI TTY STAT TIME COMMAND
> 1286 - - - ? - 0:28 /usr/bin/jackd -R -P60 -t20000 -dalsa -
> - 1286 TS 20 - SLsl 0:00 -
> - 1287 TS 23 - SLsl 0:00 -
> - 1288 FF 110 - SLsl 0:00 -
> - 1289 FF 100 - SLsl 0:27 -
>
> ~$ chrt -p 1286
> pid 1286's current scheduling policy: SCHED_OTHER
> pid 1286's current scheduling priority: 0
> ~$ chrt -p 1287
> pid 1287's current scheduling policy: SCHED_OTHER
> pid 1287's current scheduling priority: 0
just curious, are these two important to the latency path of jackd, or
are they lowprio things and are thus at SCHED_OTHER intentionally?
> ~$ chrt -p 1288
> pid 1288's current scheduling policy: SCHED_FIFO
> pid 1288's current scheduling priority: 70
> ~$ chrt -p 1289
> pid 1289's current scheduling policy: SCHED_FIFO
> pid 1289's current scheduling priority: 60
>
> Anyways i get xruns like crazy under load (like 200 in 10 minutes). It
> seems the scheduling class and high priority don't matter really as
> wiggling windows around on the screen or doing a "find /" can easily
> provoke xruns.
yeah, i'm hunting a quite similar bug: i can see 'realfeel' latencies
generated by simple window scrolling. It is most likely a logic bug
somewhere - a missing reschedule check, irqs left disabled accidentally,
or something like that. Since some other workloads dont trigger it i
dont think i broke RT scheduling by itself - it is most likely some
non-core code somewhere missing a resched. Which doesnt make it less of
a problem, but it makes it harder to find :-|
Ingo
On Wed, 27 Oct 2004 12:29:21 +0200
Ingo Molnar <[email protected]> wrote:
> > pid 1286's current scheduling policy: SCHED_OTHER
> > pid 1286's current scheduling priority: 0
> > ~$ chrt -p 1287
> > pid 1287's current scheduling policy: SCHED_OTHER
> > pid 1287's current scheduling priority: 0
>
> just curious, are these two important to the latency path of jackd, or
> are they lowprio things and are thus at SCHED_OTHER intentionally?
the latter. jackd has one RT thread for doing the grunt work and one which
acts as a watchdog. The other two are SCHED_OTHER by design.
> > Anyways i get xruns like crazy under load (like 200 in 10 minutes). It
> > seems the scheduling class and high priority don't matter really as
> > wiggling windows around on the screen or doing a "find /" can easily
> > provoke xruns.
>
> yeah, i'm hunting a quite similar bug: i can see 'realfeel' latencies
> generated by simple window scrolling. It is most likely a logic bug
> somewhere - a missing reschedule check, irqs left disabled accidentally,
> or something like that. Since some other workloads dont trigger it i
> dont think i broke RT scheduling by itself - it is most likely some
> non-core code somewhere missing a resched. Which doesnt make it less of
> a problem, but it makes it harder to find :-|
Hmm, you're right. It seems that diskload alone doesn't trigger the problem.
it rather looks like graphics output/activity is the problem.
Anyways i'm back in U3 w/o PREEMPT_REALTIME as V0.3.2 just hardlocked on me
when i pressed ctrl-d in a root shell (in an xterm) to exit. Nothing has
made it to the log. I haven't found out yet whether it's possible to use my
old zx81 as serial console, so i can't help with OOPS/BUG output.
flo
Linux version 2.6.9-mm1-RT-V0.3.2 (michich@k4-912b) (gcc version 3.3.4 (Debian 1:3.3.4-6sarge1)) #5 Wed Oct 27 12:26:13 CEST 2004
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000e4000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000003ffb0000 (usable)
BIOS-e820: 000000003ffb0000 - 000000003ffc0000 (ACPI data)
BIOS-e820: 000000003ffc0000 - 000000003fff0000 (ACPI NVS)
BIOS-e820: 000000003fff0000 - 0000000040000000 (reserved)
BIOS-e820: 00000000ff780000 - 0000000100000000 (reserved)
Warning only 896MB will be used.
Use a HIGHMEM enabled kernel.
896MB LOWMEM available.
found SMP MP-table at 000ff780
On node 0 totalpages: 229376
DMA zone: 4096 pages, LIFO batch:1
Normal zone: 225280 pages, LIFO batch:16
HighMem zone: 0 pages, LIFO batch:1
DMI 2.3 present.
ACPI: RSDP (v002 ACPIAM ) @ 0x000fa840
ACPI: XSDT (v001 A M I OEMXSDT 0x06000417 MSFT 0x00000097) @ 0x3ffb0100
ACPI: FADT (v003 A M I OEMFACP 0x06000417 MSFT 0x00000097) @ 0x3ffb0290
ACPI: MADT (v001 A M I OEMAPIC 0x06000417 MSFT 0x00000097) @ 0x3ffb0390
ACPI: OEMB (v001 A M I OEMBIOS 0x06000417 MSFT 0x00000097) @ 0x3ffc0040
ACPI: DSDT (v001 A0036 A0036001 0x00000001 MSFT 0x0100000d) @ 0x00000000
ACPI: Local APIC address 0xfee00000
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 15:7 APIC version 16
ACPI: IOAPIC (id[0x01] address[0xfec00000] gsi_base[0])
IOAPIC[0]: apic_id 1, version 3, address 0xfec00000, GSI 0-23
ACPI: INT_SRC_OVR (bus 0 bus_irq 0 global_irq 2 dfl dfl)
ACPI: IRQ0 used by override.
ACPI: IRQ2 used by override.
ACPI: IRQ9 used by override.
Enabling APIC mode: Flat. Using 1 I/O APICs
Using ACPI (MADT) for SMP configuration information
Built 1 zonelists
Initializing CPU#0
Kernel command line: BOOT_IMAGE=Deb-32-RT ro root=308 [email protected]/eth0,[email protected]/00:0C:6E:2F:30:75
netconsole: local port 4444
netconsole: local IP 147.229.222.29
netconsole: interface eth0
netconsole: remote port 4444
netconsole: remote IP 147.229.222.28
netconsole: remote ethernet address 00:0c:6e:2f:30:75
PID hash table entries: 4096 (order: 12, 65536 bytes)
Detected 2403.782 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x30
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 905628k/917504k available (1843k kernel code, 11488k reserved, 786k data, 148k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode... Ok.
Calibrating delay loop... 4702.20 BogoMIPS (lpj=2351104)
Security Framework v1.0.0 initialized
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: After generic identify, caps: 078bfbff e1d3fbff 00000000 00000000
CPU: After vendor identify, caps: 078bfbff e1d3fbff 00000000 00000000
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 1024K (64 bytes/line)
CPU: After all inits, caps: 078bfbff e1d3fbff 00000000 00000010
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
CPU: AMD Athlon(tm) 64 FX-53 Processor stepping 0a
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
spawn_desched_task(00000000)
desched cpu_callback 3/00000000
ksoftirqd started up.
softirq RT prio: 24.
desched cpu_callback 2/00000000
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=2 pin2=-1
desched thread 0 started up.
NET: Registered protocol family 16
PCI: PCI BIOS revision 2.10 entry at 0xf0031, last bus=1
PCI: Using configuration type 1
mtrr: v2.0 (20020519)
ACPI: Subsystem revision 20040816
ACPI: Interpreter enabled
ACPI: Using IOAPIC for interrupt routing
ACPI: PCI Root Bridge [PCI0] (00:00)
PCI: Probing PCI hardware (bus 00)
ACPI: PCI Interrupt Routing Table [\_SB_.PCI0._PRT]
ACPI: PCI Interrupt Link [LNKA] (IRQs 3 4 5 7 10 *11 14 15)
ACPI: PCI Interrupt Link [LNKB] (IRQs 3 4 5 7 *10 11 14 15)
ACPI: PCI Interrupt Link [LNKC] (IRQs 3 4 *5 7 10 11 14 15)
ACPI: PCI Interrupt Link [LNKD] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKE] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKF] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKG] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
ACPI: PCI Interrupt Link [LNKH] (IRQs 3 4 5 7 10 11 14 15) *0, disabled.
PCI: Using ACPI for IRQ routing
** PCI interrupts are no longer routed automatically. If this
** causes a device to stop working, it is probably because the
** driver failed to call pci_enable_device(). As a temporary
** workaround, the "pci=routeirq" argument restores the old
** behavior. If this argument makes the device work again,
** please email the output of "lspci" to [email protected]
** so I can fix the driver.
Initializing Cryptographic API
Real Time Clock Driver v1.12
ACPI: PS/2 Keyboard Controller [PS2K] at I/O 0x60, 0x64, irq 1
ACPI: PS/2 Mouse Controller [PS2M] at irq 12
serio: i8042 AUX port at 0x60,0x64 irq 12
serio: i8042 KBD port at 0x60,0x64 irq 1
Serial: 8250/16550 driver $Revision: 1.90 $ 8 ports, IRQ sharing disabled
ttyS0 at I/O 0x3f8 (irq = 4) is a 16550A
ttyS1 at I/O 0x2f8 (irq = 3) is a 16550A
io scheduler noop registered
io scheduler anticipatory registered
io scheduler deadline registered
io scheduler cfq registered
ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 17
ACPI: PCI interrupt 0000:00:0a.0[A] -> GSI 17 (level, low) -> IRQ 17
eth0: Yukon Gigabit Ethernet 10/100/1000Base-T Adapter
PrefPort:A RlmtMode:Check Link State
netconsole: device eth0 not up yet, forcing it
netconsole: carrier detect appears flaky, waiting 10 seconds
IRQ#17 thread RT prio: 49.
eth0: network connection up using port A
speed: 100
autonegotiation: yes
duplex mode: full
flowctrl: none
irq moderation: disabled
scatter-gather: enabled
netconsole: network logging started
Uniform Multi-Platform E-IDE driver Revision: 7.00alpha2
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller at PCI slot 0000:00:0f.1
ACPI: PCI interrupt 0000:00:0f.1[A] -> GSI 20 (level, low) -> IRQ 20
VP_IDE: chipset revision 6
VP_IDE: not 100% native mode: will probe irqs later
VP_IDE: VIA vt8237 (rev 00) IDE UDMA133 controller on pci0000:00:0f.1
ide0: BM-DMA at 0xfc00-0xfc07, BIOS settings: hda:DMA, hdb:DMA
ide1: BM-DMA at 0xfc08-0xfc0f, BIOS settings: hdc:pio, hdd:DMA
Probing IDE interface ide0...
hda: ST3120026A, ATA DISK drive
hdb: ST320011A, ATA DISK drive
elevator: using anticipatory as default io scheduler
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
Probing IDE interface ide1...
hdd: TEAC CD-W552E, ATAPI CD/DVD-ROM drive
ide1 at 0x170-0x177,0x376 on irq 15
Probing IDE interface ide2...
ide2: Wait for ready failed before probe !
Probing IDE interface ide3...
ide3: Wait for ready failed before probe !
Probing IDE interface ide4...
ide4: Wait for ready failed before probe !
Probing IDE interface ide5...
ide5: Wait for ready failed before probe !
hda: max request size: 1024KiB
IRQ#14 thread RT prio: 48.
hda: 234441648 sectors (120034 MB) w/8192KiB Cache, CHS=16383/255/63, UDMA(100)
hda: cache flushes supported
hda: hda1 hda2 < hda5 hda6 hda7 hda8 > hda3
hdb: max request size: 128KiB
hdb: 39102336 sectors (20020 MB) w/2048KiB Cache, CHS=38792/16/63, UDMA(100)
hdb: cache flushes not supported
hdb: hdb1
mice: PS/2 mouse device common for all mice
IRQ#12 thread RT prio: 47.
IRQ#1 thread RT prio: 46.
input: AT Translated Set 2 keyboard on isa0060/serio0
input: ImExPS/2 Generic Explorer Mouse on isa0060/serio1
input: PC Speaker
NET: Registered protocol family 2
IP: routing cache hash table of 256 buckets, 40Kbytes
TCP: Hash tables configured (established 8192 bind 13107)
arp_tables: (C) 2002 David S. Miller
NET: Registered protocol family 1
ACPI: (supports S0 S1 S3 S4 S5)
ACPI wakeup devices:
PCI0 PS2K PS2M UAR2 UAR1 AC97 USB1 USB2 USB3 USB4 EHCI PWRB SLPB
EXT3-fs: INFO: recovery required on readonly filesystem.
EXT3-fs: write access will be enabled during recovery.
kjournald starting. Commit interval 5 seconds
EXT3-fs: recovery complete.
EXT3-fs: mounted filesystem with ordered data mode.
VFS: Mounted root (ext3 filesystem) readonly.
Freeing unused kernel memory: 148k freed
Adding 995988k swap on /dev/hda6. Priority:2 extents:1
EXT3 FS on hda8, internal journal
IRQ#8 thread RT prio: 45.
ip_tables: (C) 2000-2002 Netfilter core team
ip_conntrack version 2.1 (7168 buckets, 57344 max) - 456 bytes per conntrack
Capability LSM initialized
kjournald starting. Commit interval 5 seconds
EXT3 FS on hda5, internal journal
EXT3-fs: mounted filesystem with ordered data mode.
kjournald starting. Commit interval 5 seconds
EXT3-fs: mounted filesystem with ordered data mode.
Linux agpgart interface v0.100 (c) Dave Jones
agpgart: Detected AGP bridge 0
agpgart: Maximum main memory to use for agp memory: 816M
agpgart: AGP aperture is 64M @ 0xe4000000
ACPI: PCI interrupt 0000:00:09.0[A] -> GSI 16 (level, low) -> IRQ 16
SCSI subsystem initialized
libata version 1.02 loaded.
sata_via version 0.20
ACPI: PCI interrupt 0000:00:0f.0[B] -> GSI 20 (level, low) -> IRQ 20
sata_via(0000:00:0f.0): routed to hard irq line 10
ata1: SATA max UDMA/133 cmd 0xC400 ctl 0xC002 bmdma 0xB000 irq 20
ata2: SATA max UDMA/133 cmd 0xB800 ctl 0xB402 bmdma 0xB008 irq 20
ata1: no device found (phy stat 00000000)
scsi0 : sata_via
ata2: no device found (phy stat 00000000)
scsi1 : sata_via
usbcore: registered new driver usbfs
usbcore: registered new driver hub
USB Universal Host Controller Interface driver v2.2
ACPI: PCI interrupt 0000:00:10.0[A] -> GSI 21 (level, low) -> IRQ 21
PCI: Via IRQ fixup for 0000:00:10.0, from 11 to 5
uhci_hcd 0000:00:10.0: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller
uhci_hcd 0000:00:10.0: irq 21, io base 0xc800
uhci_hcd 0000:00:10.0: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:10.1[A] -> GSI 21 (level, low) -> IRQ 21
PCI: Via IRQ fixup for 0000:00:10.1, from 11 to 5
uhci_hcd 0000:00:10.1: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#2)
uhci_hcd 0000:00:10.1: irq 21, io base 0xd000
uhci_hcd 0000:00:10.1: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:10.2[B] -> GSI 21 (level, low) -> IRQ 21
PCI: Via IRQ fixup for 0000:00:10.2, from 10 to 5
uhci_hcd 0000:00:10.2: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#3)
uhci_hcd 0000:00:10.2: irq 21, io base 0xd400
uhci_hcd 0000:00:10.2: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:10.3[B] -> GSI 21 (level, low) -> IRQ 21
PCI: Via IRQ fixup for 0000:00:10.3, from 10 to 5
uhci_hcd 0000:00:10.3: VIA Technologies, Inc. VT82xxxxx UHCI USB 1.1 Controller (#4)
uhci_hcd 0000:00:10.3: irq 21, io base 0xd800
uhci_hcd 0000:00:10.3: new USB bus registered, assigned bus number 4
hub 4-0:1.0: USB hub found
hub 4-0:1.0: 2 ports detected
ACPI: PCI interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21
ehci_hcd 0000:00:10.4: VIA Technologies, Inc. USB 2.0
ehci_hcd 0000:00:10.4: irq 21, pci mem 0xfbc00000
ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 5
ehci_hcd 0000:00:10.4: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
hub 5-0:1.0: USB hub found
hub 5-0:1.0: 8 ports detected
NET: Registered protocol family 17
IRQ#4 thread RT prio: 44.
IRQ#3 thread RT prio: 43.
ACPI: Processor [CPU1] (supports C1)
ACPI: Power Button (FF) [PWRF]
ACPI: Sleep Button (CM) [SLPB]
BUG: semaphore recursion deadlock detected!
.. current task firefox-bin/4761 [f21a3020] is already holding c04306c0 [w:f21a3020, d:1].
[<c01c7408>] __rwsem_deadlock+0x188/0x1a0 (12)
[<c01c72ec>] __rwsem_deadlock+0x6c/0x1a0 (52)
[<c0271c51>] dev_queue_xmit_nit+0x41/0x130 (24)
[<c02cb9ee>] down_write_mutex+0x5e/0x210 (4)
[<c02cba8e>] down_write_mutex+0xfe/0x210 (24)
[<c02cbba8>] down_read_mutex+0x8/0x30 (12)
[<c0271c51>] dev_queue_xmit_nit+0x41/0x130 (40)
[<c01c7abe>] up_write_mutex+0x2e/0x60 (12)
[<c0281403>] qdisc_restart+0x223/0x250 (24)
[<c02721dd>] dev_queue_xmit+0x1ad/0x260 (12)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02721ef>] dev_queue_xmit+0x1bf/0x260 (36)
[<c027852e>] neigh_resolve_output+0xfe/0x240 (32)
[<c029323e>] ip_finish_output2+0xbe/0x240 (56)
[<c027dc05>] nf_hook_slow+0xd5/0x130 (36)
[<c0293180>] ip_finish_output2+0x0/0x240 (28)
[<c0290a4b>] ip_finish_output+0x26b/0x270 (32)
[<c0293180>] ip_finish_output2+0x0/0x240 (24)
[<c029316a>] dst_output+0x1a/0x30 (32)
[<c027dc05>] nf_hook_slow+0xd5/0x130 (12)
[<c0293150>] dst_output+0x0/0x30 (28)
[<c029110a>] ip_queue_xmit+0x45a/0x570 (32)
[<c0293150>] dst_output+0x0/0x30 (24)
[<c0135cb5>] sub_preempt_count+0x65/0xd0 (36)
[<c01c791e>] __up_write+0x10e/0x220 (8)
[<c0135748>] check_preempt_timing+0x58/0x2e0 (8)
[<c0135cb5>] sub_preempt_count+0x65/0xd0 (4)
[<c01c791e>] __up_write+0x10e/0x220 (4)
[<c0134fdd>] __mcount+0x1d/0x20 (28)
[<c01c715e>] rwsem_owner_del+0xe/0xf0 (4)
[<c0134fdd>] __mcount+0x1d/0x20 (40)
[<c02a871e>] tcp_v4_send_check+0xe/0xf0 (4)
[<c02a2219>] tcp_transmit_skb+0x439/0x880 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02a875f>] tcp_v4_send_check+0x4f/0xf0 (20)
[<c02a22c2>] tcp_transmit_skb+0x4e2/0x880 (32)
[<c0114310>] mcount+0x14/0x18 (28)
[<c02a4f56>] tcp_send_ack+0xa6/0xf0 (52)
[<c029feb4>] __tcp_ack_snd_check+0x14/0xa0 (12)
[<c02a0965>] tcp_rcv_established+0x6e5/0x920 (24)
[<c02a9c0e>] tcp_v4_do_rcv+0x13e/0x150 (56)
[<c026b065>] __release_sock+0x55/0x80 (32)
[<c026b98e>] release_sock+0x7e/0x80 (32)
[<c0297c94>] tcp_recvmsg+0x2f4/0x750 (24)
[<c0114310>] mcount+0x14/0x18 (64)
[<c026bb19>] sock_common_recvmsg+0x59/0x70 (20)
[<c0267da1>] sock_recvmsg+0xd1/0xf0 (48)
[<c01c791e>] __up_write+0x10e/0x220 (24)
[<c0134fdd>] __mcount+0x1d/0x20 (44)
[<c01c715e>] rwsem_owner_del+0xe/0xf0 (4)
[<c015f959>] fget+0x59/0x70 (72)
[<c01c7abe>] up_write_mutex+0x2e/0x60 (28)
[<c01344e0>] autoremove_wake_function+0x0/0x60 (24)
[<c026798f>] sockfd_lookup+0x1f/0x80 (28)
[<c0114310>] mcount+0x14/0x18 (4)
[<c0269409>] sys_recvfrom+0x99/0x100 (20)
[<c01426f2>] free_pages_bulk+0x1d2/0x2e0 (48)
[<c01c7abe>] up_write_mutex+0x2e/0x60 (28)
[<c0114310>] mcount+0x14/0x18 (112)
[<c02694ab>] sys_recv+0x3b/0x40 (20)
[<c0269be2>] sys_socketcall+0x152/0x240 (32)
[<c010527b>] syscall_call+0x7/0xb (68)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: down_write_mutex+0x201/0x210 [<c02cbb91>] / (dev_queue_xmit_nit+0x41/0x130 [<c0271c51>])
.. entry 2: print_traces+0x1d/0x90 [<c013600d>] / (dump_stack+0x23/0x30 [<c01060b3>])
BUG: circular semaphore deadlock: ksoftirqd/0/2 is blocked on c04306c0, deadlocking firefox-bin/4761
f7c2be24 00000046 f7c24020 c03bfca0 00000202 00001dd8 f7c2a000 f7c24020
00000000 f7c2be10 000002c5 03c4f81b 00000029 f7c24020 f7c242b4 f7c2a000
f7c24020 00000000 f7c2be48 c02ca7af f7c2be48 00000082 c03e3a40 c02cbb1e
Call Trace:
[<c02ca7af>] schedule+0x2f/0xe0 (80)
[<c02cbb1e>] down_write_mutex+0x18e/0x210 (16)
[<c02cbadd>] down_write_mutex+0x14d/0x210 (20)
[<c02cbba8>] down_read_mutex+0x8/0x30 (12)
[<c027db57>] nf_hook_slow+0x27/0x130 (40)
[<c0134fdd>] __mcount+0x1d/0x20 (24)
[<c028d4ae>] ip_rcv+0xe/0x540 (4)
[<c027270d>] netif_receive_skb+0x12d/0x240 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c028d900>] ip_rcv+0x460/0x540 (20)
[<c028db80>] ip_rcv_finish+0x0/0x300 (24)
[<c0114310>] mcount+0x14/0x18 (8)
[<c027270d>] netif_receive_skb+0x12d/0x240 (28)
[<c0270008>] gnet_stats_copy_basic+0x18/0x80 (20)
[<c0272a3f>] net_rx_action+0x7f/0x1a0 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02728a8>] process_backlog+0x88/0x1a0 (20)
[<c0272a3f>] net_rx_action+0x7f/0x1a0 (40)
[<c0123807>] ___do_softirq+0x87/0xd0 (36)
[<c01238d8>] _do_softirq+0x8/0x30 (8)
[<c0123cc4>] ksoftirqd+0xb4/0x100 (4)
[<c01238f0>] _do_softirq+0x20/0x30 (28)
[<c0123cc4>] ksoftirqd+0xb4/0x100 (8)
[<c0133f2a>] kthread+0xaa/0xb0 (24)
[<c0123c10>] ksoftirqd+0x0/0x100 (20)
[<c0133e80>] kthread+0x0/0xb0 (12)
[<c0103319>] kernel_thread_helper+0x5/0xc (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __schedule+0x4e/0x640 [<c02ca18e>] / (schedule+0x2f/0xe0 [<c02ca7af>])
.. entry 2: __schedule+0xdd/0x640 [<c02ca21d>] / (schedule+0x2f/0xe0 [<c02ca7af>])
* Florian Schmidt <[email protected]> wrote:
> ah, and btw: what does the /proc/sys/kernel/kernel_preemption tunable
> do with PREEMPT_REALTIME enabled?
it's pretty pointless to offer this tunable, agreed. In theory one could
try to turn off involuntary preemption but what's the point?
> mango:~# cat /proc/sys/kernel/kernel_preemption
> 1
>
> [all the other VP tunables are not available anymore]
>
> mango:~# cat /proc/sys/kernel/voluntary_preemption
> cat: /proc/sys/kernel/voluntary_preemption: No such file or directory
> mango:~# cat /proc/sys/kernel/hardirq_preemption
> cat: /proc/sys/kernel/hardirq_preemption: No such file or directory
> mango:~# cat /proc/sys/kernel/softirq_preemption
> cat: /proc/sys/kernel/softirq_preemption: No such file or directory
right - the PREEMPT_REALTIME kernel is only correct if all asynchronous
processing is done in a process context, so i removed those tunables.
Ingo
Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
>
>
>>>Still problems with interactive behavior. Running KDE, with top
>>>running in xterm, scrolling through the menus I get some pauses. When
>>>the pauses occur I see kdeinit hit the top of the list and sometimes
>>>consuming 90% or more of a CPU and idle usage drops to 30-40%. I do
>>>see some latency traces (not really high ones) in the log that were
>>>generated by kdeinit but I think they were generated prior to when
>>>these pauses occurred, most likely when logging in.
>>
>>is this 90% or more CPU time system (kernel) overhead or userspace
>>overhead?
>
It appears that most of the time is being consumed in system (~50% vs.
~6%).
>
> another thing to watch for is the context-switch rate in vmstat. -V0.3
> patches include a hack that include involuntary context-switches (mutex
> context switches) in the context-switch stat as well. (previously those
> were reported in a separate field which vmstat didnt pick up.)
>
> So if the context-switch rate shots up to above say 100K/sec that is a
> sure sign of some mutex badness. The livelock scenarios i solved in
> -V0.3 occasionally generated a more than 500K/sec context-switch rate on
> a 2GHz box. Having just a couple of thousand per sec isnt by itself a
> sign of anything unusual.
As for the context-switch, I do see this jump up to ~10-11K/sec from
~4-5K/sec and I see this only when I trigger the pauses.
>
> Ingo
>
kr
Ingo Molnar wrote:
>
> i have released the -V0.3 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a fixes-only release, but still experimental.
OK. Currently with RT-V0.3.2.
So it seems that the jackd -R is no more an issue here.
I've tested several times now, and can even start more than 7 (yes,
seven!) fluidsynth instances, with respective soundfonts loaded, all
running without problems, besides the cpu cost topping to 80% (user) and
memory is near the swappiness edge, which is usually a normal behavior, or
so I believe.
Remember that having just 2 (two) fluidsynth instances was quite enough
for hosing the system in no time, on RT-V0.2.
However (oh no!:) those jackd -R xruns are still frequent, much frequent
than RT-U3, which is my stable RT kernel atm.
OK. I'll take this time for some questions:
What's the rationale that you guys are using on tunning the IRQ threading
policies and priorities?
What's the best approach to take, regarding the jackd, soundcard, usb,
keyboard, mouse, whatever IRQ handlers?
I've heard somewhat contraditory opinions elsewhere, but would like to
know what's in the road ahead ;)
As a side note, while I was testing this snd-usb-usx2y ALSA development
module for my Tascam US-224 USB audio/midi control interface, I've found
that the best and stable results are achieved by leveraging the ohci_hcd
IRQ handler (normally IRQ 10) to a higher priority than jackd's.
I've been doing just that with e.g. chrt -p -f 60 `pidof "IRQ 10"`.
Failing to do so just makes jackd (or it's alsa backend) missing some
deadline and drop out very easily. This has been my conclusion while
testing with RT-U3. Don't know what is reserved by RT-V0.3.2, yet. I'll do
that later.
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
i have released the -V0.4 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this is a fixes-only release, but still experimental.
this release should fix more bugs of the 'slowdown' and 'interactivity
problems' variety.
To debug the wakeup anomalies reported i've implemented a new variant of
the latency tracer, which now traces 'wakeup latencies' too - i.e. it
measures and traces maximum delays observed from the point of wakeup to
the point of the really starting to execute. Only the highest-priority
runnable task in the system is traced at a time, but this should be more
than enough to find the high latency scheduling paths.
This new tracing mode can be enabled by compiling a LATENCY_TRACE kernel
as usual and doing:
echo 4 > /proc/sys/kernel/trace_enabled
then to start tracing, reset the current max latency value via e.g.:
echo 10 > /proc/sys/kernel/preempt_max_latency
then the kernel should signal wakeup latency events in the syslog:
(sshd/3093/CPU#0): new 18 us maximum-latency wakeup.
(sshd/3093/CPU#0): new 19 us maximum-latency wakeup.
(hackbench/3818/CPU#0): new 20 us maximum-latency wakeup.
(hackbench/3762/CPU#0): new 21 us maximum-latency wakeup.
(hackbench/3814/CPU#0): new 22 us maximum-latency wakeup.
(ksoftirqd/0/3/CPU#0): new 35 us maximum-latency wakeup.
the latency trace of the last (and highest) event can always be found in
/proc/latency_trace, as usual. Note that the trace output is a bit
different in the wakeup-tracing case.
NOTE: the tracer works on SMP too, but since on SMP tasks can switch
from one CPU to another a given trace can be less useful if the delay
happened on another CPU.
using this wakeup tracer i found and fixed a couple of 'missed
preemption check' bugs - all introduced by PREEMPT_REALTIME in the -U/-V
timeframe. So if you had latency/interactivity problems please re-check
-V0.4.
the wakeup tracer is nice in the sense of that it traces actual, realy.
Changes since -V0.3.2:
- fixed the rtc_lock related crash reported by K.R. Foley and Robert
Crocombe.
- fixed missing preemption checks in rwsem-generic.c
- fixed missing preemption check in schedule_tail() [==new task wakeup]
- implemented wakeup-latency tracer
to create a -V0.4 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
+ http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.9/2.6.9-mm1/2.6.9-mm1.bz2
+ http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.9-mm1-V0.4
Ingo
* K.R. Foley <[email protected]> wrote:
> Running amlat [...]
btw., to get good 'realfeel' results i had to apply the attached patch.
Especially when running realfeel over the network it can easily happen
that it's delayed for some time and gets out of sync with the RTC. So
after a new maximum latency has triggered the code now waits 10 periods
to wait for the timings to recover.
this does not hurt the latency measurements in any way - latencies that
occur after these 10 ticks (~5 msecs) are over are still fully measured
and reported.
amlat produces weird output for me, continuously increasing latency
values:
latency = 2967939 milliseconds
latency = 2967950 milliseconds
sigint
max jitter = 0 microseconds
maybe some /dev/rtc API detail changed? Or is this the normal output?
Ingo
* Michal Schmidt <[email protected]> wrote:
> OK, re-reporting a network deadlock. It happens a few seconds after
> starting Firefox. This is with -V0.3.2:
i've uploaded -V0.4.1 with a fix that could fix this networking
deadlock. Does it work any better?
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> OK. Currently with RT-V0.3.2.
>
> So it seems that the jackd -R is no more an issue here.
great.
> However (oh no!:) those jackd -R xruns are still frequent, much
> frequent than RT-U3, which is my stable RT kernel atm.
-V0.4.1 could help with this problem. There were a number of places
where the PREEMPT_REALTIME kernel missed reschedules so it could easily
happen that jackd would sit in the runqueue waiting to be executed and
the kernel got quickly out of a critical section but then the kernel
'forgot' to reschedule for many milliseconds!
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>Running amlat [...]
>
>
> btw., to get good 'realfeel' results i had to apply the attached patch.
> Especially when running realfeel over the network it can easily happen
> that it's delayed for some time and gets out of sync with the RTC. So
> after a new maximum latency has triggered the code now waits 10 periods
> to wait for the timings to recover.
>
> this does not hurt the latency measurements in any way - latencies that
> occur after these 10 ticks (~5 msecs) are over are still fully measured
> and reported.
>
> amlat produces weird output for me, continuously increasing latency
> values:
>
> latency = 2967939 milliseconds
> latency = 2967950 milliseconds
> sigint
> max jitter = 0 microseconds
>
> maybe some /dev/rtc API detail changed? Or is this the normal output?
>
> Ingo
>
Well to produce useful results, amlat requires Andrew's rtc-debug patch
that modifies the rtc driver as well as traps so that timings are kept
when the isr gets run and when the rtc device is read to track
scheduling latencies. Also if this patch was applied the value being
read by amlat from the rtc device would be the last interrupt time
instead of the interrupt info that rtc normally produces. So the latency
values being spit out are bogus, but it's good enough to exercise the rtc.
I use the rtc-debug and amlat to generate histograms of latencies which
is what I was trying to do when I found the rtc problem the first time.
I believe that rtc-debug/amlat is much more accurate for generating
histograms of latencies than realfeel is because the instrumentation is
in the kernel rather than a userspace program.
kr
* K.R. Foley <[email protected]> wrote:
> I use the rtc-debug and amlat to generate histograms of latencies
> which is what I was trying to do when I found the rtc problem the
> first time. I believe that rtc-debug/amlat is much more accurate for
> generating histograms of latencies than realfeel is because the
> instrumentation is in the kernel rather than a userspace program.
ah, ok - nice. So rtc-debug+amlat is the only known-reliable way to
produce latency histograms?
Btw., rtc-debug's latency results could now be cross-validated with
-V0.4's wakeup tracer (and vice versa), because the two are totally
independent mechanisms.
Ingo
Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
>
>>I use the rtc-debug and amlat to generate histograms of latencies
>>which is what I was trying to do when I found the rtc problem the
>>first time. I believe that rtc-debug/amlat is much more accurate for
>>generating histograms of latencies than realfeel is because the
>>instrumentation is in the kernel rather than a userspace program.
>
>
> ah, ok - nice. So rtc-debug+amlat is the only known-reliable way to
> produce latency histograms?
Don't know that for sure, but it is the most reliable way that I am
aware of.
>
> Btw., rtc-debug's latency results could now be cross-validated with
> -V0.4's wakeup tracer (and vice versa), because the two are totally
> independent mechanisms.
Agreed. :)
>
> Ingo
>
kr
* Lee Revell <[email protected]> wrote:
> > ah, ok - nice. So rtc-debug+amlat is the only known-reliable way to
> > produce latency histograms?
> >
>
> Yes, I think it is the most reliable way because the measurement is
> done in the kernel. At least, this is what AM's notes say. There are
> any number of ways to generate these with userspace programs (jackd,
> realfeel, etc).
>
> Here is a more up to date version of the rtc-debug patch:
>
> http://lkml.org/lkml/2004/9/9/307
>
> There is still a bit of 2.4 cruft in there but it works well. Maybe
> this could be included in future patches.
the most natural point of inclusion would be Andrew's -mm tree i think
:-)
Ingo
On Wed, 2004-10-27 at 17:05 +0200, Ingo Molnar wrote:
> * K.R. Foley <[email protected]> wrote:
>
> > I use the rtc-debug and amlat to generate histograms of latencies
> > which is what I was trying to do when I found the rtc problem the
> > first time. I believe that rtc-debug/amlat is much more accurate for
> > generating histograms of latencies than realfeel is because the
> > instrumentation is in the kernel rather than a userspace program.
>
> ah, ok - nice. So rtc-debug+amlat is the only known-reliable way to
> produce latency histograms?
>
Yes, I think it is the most reliable way because the measurement is done
in the kernel. At least, this is what AM's notes say. There are any
number of ways to generate these with userspace programs (jackd,
realfeel, etc).
Here is a more up to date version of the rtc-debug patch:
http://lkml.org/lkml/2004/9/9/307
There is still a bit of 2.4 cruft in there but it works well. Maybe
this could be included in future patches.
Lee
Ingo Molnar wrote:
>
> Rui Nuno Capela wrote:
>
>> OK. Currently with RT-V0.3.2.
>>
>> So it seems that the jackd -R is no more an issue here.
>
> great.
>
>> However (oh no!:) those jackd -R xruns are still frequent, much
>> frequent than RT-U3, which is my stable RT kernel atm.
>
> -V0.4.1 could help with this problem. There were a number of places
> where the PREEMPT_REALTIME kernel missed reschedules so it could easily
> happen that jackd would sit in the runqueue waiting to be executed and
> the kernel got quickly out of a critical section but then the kernel
> 'forgot' to reschedule for many milliseconds!
>
On RT-V0.4.1, xruns seems slighly reduced, but plenty enough for my taste.
Running jackd -R with 6 fluidsynth instances gives me 0 (zero) xruns on
RT-U3, but more than 20 (twenty) on RT-V0.4.1, under a 5 minute time
frame. It was 30 (thirty something) on RT-V0.4, but overall "feel" is
about the same.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
On Wed, 2004-10-27 at 16:26 +0100, Rui Nuno Capela wrote:
> On RT-V0.4.1, xruns seems slighly reduced, but plenty enough for my taste.
>
Have you tried making ksoftirqd SCHED_OTHER? This drastically reduced
xruns on my system with an earlier version.
Lee
>i have released the -V0.4 Real-Time Preemption patch, which can be
>downloaded from:
I built with this (and not V0.4.1) and had the following results:
[1] No problems with the build.
[2] Booting to single user without problems.
[3] telinit 3 - still have the atomic counter underflow BUG related
to qdisc_destroy. Otherwise, normal boot messages.
[4] telinit 5 - normal boot messages and display came up OK. Able
to login and start testing.
[5] Running my stress test, the first test (X server) appeared to
run OK. The second test (/proc or top) did not run properly. The
RT audio test appeared to take the whole system (both CPU's) and
the terminal window showing top did not appear until the audio
test finished (and was quickly taken down by the script). Could not
move the mouse at all during that test. The third test (network
output) ran a short period and then the machine locked up. Had
to use the hardware reset to recover.
The only message in the system log that was unusual was
nrpe[4151]: Error: Could not complete SSL handshake.
(no messages after this one)
everything before that was OK in the system log.
Looking at the application level charts from the first two
tests (after rebooting with -T3), the measured CPU time was
VERY SMOOTH, almost no blips until the end of the first test
and start of the second test(680 usec and 570 usec respectively).
The audio loop had some (6) spikes, 4 in the first test and 2 in
the second. The longest spike was over 60 msec in duration.
My script that samples latency traces > 200 usec had no
output. Not sure if that is because it didn't run or if there
were no traces to record.
I looked at the updated patch (V0.4.1) but I am not sure it
fixes this lock up problem. Please advise.
--Mark H Johnson
<mailto:[email protected]>
Lee Revell wrote:
> On Wed, 2004-10-27 at 17:17 +0200, Ingo Molnar wrote:
>
>>>Here is a more up to date version of the rtc-debug patch:
>>>
>>>http://lkml.org/lkml/2004/9/9/307
>>>
>>>There is still a bit of 2.4 cruft in there but it works well. Maybe
>>>this could be included in future patches.
>>
>>the most natural point of inclusion would be Andrew's -mm tree i think
>>:-)
>>
>
>
> Well I suspect from looking at the comments :-) that he would not
> include it in its current form due to the way it just checks whether the
> process opening the RTC is called "amlat" and updates the RTC histogram
> if so. I am not sure what a clean way to do this would be, maybe an
> ioctl()?
>
> Anyway I am generating a cleaned up version of the patch agaqinst
> 2.6.9-mm1.
>
> Lee
>
Actually if you are cleaning it up anyway, could you fix it to work with
Ingo's changes to rtc.c? If not I will be glad to do it. Up until one of
the last couple of versions of RT PREEMPT it applied cleanly, but I just
tried it and it failed.
kr
On Wed, 2004-10-27 at 12:21 -0500, K.R. Foley wrote:
> >
> > Anyway I am generating a cleaned up version of the patch agaqinst
> > 2.6.9-mm1.
> >
> > Lee
> >
>
> Actually if you are cleaning it up anyway, could you fix it to work with
> Ingo's changes to rtc.c? If not I will be glad to do it. Up until one of
> the last couple of versions of RT PREEMPT it applied cleanly, but I just
> tried it and it failed.
Yup, here it is against 2.6.9-mm1-V0.4.1. Not yet tested (building now)
but should work. I took out the show_trace_smp part because that never
worked, I always get "Stack pointer is garbage". So now the patch is
smaller and only touches rtc.c.
--- linux-2.6.9-mm1/drivers/char/rtc.c 2004-10-27 13:19:04.000000000 -0400
+++ linux-2.6.9-mm1-V0.4.1/drivers/char/rtc.c 2004-10-27 12:55:23.000000000 -0400
@@ -86,6 +86,18 @@
#include <asm/hpet.h>
#endif
+static unsigned long long last_interrupt_time;
+
+#include <asm/timex.h>
+
+
+#define CPU_MHZ (cpu_khz / 1000)
+#define HISTSIZE 10000
+static int histogram[HISTSIZE];
+
+int rtc_debug;
+int rtc_running;
+
#ifdef __sparc__
#include <linux/pci.h>
#include <asm/ebus.h>
@@ -191,6 +203,14 @@
static const unsigned char days_in_mo[] =
{0, 31, 28, 31, 30, 31, 30, 31, 31, 30, 31, 30, 31};
+static int rtc_state;
+
+enum rtc_states {
+ S_IDLE, /* Waiting for an interrupt */
+ S_WAITING_FOR_READ, /* Signal delivered. waiting for rtc_read() */
+ S_READ_MISSED, /* Signal delivered, read() deadline missed */
+};
+
/*
* Returns true if a clock update is in progress
*/
@@ -259,7 +279,36 @@
spin_unlock_irqrestore(&rtc_task_lock, flags);
wake_up_interruptible(&rtc_wait);
- kill_fasync (&rtc_async_queue, SIGIO, POLL_IN);
+ if (!(rtc_status & RTC_IS_OPEN))
+ goto tata;
+
+ switch (rtc_state) {
+ case S_IDLE: /* Waiting for an interrupt */
+ rdtscll(last_interrupt_time);
+ kill_fasync (&rtc_async_queue, SIGIO, POLL_IN);
+ rtc_state = S_WAITING_FOR_READ;
+ break;
+ case S_WAITING_FOR_READ: /* Signal has been delivered. waiting for rtc_read() */
+ /*
+ * Well foo. The usermode application didn't schedule and read in time.
+ */
+ rtc_state = S_READ_MISSED;
+ if (strcmp(current->comm, "amlat") != 0) {
+ printk("`%s'[%d] is being piggy. "
+ "need_resched=%d, cpu=%d\n",
+ current->comm, current->pid,
+ need_resched(), smp_processor_id());
+ /* show_trace_smp(); */
+ }
+ break;
+ case S_READ_MISSED: /* Signal has been delivered, read() deadline was missed */
+ /*
+ * Not much we can do here. We're waiting for the usermode
+ * application to read the rtc
+ */
+ break;
+ }
+tata:
return IRQ_HANDLED;
}
@@ -319,8 +368,74 @@
* Now all the various file operations that we export.
*/
-static ssize_t rtc_read(struct file *file, char __user *buf,
- size_t count, loff_t *ppos)
+static ssize_t ll_rtc_read(struct file *file, char *buf,
+ size_t count, loff_t *ppos)
+{
+ ssize_t retval;
+ unsigned long long now;
+
+ rdtscll(now);
+
+ switch (rtc_state) {
+ case S_IDLE: /* Waiting for an interrupt */
+ /*
+ * err... This can't be happening
+ */
+ printk("ll_rtc_read(): called in state S_IDLE!\n");
+ break;
+ case S_WAITING_FOR_READ: /*
+ * Signal has been delivered.
+ * waiting for rtc_read()
+ */
+ /*
+ * Well done
+ */
+ case S_READ_MISSED: /*
+ * Signal has been delivered, read()
+ * deadline was missed
+ */
+ /*
+ * So, you finally got here.
+ */
+ if (last_interrupt_time == 0)
+ printk("ll_rtc_read(): we screwed up. "
+ "last_interrupt_time = 0\n");
+ rtc_state = S_IDLE;
+ {
+ unsigned long long latency = now - last_interrupt_time;
+ unsigned long delta; /* Nocroseconds */
+
+ delta = latency;
+ delta /= CPU_MHZ;
+ if (delta > 1000 * 1000) {
+ printk("rtc: eek\n");
+ } else {
+ unsigned long slot = delta;
+ if (slot >= HISTSIZE)
+ slot = HISTSIZE - 1;
+ histogram[slot]++;
+ if (delta > 2000)
+ printk("wow! That was a "
+ "%ld millisec bump\n",
+ delta / 1000);
+ }
+ }
+ rtc_state = S_IDLE;
+ break;
+ }
+
+ if (count < sizeof(last_interrupt_time))
+ return -EINVAL;
+
+ retval = -EIO;
+ if (copy_to_user(buf, &last_interrupt_time,
+ sizeof(last_interrupt_time)) == 0)
+ retval = sizeof(last_interrupt_time);
+ return retval;
+}
+
+static ssize_t orig_rtc_read(struct file *file, char *buf,
+ size_t count, loff_t *ppos)
{
#ifndef RTC_IRQ
return -EIO;
@@ -375,6 +490,19 @@
#endif
}
+/*
+ * If anyone reads this, please send me an email describing
+ * the superlative elegance of this conception
+ */
+static ssize_t rtc_read(struct file *file, char *buf,
+ size_t count, loff_t *ppos)
+{
+ if (strcmp(current->comm, "amlat") == 0)
+ return ll_rtc_read(file, buf, count, ppos);
+ else
+ return orig_rtc_read(file, buf, count, ppos);
+}
+
static int rtc_do_ioctl(unsigned int cmd, unsigned long arg, int kernel)
{
struct rtc_time wtime;
@@ -692,6 +820,8 @@
* needed here. Or anywhere else in this driver. */
static int rtc_open(struct inode *inode, struct file *file)
{
+ int i;
+
spin_lock_irq (&rtc_lock);
if(rtc_status & RTC_IS_OPEN)
@@ -699,7 +829,16 @@
rtc_status |= RTC_IS_OPEN;
- rtc_irq_data = 0;
+ if (strcmp(current->comm, "amlat") == 0) {
+ last_interrupt_time = 0;
+ rtc_state = S_IDLE;
+ rtc_irq_data = 0;
+ }
+
+ rtc_running = 1;
+ for (i = 0; i < HISTSIZE; i++)
+ histogram[i] = 0;
+
spin_unlock_irq (&rtc_lock);
return 0;
@@ -753,6 +892,19 @@
rtc_irq_data = 0;
rtc_status &= ~RTC_IS_OPEN;
spin_unlock_irq (&rtc_lock);
+ {
+ int i = 0;
+ unsigned long total = 0;
+ printk("rtc histogram:\n");
+ for (i = 0; i < HISTSIZE; i++) {
+ if (histogram[i]) {
+ total += histogram[i];
+ printk("%d %d\n", i, histogram[i]);
+ }
+ }
+ printk("\nTotal samples: %lu\n", total);
+ rtc_running = 0;
+ }
return 0;
}
@@ -1127,6 +1279,7 @@
wake_up_interruptible(&rtc_wait);
kill_fasync (&rtc_async_queue, SIGIO, POLL_IN);
+ return;
}
#endif
Ingo Molnar wrote:
> * Michal Schmidt <[email protected]> wrote:
>
>>OK, re-reporting a network deadlock. It happens a few seconds after
>>starting Firefox. This is with -V0.3.2:
>
> i've uploaded -V0.4.1 with a fix that could fix this networking
> deadlock. Does it work any better?
>
> Ingo
Unfortunately, no. It's only slightly different:
BUG: semaphore recursion deadlock detected!
.. current task ksoftirqd/0/2 [f7c24020] is already holding c0438ec0
[w:f7c24020, d:0].
[<c01c7a08>] __rwsem_deadlock+0x188/0x1a0 (12)
[<c01c78ec>] __rwsem_deadlock+0x6c/0x1a0 (52)
[<c0134ac0>] _mutex_lock+0x40/0x50 (28)
[<c02cc153>] down_write_mutex+0x113/0x220 (24)
[<c01c8228>] down_mutex+0x8/0x10 (12)
[<c0134ac0>] _mutex_lock+0x40/0x50 (40)
[<c0281a16>] qdisc_restart+0x236/0x250 (24)
[<c0281d2e>] pfifo_fast_enqueue+0xe/0xc0 (8)
[<c02727dd>] dev_queue_xmit+0x1ad/0x260 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02727ef>] dev_queue_xmit+0x1bf/0x260 (36)
[<c0278b2e>] neigh_resolve_output+0xfe/0x240 (32)
[<c029383e>] ip_finish_output2+0xbe/0x240 (56)
[<c01c80be>] up_write_mutex+0x2e/0x60 (12)
[<c027e211>] nf_hook_slow+0xe1/0x130 (24)
[<c0293780>] ip_finish_output2+0x0/0x240 (28)
[<c029104b>] ip_finish_output+0x26b/0x270 (32)
[<c0293780>] ip_finish_output2+0x0/0x240 (24)
[<c029376a>] dst_output+0x1a/0x30 (32)
[<c027e211>] nf_hook_slow+0xe1/0x130 (12)
[<c0293750>] dst_output+0x0/0x30 (28)
[<c029170a>] ip_queue_xmit+0x45a/0x570 (32)
[<c0293750>] dst_output+0x0/0x30 (24)
[<c0135ef0>] sub_preempt_count+0x60/0xd0 (24)
[<c01c7f1e>] __up_write+0x10e/0x220 (12)
[<c0135ef0>] sub_preempt_count+0x60/0xd0 (4)
[<c0135958>] check_preempt_timing+0x58/0x2f0 (8)
[<c0135ef0>] sub_preempt_count+0x60/0xd0 (4)
[<c01c7f1e>] __up_write+0x10e/0x220 (4)
[<c013514d>] __mcount+0x1d/0x20 (32)
[<c01c775e>] rwsem_owner_del+0xe/0xf0 (4)
[<c013514d>] __mcount+0x1d/0x20 (40)
[<c02a8d1e>] tcp_v4_send_check+0xe/0xf0 (4)
[<c02a2819>] tcp_transmit_skb+0x439/0x880 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02a8d5f>] tcp_v4_send_check+0x4f/0xf0 (20)
[<c02a28c2>] tcp_transmit_skb+0x4e2/0x880 (32)
[<c0114310>] mcount+0x14/0x18 (28)
[<c02a5556>] tcp_send_ack+0xa6/0xf0 (52)
[<c02a0527>] __tcp_ack_snd_check+0x87/0xa0 (12)
[<c02a0f65>] tcp_rcv_established+0x6e5/0x920 (24)
[<c02aa20e>] tcp_v4_do_rcv+0x13e/0x150 (56)
[<c02aa8e0>] tcp_v4_rcv+0x6c0/0x8d0 (32)
[<c02cc11f>] down_write_mutex+0xdf/0x220 (28)
[<c028e09d>] ip_local_deliver_finish+0xbd/0x1a0 (52)
[<c01c80be>] up_write_mutex+0x2e/0x60 (12)
[<c027e211>] nf_hook_slow+0xe1/0x130 (24)
[<c028dfe0>] ip_local_deliver_finish+0x0/0x1a0 (28)
[<c028da76>] ip_local_deliver+0x1f6/0x220 (32)
[<c028dfe0>] ip_local_deliver_finish+0x0/0x1a0 (24)
[<c028e2c9>] ip_rcv_finish+0x149/0x300 (24)
[<c01c80be>] up_write_mutex+0x2e/0x60 (36)
[<c027e211>] nf_hook_slow+0xe1/0x130 (24)
[<c028e180>] ip_rcv_finish+0x0/0x300 (28)
[<c028df00>] ip_rcv+0x460/0x540 (32)
[<c028e180>] ip_rcv_finish+0x0/0x300 (24)
[<c0114310>] mcount+0x14/0x18 (8)
[<c0272d0d>] netif_receive_skb+0x12d/0x240 (28)
[<c0270008>] __scm_send+0x78/0x1e0 (20)
[<c027303f>] net_rx_action+0x7f/0x1a0 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c0272ea8>] process_backlog+0x88/0x1a0 (20)
[<c027303f>] net_rx_action+0x7f/0x1a0 (40)
[<c0123887>] ___do_softirq+0x87/0xd0 (36)
[<c0123958>] _do_softirq+0x8/0x30 (8)
[<c0123d44>] ksoftirqd+0xb4/0x100 (4)
[<c0123970>] _do_softirq+0x20/0x30 (28)
[<c0123d44>] ksoftirqd+0xb4/0x100 (8)
[<c013406a>] kthread+0xaa/0xb0 (24)
[<c0123c90>] ksoftirqd+0x0/0x100 (20)
[<c0133fc0>] kthread+0x0/0xb0 (12)
[<c0103319>] kernel_thread_helper+0x5/0xc (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: down_write_mutex+0x214/0x220 [<c02cc254>] /
(_mutex_lock+0x40/0x50 [<c0134ac0>])
.. entry 2: print_traces+0x1d/0x60 [<c013663d>] / (dump_stack+0x23/0x30
[<c01060b3>])
BUG: circular semaphore deadlock: firefox-bin/4792 is blocked on
c0438ec0, deadlocking ksoftirqd/0/2
f18119bc 00200086 f1fbe810 c03c1ca0 f1810000 0000196c f1810000 f1fbe810
00000000 f18119a8 00000167 4344b736 00000023 f1fbe810 f1fbeaa4
f1810000
f1fbe810 00000000 f18119e0 c02cae1f 00000054 00000008 f18119e0
00200046
Call Trace:
[<c02cae1f>] schedule+0x2f/0xe0 (80)
[<c02cc1a0>] down_write_mutex+0x160/0x220 (36)
[<c02cc268>] down_read_mutex+0x8/0x30 (12)
[<c0272251>] dev_queue_xmit_nit+0x41/0x130 (40)
[<c01c80be>] up_write_mutex+0x2e/0x60 (12)
[<c0281a03>] qdisc_restart+0x223/0x250 (24)
[<c02727dd>] dev_queue_xmit+0x1ad/0x260 (12)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02727ef>] dev_queue_xmit+0x1bf/0x260 (36)
[<c0278b2e>] neigh_resolve_output+0xfe/0x240 (32)
[<c029383e>] ip_finish_output2+0xbe/0x240 (56)
[<c01c80be>] up_write_mutex+0x2e/0x60 (12)
[<c027e211>] nf_hook_slow+0xe1/0x130 (24)
[<c0293780>] ip_finish_output2+0x0/0x240 (28)
[<c029104b>] ip_finish_output+0x26b/0x270 (32)
[<c0293780>] ip_finish_output2+0x0/0x240 (24)
[<c029376a>] dst_output+0x1a/0x30 (32)
[<c027e211>] nf_hook_slow+0xe1/0x130 (12)
[<c0293750>] dst_output+0x0/0x30 (28)
[<c029170a>] ip_queue_xmit+0x45a/0x570 (32)
[<c0293750>] dst_output+0x0/0x30 (24)
[<c02cc11f>] down_write_mutex+0xdf/0x220 (24)
[<c0135ef0>] sub_preempt_count+0x60/0xd0 (4)
[<c01c7f1e>] __up_write+0x10e/0x220 (12)
[<c0135ef0>] sub_preempt_count+0x60/0xd0 (4)
[<c0135958>] check_preempt_timing+0x58/0x2f0 (8)
[<c0135ef0>] sub_preempt_count+0x60/0xd0 (4)
[<c01c7f1e>] __up_write+0x10e/0x220 (4)
[<c013514d>] __mcount+0x1d/0x20 (32)
[<c01c775e>] rwsem_owner_del+0xe/0xf0 (4)
[<c013514d>] __mcount+0x1d/0x20 (36)
[<c02a8d1e>] tcp_v4_send_check+0xe/0xf0 (4)
[<c02a2819>] tcp_transmit_skb+0x439/0x880 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c02a8d5f>] tcp_v4_send_check+0x4f/0xf0 (20)
[<c02a28c2>] tcp_transmit_skb+0x4e2/0x880 (32)
[<c01c9f52>] memcpy+0x12/0x40 (36)
[<c02a37f4>] tcp_write_xmit+0x154/0x2d0 (44)
[<c0296eea>] tcp_sendmsg+0x50a/0x1130 (12)
[<c0114310>] mcount+0x14/0x18 (8)
[<c0296ebe>] tcp_sendmsg+0x4de/0x1130 (32)
[<c0268254>] sock_sendmsg+0xc4/0xe0 (84)
[<c02b9af0>] inet_sendmsg+0x50/0x60 (28)
[<c0268254>] sock_sendmsg+0xc4/0xe0 (28)
[<c0135958>] check_preempt_timing+0x58/0x2f0 (24)
[<c0135ef0>] sub_preempt_count+0x60/0xd0 (4)
[<c013514d>] __mcount+0x1d/0x20 (36)
[<c01c775e>] rwsem_owner_del+0xe/0xf0 (4)
[<c015ff59>] fget+0x59/0x70 (72)
[<c01c80be>] up_write_mutex+0x2e/0x60 (28)
[<c0134620>] autoremove_wake_function+0x0/0x60 (24)
[<c0267f8f>] sockfd_lookup+0x1f/0x80 (28)
[<c0114310>] mcount+0x14/0x18 (4)
[<c026990d>] sys_sendto+0xed/0x110 (20)
[<c01c80be>] up_write_mutex+0x2e/0x60 (72)
[<c0142cf2>] free_pages_bulk+0x1d2/0x2e0 (24)
[<c013514d>] __mcount+0x1d/0x20 (72)
[<c026993b>] sys_send+0xb/0x40 (4)
[<c026a1c3>] sys_socketcall+0x133/0x240 (4)
[<c0114310>] mcount+0x14/0x18 (8)
[<c026996b>] sys_send+0x3b/0x40 (20)
[<c026a1c3>] sys_socketcall+0x133/0x240 (32)
[<c010527b>] syscall_call+0x7/0xb (68)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __schedule+0x4e/0x6b0 [<c02ca78e>] / (schedule+0x2f/0xe0
[<c02cae1f>])
.. entry 2: __schedule+0xdd/0x6b0 [<c02ca81d>] / (schedule+0x2f/0xe0
[<c02cae1f>])
Michal
> Lee Revell
> On Wed, 2004-10-27 at 16:26 +0100, Rui Nuno Capela wrote:
>> On RT-V0.4.1, xruns seems slighly reduced, but plenty enough for my
>> taste.
>>
>
> Have you tried making ksoftirqd SCHED_OTHER? This drastically reduced
> xruns on my system with an earlier version.
>
> Lee
>
Hmm... ksoftirqd/0 (or also ksoftirqd/1 on 2cpu SMP) are already
SCHED_OTHER by default, at least on both of my boxes that are running
RT-V0.4.1 now.
Should I try the other way around? Lets see... 'chrt -p -f 90 `pidof
ksoftirwd/0`',... yes, apparentely the xrun rate seems to decrease into
half, but IMHO not conclusive enough, thought.
CU
--
rncbc aka Rui Nuno Capela
[email protected]
Lee Revell wrote:
> On Wed, 2004-10-27 at 12:21 -0500, K.R. Foley wrote:
>
>>>Anyway I am generating a cleaned up version of the patch agaqinst
>>>2.6.9-mm1.
>>>
>>>Lee
>>>
>>
>>Actually if you are cleaning it up anyway, could you fix it to work with
>>Ingo's changes to rtc.c? If not I will be glad to do it. Up until one of
>>the last couple of versions of RT PREEMPT it applied cleanly, but I just
>>tried it and it failed.
>
>
> Yup, here it is against 2.6.9-mm1-V0.4.1. Not yet tested (building now)
> but should work. I took out the show_trace_smp part because that never
> worked, I always get "Stack pointer is garbage". So now the patch is
> smaller and only touches rtc.c.
>
OH! And thanks. :)
kr
Lee Revell wrote:
> On Wed, 2004-10-27 at 12:21 -0500, K.R. Foley wrote:
>
>>>Anyway I am generating a cleaned up version of the patch agaqinst
>>>2.6.9-mm1.
>>>
>>>Lee
>>>
>>
>>Actually if you are cleaning it up anyway, could you fix it to work with
>>Ingo's changes to rtc.c? If not I will be glad to do it. Up until one of
>>the last couple of versions of RT PREEMPT it applied cleanly, but I just
>>tried it and it failed.
>
>
> Yup, here it is against 2.6.9-mm1-V0.4.1. Not yet tested (building now)
> but should work. I took out the show_trace_smp part because that never
> worked, I always get "Stack pointer is garbage". So now the patch is
> smaller and only touches rtc.c.
>
You've also eliminated the stack trace altogether then, right? Not that
I really need it. :-)
kr
On Wed, 2004-10-27 at 17:17 +0200, Ingo Molnar wrote:
> > Here is a more up to date version of the rtc-debug patch:
> >
> > http://lkml.org/lkml/2004/9/9/307
> >
> > There is still a bit of 2.4 cruft in there but it works well. Maybe
> > this could be included in future patches.
>
> the most natural point of inclusion would be Andrew's -mm tree i think
> :-)
>
Well I suspect from looking at the comments :-) that he would not
include it in its current form due to the way it just checks whether the
process opening the RTC is called "amlat" and updates the RTC histogram
if so. I am not sure what a clean way to do this would be, maybe an
ioctl()?
Anyway I am generating a cleaned up version of the patch agaqinst
2.6.9-mm1.
Lee
On Wed, 2004-10-27 at 12:40 -0500, K.R. Foley wrote:
> Lee Revell wrote:
> > On Wed, 2004-10-27 at 12:21 -0500, K.R. Foley wrote:
> >
> >>>Anyway I am generating a cleaned up version of the patch agaqinst
> >>>2.6.9-mm1.
> >>>
> >>>Lee
> >>>
> >>
> >>Actually if you are cleaning it up anyway, could you fix it to work with
> >>Ingo's changes to rtc.c? If not I will be glad to do it. Up until one of
> >>the last couple of versions of RT PREEMPT it applied cleanly, but I just
> >>tried it and it failed.
> >
> >
> > Yup, here it is against 2.6.9-mm1-V0.4.1. Not yet tested (building now)
> > but should work. I took out the show_trace_smp part because that never
> > worked, I always get "Stack pointer is garbage". So now the patch is
> > smaller and only touches rtc.c.
> >
>
> You've also eliminated the stack trace altogether then, right? Not that
> I really need it. :-)
Yes, it's commented out. I figured that a better way would be to have
it trigger Ingo's latency tracer.
Lee
--- Rui Nuno Capela <[email protected]> schrieb:
> Should I try the other way around? Lets see... 'chrt -p
> -f 90 `pidof
> ksoftirwd/0`',... yes, apparentely the xrun rate seems to
> decrease into
> half, but IMHO not conclusive enough, thought.
>
'into half' makes me wonder:
did you also 'chrt -p -f 90 `pidof ksoftirwd/1`'?
I guess you meant that with '...'. Just in case :-)
Best,
Karsten
___________________________________________________________
Gesendet von Yahoo! Mail - Jetzt mit 100MB Speicher kostenlos - Hier anmelden: http://mail.yahoo.de
--- Rui Nuno Capela <[email protected]> schrieb:
> Should I try the other way around? Lets see... 'chrt -p
> -f 90 `pidof
> ksoftirwd/0`',... yes, apparentely the xrun rate seems to
> decrease into
> half, but IMHO not conclusive enough, thought.
>
'into half' makes me wonder:
did you also 'chrt -p -f 90 `pidof ksoftirwd/1`'?
(I guess you meant that with '...') And isn't it
'ksoftirqd'? Just in case :-)
Best,
Karsten
___________________________________________________________
Gesendet von Yahoo! Mail - Jetzt mit 100MB Speicher kostenlos - Hier anmelden: http://mail.yahoo.de
On Wed, 2004-10-27 at 12:43 -0500, K.R. Foley wrote:
> OH! And thanks. :)
>
Well I tried it and it does not seem to work exactly right. This might
be because I enabled the HPET so the RTC is not getting used. When I
run amlat for a few minutes I get a histogram with only 38 samples.
Does this work for you?
Lee
I'm testing out this patch on an debian box.
There seems to be a problem with enable_irq in the e100 driver that
makes the network to b0rk.
What information do you need to get something useful out of this?
I saw that others have this problem, so I've got an serial console to
the box, if you want me to do any tests, tell me how.
Regards,
Magnus
Setting up IP spoofing protection: rp_filter.
Configuring network interfaces: ifconfig/924: BUG in enable_irq at
kernel/irq/manage.c:112
[<c01362a0>] enable_irq+0xe4/0x12c (8)
[<d08a44f6>] e100_up+0x119/0x224 [e100] (44)
[<d08a5743>] e100_open+0x2c/0x84 [e100] (44)
[<c0237dda>] dev_open+0x76/0x85 (28)
[<c02394a8>] dev_change_flags+0x5d/0x138 (24)
[<c0237cbd>] dev_load+0x31/0x6c (12)
[<c027915c>] devinet_ioctl+0x5f9/0x6c6 (20)
[<c027b1b8>] inet_ioctl+0xc7/0xd3 (104)
[<c022ea5e>] sock_ioctl+0x19f/0x26a (24)
[<c016c897>] sys_ioctl+0x1e8/0x249 (28)
[<c0113eea>] do_page_fault+0x0/0x5ee (24)
[<c0105bb3>] syscall_call+0x7/0xb (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: enable_irq+0x31/0x12c [<c01361ed>] / (0x0 [<00000000>])
.. entry 2: print_traces+0x14/0x47 [<c0130201>] / (0x0 [<00000000>])
* Rui Nuno Capela <[email protected]> wrote:
> On RT-V0.4.1, xruns seems slighly reduced, but plenty enough for my
> taste.
>
> Running jackd -R with 6 fluidsynth instances gives me 0 (zero) xruns
> on RT-U3, but more than 20 (twenty) on RT-V0.4.1, under a 5 minute
> time frame. It was 30 (thirty something) on RT-V0.4, but overall
> "feel" is about the same.
ok, i've uploaded RT-V0.4.2 which has more of the same: it fixes other
missed preemption checks. Does it make any difference to the xruns on
your UP box?
Ingo
Lee Revell wrote:
> On Wed, 2004-10-27 at 12:43 -0500, K.R. Foley wrote:
>
>>OH! And thanks. :)
>>
>
>
> Well I tried it and it does not seem to work exactly right. This might
> be because I enabled the HPET so the RTC is not getting used. When I
> run amlat for a few minutes I get a histogram with only 38 samples.
> Does this work for you?
>
> Lee
>
>
Sorry it took a while to get back to you. Yes I did try it a little
earlier and did seem to be getting reasonable numbers. I wouldn't want
to publish those numbers yet because I haven't done anything with
priorities and I was seeing some higher numbers. I don't have the HPET
turned on myself.
kr
On Wed, Oct 27, 2004 at 01:54:45PM -0700, Bill Huey wrote:
> It was originally mean to go in between the irq-thread wake attempt and
> the actual running of the thread body itself. Somehow this is breaking
> in my effort to integrate this logic into Ingo's (your) stuff. Brain
> farting severely right now.
Another note, it's not meant to be a high resolution latency stats patch
as much as giving a general feel of irq latency in the system. That
information is just useful to have in general, but won't be sufficient
enough to track down specific problems in the kernel. Extending this code
to track all wake ups is beyond the original intention of these
measurements. A method like this only applies to thread of high system
priority with a near RT scheduling class or similar.
bill
On Wed, Oct 27, 2004 at 05:05:48PM +0200, Ingo Molnar wrote:
>
> * K.R. Foley <[email protected]> wrote:
>
> > I use the rtc-debug and amlat to generate histograms of latencies
> > which is what I was trying to do when I found the rtc problem the
> > first time. I believe that rtc-debug/amlat is much more accurate for
> > generating histograms of latencies than realfeel is because the
> > instrumentation is in the kernel rather than a userspace program.
>
> ah, ok - nice. So rtc-debug+amlat is the only known-reliable way to
> produce latency histograms?
>
> Btw., rtc-debug's latency results could now be cross-validated with
> -V0.4's wakeup tracer (and vice versa), because the two are totally
> independent mechanisms.
I've got a latency/histogram patch here, but I've been having problems
trying to integrate it into Ingo's irq-threads and getting that simple
subtraction returning something sensible. The basic logic works otherwise.
Maybe another pair of eyes can figure this out, since I could have missed
something pretty simple...
bill
Andrew Morton wrote:
> Ingo Molnar <[email protected]> wrote:
>
>>>Here is a more up to date version of the rtc-debug patch:
>>
>> >
>> > http://lkml.org/lkml/2004/9/9/307
>> >
>> > There is still a bit of 2.4 cruft in there but it works well. Maybe
>> > this could be included in future patches.
>>
>> the most natural point of inclusion would be Andrew's -mm tree i think
>> :-)
>
>
> It's 'orrid. And iirc it breaks normal use of the RTC.
>
It definitely changes the output of /dev/rtc if anything uses that.
kr
* Rui Nuno Capela <[email protected]> wrote:
> On RT-V0.4.1, xruns seems slighly reduced, but plenty enough for my
> taste.
>
> Running jackd -R with 6 fluidsynth instances gives me 0 (zero) xruns
> on RT-U3, but more than 20 (twenty) on RT-V0.4.1, under a 5 minute
> time frame. It was 30 (thirty something) on RT-V0.4, but overall
> "feel" is about the same.
does the wakeup tracer show any high latency?
Ingo
karsten wiese wrote:
> Rui Nuno Capela wrote:
>> Should I try the other way around? Lets see... 'chrt -p
>> -f 90 `pidof ksoftirwd/0`',... yes, apparentely the xrun
>> rate seems to decrease into half, but IMHO not conclusive
>> enough, thought.
>>
> 'into half' makes me wonder:
> did you also 'chrt -p -f 90 `pidof ksoftirwd/1`'?
> I guess you meant that with '...'. Just in case :-)
>
Wonder no more. All my statistical-wise tests were carried on a UP box (my
laptop), so there's no "ksoftirqd/1" in there, just a single
"ksoftirqd/0".
Speaking of which, I was not taking tests very seriously on my other
SMP/HT box, just because I don't want to rant about it anymore :) Only
recently VP and RT kernels were barely able to boot there, where even
plain vanilla 2.6.9 seems to be snappier and with far fewer xruns than
V0.4.1 or even U3 (either RT or not).
OTOH, on my laptop (P4/UP) I can testify as truth that, at least for
RT-U3, the improvement is real: I don't have a record of such a top
performer, when regarding the zero-xrun, low-latency audio setup
potential. When even compared, it just outperforms by far that old
2.4+preempt+low-latency myth ;)
Unfortunately, this is not what I see on my P4/SMP/HT desktop box. I
cannot tell a lie ;)
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
P.S. Karsten, my US-224 is working real nice on my laptop now (provided
I'm with RT-U3 :) I'm real thankful for all of your work on snd-usb-usx2y.
Cheers.
Hi,
Here is a trivial compile fix :
CC lib/string.o
CC lib/vsprintf.o
AR lib/lib.a
CC arch/i386/lib/bitops.o
CC arch/i386/lib/dec_and_lock.o
CC arch/i386/lib/delay.o
CC arch/i386/lib/memcpy.o
CC arch/i386/lib/usercopy.o
AR arch/i386/lib/lib.a
GEN .version
CHK include/linux/compile.h
UPD include/linux/compile.h
CC init/version.o
LD init/built-in.o
LD .tmp_vmlinux1
kernel/built-in.o(.text+0x8450): In function `cpu_callback':
kernel/fork.c:1421: undefined reference to `takeover_tasklets'
make: *** [.tmp_vmlinux1] Error 1
--- kernel/softirq.c.orig 2004-10-27 23:44:14.160948016 +0200
+++ kernel/softirq.c 2004-10-27 23:44:34.571845088 +0200
@@ -464,7 +464,7 @@
BUG();
}
-static void takeover_tasklets(unsigned int cpu)
+void takeover_tasklets(unsigned int cpu)
{
struct tasklet_struct **i;
Remi
* [email protected] <[email protected]> wrote:
> [5] Running my stress test, the first test (X server) appeared to run
> OK. The second test (/proc or top) did not run properly. The RT audio
> test appeared to take the whole system (both CPU's) and the terminal
> window showing top did not appear until the audio test finished (and
> was quickly taken down by the script). Could not move the mouse at all
> during that test. The third test (network output) ran a short period
> and then the machine locked up. Had to use the hardware reset to
> recover.
the network one could perhaps be related to the network deadlocks
reported by others. Would be nice to turn on RWSEM_DETECT_DEADLOCKS and
to use a serial logging if possible.
does the audio test use alot of CPU time? In that case it would be
normal for the RT task to 'lock' the system up. In any case it would be
nice to try 0.4.2 because it has more check-preemption fixes affecting
both UP and SMP systems.
Ingo
On Wed, Oct 27, 2004 at 01:49:35PM -0700, Bill Huey wrote:
> I've got a latency/histogram patch here, but I've been having problems
> trying to integrate it into Ingo's irq-threads and getting that simple
> subtraction returning something sensible. The basic logic works otherwise.
>
> Maybe another pair of eyes can figure this out, since I could have missed
> something pretty simple...
It was originally mean to go in between the irq-thread wake attempt and
the actual running of the thread body itself. Somehow this is breaking
in my effort to integrate this logic into Ingo's (your) stuff. Brain
farting severely right now.
bill
* Ingo Molnar <[email protected]> wrote:
> ok, i've uploaded RT-V0.4.2 which has more of the same: it fixes other
> missed preemption checks. Does it make any difference to the xruns on
> your UP box?
uploaded RT-V0.4.3 - there was a thinko in the latency tracer that
caused early boot failures.
Ingo
>the network one could perhaps be related to the network deadlocks
>reported by others. Would be nice to turn on RWSEM_DETECT_DEADLOCKS and
>to use a serial logging if possible.
Would be nice but I don't have serial logging available at this point.
I may be able to set it up in a couple of days though.
>does the audio test use alot of CPU time? In that case it would be
>normal for the RT task to 'lock' the system up. In any case it would be
>nice to try 0.4.2 because it has more check-preemption fixes affecting
>both UP and SMP systems.
I am aware of slow responses normally during tests. However the audio
test should only use one CPU out of two. The other CPU is busy as well
with a cpu burner (nice 10) but that should leave me CPU cycles
to move the mouse, swap windows, etc. The "lock up" I saw this time
was a lot more severe (no mouse motion for several minutes at a time).
I knew the system was still running since the audio continued to play.
I'll try another build in the morning with whatever your latest is.
--Mark H Johnson
<mailto:[email protected]>
Ingo Molnar wrote:
>
>> ok, i've uploaded RT-V0.4.2 which has more of the same: it fixes other
>> missed preemption checks. Does it make any difference to the xruns on
>> your UP box?
>
> uploaded RT-V0.4.3 - there was a thinko in the latency tracer that
> caused early boot failures.
>
Yes, the xrun rate has decreased, slightly. RT-V0.4.3 is now ranking under
10 per 5 min (~2/min), with jackd -R -r44100 -p128 -n2, fluidsynth x 6.
Better still, but not to par as RT-U3, under the very same conditions.
Cya.
--
rncbc aka Rui Nuno Capela
[email protected]
Ingo Molnar <[email protected]> wrote:
>
> > Here is a more up to date version of the rtc-debug patch:
> >
> > http://lkml.org/lkml/2004/9/9/307
> >
> > There is still a bit of 2.4 cruft in there but it works well. Maybe
> > this could be included in future patches.
>
> the most natural point of inclusion would be Andrew's -mm tree i think
> :-)
It's 'orrid. And iirc it breaks normal use of the RTC.
* Rui Nuno Capela <[email protected]> wrote:
> >> ok, i've uploaded RT-V0.4.2 which has more of the same: it fixes other
> >> missed preemption checks. Does it make any difference to the xruns on
> >> your UP box?
> >
> > uploaded RT-V0.4.3 - there was a thinko in the latency tracer that
> > caused early boot failures.
> >
>
> Yes, the xrun rate has decreased, slightly. RT-V0.4.3 is now ranking
> under 10 per 5 min (~2/min), with jackd -R -r44100 -p128 -n2,
> fluidsynth x 6.
>
> Better still, but not to par as RT-U3, under the very same conditions.
how much idle time do you have in the RT-U3 and in the RT-V0.4 tests,
compared? If it's close to 100% then make sure you have the following
debug options disabled:
# CONFIG_DEBUG_SLAB is not set
# CONFIG_DEBUG_PREEMPT is not set
# CONFIG_DEBUG_SPINLOCK_SLEEP is not set
# CONFIG_PREEMPT_TIMING is not set
# CONFIG_RWSEM_DEADLOCK_DETECT is not set
# CONFIG_FRAME_POINTER is not set
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
RWSEM_DEADLOCK, DEBUG_PREEMPT, PREEMPT_TIMING and LATENCY_TRACE are
especially expensive, so depending on the amount of kernel work done, it
can make or break the total balance of CPU time used and you could get
xruns not only due to kernel latencies but purely due to having not
enough CPU time to generate audio output. (fluidsynth is a software
audio generator?)
Ingo
* Magnus Naeslund(t) <[email protected]> wrote:
> What information do you need to get something useful out of this? I
> saw that others have this problem, so I've got an serial console to
> the box, if you want me to do any tests, tell me how.
also, if you hit problems make sure you have the latest patch, i
sometimes upload a small update with a trivial fix without announcing it
(or the announcement lags on lkml), and i upload larger changes roughly
daily. The current latest version is RT-V0.4.3.
Ingo
* Magnus Naeslund(t) <[email protected]> wrote:
> I'm testing out this patch on an debian box. There seems to be a
> problem with enable_irq in the e100 driver that makes the network to
> b0rk.
this e100 driver warning seems mostly harmless - i get it too and the
device works just fine.
> What information do you need to get something useful out of this? I
> saw that others have this problem, so I've got an serial console to
> the box, if you want me to do any tests, tell me how.
even just using it and reporting any potential breakages you get during
bootup or normal use would be very useful. I'd suggest to initially
enable all the relevant debugging options:
CONFIG_DEBUG_PREEMPT=y
CONFIG_PREEMPT_TIMING=y
CONFIG_PREEMPT_TRACE=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
CONFIG_RWSEM_DEADLOCK_DETECT=y
CONFIG_RWSEM_MAX_OWNERS=32
CONFIG_DEBUG_INFO=y
CONFIG_EARLY_PRINTK=y
this will slow the kernel down but in case of problems there is a much
higher chance of getting a useful assert on the serial console and not
some silent lockup.
if things look good for normal use then a bit advanced type of testing
would be to enable the wakeup-latency tracer:
echo 4 > /proc/sys/kernel/tracing_enabled
echo 5 > /proc/sys/kernel/preempt_max_latency
this will measure the highest wakeup latency of highprio tasks, starting
at 5 microseconds. You'll get a short 1-line notification of the latest
latency in the syslog, and the latest trace will always be available in
/proc/latency_trace. Depending on the speed of the system, 'larger'
latencies should be reported to me. 'larger' means more than 20 usecs on
a 2 GHz box or more than 40 usecs on a 1 GHz box. (Dont worry about
reporting duplicates, i can skip them quickly.)
then if things are still looking good (i'm not betting on it though :-),
you could try various stesstests, running LTP's:
./runalltests.sh -x 20
and things like that. If you have a stable system then it would also be
nice to try to trigger as high wakeup-latencies as possible. E.g. run a
couple of thousand tasks on it, or try to start as many mozilla
instances, or make it go into heavy swapping. I.e. if the core is ok,
explore the edges a bit. The kernel is supposed to offer very low
(wakeup-) latencies no matter what the load. [this doesnt mean your
system will necessarily feel fast - if it's running lots of tasks, even
if the highest prio one is woken up quickly, then it's gonna be slow.]
Ingo
* [email protected] <[email protected]> wrote:
> I am aware of slow responses normally during tests. However the audio
> test should only use one CPU out of two. The other CPU is busy as well
> with a cpu burner (nice 10) but that should leave me CPU cycles to
> move the mouse, swap windows, etc. The "lock up" I saw this time was
> a lot more severe (no mouse motion for several minutes at a time). I
> knew the system was still running since the audio continued to play.
the way i typically debug such scenarios is to set up a separate
'highprio console' of some sorts. E.g. log in via the network from
another box and make sure all processing of that console is SCHED_FIFO.
(in the network case that would mean the network IRQ, ksoftirqd, sshd,
login and bash.) Such a 'highprio console' should have a higher priority
than any other task in the system. If you run alot of stuff in it then
it will surely disturb your measurements (and largely invalidate them),
but otherwise it can be useful to have it around just in case you
experience a lockup that you suspect to be some sort of livelock or
starvation. Whenever the lockup happens, check out what's going on, via
the highprio console.
another useful 'highprio console' is the text console itself - in this
case only login and bash has to be chrt-ed, and the runtime impact on
the test is smaller as well. This is only useful if you can start your
'bad' workload over ssh or via networked X.
Ingo
* Bill Huey <[email protected]> wrote:
> On Wed, Oct 27, 2004 at 01:54:45PM -0700, Bill Huey wrote:
> > It was originally mean to go in between the irq-thread wake attempt and
> > the actual running of the thread body itself. Somehow this is breaking
> > in my effort to integrate this logic into Ingo's (your) stuff. Brain
> > farting severely right now.
>
> Another note, it's not meant to be a high resolution latency stats
> patch as much as giving a general feel of irq latency in the system.
> That information is just useful to have in general, but won't be
> sufficient enough to track down specific problems in the kernel.
> Extending this code to track all wake ups is beyond the original
> intention of these measurements. [...]
yeah, the wakeup-tracing we have now is mostly for debugging, it doesnt
generate a histogram at the moment. But you could extend it to be that -
the trace_start_sched_wakeup() and trace_stop_sched_switched() hooks in
the latest patch are precisely that. Note the magic done there, it is
important to establish that only the _highest prio_ task's wakeup
latency counts, and the tests in those functions do just that.
(otherwise we'd have to do nested latency tracing which is quite
elaborous. I've done it before and it's neither fun, nor truly useful.)
So i think you can put a histogram generator into check_wakeup_timing()
(without needing any outside code), the 'latency' variable established
there is precisely the latency you want to track.
Ingo
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> >> ok, i've uploaded RT-V0.4.2 which has more of the same: it fixes
>> >> other missed preemption checks. Does it make any difference to the
>> >> xruns on your UP box?
>> >
>> > uploaded RT-V0.4.3 - there was a thinko in the latency tracer that
>> > caused early boot failures.
>> >
>>
>> Yes, the xrun rate has decreased, slightly. RT-V0.4.3 is now ranking
>> under 10 per 5 min (~2/min), with jackd -R -r44100 -p128 -n2,
>> fluidsynth x 6.
>>
>> Better still, but not to par as RT-U3, under the very same conditions.
>
> how much idle time do you have in the RT-U3 and in the RT-V0.4 tests,
> compared? If it's close to 100% then make sure you have the following
> debug options disabled:
>
> # CONFIG_DEBUG_SLAB is not set
> # CONFIG_DEBUG_PREEMPT is not set
> # CONFIG_DEBUG_SPINLOCK_SLEEP is not set
> # CONFIG_PREEMPT_TIMING is not set
> # CONFIG_RWSEM_DEADLOCK_DETECT is not set
> # CONFIG_FRAME_POINTER is not set
> # CONFIG_DEBUG_STACKOVERFLOW is not set
> # CONFIG_DEBUG_STACK_USAGE is not set
> # CONFIG_DEBUG_PAGEALLOC is not set
>
> RWSEM_DEADLOCK, DEBUG_PREEMPT, PREEMPT_TIMING and LATENCY_TRACE are
> especially expensive, so depending on the amount of kernel work done, it
> can make or break the total balance of CPU time used and you could get
> xruns not only due to kernel latencies but purely due to having not
> enough CPU time to generate audio output. (fluidsynth is a software
> audio generator?)
>
As far as nmeter can tell, this a rough cpu usage pattern between RT-U3
and RT-V0.4.3, during my jackd + 6*fluidsynth "benchmark" tests:
cpu usage RT-U3.0 RT-V0.4.3
---------------------------- ---------- ---------
system (kernel) <10% 10%
user 30% 60%
---------------------------- ---------- ---------
total <40% 70%
The following table compares the state between my RT-U3 and RT-V0.4.3
configurations, regarding only the mentioned options:
option RT-U3.0 RT-V0.4.3
---------------------------- ---------- ---------
CONFIG_DEBUG_SLAB n n
CONFIG_DEBUG_PREEMPT y y
CONFIG_DEBUG_SPINLOCK_SLEEP n -
CONFIG_PREEMPT_TIMING n n
CONFIG_RWSEM_DEADLOCK_DETECT - y
CONFIG_FRAME_POINTER y y
CONFIG_DEBUG_STACKOVERFLOW y y
CONFIG_DEBUG_STACK_USAGE n n
CONFIG_DEBUG_PAGEALLOC n n
(dash "-" means that the option is not available in the config).
As you can see, it can only be CONFIG_RWSEM_DEADLOCK_DETECT, being new in
RT-V0.4.3, that is probably affecting on RT-V0.4.3. I'll try to rebuild
and test all over without it, and see if it gets any better.
BBL
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> The following table compares the state between my RT-U3 and RT-V0.4.3
> configurations, regarding only the mentioned options:
>
> option RT-U3.0 RT-V0.4.3
> ---------------------------- ---------- ---------
> CONFIG_DEBUG_SLAB n n
> CONFIG_DEBUG_PREEMPT y y
> CONFIG_DEBUG_SPINLOCK_SLEEP n -
> CONFIG_PREEMPT_TIMING n n
> CONFIG_RWSEM_DEADLOCK_DETECT - y
> CONFIG_FRAME_POINTER y y
> CONFIG_DEBUG_STACKOVERFLOW y y
> CONFIG_DEBUG_STACK_USAGE n n
> CONFIG_DEBUG_PAGEALLOC n n
>
> (dash "-" means that the option is not available in the config).
>
> As you can see, it can only be CONFIG_RWSEM_DEADLOCK_DETECT, being new
> in RT-V0.4.3, that is probably affecting on RT-V0.4.3. I'll try to
> rebuild and test all over without it, and see if it gets any better.
note that DEBUG_PREEMPT got more expensive in the -V kernels. I'd
suggest to disable all the 'y' ones in both the -U and -V kernel and
compare them then.
but especially the userspace overhead seems to be significantly higher
in the -V kernel so i'm not quite sure it can all be attributed to
debugging overhead. We'll see.
also, how does the context-switching rate compare between the two tests?
This test is pretty steady when it's running, so the context-switch
rates can be directly compared, correct?
Ingo
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> The following table compares the state between my RT-U3 and RT-V0.4.3
>> configurations, regarding only the mentioned options:
>>
>> option RT-U3.0 RT-V0.4.3
>> ---------------------------- ---------- ---------
>> CONFIG_DEBUG_SLAB n n
>> CONFIG_DEBUG_PREEMPT y y
>> CONFIG_DEBUG_SPINLOCK_SLEEP n -
>> CONFIG_PREEMPT_TIMING n n
>> CONFIG_RWSEM_DEADLOCK_DETECT - y
>> CONFIG_FRAME_POINTER y y
>> CONFIG_DEBUG_STACKOVERFLOW y y
>> CONFIG_DEBUG_STACK_USAGE n n
>> CONFIG_DEBUG_PAGEALLOC n n
>>
>> (dash "-" means that the option is not available in the config).
>>
>> As you can see, it can only be CONFIG_RWSEM_DEADLOCK_DETECT, being new
>> in RT-V0.4.3, that is probably affecting on RT-V0.4.3. I'll try to
>> rebuild and test all over without it, and see if it gets any better.
>
> note that DEBUG_PREEMPT got more expensive in the -V kernels. I'd
> suggest to disable all the 'y' ones in both the -U and -V kernel and
> compare them then.
>
> but especially the userspace overhead seems to be significantly higher
> in the -V kernel so i'm not quite sure it can all be attributed to
> debugging overhead. We'll see.
>
> also, how does the context-switching rate compare between the two tests?
> This test is pretty steady when it's running, so the context-switch
> rates can be directly compared, correct?
>
OK. That was it. After switching off CONFIG_RWSEM_DEADLOCK_DETECT on
RT-V0.4.3, I can say that it's now on par to RT-U3.
Later today, I will conduct some extendeded testing, where I'll able to
compare the jackd performance between vanilla, RT-U3 and RT-V0.4.3, on my
UP laptop. All kernel configurations will be stripped off from all the
debug options.
I will take note of xrun rate, jackd scheduling delay histogram, and cpu
usage. Context switch rate will be also acquainted.
Anything else?
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> OK. That was it. After switching off CONFIG_RWSEM_DEADLOCK_DETECT on
> RT-V0.4.3, I can say that it's now on par to RT-U3.
great!
> Later today, I will conduct some extendeded testing, where I'll able
> to compare the jackd performance between vanilla, RT-U3 and RT-V0.4.3,
> on my UP laptop. All kernel configurations will be stripped off from
> all the debug options.
>
> I will take note of xrun rate, jackd scheduling delay histogram, and
> cpu usage. Context switch rate will be also acquainted.
>
> Anything else?
yeah, that's good enough. I'd still suggest to first test new kernels
with all the debug options on, to make sure it's stable. For performance
comparisons turn all the debug options off.
i'd also suggest to turn the NMI watchdog off (if enabled), that can
inject a 10-20 usec latency into any critical path. For the absolute
lowest latencies i'd also suggest to turn off all the APIC options
(possible in a UP kernel only, and only if the XT-PIC setup doesnt cause
unacceptable IRQ-line sharing), the IO-APIC mask handling is a bit
expensive compared to the XT-PIC.
If you find (or suspect) larger latencies anywhere then PREEMPT_TIMING +
LATENCY_TRACE + preempt_enable=4 is the preferred variant to use. (right
now it's not possible to do wakeup-timing without LATENCY_TRACE, i'll
fix that.)
Ingo
Ingo Molnar wrote:
> * Magnus Naeslund(t) <[email protected]> wrote:
>
>
>>I'm testing out this patch on an debian box. There seems to be a
>>problem with enable_irq in the e100 driver that makes the network to
>>b0rk.
>
>
> this e100 driver warning seems mostly harmless - i get it too and the
> device works just fine.
>
>
Well, this isn't my experience.
I can't even ping another box on the same net with e100 or eepro100
driver. I'm running a UP p4 2.4 ghz box with 2.6.9-mm1-RT-V0.4.3.
After letting it ping a while, this turns up:
NETDEV WATCHDOG: eth0: transmit timed out
ksoftirqd/0/2: BUG in enable_irq at kernel/irq/manage.c:112
[<c01384d9>] enable_irq+0xe1/0x122 (12)
[<d08055ab>] e100_up+0x112/0x211 [e100] (48)
[<c0247e37>] dev_watchdog+0x0/0xb6 (36)
[<c0247eeb>] dev_watchdog+0xb4/0xb6 (12)
[<c0124391>] run_timer_softirq+0x1c9/0x48b (20)
[<c0111ac0>] mcount+0x14/0x18 (32)
[<c012018e>] ___do_softirq+0x4e/0xd6 (20)
[<c01202a0>] _do_softirq+0x8/0x22 (8)
[<c012066a>] ksoftirqd+0xa5/0xeb (4)
[<c01202b8>] _do_softirq+0x20/0x22 (28)
[<c012066a>] ksoftirqd+0xa5/0xeb (8)
[<c013015e>] kthread+0xa1/0xce (28)
[<c01205c5>] ksoftirqd+0x0/0xeb (20)
[<c01300bd>] kthread+0x0/0xce (12)
[<c0104195>] kernel_thread_helper+0x5/0xb (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: enable_irq+0x33/0x122 [<c013842b>] / (e100_up+0x112/0x211
[e100] [<d08055ab>])
.. entry 2: print_traces+0x1b/0x55 [<c013243e>] / (dump_stack+0x23/0x25
[<c0106eb8>])
Other latecy traces are:
(ksoftirqd/0/2/CPU#0): new 22 us maximum-latency critical section.
=> started at timestamp 116357664: <call_console_drivers+0x8c/0x163>
=> ended at timestamp 116357686: <__sched_text_start+0x30e/0x6c1>
[<c0131d71>] sub_preempt_count+0x62/0xc5 (4)
[<c01319e9>] check_preempt_timing+0x20b/0x2da (8)
[<c0289ece>] __sched_text_start+0x30e/0x6c1 (8)
[<c0106a10>] common_interrupt+0x18/0x20 (24)
[<c013007b>] kthread_exit_files+0x14/0x56 (32)
[<c0131d71>] sub_preempt_count+0x62/0xc5 (20)
[<c0289ece>] __sched_text_start+0x30e/0x6c1 (8)
[<c0289ece>] __sched_text_start+0x30e/0x6c1 (20)
[<c028a2aa>] schedule+0x29/0xd1 (84)
[<c0111ac0>] mcount+0x14/0x18 (8)
[<c01206ae>] ksoftirqd+0xe9/0xeb (28)
[<c013015e>] kthread+0xa1/0xce (28)
[<c01205c5>] ksoftirqd+0x0/0xeb (20)
[<c01300bd>] kthread+0x0/0xce (12)
[<c0104195>] kernel_thread_helper+0x5/0xb (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x4e/0x6c1 [<c0289c0e>] /
(schedule+0x29/0xd1 [<c028a2aa>])
.. entry 2: print_traces+0x1b/0x55 [<c013243e>] / (dump_stack+0x23/0x25
[<c0106eb8>])
=> dump-end timestamp 116461059
(bash/1156/CPU#0): new 16 us maximum-latency critical section.
=> started at timestamp 227980588: <call_console_drivers+0x8c/0x163>
=> ended at timestamp 227980604: <irq_exit+0x3c/0x45>
[<c0131d71>] sub_preempt_count+0x62/0xc5 (4)
[<c01319e9>] check_preempt_timing+0x20b/0x2da (8)
[<c01380a1>] irq_exit+0x3c/0x45 (8)
[<c0115f81>] try_to_wake_up+0x135/0x137 (28)
[<c0131d71>] sub_preempt_count+0x62/0xc5 (48)
[<c01380a1>] irq_exit+0x3c/0x45 (8)
[<c01380a1>] irq_exit+0x3c/0x45 (20)
[<c0108784>] do_IRQ+0x5c/0x81 (12)
[<c0106a10>] common_interrupt+0x18/0x20 (20)
[<c0114ab3>] do_page_fault+0x8e/0x5e7 (44)
[<c0131d71>] sub_preempt_count+0x62/0xc5 (72)
[<c013183a>] check_preempt_timing+0x5c/0x2da (16)
[<c0131d71>] sub_preempt_count+0x62/0xc5 (4)
[<c01163ff>] schedule_tail+0x37/0x8f (4)
[<c0114a25>] do_page_fault+0x0/0x5e7 (36)
[<c0106acd>] error_code+0x2d/0x38 (8)
[<c011007b>] set_fixed_ranges+0x62/0xd1 (40)
[<c0116447>] schedule_tail+0x7f/0x8f (12)
[<c0114a25>] do_page_fault+0x0/0x5e7 (20)
[<c0106acd>] error_code+0x2d/0x38 (8)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __do_IRQ+0x104/0x196 [<c01382a6>] / (do_IRQ+0x57/0x81
[<c010877f>])
.. entry 2: print_traces+0x1b/0x55 [<c013243e>] / (dump_stack+0x23/0x25
[<c0106eb8>])
=> dump-end timestamp 228101445
(ksoftirqd/0/2/CPU#0): new 18 us maximum-latency critical section.
=> started at timestamp 228104930: <call_console_drivers+0x8c/0x163>
=> ended at timestamp 228104948: <__sched_text_start+0x30e/0x6c1>
[<c0131d71>] sub_preempt_count+0x62/0xc5 (4)
[<c01319e9>] check_preempt_timing+0x20b/0x2da (8)
[<c0289ece>] __sched_text_start+0x30e/0x6c1 (8)
[<c0106a10>] common_interrupt+0x18/0x20 (24)
[<c013007b>] kthread_exit_files+0x14/0x56 (32)
[<c0131d71>] sub_preempt_count+0x62/0xc5 (20)
[<c0289ece>] __sched_text_start+0x30e/0x6c1 (8)
[<c0289ece>] __sched_text_start+0x30e/0x6c1 (20)
[<c028a2aa>] schedule+0x29/0xd1 (84)
[<c0111ac0>] mcount+0x14/0x18 (8)
[<c01206ae>] ksoftirqd+0xe9/0xeb (28)
[<c013015e>] kthread+0xa1/0xce (28)
[<c01205c5>] ksoftirqd+0x0/0xeb (20)
[<c01300bd>] kthread+0x0/0xce (12)
[<c0104195>] kernel_thread_helper+0x5/0xb (16)
preempt count: 00000002
. 2-level deep critical section nesting:
.. entry 1: __sched_text_start+0x4e/0x6c1 [<c0289c0e>] /
(schedule+0x29/0xd1 [<c028a2aa>])
.. entry 2: print_traces+0x1b/0x55 [<c013243e>] / (dump_stack+0x23/0x25
[<c0106eb8>])
=> dump-end timestamp 228208260
* Michal Schmidt <[email protected]> wrote:
> > i've uploaded -V0.4.1 with a fix that could fix this networking
> > deadlock. Does it work any better?
>
> Unfortunately, no. It's only slightly different:
ok. I've uploaded -RT-V0.5 which includes a different approach to
solving these netfilter related deadlocks. It can be downloaded from the
usual place:
http://redhat.com/~mingo/realtime-preempt/
there's a fair chance that you will still see a deadlock, so in -V0.5
i've improved the deadlock-detection infrastructure with a number of new
debugging features:
- track the code (by EIP) that acquired the semaphore
- track the place (by file:line) where the locking object got defined
or initialized. Print this symbolic info out when printing locking
objects - this is easier to read than the hexa address alone.
- added a global registry for all mutexes, rwsems, spinlocks and
rwlocks, which registry is printed during the deadlock-printout.
(including the current holder of the lock and the place where the
lock was acquired.)
- print out all the locks held by the task(s) involved in any deadlock
scenario.
- print out a summary of all tasks in the system, whether they are
blocked on a locking object, and if they are, on which lock.
- the 'all locks' and 'all tasks' printout can also be triggered via
sysrq-d or 'echo d > /proc/sysrq-trigger'.
- turn off deadlock tracing when the first deadlock has been detected.
This is to get a more robust printout of a stable locking state and
usually it's the first deadlock that matters so there's no point in
printing out more.
- cleaned up formatting of various existing printouts like the
preemption backtrace and messages from the deadlock detector.
all this info will greatly simplify the tracking down of deadlocks.
There is no additional configuration needed to active the above
features, other than to enable CONFIG_RWSEM_DEADLOCK_DETECT. Below i've
attached a sample printout.
Ingo
down()-ing unlocked semaphore. should succeed.
good. now down()ing locked semaphore. should deadlock.
===============================================
[ BUG: semaphore recursion deadlock detected! ]
-----------------------------------------------
already locked: [c065eee0] {r:0,a:-1,kernel/time.c:100}
.. held by: gettimeofday/ 3008 [f1b374c0]
... acquired at: sys_gettimeofday+0xfa/0x3f8
-{backtrace}--------------------------------->
[<c0241d47>] __rwsem_deadlock+0xf9/0x188 (12)
[<c011f1eb>] sys_gettimeofday+0xfa/0x3f8 (8)
[<c058f68e>] _down_write+0xe/0x325 (16)
[<c011f201>] sys_gettimeofday+0x110/0x3f8 (4)
[<c011f201>] sys_gettimeofday+0x110/0x3f8 (20)
[<c058f7e0>] _down_write+0x160/0x325 (8)
[<c0130903>] __mcount+0x1d/0x1f (16)
[<c0242561>] down+0x8/0x11 (4)
[<c011f201>] sys_gettimeofday+0x110/0x3f8 (4)
[<c011f201>] sys_gettimeofday+0x110/0x3f8 (36)
[<c014d641>] sys_munmap+0x42/0x4b (12)
[<c010601d>] sysenter_past_esp+0x52/0x71 (28)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c058f8bd>] .... _down_write+0x23d/0x325
.....[<c011f201>] .. ( <= sys_gettimeofday+0x110/0x3f8)
.. [<c0131be1>] .... print_traces+0x1b/0x5a
.....[<c0106608>] .. ( <= dump_stack+0x23/0x25)
------------------------------
| showing all locks held by: | (gettimeofday/3008 [f1b374c0]):
------------------------------
#001: [c065eee0] {r:0,a:-1,kernel/time.c:100}
... acquired at: sys_gettimeofday+0xfa/0x3f8
showing all tasks:
s init/ 1 [c2a77080] (not blocked)
s ksoftirqd/0/ 2 [c2a76850] (not blocked)
s desched/0/ 3 [c2a76020] (not blocked)
s events/0/ 4 [c2a950c0] (not blocked)
s khelper/ 5 [c2a94890] (not blocked)
s kthread/ 10 [c2a94060] (not blocked)
s kacpid/ 18 [c2ab1100] (not blocked)
s IRQ 9/ 19 [c2ab08d0] (not blocked)
s kblockd/0/ 94 [c2ab00a0] (not blocked)
s khubd/ 107 [f7cf1140] (not blocked)
s pdflush/ 330 [f7cf0910] (not blocked)
s pdflush/ 331 [f7cf00e0] (not blocked)
s aio/0/ 333 [f7d56950] (not blocked)
s kswapd0/ 332 [f7d57180] (not blocked)
s IRQ 8/ 918 [f7d56120] (not blocked)
s IRQ 12/ 931 [f7eca990] (not blocked)
s kseriod/ 925 [f7ecb1c0] (not blocked)
s IRQ 6/ 946 [f7eca160] (not blocked)
s IRQ 5/ 978 [f7f01200] (not blocked)
s IRQ 14/ 997 [f7f009d0] (not blocked)
s IRQ 15/ 999 [f7f001a0] (not blocked)
s khpsbpkt/ 1013 [f7f51240] (not blocked)
s IRQ 11/ 1024 [f7f50a10] (not blocked)
s knodemgrd_0/ 1025 [f7f501e0] (not blocked)
s IRQ 1/ 1131 [f7f91280] (not blocked)
s IRQ 10/ 1169 [f7f90a50] (not blocked)
D kjournald/ 1180 [f7f90220] (not blocked)
s udevd/ 1242 [f5ec76c0] (not blocked)
s IRQ 4/ 1615 [f610c2e0] (not blocked)
s IRQ 3/ 1616 [f59f0a50] (not blocked)
D syslogd/ 2503 [f5d6ad90] (not blocked)
s klogd/ 2507 [f610d340] (not blocked)
s portmap/ 2526 [f5ae0120] (not blocked)
s mDNSResponder/ 2559 [f63aa420] (not blocked)
s mDNSResponder/ 2560 [f2bc6b90] (not blocked)
s acpid/ 2580 [f2af9340] (not blocked)
s sshd/ 2590 [f2af82e0] (not blocked)
s distccd/ 2619 [f610cb10] (not blocked)
s distccd/ 2620 [f59f0220] (not blocked)
s gpm/ 2630 [f2b6cb50] (not blocked)
s crond/ 2640 [f2af8b10] (not blocked)
s sshd/ 2653 [f5a59100] (not blocked)
s distccd/ 2656 [f5ae0950] (not blocked)
s sshd/ 2667 [f5a580a0] (not blocked)
s xfs/ 2670 [f5ec6660] (not blocked)
s bash/ 2672 [f2bc73c0] (not blocked)
s anacron/ 2704 [f5ec6e90] (not blocked)
s distccd/ 2706 [f2b6c320] (not blocked)
s atd/ 2714 [f59f1280] (not blocked)
s dbus-daemon-1/ 2724 [f2bc6360] (not blocked)
s cups-config-dae/ 2734 [f63aac50] (not blocked)
s agetty/ 2748 [f5a588d0] (not blocked)
s mingetty/ 2749 [f5d6b5c0] (not blocked)
s mingetty/ 2750 [f63ab480] (not blocked)
s mingetty/ 2751 [f1a73400] (not blocked)
s mingetty/ 2752 [f1a72bd0] (not blocked)
s mingetty/ 2753 [f2b6d380] (not blocked)
s mingetty/ 2754 [f1a723a0] (not blocked)
s hotplug/ 2839 [f1b36460] (not blocked)
s hotplug/ 2856 [f14e6d90] (not blocked)
s hotplug/ 2857 [f155d600] (not blocked)
s 10-udev.hotplug/ 2916 [f1b36c90] (not blocked)
s 10-udev.hotplug/ 2917 [f14e75c0] (not blocked)
s 10-udev.hotplug/ 2918 [f14e6560] (not blocked)
s udev/ 2959 [f1629680] (not blocked)
s udev/ 2965 [f1702ed0] (not blocked)
s udev/ 2981 [f17d5780] (not blocked)
s udev/ 2987 [f11240a0] (not blocked)
D gettimeofday/ 3008 [f1b374c0] (not blocked)
---------------------------
| showing all locks held: |
---------------------------
#001: [c07f809c] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#002: [c07f7bb4] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#003: [c07f7e0c] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#004: [c07f8890] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#005: [c07f83a8] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#006: [c07f8600] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#007: [c07f9084] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#008: [c07f8b9c] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#009: [c07f8df4] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#010: [c07f9878] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#011: [c07f9390] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#012: [c07f95e8] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#013: [c07fa06c] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#014: [c07f9b84] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#015: [c07f9ddc] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#016: [c07fa860] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#017: [c07fa378] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#018: [c07fa5d0] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#019: [c07fb054] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#020: [c07fab6c] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#021: [c07fadc4] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#022: [c07fb848] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#023: [c07fb360] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#024: [c07fb5b8] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#025: [c07fc03c] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#026: [c07fbb54] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#027: [c07fbdac] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#028: [c07fc830] {r:0,a:-1,drivers/ide/ide.c:228}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x8b/0x15c
#029: [c07fc348] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#030: [c07fc5a0] {r:0,a:-1,drivers/ide/ide.c:252}
.. held by: init/ 1 [c2a77080]
... acquired at: init_hwif_data+0x14b/0x15c
#031: [c06a8f40] {r:0,a:-1,drivers/ieee1394/nodemgr.c:111}
.. held by: knodemgrd_0/ 1025 [f7f501e0]
... acquired at: nodemgr_host_thread+0x89/0x186
#032: [f2b7cfcc] {r:0,a:-1,drivers/char/tty_io.c:2641}
.. held by: mingetty/ 2749 [f5d6b5c0]
... acquired at: read_chan+0x471/0x7a2
#033: [f1ba4fcc] {r:0,a:-1,drivers/char/tty_io.c:2641}
.. held by: mingetty/ 2751 [f1a73400]
... acquired at: read_chan+0x471/0x7a2
#034: [f157efcc] {r:0,a:-1,drivers/char/tty_io.c:2641}
.. held by: mingetty/ 2754 [f1a723a0]
... acquired at: read_chan+0x471/0x7a2
#035: [f1722fcc] {r:0,a:-1,drivers/char/tty_io.c:2641}
.. held by: mingetty/ 2752 [f1a72bd0]
... acquired at: read_chan+0x471/0x7a2
#036: [f10dafcc] {r:0,a:-1,drivers/char/tty_io.c:2641}
.. held by: mingetty/ 2753 [f2b6d380]
... acquired at: read_chan+0x471/0x7a2
#037: [f150efcc] {r:0,a:-1,drivers/char/tty_io.c:2641}
.. held by: mingetty/ 2750 [f63ab480]
... acquired at: read_chan+0x471/0x7a2
#038: [f1a92fcc] {r:0,a:-1,drivers/char/tty_io.c:2641}
.. held by: agetty/ 2748 [f5a588d0]
... acquired at: read_chan+0x471/0x7a2
#039: [f2a11ba4] {r:0,a:-1,fs/inode.c:198}
.. held by: syslogd/ 2503 [f5d6ad90]
... acquired at: sys_fsync+0x56/0xb7
#040: [c065eee0] {r:0,a:-1,kernel/time.c:100}
.. held by: gettimeofday/ 3008 [f1b374c0]
... acquired at: sys_gettimeofday+0xfa/0x3f8
#041: [c0748a00] {r:0,a:-1,kernel/fork.c:64}
.. held by: gettimeofday/ 3008 [f1b374c0]
... acquired at: show_all_locks+0x30/0x148
=============================================
* Ingo Molnar <[email protected]> wrote:
> (right now it's not possible to do wakeup-timing without
> LATENCY_TRACE, i'll fix that.)
i've fixed this in -RT-V0.5.2. Also, the trace_enabled=4 method is
deprecated now and the new mechanism is to use:
/proc/sys/kernel/preempt_wakeup_timing
this flag is default-enabled. So starting at -RT-V0.5.2 to activate
wakeup timing it's enough to enable PREEMPT_TIMING and reset the max
after bootup:
echo 0 > /proc/sys/kernel/preempt_max_latency
this will switch back to critical-section timing/tracing:
echo 0 > /proc/sys/kernel/preempt_wakeup_timing
Ingo
On Thu, 28 Oct 2004 15:57:06 +0200, Ingo Molnar <[email protected]> wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > (right now it's not possible to do wakeup-timing without
> > LATENCY_TRACE, i'll fix that.)
>
> i've fixed this in -RT-V0.5.2. Also, the trace_enabled=4 method is
> deprecated now and the new mechanism is to use:
>
> /proc/sys/kernel/preempt_wakeup_timing
>
> this flag is default-enabled. So starting at -RT-V0.5.2 to activate
> wakeup timing it's enough to enable PREEMPT_TIMING and reset the max
> after bootup:
>
> echo 0 > /proc/sys/kernel/preempt_max_latency
>
> this will switch back to critical-section timing/tracing:
>
> echo 0 > /proc/sys/kernel/preempt_wakeup_timing
What kind of benchmarking tools about from the inkernel timing/tracing
do you use for testing REALTIME_PREEMPT?
>
> Ingo
-DaMouse
--
I know I broke SOMETHING but its their fault for not fixing it before me
Ingo Molnar wrote:
> i have released the -V0.4 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
I have been having problems on my UP system at home with all of the more
recent patches (since U10.X). Some would boot and then the networking
was severely busted (slowdowns, hangs, etc.), some would not even boot.
V0.4.3 was of the no boot variety. Just for grins I disabled kudzu, and
the thing boots fine with no networking or other problems. This very
well may have been a fluke, but I have successfully booted this kernel
twice now. It did hang on a reboot at the point when it should have been
doing the actual reboot and I had to press the button. I didn't have
time this morning to turn kudzu back on to see if was just a fluke that
it didn't boot the first time. Not sure what, if anything, this means,
but V0.4.3 is running very nicely on my UP system with no lag or
noticeable problems.
kr
* DaMouse <[email protected]> wrote:
> > echo 0 > /proc/sys/kernel/preempt_max_latency
> >
> > this will switch back to critical-section timing/tracing:
> >
> > echo 0 > /proc/sys/kernel/preempt_wakeup_timing
>
> What kind of benchmarking tools about from the inkernel timing/tracing
> do you use for testing REALTIME_PREEMPT?
amlat's 'realfeel' with the patch i posted yesterday.
Ingo
* K.R. Foley <[email protected]> wrote:
> I have been having problems on my UP system at home with all of the
> more recent patches (since U10.X). Some would boot and then the
> networking was severely busted (slowdowns, hangs, etc.), some would
> not even boot. V0.4.3 was of the no boot variety. Just for grins I
> disabled kudzu, and the thing boots fine with no networking or other
> problems. This very well may have been a fluke, but I have
> successfully booted this kernel twice now. It did hang on a reboot at
> the point when it should have been doing the actual reboot and I had
> to press the button. I didn't have time this morning to turn kudzu
> back on to see if was just a fluke that it didn't boot the first time.
> Not sure what, if anything, this means, but V0.4.3 is running very
> nicely on my UP system with no lag or noticeable problems.
just to make sure - could try to run kudzu manually after bootup and
observe what happens? Do you have a udev based system? I recently
corrupted my udev database via a crash and had to remove the
/dev/.udev.tdb file and had to regenerate it via 'udevstart'. (be
careful doing that though, it might mess up your system.) The symtoms
were a hung kudzu - while in reality it 'hung' because udev and udevinfo
processes looped in userspace forever. Weirdly, the stock Fedora kernel
didnt hang in this same phase, so there might still be a
PREEMPT_REALTIME bug here.
Ingo
I built with V0.5.1 and on a relatively idle system saw a number
of traces with > 4000 entries, with a wide range of latencies.
For example:
lt001.v1k1/lt.00: latency: 98910 us, entries: 4000 (24040) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.01: latency: 4021 us, entries: 4000 (11139) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.02: latency: 6597 us, entries: 4000 (13361) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.03: latency: 8873 us, entries: 4000 (16109) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.04: latency: 9092 us, entries: 4000 (17525) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.05: latency: 10079 us, entries: 4000 (22351) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.06: latency: 1976 us, entries: 2148 (2148) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.07: latency: 50675 us, entries: 4000 (12102) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.08: latency: 2822 us, entries: 4000 (7318) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.09: latency: 5817 us, entries: 4000 (9574) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.10: latency: 10811 us, entries: 4000 (26175) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.11: latency: 2511 us, entries: 4000 (6204) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.12: latency: 9194 us, entries: 4000 (28039) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.13: latency: 12068 us, entries: 4000 (37143) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.14: latency: 36768 us, entries: 4000 (101483) | [VP:0
KP:1 SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.15: latency: 96154 us, entries: 4000 (21364) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.16: latency: 2759 us, entries: 4000 (7153) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.17: latency: 6653 us, entries: 4000 (15161) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.18: latency: 7322 us, entries: 4000 (12684) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.19: latency: 7440 us, entries: 4000 (15368) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
lt001.v1k1/lt.20: latency: 13838 us, entries: 4000 (30923) | [VP:0 KP:1
SP:1 HP:1 #CPUS:2]
It did not take long to collect this information. These may be false
positives, here is the start of one example (note 00000000 for preempt
count in some of the lines).
preemption latency trace v1.0.7 on 2.6.9-mm1-RT-V0.5.1
-------------------------------------------------------
latency: 1976 us, entries: 2148 (2148) | [VP:0 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: get_ltrace.sh/3673, uid:0 nice:0 policy:1 rt_prio:50
-----------------
=> started at: try_to_wake_up+0x1cc/0x330 <c011c1cc>
=> ended at: finish_task_switch+0x41/0xc0 <c011c7e1>
=======>
80000000 0.000ms (+0.000ms): _spin_unlock (try_to_wake_up)
80000000 0.000ms (+0.000ms): (105) ((140))
80000000 0.000ms (+0.000ms): (6) ((0))
80000000 0.000ms (+0.000ms): resched_task (try_to_wake_up)
80000000 0.001ms (+0.000ms): _spin_unlock_irqrestore (try_to_wake_up)
80000000 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
00000000 0.001ms (+0.000ms): preempt_schedule (cpu_idle)
80000000 0.002ms (+0.000ms): __sched_text_start (preempt_schedule)
80000000 0.002ms (+0.000ms): sched_clock (__sched_text_start)
80000000 0.002ms (+0.000ms): _spin_lock_irq (__sched_text_start)
80000000 0.003ms (+0.000ms): _spin_lock_irqsave (__sched_text_start)
80000000 0.003ms (+0.000ms): dequeue_task (__sched_text_start)
80000000 0.004ms (+0.000ms): recalc_task_prio (__sched_text_start)
80000000 0.004ms (+0.000ms): effective_prio (recalc_task_prio)
80000000 0.004ms (+0.000ms): enqueue_task (__sched_text_start)
80000000 0.005ms (+0.000ms): __switch_to (__sched_text_start)
80000000 0.005ms (+0.000ms): (0) ((6))
80000000 0.005ms (+0.000ms): (140) ((105))
80000000 0.006ms (+0.000ms): finish_task_switch (__sched_text_start)
80000000 0.006ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
80000000 0.006ms (+0.000ms): (6) ((105))
80000000 0.006ms (+0.000ms): _spin_lock (trace_stop_sched_switched)
80000000 0.007ms (+0.000ms): _spin_unlock (finish_task_switch)
80000000 0.007ms (+0.000ms): _spin_unlock_irq (finish_task_switch)
00000000 0.008ms (+0.000ms): _do_softirq (ksoftirqd)
80000000 0.008ms (+0.000ms): ___do_softirq (_do_softirq)
00000000 0.008ms (+0.000ms): run_timer_softirq (___do_softirq)
... ends like this ...
80000000 1.856ms (+0.000ms): rwsem_owner_del (__up_write)
80000000 1.857ms (+0.000ms): _spin_unlock (__up_write)
80000000 1.857ms (+0.000ms): _spin_unlock (__up_write)
00000000 1.858ms (+0.000ms): up (real_lookup)
00000000 1.858ms (+0.000ms): _up_write (up)
00000000 1.858ms (+0.000ms): __up_write (_up_write)
80000000 1.859ms (+0.000ms): _spin_lock (__up_write)
80000000 1.859ms (+0.000ms): _spin_lock (__up_write)
80000000 1.860ms (+0.000ms): rwsem_owner_del (__up_write)
80000000 1.861ms (+0.000ms): _spin_unlock (__up_write)
80000000 1.861ms (+0.000ms): _spin_unlock (__up_write)
00000000 1.862ms (+0.001ms): follow_mount (link_path_walk)
00000000 1.863ms (+33179.004ms): dput (link_path_walk)
I haven't run the stress tests yet but will send the system log
and these traces in a separate message so you can see what I'm seeing.
--Mark
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> OK. That was it. After switching off CONFIG_RWSEM_DEADLOCK_DETECT on
>> RT-V0.4.3, I can say that it's now on par to RT-U3.
>
> great!
>
>> Later today, I will conduct some extendeded testing, where I'll able
>> to compare the jackd performance between vanilla, RT-U3 and RT-V0.4.3,
>> on my UP laptop. All kernel configurations will be stripped off from
>> all the debug options.
>>
>> I will take note of xrun rate, jackd scheduling delay histogram, and
>> cpu usage. Context switch rate will be also acquainted.
>>
>> Anything else?
>
> yeah, that's good enough.
OK. Here are my early consolidated results. Feel free to comment.
2.6.9 RT-U3 RT-V0.4.3
--------- --------- ---------
XRUN Rate . . . . . . . . . . . 424 8 4 /hour
Delay Rate (>spare time) . . . 496 0 0 /hour
Delay Rate (>1000 usecs) . . . 940 8 4 /hour
Maximum Delay . . . . . . . . . 6904 921 721 usecs
Maximum Process Cycle . . . . . 1449 1469 1590 usecs
Average DSP CPU Load . . . . . 38 39 40 %
Average Context-Switch Rate . . 7480 8929 9726 /sec
Note: all tests were carried out running jackd -v -dalsa -dhw:0 -r44100
-p128 -n2 -S -P, loaded with 9 (nine) fluidsynth instances, on a
[email protected] laptop, against the onboard sound device (snd-ali5451).
On the RT kernels only, the following optimizations were issued:
chrt --pid --fifo 90 2 (pidof "ksoftirqd/0" = 2)
chrt --pid --fifo 60 `pidof "IRQ 5"` (snd-ali5451)
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
* [email protected] <[email protected]> wrote:
> It did not take long to collect this information. These may be false
> positives, here is the start of one example (note 00000000 for preempt
> count in some of the lines).
>
> preemption latency trace v1.0.7 on 2.6.9-mm1-RT-V0.5.1
> -------------------------------------------------------
> latency: 1976 us, entries: 2148 (2148) | [VP:0 KP:1 SP:1 HP:1 #CPUS:2]
> -----------------
> | task: get_ltrace.sh/3673, uid:0 nice:0 policy:1 rt_prio:50
> -----------------
> => started at: try_to_wake_up+0x1cc/0x330 <c011c1cc>
> => ended at: finish_task_switch+0x41/0xc0 <c011c7e1>
> =======>
> 80000000 0.000ms (+0.000ms): _spin_unlock (try_to_wake_up)
> 80000000 0.000ms (+0.000ms): (105) ((140))
> 80000000 0.000ms (+0.000ms): (6) ((0))
here pid 6 got woken up and it's about to preempt pid 0 [the idle task].
> 80000000 0.000ms (+0.000ms): resched_task (try_to_wake_up)
> 80000000 0.001ms (+0.000ms): _spin_unlock_irqrestore (try_to_wake_up)
> 80000000 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
> 00000000 0.001ms (+0.000ms): preempt_schedule (cpu_idle)
> 80000000 0.002ms (+0.000ms): __sched_text_start (preempt_schedule)
> 80000000 0.002ms (+0.000ms): sched_clock (__sched_text_start)
> 80000000 0.002ms (+0.000ms): _spin_lock_irq (__sched_text_start)
> 80000000 0.003ms (+0.000ms): _spin_lock_irqsave (__sched_text_start)
> 80000000 0.003ms (+0.000ms): dequeue_task (__sched_text_start)
> 80000000 0.004ms (+0.000ms): recalc_task_prio (__sched_text_start)
> 80000000 0.004ms (+0.000ms): effective_prio (recalc_task_prio)
> 80000000 0.004ms (+0.000ms): enqueue_task (__sched_text_start)
> 80000000 0.005ms (+0.000ms): __switch_to (__sched_text_start)
> 80000000 0.005ms (+0.000ms): (0) ((6))
> 80000000 0.005ms (+0.000ms): (140) ((105))
> 80000000 0.006ms (+0.000ms): finish_task_switch (__sched_text_start)
> 80000000 0.006ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
> 80000000 0.006ms (+0.000ms): (6) ((105))
> 80000000 0.006ms (+0.000ms): _spin_lock (trace_stop_sched_switched)
the trace should have stopped here! We just successfully switched from
pid 0 to pid 6 and called trace_stop_sched_switched() - the tracer
should really have noticed it. But the trace goes on for eternity:
> 00000000 1.862ms (+0.001ms): follow_mount (link_path_walk)
> 00000000 1.863ms (+33179.004ms): dput (link_path_walk)
which is just wrong.
i think the tracer is more broken on SMP systems than i thought. If we
start tracing on one CPU and it goes over to another CPU [which might
have happened in the above case - another task on another CPU took
precedence over pid 6 on this CPU] ... but the tracing timestamp is
per-CPU.
what needs to happen is some sort of 'handover' whenever the
highest-prio task is migrated from one CPU to another. Until this is
fixed i'd not suggest to use the wakeup latency tracer on SMP :-|
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> OK. Here are my early consolidated results. Feel free to comment.
>
> 2.6.9 RT-U3 RT-V0.4.3
> --------- --------- ---------
> XRUN Rate . . . . . . . . . . . 424 8 4 /hour
> Delay Rate (>spare time) . . . 496 0 0 /hour
> Delay Rate (>1000 usecs) . . . 940 8 4 /hour
> Maximum Delay . . . . . . . . . 6904 921 721 usecs
> Maximum Process Cycle . . . . . 1449 1469 1590 usecs
> Average DSP CPU Load . . . . . 38 39 40 %
> Average Context-Switch Rate . . 7480 8929 9726 /sec
looks pretty good, doesnt it?
how is the 'maximum delay' calculated? Could you put in a tracing hook
into jackd whenever such a ~720 usecs maximum is hit? I'd _love_ to see
how such a latency path looks like, it seems a bit long.
It should be a relatively simple hack to jackd. Firstly, download the
-V0.5.3 patch and enable LATENCY_TRACE, then do:
echo 2 > /proc/sys/kernel/trace_enabled
this activates the 'application-triggered kernel tracer' functionality.
No tracing happens by default, but tracing starts if the application
executes this function:
gettimeofday(0,1);
and tracing stops if the application does:
gettimeofday(0,0);
whenever the app does this (0,0) call the trace gets saved and you can
retrieve it from /proc/latency_trace where you can retrieve it. There is
no combination of these parameters that can break the kernel, so it's a
100% safe tracing facility. You can 'ignore' a latency [e.g. if it's not
a maximum] by simply not doing the (0,0) call. The next (0,1) call done
will override the previous, already running trace.
[stupid function but this combination of the syscall parameters is not
used otherwise so the latency tracer hijacks it.]
i dont know how Jackd does things, but i'd suggest to enable tracing the
first time possible when getting an interrupt - in theory this should
happen as soon as the wakeup-latency-tracer says - i.e. at most in like
30 usecs. The bulk of the remaining 700 usecs will be spent in jackd,
and you can trace those 700 usecs.
or if you would be willing to do a little bit of ALSA hacking, you could
add this to the ALSA interrupt handler:
#include <linux/syscalls.h>
...
sys_gettimeofday(0, 1);
[the attached patch implements this for ali5451.]
and do the gettimeofday(0,0) in jackd [if the latency measured there is
a new maximum]! This way tracing is turned on within the kernel IRQ
handler (i.e. as soon as possible) and turned off within ALSA. This will
enable us to see an even more complete latency path.
NOTE: there can only be one trace active at a time. So if there can be
multiple channels active at a time then this user-triggered tracer might
be less useful. Do these channels have any priority? Or if multiple
channels are necessary then you could modify the patch to only do the
(0,1) call for say channel #0:
if (channel == 0)
sys_gettimeofday(0, 1);
in this case the trace-off-save (0,0) call in Jackd must also only do
this for channel 0 processing! (or whichever channel you find the most
interesting.)
also, i looked at the sound/pci/ali5451/ali5451.c driver code and it has
one weird piece of code on line 988:
udelay(100);
that adds a 100 usecs latency to the main path, for no good reason! It
also spends that time burning CPU time, delaying other processing. Could
you add an IRQs/sec measurement too if possible, so that we can compare
the IRQ rates of various kernels?
Also, i'd suggest to simply remove that line (or apply the attached
patch) - does the driver still work fine with that?
plus i've also got questions about how Jackd interfaces with ALSA: does
it use SIGIO, or some direct driver ioctl? If SIGIO is used then how is
it done precisely - is an 'RT' queued signal used or ordinary SIGIO?
Also, how is the 'channel' information established upon getting a SIGIO,
is it in the siginfo structure?
Ingo
--- linux/sound/pci/ali5451/ali5451.c.orig
+++ linux/sound/pci/ali5451/ali5451.c
@@ -33,6 +33,7 @@
#include <linux/pci.h>
#include <linux/slab.h>
#include <linux/moduleparam.h>
+#include <linux/syscalls.h>
#include <sound/core.h>
#include <sound/pcm.h>
#include <sound/info.h>
@@ -985,11 +986,11 @@ static void snd_ali_update_ptr(ali_t *co
pvoice = &codec->synth.voices[channel];
runtime = pvoice->substream->runtime;
- udelay(100);
spin_lock(&codec->reg_lock);
if (pvoice->pcm && pvoice->substream) {
/* pcm interrupt */
+ sys_gettimeofday((void *)0, (void *)1); // start the tracer
#ifdef ALI_DEBUG
outb((u8)(pvoice->number), ALI_REG(codec, ALI_GC_CIR));
temp = inw(ALI_REG(codec, ALI_CSO_ALPHA_FMS + 2));
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> OK. Here are my early consolidated results. Feel free to comment.
>>
>> 2.6.9 RT-U3 RT-V0.4.3
>> --------- --------- ---------
>> XRUN Rate . . . . . . . . . . . 424 8 4 /hour
>> Delay Rate (>spare time) . . . 496 0 0 /hour
>> Delay Rate (>1000 usecs) . . . 940 8 4 /hour
>> Maximum Delay . . . . . . . . . 6904 921 721 usecs
>> Maximum Process Cycle . . . . . 1449 1469 1590 usecs
>> Average DSP CPU Load . . . . . 38 39 40 %
>> Average Context-Switch Rate . . 7480 8929 9726 /sec
>
> looks pretty good, doesnt it?
>
Yes indeed :)
> how is the 'maximum delay' calculated? Could you put in a tracing hook
> into jackd whenever such a ~720 usecs maximum is hit? I'd _love_ to see
> how such a latency path looks like, it seems a bit long.
>
That 'maximum delay' is collected on each jackd process cycle. AFAICS, it
is the figure of a scheduling delay, as measured by jackd as the time
interval between interrupt and effective jackd process handler (re)entry.
Please note that I'm not a JACK developer. I'm just a regular user
with ancient coding skills ;) I do however subscribe to the jackit-devel
maillist. And the author of qjackctl, if that matters...
For reading this 'maximum delay' I am actually using a custom patch
against jack-0.99.7cvs, being a Lee Revell's original.
> It should be a relatively simple hack to jackd. Firstly, download the
> -V0.5.3 patch and enable LATENCY_TRACE, then do:
>
> echo 2 > /proc/sys/kernel/trace_enabled
>
> this activates the 'application-triggered kernel tracer' functionality.
>
> No tracing happens by default, but tracing starts if the application
> executes this function:
>
> gettimeofday(0,1);
>
> and tracing stops if the application does:
>
> gettimeofday(0,0);
>
> whenever the app does this (0,0) call the trace gets saved and you can
> retrieve it from /proc/latency_trace where you can retrieve it. There is
> no combination of these parameters that can break the kernel, so it's a
> 100% safe tracing facility. You can 'ignore' a latency [e.g. if it's not
> a maximum] by simply not doing the (0,0) call. The next (0,1) call done
> will override the previous, already running trace.
>
> [stupid function but this combination of the syscall parameters is not
> used otherwise so the latency tracer hijacks it.]
>
> i dont know how Jackd does things, but i'd suggest to enable tracing the
> first time possible when getting an interrupt - in theory this should
> happen as soon as the wakeup-latency-tracer says - i.e. at most in like
> 30 usecs. The bulk of the remaining 700 usecs will be spent in jackd,
> and you can trace those 700 usecs.
>
> or if you would be willing to do a little bit of ALSA hacking, you could
> add this to the ALSA interrupt handler:
>
> #include <linux/syscalls.h>
>
> ...
> sys_gettimeofday(0, 1);
>
> [the attached patch implements this for ali5451.]
>
> and do the gettimeofday(0,0) in jackd [if the latency measured there is
> a new maximum]! This way tracing is turned on within the kernel IRQ
> handler (i.e. as soon as possible) and turned off within ALSA. This will
> enable us to see an even more complete latency path.
>
> NOTE: there can only be one trace active at a time. So if there can be
> multiple channels active at a time then this user-triggered tracer might
> be less useful. Do these channels have any priority? Or if multiple
> channels are necessary then you could modify the patch to only do the
> (0,1) call for say channel #0:
>
> if (channel == 0)
> sys_gettimeofday(0, 1);
>
> in this case the trace-off-save (0,0) call in Jackd must also only do
> this for channel 0 processing! (or whichever channel you find the most
> interesting.)
>
Ouch. This is a bit too much to digest in so little time :) I'll try to
re-read this from cache, erm... tomorrow ;)
BTW, this means that I have to re-enable LATENCY_TIMING back again? Notice
that all my results were taken with a production configuration, that is,
with all debug options now set to off (OK, I think I've left the
stack-overflow on, but that was the only one).
OTOH, this latency timing has been troublesome on either of my setups,
recently. But I'll give it another try...
> also, i looked at the sound/pci/ali5451/ali5451.c driver code and it has
> one weird piece of code on line 988:
>
> udelay(100);
>
> that adds a 100 usecs latency to the main path, for no good reason! It
> also spends that time burning CPU time, delaying other processing. Could
> you add an IRQs/sec measurement too if possible, so that we can compare
> the IRQ rates of various kernels?
>
Yes, I can add interrupts/sec measuring with nmeter. Neat utility indeed,
thanks to Denis Vlasenko.
> Also, i'd suggest to simply remove that line (or apply the attached
> patch) - does the driver still work fine with that?
>
Now that you call, I remember to hack that very same line, some time go,
but couldn't get no better than a udelay(33). Removing that line just
ended in some kind of malfunction, but can't remember what exactly. One
thing's for sure, sound didn't came out of it :-/
> plus i've also got questions about how Jackd interfaces with ALSA: does
> it use SIGIO, or some direct driver ioctl? If SIGIO is used then how is
> it done precisely - is an 'RT' queued signal used or ordinary SIGIO?
> Also, how is the 'channel' information established upon getting a SIGIO,
> is it in the siginfo structure?
>
Now that's really pushing me over. Any ALSA-JACK developers around here to
comment?
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
On Fri, 2004-10-29 at 00:49 +0100, Rui Nuno Capela wrote:
> > plus i've also got questions about how Jackd interfaces with ALSA: does
> > it use SIGIO, or some direct driver ioctl? If SIGIO is used then how is
> > it done precisely - is an 'RT' queued signal used or ordinary SIGIO?
> > Also, how is the 'channel' information established upon getting a SIGIO,
> > is it in the siginfo structure?
> >
>
> Now that's really pushing me over. Any ALSA-JACK developers around here to
> comment?
>
I think it uses a direct driver ioctl to open the device. Then jack
uses mmap to talk to the audio device.
Anyway I forwarded your question to Paul Davis, the author of JACK, and
cc'ed jackit-devel.
Lee
* Rui Nuno Capela <[email protected]> wrote:
> BTW, this means that I have to re-enable LATENCY_TIMING back again?
yes. I'd suggest to start with the simplest setup - i.e. just one
fluidsynth instance running. I suspect 3-4 instances later on will be
enough to trigger some xruns or at least some of the bigger delays.
you possibly wont be able to debug the 'production' setup, but that's
not an issue because the latencies should show up under just 2-3
instances running as well.
> > Also, i'd suggest to simply remove that line (or apply the attached
> > patch) - does the driver still work fine with that?
> >
>
> Now that you call, I remember to hack that very same line, some time
> go, but couldn't get no better than a udelay(33). Removing that line
> just ended in some kind of malfunction, but can't remember what
> exactly. One thing's for sure, sound didn't came out of it :-/
ugh. Possibly some sort of interaction with the firmware and/or an
outright driver bug?
Ingo
On Thu, 28 Oct 2004 17:33:50 +0100 (WEST)
"Rui Nuno Capela" <[email protected]> wrote:
> OK. Here are my early consolidated results. Feel free to comment.
>
> 2.6.9 RT-U3 RT-V0.4.3
> --------- --------- ---------
> XRUN Rate . . . . . . . . . . . 424 8 4 /hour
> Delay Rate (>spare time) . . . 496 0 0 /hour
> Delay Rate (>1000 usecs) . . . 940 8 4 /hour
> Maximum Delay . . . . . . . . . 6904 921 721 usecs
> Maximum Process Cycle . . . . . 1449 1469 1590 usecs
> Average DSP CPU Load . . . . . 38 39 40 %
> Average Context-Switch Rate . . 7480 8929 9726 /sec
>
> Note: all tests were carried out running jackd -v -dalsa -dhw:0 -r44100
> -p128 -n2 -S -P, loaded with 9 (nine) fluidsynth instances, on a
> [email protected] laptop, against the onboard sound device (snd-ali5451).
>
> On the RT kernels only, the following optimizations were issued:
>
> chrt --pid --fifo 90 2 (pidof "ksoftirqd/0" = 2)
> chrt --pid --fifo 60 `pidof "IRQ 5"` (snd-ali5451)
>
> Take care.
hi,
i tried a V0.5.2 with PREEMPT_REALTIME and all debugging off (config
attached). I cannot reproduce your results. I have experienced around 30
xruns in 10 minutes. And big ones, too (> 5ms). I don't know exactly what
kind of load triggers them. Here's a bit of qjackctl message window (btw:
jackd was idle, no clients connected, except for qjackctl):
**** alsa_pcm: xrun of at least 0.594 msecs
**** alsa_pcm: xrun of at least 0.037 msecs
16:20:44.832 XRUN callback (2).
16:20:44.840 XRUN callback (3).
**** alsa_pcm: xrun of at least 0.024 msecs
**** alsa_pcm: xrun of at least 3.682 msecs
16:22:59.834 XRUN callback (4).
16:22:59.839 XRUN callback (5).
**** alsa_pcm: xrun of at least 0.016 msecs
**** alsa_pcm: xrun of at least 3.460 msecs
16:23:03.513 XRUN callback (6).
**** alsa_pcm: xrun of at least 2.028 msecs
16:23:03.869 XRUN callback (7).
**** alsa_pcm: xrun of at least 7.061 msecs
**** alsa_pcm: xrun of at least 4.510 msecs
16:23:03.894 XRUN callback (8).
16:23:03.895 XRUN callback (9).
**** alsa_pcm: xrun of at least 0.428 msecs
**** alsa_pcm: xrun of at least 0.157 msecs
16:23:54.546 XRUN callback (10).
16:23:54.547 XRUN callback (11).
16:23:54.549 XRUN callback (12).
**** alsa_pcm: xrun of at least 1.507 msecs
16:25:49.194 XRUN callback (13).
subgraph starting at qjackctl-1013 timed out (subgraph_wait_fd=14, status = 0, state = Triggered)
**** alsa_pcm: xrun of at least 1.534 msecs
16:28:23.530 XRUN callback (14).
**** alsa_pcm: xrun of at least 0.222 msecs
**** alsa_pcm: xrun of at least 2.674 msecs
**** alsa_pcm: xrun of at least 3.904 msecs
16:28:26.790 XRUN callback (15).
16:28:26.794 XRUN callback (16).
16:28:26.795 XRUN callback (17).
**** alsa_pcm: xrun of at least 0.701 msecs
Here's some setup info:
ksoftirqd/0:
mango:~# chrt -p 2
pid 2's current scheduling policy: SCHED_FIFO
pid 2's current scheduling priority: 99
snd-cs46xx:
mango:~# chrt -p `pidof "IRQ 3"`
pid 118's current scheduling policy: SCHED_FIFO
pid 118's current scheduling priority: 90
mango:~# pidof jackd
1014
mango:~# chrt -p 1014
pid 1014's current scheduling policy: SCHED_OTHER
pid 1014's current scheduling priority: 0
mango:~# chrt -p 1015
pid 1015's current scheduling policy: SCHED_OTHER
pid 1015's current scheduling priority: 0
mango:~# chrt -p 1016
pid 1016's current scheduling policy: SCHED_FIFO
pid 1016's current scheduling priority: 70
mango:~# chrt -p 1017
pid 1017's current scheduling policy: SCHED_FIFO
pid 1017's current scheduling priority: 60
Hmm, it seems i haven't disabled all debugging. This is from dmesg:
BUG: atomic counter underflow at:
[<c010649e>] dump_stack+0x1e/0x20 (20)
[<c025f319>] qdisc_destroy+0xd9/0xe0 (28)
[<c025f506>] dev_shutdown+0x36/0xb0 (28)
[<c0254679>] unregister_netdevice+0x129/0x2b0 (40)
[<c0221919>] unregister_netdev+0x19/0x40 (16)
[<f0902134>] ppp_shutdown_interface+0x64/0xd0 [ppp_generic] (36)
[<f08fe0a6>] ppp_release+0x86/0x90 [ppp_generic] (16)
[<c0155c3f>] __fput+0x14f/0x170 (36)
[<c0154457>] filp_close+0x57/0x80 (28)
[<c01544f0>] sys_close+0x70/0x90 (32)
[<c0105ecf>] syscall_call+0x7/0xb (-8124)
Here's some context from syslog:
Oct 29 15:04:01 mango pppd[324]: LCP: timeout sending Config-Requests
Oct 29 15:04:01 mango pppd[324]: Connection terminated.
Oct 29 15:04:01 mango kernel: BUG: atomic counter underflow at:
Oct 29 15:04:01 mango kernel: [valid_stack_ptr+46/96] dump_stack+0x1e/0x20 (20)
Oct 29 15:04:01 mango kernel: [unregister_netdevice+553/672] qdisc_destroy+0xd9/0xe0 (28)
Oct 29 15:04:01 mango kernel: [ethtool_op_get_tso+6/32] dev_shutdown+0x36/0xb0 (28)
Oct 29 15:04:01 mango kernel: [__sock_create+185/704] unregister_netdevice+0x129/0x2b0 (40)
Oct 29 15:04:01 mango kernel: [blk_wait_queue_drained+25/304] unregister_netdev+0x19/0x40 (16)
Oct 29 15:04:01 mango kernel: [pg0+810783028/1069765632] ppp_shutdown_interface+0x64/0xd0 [ppp_generic] (36)
Oct 29 15:04:01 mango kernel: [pg0+810766502/1069765632] ppp_release+0x86/0x90 [ppp_generic] (16)
Oct 29 15:04:01 mango kernel: [shmem_writepage+463/480] __fput+0x14f/0x170 (36)
Oct 29 15:04:01 mango kernel: [sys_swapon+1591/2080] filp_close+0x57/0x80 (28)
Oct 29 15:04:01 mango kernel: [sys_swapon+1744/2080] sys_close+0x70/0x90 (32)
Oct 29 15:04:01 mango kernel: [do_notify_resume+15/76] syscall_call+0x7/0xb (-8124)
Oct 29 15:04:10 mango pppoe[326]: Timeout waiting for PADS packets
Oct 29 15:04:31 mango pppd[324]: Serial connection established.
Oct 29 15:04:31 mango pppd[324]: Using interface ppp0
Oct 29 15:04:31 mango pppd[324]: Connect: ppp0 <--> /dev/pts/0
Oct 29 15:04:47 mango pppoe[822]: PADS: Service-Name: ''
Oct 29 15:04:47 mango pppoe[822]: PPP session is 2133
Oct 29 15:04:47 mango pppd[324]: CHAP authentication succeeded
Oct 29 15:04:48 mango pppd[324]: Cannot determine ethernet address for proxy ARP
Oct 29 15:04:48 mango pppd[324]: local IP address 213.23.197.161
Oct 29 15:04:48 mango pppd[324]: remote IP address 145.253.4.148
Oct 29 15:04:48 mango pppd[324]: primary DNS address 195.50.140.250
Oct 29 15:04:48 mango pppd[324]: secondary DNS address 145.253.2.11
flo
* Florian Schmidt <[email protected]> wrote:
> Here's some setup info:
>
> ksoftirqd/0:
> mango:~# chrt -p 2
> pid 2's current scheduling policy: SCHED_FIFO
> pid 2's current scheduling priority: 99
dont do this ... ksoftirqd can spend alot of time processing various
stuff and it should not be relevant to the audio path. It should be
SCHED_OTHER.
> Hmm, it seems i haven't disabled all debugging. This is from dmesg:
>
> BUG: atomic counter underflow at:
> [<c010649e>] dump_stack+0x1e/0x20 (20)
> [<c025f319>] qdisc_destroy+0xd9/0xe0 (28)
this is automatic and doesnt introduce alot of overhead (unless the
printout happens while you are testing latencies). You can remove the
WARN_ON from include/asm-i386/atomic.h.
Ingo
On Fri, 29 Oct 2004 16:31:35 +0200
Florian Schmidt <[email protected]> wrote:
> i tried a V0.5.2 with PREEMPT_REALTIME and all debugging off (config
> attached). I cannot reproduce your results. I have experienced around 30
> xruns in 10 minutes. And big ones, too (> 5ms). I don't know exactly what
> kind of load triggers them. Here's a bit of qjackctl message window (btw:
> jackd was idle, no clients connected, except for qjackctl):
>
[snip]
i forgot to mention though that i do use jack in full duplex mode and with a
periodsize of 64:
/usr/bin/jackd -R -P60 -t20000 -dalsa -dhw:0 -r48000 -p64 -n2
The results are thus not really comparable. Anyways, 7ms xruns still
shouldn't happen (with the older VP patches i could run 32 frames full
duplex for hours w/o xruns).
flo
On Fri, 29 Oct 2004 16:25:38 +0200
Ingo Molnar <[email protected]> wrote:
> > ksoftirqd/0:
> > mango:~# chrt -p 2
> > pid 2's current scheduling policy: SCHED_FIFO
> > pid 2's current scheduling priority: 99
>
> dont do this ... ksoftirqd can spend alot of time processing various
> stuff and it should not be relevant to the audio path. It should be
> SCHED_OTHER.
ah ok, i was wondering about this.. i saw it in rui's setup [SCHED_FIFO with
high prio]. Doesn't seem to make a difference though on first sight. still
xruns plenty above 1ms.
playback only or rmmod'ing the network adapter driver or using the dummy
soundcard driver instead of snd-cs46xx doesn't make a difference either.
after rmmoding the network card driver i saw:
sis900 0000:00:03.0: Device was removed without properly calling
pci_disable_device(). This may need fixing.
in m dmesg. rmmod'ing snd-cs46xx gives me:
Sound Fusion CS46xx 0000:00:0f.0: Device was removed without properly
calling pci_disable_device(). This may need fixing.
too, so maybe i have already hit a BUG in the kernel and this screwed up all
further test results.
Will build a kernel with debugging stuff to see what's up. I'll also build a
VP only version to see if i still get these pci_disable_device() messages
with a "more vanilla" kernel ;)
flo
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> BTW, this means that I have to re-enable LATENCY_TIMING back again?
>
> yes. I'd suggest to start with the simplest setup - i.e. just one
> fluidsynth instance running. I suspect 3-4 instances later on will be
> enough to trigger some xruns or at least some of the bigger delays.
>
> you possibly wont be able to debug the 'production' setup, but that's
> not an issue because the latencies should show up under just 2-3
> instances running as well.
>
>> > Also, i'd suggest to simply remove that line (or apply the attached
>> > patch) - does the driver still work fine with that?
>> >
>>
>> Now that you call, I remember to hack that very same line, some time
>> go, but couldn't get no better than a udelay(33). Removing that line
>> just ended in some kind of malfunction, but can't remember what
>> exactly. One thing's for sure, sound didn't came out of it :-/
>
> ugh. Possibly some sort of interaction with the firmware and/or an
> outright driver bug?
>
Just confirmed that, by removing that udelay(100) line on ali5451.c, the
result is crappy sound (worst than normally is :) and most relevant to the
subject, I get a nasty jackd XRUN storm, without even blinking an eye.
Useless.
Regarding my test results, maybe this is just a distraction. I was just
comparing the kernels, not the hardware, jackd or the ali5451 alsa driver,
which were kept as constants along the evaluation.
In fact, those tests were only about to confirm, by numbers on-the-field,
that RT-V0 is on par to RT-U3. Don't bother to compare it to your own
setup and/or hardware. These are the kind of things that YMMCV = Your
Mileage Certainly Varies :)
For example, in my own case, if those tests are done with ACPI disabled
(yes, with acpi=off), this laptop of mine just skews the results
completely: vanilla 2.6.9 gets better results, while the RT ones go
slumber. Go figure ;)
OK. I'm really running out of time now. Family's calling for the weekend.
On the next few days I'll take the latency_trace route, as Ingo proposed,
patching ali5451 and jackd to issue a sys_gettimeofday(0, 0) and
gettimeofday(0, 1) trace on/off instrumentation respectively, while using
a proper RT-V0.5.x kernel patch line (or newer).
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
Today's run with -V0.5.11.
In short, the tracing appears to be broken on SMP systems and
may have caused the crash in the first attempt. In the second
attempt (no tracing), the system ran but the something is not
quite right in the scheduler (or in marking tasks "ready to run"
since it appears that my cpu burner (a nice'd task) is getting
run when the system should run the X server & other applications.
Will be sending system logs and serial console output separately.
--Mark
-----
First attempt.
[1] Build / boot to X server OK. Only unusual message is the one
with the atomic counter underflow (in system log, not on console output).
[2] Set IRQ priorities & ran my "get_ltrace" script which looks for
changes in the latency trace output & dumps them out. Had one trace
and then soon after, the machine locked up. No console output for
several seconds, then finally a stream of errors.
The serial console messages appear to be incomplete at the beginning
(not sure why). Looks something like this
Starting system message bus: [ OK ]^M
(kdeinit/3196/CPU#0): new 1204 us maximum-latency wakeup.
(X/2832/CPU#1): new 68008 us maximum-latency wakeup.
(get_ltrace.sh/3482/CPU#1): new 3814 us maximum-latency wakeup.
(get_ltrace.sh/3530/CPU#1): new 3963 us maximum-latency wakeup.
(artsd/2965/CPU#1): new 78620 us maximum-latency wakeup.
(124)
[<c012454b>] profile_task_exit+0x1b/0x50 (24)
[<c012691f>] do_exit+0x1f/0x5c0 (40)
[note how the prologue data is missing & the output starts in
the middle of a line]
The critical section nesting was "unique" and looks like
| preempt count: 00010005 ]
| 5-level deep critical section nesting:
----------------------------------------
.. [<c03257cf>] .... _spin_lock+0x1f/0x70
.....[<c01e217a>] .. ( <= __up_write+0x26a/0x2a0)
.. [<c03257cf>] .... _spin_lock+0x1f/0x70
.....[<c01e1f65>] .. ( <= __up_write+0x55/0x2a0)
.. [<c0325817>] .... _spin_lock+0x67/0x70
.....[<c011b54d>] .. ( <= task_rq_lock+0x3d/0x70)
.. [<c03257cf>] .... _spin_lock+0x1f/0x70
.....[<c0115f47>] .. ( <= nmi_watchdog_tick+0x127/0x140)
.. [<c013d5bd>] .... print_traces+0x1d/0x60
.....[<c0105bec>] .. ( <= show_regs+0x14c/0x174)
The script then exits with preempt count of 3 and an atomic counter
underflow BUG message. This is followed right after with
BUG: Unable to handle kernel NULL pointer dereference at virtual address
00000
020
and the appropriate dump messages. The cycle of script failure and
kernel BUG repeats a few cycles and then stops.
Let's reboot and not look at the latency traces.
Second attempt.
Boots OK and started the stress test.
[1] Got through about 80% of the X11perf stress test before the system
"live locked" again. The audio continued to come out but the display
stopped updating. I found that Alt-Sysrq-L and Alt-Sysrq-T seemed to
work so I did both while waiting for the audio test to stop on its own.
[2] Running the second test, a similar symptom occurred where the
display froze during the top test, but it finished as well.
[3] Running the third test (network output), I got a number of error
messages on the X display (but not on the serial console) indicating
segmentation violation for the command
sudo -u me rcp file remote-system
I was able to manipulate the display at this point.
[4] It then ran the fourth test (network input), and the symptom of
locking up the display repeated (still no serial console messages).
[5] Disk write test had the same display lockup (audio OK).
Did Alt-Sysrq-L and -T to see what the status of tasks was.
Noticed that the cpu_burn program (nice'd) was still the active
task even though I had higher priority jobs (e.g., the X server)
that should have been ready to run.
[6] Disk copy & read tests, same symptoms, though it unfroze sometime
later in the test (and then froze again...). Something is not
quite consistent in how the system runs under these stress tests.
Application level charts are "interesting". All of them show very
little overhead to the CPU task with a few glitches. The audio
loop is consistently getting done "early". The yellow line showing
the nominal audio cycle time is well above the white line showing
the actual duration. To recap, in latencytest, we have a loop like
this...
capture T1
CPU burn 80% of nominal audio duration
capture T2
write next audio segment
capture T3
where T2-T1 is the CPU time and T3-T1 is the audio time. Some of the
delays were extremely long, but I assume that was due to the occasional
use of Alt-Sysrq keys. Some may also be due to that scheduling problem
I described above as well.
* [email protected] <[email protected]> wrote:
> The critical section nesting was "unique" and looks like
>
> | preempt count: 00010005 ]
> | 5-level deep critical section nesting:
> ----------------------------------------
> .. [<c03257cf>] .... _spin_lock+0x1f/0x70
> .....[<c01e217a>] .. ( <= __up_write+0x26a/0x2a0)
> .. [<c03257cf>] .... _spin_lock+0x1f/0x70
> .....[<c01e1f65>] .. ( <= __up_write+0x55/0x2a0)
> .. [<c0325817>] .... _spin_lock+0x67/0x70
> .....[<c011b54d>] .. ( <= task_rq_lock+0x3d/0x70)
> .. [<c03257cf>] .... _spin_lock+0x1f/0x70
> .....[<c0115f47>] .. ( <= nmi_watchdog_tick+0x127/0x140)
> .. [<c013d5bd>] .... print_traces+0x1d/0x60
> .....[<c0105bec>] .. ( <= show_regs+0x14c/0x174)
this might as well have been the NMI watchdog interacting. Could you
turn off the NMI watchdog to see whether that stabilizes things?
> The script then exits with preempt count of 3 and an atomic counter
> underflow BUG message. This is followed right after with
>
> BUG: Unable to handle kernel NULL pointer dereference at virtual address
> 00000
> 020
these are then probably just followup-errors. Will take a look at the
logs.
Ingo
(ksoftirqd/0/2/CPU#0): new 4 us maximum-latency wakeup.
(primert/2474/CPU#0): new 13 us maximum-latency wakeup.
(startx/2484/CPU#0): new 14 us maximum-latency wakeup.
(hotplug/2501/CPU#0): new 34 us maximum-latency wakeup.
agpgart: Found an AGP 2.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V2 device at 0000:00:00.0 into 4x mode
agpgart: Putting AGP V2 device at 0000:01:00.0 into 4x mode
IRQ#11 thread RT prio: 43.
IRQ#5 thread RT prio: 42.
(ksoftirqd/0/2/CPU#0): new 1003 us maximum-latency wakeup.
On Fri, 2004-10-29 at 16:00 +0100, Rui Nuno Capela wrote:
> For example, in my own case, if those tests are done with ACPI disabled
> (yes, with acpi=off), this laptop of mine just skews the results
> completely: vanilla 2.6.9 gets better results, while the RT ones go
> slumber. Go figure ;)
I suspect that your laptop uses SMM traps to talk to the battery. That
could certainly explain the 700 us xruns, because SMM disables all
interrupts. This was covered recently in another thread. According to
Alan Cox most laptops work this way.
Has anyone been able to reproduce the problem on a desktop system?
Lee
On Fri, 2004-10-29 at 10:52 -0700, John Gilbert wrote:
> Hello all, Ingo,
> Here's a few bugs on boot with V0.5.2, and a question: what's needed to
> get back to the verbose latency messages of previous preempt patches
> (see the terse second log)?
There is a setting in /proc/sys/kernel for this, don't remember what
it's called but should be pretty clear.
Lee
* John Gilbert <[email protected]> wrote:
> Hello all, Ingo,
> Here's a few bugs on boot with V0.5.2, and a question: what's needed to
> get back to the verbose latency messages of previous preempt patches
> (see the terse second log)?
> (ksoftirqd/0/2/CPU#0): new 1003 us maximum-latency wakeup.
if you have LATENCY_TRACING enabled then the wakeup trace of the last
wakeup will be in /proc/latency_trace.
the reason that the messages are less verbose is that by default we are
not measuring critical sections anymore, but 'wakeup latency'. Wakeup
latency is measured from the point of wakeup to the point where the task
runs - so it makes no sense to dump the stack (which is why the previous
tracing output was more verbose) - a stackdump would always show the
scheduler codepath where we stop measuring.
you can switch back to critical section timing though, via:
echo 0 > /proc/sys/kernel/preempt_wakeup_timing
this will also turn the stackdumps back on. (those make sense in this
case because we measure 'start of critical section' to 'end of critical
section', in which case both a stackdump and the symbolic printout of
the start and end address is useful - because it's variable.
Ingo
On Thu, Oct 28, 2004 at 01:57:19PM +0200, Ingo Molnar wrote: >
> * Michal Schmidt <[email protected]> wrote:
>
> > > i've uploaded -V0.4.1 with a fix that could fix this networking
> > > deadlock. Does it work any better?
> >
> > Unfortunately, no. It's only slightly different:
>
> ok. I've uploaded -RT-V0.5 which includes a different approach to
> solving these netfilter related deadlocks. It can be downloaded from the
> usual place:
This is in -V5.14
bill
On Fri, 2004-10-29 at 16:31 +0200, Florian Schmidt wrote:
> i tried a V0.5.2 with PREEMPT_REALTIME and all debugging off (config
> attached). I cannot reproduce your results. I have experienced around 30
> xruns in 10 minutes. And big ones, too (> 5ms). I don't know exactly what
> kind of load triggers them. Here's a bit of qjackctl message window (btw:
> jackd was idle, no clients connected, except for qjackctl):
>
I am seeing the same behavior, about 30 xruns in 10 minutes. It seems
to be triggered by display activity, among other things. This cannot be
a jackd issue, because with an earlier version (T3) I can run for 24
hours without a single xrun.
There has to be a bug somewhere in the RT preempt patch.
Lee
On Fri, 29 Oct 2004 23:09:24 -0400, Lee Revell <[email protected]> wrote:
> On Fri, 2004-10-29 at 16:31 +0200, Florian Schmidt wrote:
> > i tried a V0.5.2 with PREEMPT_REALTIME and all debugging off (config
> > attached). I cannot reproduce your results. I have experienced around 30
> > xruns in 10 minutes. And big ones, too (> 5ms). I don't know exactly what
> > kind of load triggers them. Here's a bit of qjackctl message window (btw:
> > jackd was idle, no clients connected, except for qjackctl):
> >
>
> I am seeing the same behavior, about 30 xruns in 10 minutes. It seems
> to be triggered by display activity, among other things. This cannot be
> a jackd issue, because with an earlier version (T3) I can run for 24
> hours without a single xrun.
>
> There has to be a bug somewhere in the RT preempt patch.
>
> Lee
There is, this has been a well-known issue with many versions of
real-time voluntary preemption, which is also a main reason as to why
I avoid it, voluntary preemption performs flawlessly however RT has
been horrendous. Hopefully the bugs will get smoothed out.
* Bill Huey <[email protected]> wrote:
> On Thu, Oct 28, 2004 at 01:57:19PM +0200, Ingo Molnar wrote: >
> > * Michal Schmidt <[email protected]> wrote:
> >
> > > > i've uploaded -V0.4.1 with a fix that could fix this networking
> > > > deadlock. Does it work any better?
> > >
> > > Unfortunately, no. It's only slightly different:
> >
> > ok. I've uploaded -RT-V0.5 which includes a different approach to
> > solving these netfilter related deadlocks. It can be downloaded from the
> > usual place:
>
> This is in -V5.14
thanks - excellent trace - i hopefully fixed this in -V0.5.16, freshly
uploaded. This also made me notice an upstream buglet.
Ingo
[email protected] wrote:
>Hi,
>
>
>>I'm seeing an odd build error in the -U10.3 patch to 2.6.9-mm1:
>>
>> <snip>
>>
>> AS arch/i386/boot/compressed/head.o
>> CC arch/i386/boot/compressed/misc.o
>> OBJCOPY arch/i386/boot/compressed/vmlinux.bin
>>BFD: Warning: Writing section `.bss' to huge (ie negative) file offset
>>0xc03ac000.
>>objcopy: arch/i386/boot/compressed/vmlinux.bin: File truncated
>>make[2]: *** [arch/i386/boot/compressed/vmlinux.bin] Error 1
>>make[1]: *** [arch/i386/boot/compressed/vmlinux] Error 2
>>make: *** [bzImage] Error 2
>>
>>[root@otaku linux-2.6.9]# objdump -f vmlinux
>>
>>vmlinux: file format elf32-i386
>>architecture: i386, flags 0x00000112:
>>EXEC_P, HAS_SYMS, D_PAGED
>>start address 0x00100000
>>
>>This appears a result of changes in:
>>
>> arch/i386/kernel/vmlinux.lds.S
>>
>>apparently for support of CONFIG_KERN_PHYS_OFFSET.
>>This causes the kernel LMA start address to
>>change from 0xc0100000 to 0x100000 and objcopy to
>>gag. I rolled back to a 2.6.9-mm1 version of the
>>above linker map file and did get the kernel to
>>build and boot.
>>
>>Anyone else seeing this? .config attached.
>>
>
>Yes.
>
>You probably need to upgrade your binutil package. The .bss LMA start address
>section is not dealt the way it should by ld.
>
>An other (bad) way to work around this compile problem is to force the .bss LMA
>start address with the following OBJCOPYFLAGS at objcopy time.
>
>OBJCOPYFLAGS := -O binary --change-section-lma .bss-0xc0000000 -R .note -R
>.comment -S
>
>Hope this help,
>Remi
>
Already tried a recent (2.15) version of objcopy
with the same results. But LD was the issue.
BTW the offending version of binutils was 2.13.
Thanks!
--
[email protected]
On Fri, Oct 29, 2004 at 05:02:34PM -0700, Bill Huey wrote:
> This is in -V5.14
[nasty networking crash trace]
Didn't fix it all...
bill
* Bill Huey <[email protected]> wrote:
> On Fri, Oct 29, 2004 at 05:02:34PM -0700, Bill Huey wrote:
> > This is in -V5.14
>
> [nasty networking crash trace]
>
> Didn't fix it all...
thanks for the trace - i've uploaded -V0.6.6 to the usual place:
http://redhat.com/~mingo/realtime-preempt/
which attempts to fix this particular deadlock.
other changes in -V0.6.6:
- increased debuggability by turning deadlock detection on for ordinary
Linux semaphores and rwsems. I suspect that some of the recently
reported hangs were semaphore related deadlocks. Those who see hangs
please re-try with -V0.6.6 and deadlock detection turned on, does it
produce a usable deadlock printout?
- show_all_locks() build bug fix from Daniel Walker.
- another debuggability feature: deadlock tracing stops after the
first dump and the kernel tries to continue (we tried to do this
before but it wasnt complete). Sometimes the deadlock is in fact some
'interesting' use of Linux semaphores and the system will not
really deadlock. The dump we can use to fix up that interesting use
of semaphores, and the system wont crash.
- crash fix: the dump_own_stack() code was buggered - removed it. The
stock kernel does pretty good stackdumping by itself, all that was
needed to activate it was to set CONFIG_FRAME_POINTERS.
Ingo
On Tue, Nov 02, 2004 at 10:37:58AM +0100, Ingo Molnar wrote:
> * Bill Huey <[email protected]> wrote:
> > [nasty networking crash trace]
...
> which attempts to fix this particular deadlock.
getting closer...
http:590 BUG: lock held at task exit time!
[c03f9e84] {r:0,a:-1,kernel_sem.lock}
.. held by: http/ 590 [dc0508a0, 121]
... acquired at: __schedule+0x3ac/0x850
bill
* Bill Huey <[email protected]> wrote:
> On Tue, Nov 02, 2004 at 10:37:58AM +0100, Ingo Molnar wrote:
> > * Bill Huey <[email protected]> wrote:
> > > [nasty networking crash trace]
> ...
> > which attempts to fix this particular deadlock.
>
> getting closer...
>
> http:590 BUG: lock held at task exit time!
> [c03f9e84] {r:0,a:-1,kernel_sem.lock}
> .. held by: http/ 590 [dc0508a0, 121]
> ... acquired at: __schedule+0x3ac/0x850
hm. Something called do_exit() with the BKL held which is a no-no. Do
you have a stacktrace, is this sys_exit() or some other code calling
do_exit()?
Ingo
* Ingo Molnar <[email protected]> wrote:
> > http:590 BUG: lock held at task exit time!
> > [c03f9e84] {r:0,a:-1,kernel_sem.lock}
> > .. held by: http/ 590 [dc0508a0, 121]
> > ... acquired at: __schedule+0x3ac/0x850
>
> hm. Something called do_exit() with the BKL held which is a no-no. Do
> you have a stacktrace, is this sys_exit() or some other code calling
> do_exit()?
i've uploaded -V0.6.7 with a bug fixed in the new priority code
(affecting RT tasks and probably causing some of the deadlocks reported
while running Jackd or other RT apps). I also fixed another networking
deadlock.
Ingo
* Ingo Molnar <[email protected]> wrote:
> > getting closer...
> >
> > http:590 BUG: lock held at task exit time!
> > [c03f9e84] {r:0,a:-1,kernel_sem.lock}
> > .. held by: http/ 590 [dc0508a0, 121]
> > ... acquired at: __schedule+0x3ac/0x850
>
> hm. Something called do_exit() with the BKL held which is a no-no. Do
> you have a stacktrace, is this sys_exit() or some other code calling
> do_exit()?
ok, Thomas reported a similar one too, which comes from the NFS code.
Does the patch below fix the warning?
Ingo
--- linux/kernel/module.c.orig
+++ linux/kernel/module.c
@@ -35,6 +35,7 @@
#include <linux/notifier.h>
#include <linux/stop_machine.h>
#include <linux/device.h>
+#include <linux/smp_lock.h>
#include <asm/uaccess.h>
#include <asm/semaphore.h>
#include <asm/cacheflush.h>
@@ -97,6 +98,11 @@ static inline int strong_try_module_get(
void __module_put_and_exit(struct module *mod, long code)
{
module_put(mod);
+ /*
+ * Release the kernel lock if held:
+ */
+ while (current->lock_depth != -1)
+ unlock_kernel();
do_exit(code);
}
EXPORT_SYMBOL(__module_put_and_exit);
Am Dienstag 02 November 2004 13:02 schrieb Ingo Molnar:
> * Ingo Molnar <[email protected]> wrote:
> > > http:590 BUG: lock held at task exit time!
> > > [c03f9e84] {r:0,a:-1,kernel_sem.lock}
> > > .. held by: http/ 590 [dc0508a0, 121]
> > > ... acquired at: __schedule+0x3ac/0x850
> >
> > hm. Something called do_exit() with the BKL held which is a no-no. Do
> > you have a stacktrace, is this sys_exit() or some other code calling
> > do_exit()?
>
> i've uploaded -V0.6.7 with a bug fixed in the new priority code
> (affecting RT tasks and probably causing some of the deadlocks reported
> while running Jackd or other RT apps). I also fixed another networking
> deadlock.
>
Hi
This showed via netconsole when shutting down V0.6.7 (maybe already fixed in
V0.6.8?):
>>>>>> some noise while in runlevel 5, just for context:
EXT2-fs warning: mounting unchecked fs, running e2fsck is recommended
loop: loaded (max 8 devices)
>>>>>> Here or...
atkbd.c: Spurious ACK on isa0060/serio0. Some program, like XFree86, might be
trying access hardware directly.
>>>>>> ...here the shutdown takes over
usbcore: deregistering driver snd-usb-usx2y
ALSA /home/ka/kernel/2.6/linux-2.6.9-mm1-RT-V0.6.7/sound/usb/usbmidi.c:154:
urb status -108
ALSA /home/ka/kernel/2.6/linux-2.6.9-mm1-RT-V0.6.7/sound/usb/usbmidi.c:139:
usb_submit_urb: -90
VIA 82xx Audio 0000:00:07.5: Device was removed without properly calling
pci_disable_device(). This may need fixing.
>>>>>> (I think there was nothing mounted via nfs, so) this looks odd to me:
nfsd:2468 BUG: lock held at task exit time!
[c032d2e4] {r:0,a:-1,kernel_sem.lock}
.. held by: nfsd/ 2468 [ce1a3750, 116]
... acquired at: __sched_text_start+0x36b/0x6d0
nfsd/2468: BUG in __up_write
at /home/ka/kernel/2.6/linux-2.6.9-mm1-RT-V0.6.7/lib/rwsem-generic.c:1058
BUG: sleeping function called from invalid context nfsd(2468)
at /home/ka/kernel/2.6/linux-2.6.9-mm1-RT-V0.6.7/kernel/mutex.c:30
in_atomic():1 [00000003], irqs_disabled():0
[<c0107923>] dump_stack+0x23/0x30 (20)
[<c011a5aa>] __might_sleep+0xca/0xe0 (36)
[<c0134b89>] __mutex_lock+0x39/0x60 (24)
[<c0134bcd>] _mutex_lock+0x1d/0x30 (16)
[<c01494d5>] kmem_cache_alloc+0x45/0x110 (32)
[<c0277388>] alloc_skb+0x28/0xf0 (32)
[<c02894e6>] find_skb+0x36/0xb0 (24)
[<c0289671>] netpoll_send_udp+0x41/0x2b0 (48)
[<d08d106a>] write_msg+0x6a/0x120 [netconsole] (52)
[<c011de6e>] __call_console_drivers+0x6e/0x70 (32)
[<c011dfa6>] call_console_drivers+0xb6/0x160 (40)
[<c011e3c1>] release_console_sem+0x71/0x110 (36)
[<c011e260>] vprintk+0x110/0x180 (36)
[<c011e13d>] printk+0x1d/0x30 (16)
[<c01bb9d6>] __up_write+0x186/0x540 (68)
[<c01bc798>] up+0x78/0xd0 (36)
[<c02dcc46>] __sched_text_start+0x606/0x6d0 (84)
[<c0120852>] do_exit+0x2d2/0x560 (40)
[<c01381a1>] __module_put_and_exit+0x51/0x70 (16)
[<d0929574>] nfsd+0x2d4/0x3b0 [nfsd] (72)
[<c0105325>] kernel_thread_helper+0x5/0x10 (836747284)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c02dc68e>] .... __sched_text_start+0x4e/0x6d0
.....[<c0120852>] .. ( <= do_exit+0x2d2/0x560)
.. [<c01bc743>] .... up+0x23/0xd0
.....[<c02dcc46>] .. ( <= __sched_text_start+0x606/0x6d0)
.. [<c01bbd04>] .... __up_write+0x4b4/0x540
.....[<c01bc798>] .. ( <= up+0x78/0xd0)
.. [<c01368ad>] .... print_traces+0x1d/0x60
.....[<c0107923>] .. ( <= dump_stack+0x23/0x30)
[<c0107923>] dump_stack+0x23/0x30 (20)
[<c01bb9db>] __up_write+0x18b/0x540 (68)
[<c01bc798>] up+0x78/0xd0 (36)
[<c02dcc46>] __sched_text_start+0x606/0x6d0 (84)
[<c0120852>] do_exit+0x2d2/0x560 (40)
[<c01381a1>] __module_put_and_exit+0x51/0x70 (16)
[<d0929574>] nfsd+0x2d4/0x3b0 [nfsd] (72)
[<c0105325>] kernel_thread_helper+0x5/0x10 (836747284)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c02dc68e>] .... __sched_text_start+0x4e/0x6d0
.....[<c0120852>] .. ( <= do_exit+0x2d2/0x560)
.. [<c01bc743>] .... up+0x23/0xd0
.....[<c02dcc46>] .. ( <= __sched_text_start+0x606/0x6d0)
.. [<c01bbd04>] .... __up_write+0x4b4/0x540
.....[<c01bc798>] .. ( <= up+0x78/0xd0)
.. [<c01368ad>] .... print_traces+0x1d/0x60
.....[<c0107923>] .. ( <= dump_stack+0x23/0x30)
nfsd: last server has exited
nfsd: unexporting all filesystems
<<<<<<
Best,
Karsten
On Tue, Nov 02, 2004 at 12:45:22PM +0100, Ingo Molnar wrote:
> * Bill Huey <[email protected]> wrote:
> > getting closer...
> >
> > http:590 BUG: lock held at task exit time!
> > [c03f9e84] {r:0,a:-1,kernel_sem.lock}
> > .. held by: http/ 590 [dc0508a0, 121]
> > ... acquired at: __schedule+0x3ac/0x850
>
> hm. Something called do_exit() with the BKL held which is a no-no. Do
> you have a stacktrace, is this sys_exit() or some other code calling
> do_exit()?
Attached:
bill
i have released the -V0.7.1 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this release is mainly a merge of -V0.6.9 to 2.6.10-rc2-mm2.
I havent done a proper changelog for a couple of days so here is a list
of bigger changes since -V0.4:
- implemented a first version of the priority inheritance handling and
priority inversion avoidance logic. This feature, after some initial
stability problems, solved the jackd and rtc_wakeup latencies that
were introduced by the ultra-finegrained locking in the -V series.
(the -T/U series had a coarser locking scheme triggered much lower
levels of priority inversion scenarios. The locking in the -V series
was clearly the tipping point.)
The new PI code covers all synchronization objects in Linux (on
PREEMPT_REALTIME): spinlocks, rwlocks, semaphores and rwsems.
Feedback on the design of this code would be welcome, and patches as
well, if you have a better scheme. The code is pretty modular so feel
free to experiment with alternative schemes.
- completely reworked the debugging framework. All lock types
(spinlocks, rwlocks, semaphores and rwsems) are now tracked, both
their symbolic name and their place of acquire are traced and printed
out upon detection of a deadlock. More and better information is
printed upon a deadlock. Got rid of the 'semaphore owners array' in
debugging mode, this reduces the footprint of semaphores quite
significantly and speeds up deadlock detection.
- got rid of the separate 'counted semaphores' implementation, it was
too intrusive. Made the core 'generic semaphores' implementation
compatible with vanilla Linux counted semaphore semantics. This also
enabled the unrolling of the completion-handling cleanups which,
while being very nice, were getting intrusive as well.
- countless build and driver related reports/fixes from lots of people
- more latency breaks in the remaining critical sections. A
particularly important one was the irqs-off latency bugfix from
Thomas Gleixner.
- sped up the i8259 PIC and the PIT timer hardirq handling routines -
these are now in the path of the longest latency.
- cleaned up IRQ and signal preemption - there were missed
check-rescheds and possibilities for IRQ recursion.
- made ALSA's ioctl()s not use the BKL - this fixes more jackd
latencies.
to create a -V0.7.1 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm2/2.6.10-rc1-mm2.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm2-V0.7.1
Ingo
On Wednesday 03 November 2004 11:58, Ingo Molnar wrote:
>
> i have released the -V0.7.1 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this release is mainly a merge of -V0.6.9 to 2.6.10-rc2-mm2.
AS arch/i386/lib/checksum.o
CC arch/i386/lib/dec_and_lock.o
CC arch/i386/lib/delay.o
AS arch/i386/lib/getuser.o
CC arch/i386/lib/memcpy.o
CC arch/i386/lib/mmx.o
CC arch/i386/lib/strstr.o
CC arch/i386/lib/usercopy.o
AR arch/i386/lib/lib.a
GEN .version
CHK include/linux/compile.h
UPD include/linux/compile.h
CC init/version.o
LD init/mounts.o
LD init/built-in.o
LD .tmp_vmlinux1
net/built-in.o(.text+0x1887f): In function `netpoll_setup':
: undefined reference to `rcu_read_lock_up_read'
net/built-in.o(.text+0x188ed): In function `netpoll_setup':
: undefined reference to `rcu_read_lock_up_read'
make: *** [.tmp_vmlinux1] Error 1
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
CONFIG_NET_POLL_CONTROLLER=y
--
I route therefore you are
* Lorenzo Allegrucci <[email protected]> wrote:
> LD init/built-in.o
> LD .tmp_vmlinux1
> net/built-in.o(.text+0x1887f): In function `netpoll_setup':
> : undefined reference to `rcu_read_lock_up_read'
> net/built-in.o(.text+0x188ed): In function `netpoll_setup':
> : undefined reference to `rcu_read_lock_up_read'
> make: *** [.tmp_vmlinux1] Error 1
fixed in -V0.7.3.
Ingo
On Wednesday 03 November 2004 14:46, Ingo Molnar wrote:
>
> * Lorenzo Allegrucci <[email protected]> wrote:
>
> > LD init/built-in.o
> > LD .tmp_vmlinux1
> > net/built-in.o(.text+0x1887f): In function `netpoll_setup':
> > : undefined reference to `rcu_read_lock_up_read'
> > net/built-in.o(.text+0x188ed): In function `netpoll_setup':
> > : undefined reference to `rcu_read_lock_up_read'
> > make: *** [.tmp_vmlinux1] Error 1
>
> fixed in -V0.7.3.
I've just tried V0.7.3 but my PS/2 mouse and keyboard don't work.
No message from the kernel. I attach my .config
--
I route therefore you are
>i have released the -V0.7.1 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
V0.7.1 had build problems but after applying V0.7.7, I got a clean build
and was able to do some testing.
- single user and telnet 5 were uneventful
- preempt_wakeup_timing (PWT) is still generating false positives on SMP
- after disabling PWT, I ran for over an hour without any latency
traces > 200 usec.
- no crashes, lockups, or other fatal behavior in that same period
- X test was generally OK. A few bursts of high overhead (> 1 msec)
on the CPU task and worst case was > 2 msec.
- top test was a little cleaner, but its worst case was much worse
(over 17 msec).
- network tests were similar, with even longer worst cases (21 msec and
32 msec)
- disk I/O tests had some really odd results. The write test was much
cleaner than read / copy. All tests had > 25 msec worst cases. The read
test also had a pretty consistent variation on CPU overhead, about 500 usec
range (compared to 1160 usec nominal duration) in CPU loop timing.
A few other things I noticed:
- whenever the real time test was active, responses to ping from another
system would basically stop until the real time test was done. In one case
about 25 ping packets were returned after a huge delay. From that, it
appears they were received but the return was delayed.
- cat /proc/interrupts showed that LOC was increasing on both CPU's
during the tests.
- the scheduler seems to prefer run my cpu_burn (nice'd) task instead
of updating the X display, doing the latency timing checks, ping responses,
and anything else that does useful work.
- the disk write test was REALLY SLOW, perhaps hundreds of Kbytes per
second instead of what I normally see. I took much longer than the real
time audio test. I checked with top and noticed that "fam" was taking
near 100% of CPU time. I closed my konqueror window (just happened to be
looking at my test directory) and fam usage went away and the disk writes
sped up considerably. I don't recall seeing this symptom on -T3 or
previous tests, may try that later today to see if this is an old problem
or not.
This appears to be more stable than anything else since -T3 but the
odd spikes and scheduling symptoms are quite troubling.
--Mark
Ingo Molnar wrote:
> The new PI code covers all synchronization objects in Linux (on
> PREEMPT_REALTIME): spinlocks, rwlocks, semaphores and rwsems.
> Feedback on the design of this code would be welcome, and patches as
> well, if you have a better scheme. The code is pretty modular so feel
> free to experiment with alternative schemes.
I didn't see closure being performed of a possible blocked-owner
dependency chain, but only promotion of the immediate owner. It
is possible for a mutex owner to itself be blocked on another mutex
requiring promotion of the latter mutex owner(s).
--
[email protected]
* Lorenzo Allegrucci <[email protected]> wrote:
> On Wednesday 03 November 2004 18:53, Lorenzo Allegrucci wrote:
> > On Wednesday 03 November 2004 14:46, Ingo Molnar wrote:
> > >
> > > * Lorenzo Allegrucci <[email protected]> wrote:
> > >
> > > > LD init/built-in.o
> > > > LD .tmp_vmlinux1
> > > > net/built-in.o(.text+0x1887f): In function `netpoll_setup':
> > > > : undefined reference to `rcu_read_lock_up_read'
> > > > net/built-in.o(.text+0x188ed): In function `netpoll_setup':
> > > > : undefined reference to `rcu_read_lock_up_read'
> > > > make: *** [.tmp_vmlinux1] Error 1
> > >
> > > fixed in -V0.7.3.
> >
> > I've just tried V0.7.3 but my PS/2 mouse and keyboard don't work.
> > No message from the kernel. I attach my .config
>
> Problem solved disabling ACPI.
does the same problem happen in vanilla 2.6.10-rc1-mm2 too?
Ingo
On Wednesday 03 November 2004 18:53, Lorenzo Allegrucci wrote:
> On Wednesday 03 November 2004 14:46, Ingo Molnar wrote:
> >
> > * Lorenzo Allegrucci <[email protected]> wrote:
> >
> > > LD init/built-in.o
> > > LD .tmp_vmlinux1
> > > net/built-in.o(.text+0x1887f): In function `netpoll_setup':
> > > : undefined reference to `rcu_read_lock_up_read'
> > > net/built-in.o(.text+0x188ed): In function `netpoll_setup':
> > > : undefined reference to `rcu_read_lock_up_read'
> > > make: *** [.tmp_vmlinux1] Error 1
> >
> > fixed in -V0.7.3.
>
> I've just tried V0.7.3 but my PS/2 mouse and keyboard don't work.
> No message from the kernel. I attach my .config
Problem solved disabling ACPI.
--
I route therefore you are
I reran my tests on -T3 to see if the same symptoms I saw with -V0.7.7
were present with the older (preempt, not RT) patches.
> - whenever the real time test was active, responses to ping from another
>system would basically stop until the real time test was done. In one case
>about 25 ping packets were returned after a huge delay. From that, it
>appears they were received but the return was delayed.
Same with -T3. What's really odd is that it stopped during the network
tests as well; may indicate that the network tests don't actually run
during the real time audio test. Hmm. Will modify the stress_neti and
stress_neto scripts I use to dump after each file transfer & see if
this is true or not. That certainly was not the case on 2.4 kernels
so this looks like a serious regression.
> - cat /proc/interrupts showed that LOC was increasing on both CPU's
>during the tests.
Did not check this, but I wasn't seeing the severe lockups of the display
on -T3 either. Yes - it is sometimes slow to update, but not stopping
display updates for extended periods.
> - the scheduler seems to prefer run my cpu_burn (nice'd) task instead
>of updating the X display, doing the latency timing checks, ping
responses,
>and anything else that does useful work.
To some extent, I see this symptom too. I watched the system with top
during the network and disk tests and it would stop updating for several
seconds (should be one second updates) during the test (and usually show
cpu_burn at > 90% CPU), then do a flurry of updates, and then sometimes
settle down to the one per second update for several seconds in a row.
> - the disk write test was REALLY SLOW, perhaps hundreds of Kbytes per
>second instead of what I normally see. I took much longer than the real
>time audio test. I checked with top and noticed that "fam" was taking
>near 100% of CPU time. I closed my konqueror window (just happened to be
>looking at my test directory) and fam usage went away and the disk writes
>sped up considerably.
This was much less severe in -T3. What I saw was that fam would show up
for several seconds and then disappear from the top list for several
seconds. The disk transfer speed also appeared to be much faster on -T3
than -V0.7.7 when fam was active. (based on how much clock time the
test took to perform)
--Mark H Johnson
<mailto:[email protected]>
On Wednesday 03 November 2004 21:43, Ingo Molnar wrote:
>
> * Lorenzo Allegrucci <[email protected]> wrote:
>
> > On Wednesday 03 November 2004 18:53, Lorenzo Allegrucci wrote:
> > > On Wednesday 03 November 2004 14:46, Ingo Molnar wrote:
> > > >
> > > > * Lorenzo Allegrucci <[email protected]> wrote:
> > > >
> > > > > LD init/built-in.o
> > > > > LD .tmp_vmlinux1
> > > > > net/built-in.o(.text+0x1887f): In function `netpoll_setup':
> > > > > : undefined reference to `rcu_read_lock_up_read'
> > > > > net/built-in.o(.text+0x188ed): In function `netpoll_setup':
> > > > > : undefined reference to `rcu_read_lock_up_read'
> > > > > make: *** [.tmp_vmlinux1] Error 1
> > > >
> > > > fixed in -V0.7.3.
> > >
> > > I've just tried V0.7.3 but my PS/2 mouse and keyboard don't work.
> > > No message from the kernel. I attach my .config
> >
> > Problem solved disabling ACPI.
>
> does the same problem happen in vanilla 2.6.10-rc1-mm2 too?
No it doesn't.
--
I route therefore you are
Ingo Molnar wrote:
> i have released the -V0.7.1 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this release is mainly a merge of -V0.6.9 to 2.6.10-rc2-mm2.
>
[snip]
I just wanted to tell you that my network problems with the e100 driver
disappeared. I still get the BUG in enable_irq, but now the network
works. I dunno if this is due to 2.6.10-rc2-mm2 fixes or your own fixes,
but i'm now happy :)
I'm going to try this patch on a network game server, that's pretty
latency demanding.
Regards,
Magnus
Hi,
Here goes my latest stats, now regarding under 2.6.10-rc1 and running
jackd -R -r44100 -p64 -n2 with 8 loaded fluidsynth instances, on a P4
2.533GHz UP 512MB laptop.
The main change from my previous posted results are that buffer/period
size has been decreased from 128 to 64 frames, being now a bit/much more
harnessing than before.
2.6.10-rc1 2.6.10-rc1-mm2 RT-V0.7.7
---------- -------------- ---------
XRUN Rate . . . . . . . . . : 687.6 832.0 45.6 /hour
Delay Rate (>spare time) . : 1354.8 1517.3 43.2 /hour
Delay Rate (>1000 usecs) . : 1146.0 1245.3 3.6 /hour
Delay Maximum . . . . . . . : 4510 8786 1249 usecs
Cycle Maximum . . . . . . . : 957 976 946 usecs
Average DSP Load . . . . . : 51.4 51.9 55.2 %
Average CPU System Load . . : 6.4 6.7 13.2 %
Average CPU User Load . . . : 40.3 40.4 41.9 %
Average CPU Nice Load . . . : 0.0 0.0 0.0 %
Average CPU I/O Wait Load . : 0.0 0.1 0.1 %
Average CPU IRQ Load . . . : 0.0 0.0 0.0 %
Average CPU Soft-IRQ Load . : 0.0 0.0 0.0 %
Average Interrupt Rate . . : 1673.7 1674.3 1675.4 /sec
Average Context-Switch Rate : 10967.4 10967.9 13940.9 /sec
As a first look, the -mm2 branch is performing quite badly, even though it
has been configured with the PREEMPT_BKL option set.
S/W versions are:
jackd 0.99.10
fluidsynth 1.0.5
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
* Magnus Naeslund(t) <[email protected]> wrote:
> Ingo Molnar wrote:
> >i have released the -V0.7.1 Real-Time Preemption patch, which can be
> >downloaded from:
> >
> > http://redhat.com/~mingo/realtime-preempt/
> >
> >this release is mainly a merge of -V0.6.9 to 2.6.10-rc2-mm2.
> >
> [snip]
>
> I just wanted to tell you that my network problems with the e100
> driver disappeared. I still get the BUG in enable_irq, but now the
> network works. I dunno if this is due to 2.6.10-rc2-mm2 fixes or your
> own fixes, but i'm now happy :)
while doing the merge i noticed and removed an older hack i added to the
e100 driver (and the rtl8139 driver) - possibly this could be related.
Ingo
Let me follow up briefly on the regression I noticed yesterday on ping
responses from an SMP system with one real time task running. Two systems
were used for the tests, both dual 866 Mhz Pentium III systems, identical
except for the software. The "old" machine is running Red Hat 7.3 with a
2.4.24 preempt, low latency kernel. The "new" machine is running Fedora
Core 2 with a 2.6 kernel as indicated below. The other difference that may
be significant is the "old" system uses OSS and the new one uses ALSA for
the audio output (source of latencytest application is unchanged for all
three tests).
The data below is from using another machine to ping the system under test.
2.4.24 preempt + low latency on "old" machine
430 seconds for the complete series of tests
0 lost packets
0.248, 0.322, 2.82, 0.145 (min, average, max, deviation)
2.6.5-1.358smp [from fedora core 2] on "new" machine
658 seconds for the complete series of tests
0 lost packets
0.148, 0.207, 1.952, 0.097 (min, average, max, deviation)
This system also lost the mouse (screaming interrupts, IRQ 10 disabled).
2.6.9-rc3-mm3-VP-T3 on "new machine"
955 seconds for the complete series of tests
539 lost packets
0.215, 17971, 287799, 63054 (min, average, max, deviation)
I did not repeat the tests on -V0.7.7, but I expect them to come out
similar to -T3 above based on what I saw yesterday. In any case, the loss
of network data appears significant with both the voluntary preempt &
realtime preempt patches on 2.6 kernels.
--Mark H Johnson
<mailto:[email protected]>
* [email protected] <[email protected]> wrote:
> Let me follow up briefly on the regression I noticed yesterday on ping
> responses from an SMP system with one real time task running. [...]
icmp/ping replies are handled by ksoftirqd. Once a networking request
has been handed to ksoftirqd it cannot be redirected to another CPU,
because softirq processing is fundamentally per-CPU. So if the network
interrupt hits the CPU where the RT-task is running then the RT task
will starve that ksoftirq instance (and hence the reply) even if another
CPU in the system is idle.
i agree that this is an SMP/RT artifact that should be fixed. hardirq
workload can be redirected to other CPUs because it's single-threaded,
but it's not that easy for softirq workload.
i suspect the same phenomenon causes some of the other scheduling
artifacts ('frozen' X) you've noticed.
Ingo
* Ingo Molnar <[email protected]> wrote:
> icmp/ping replies are handled by ksoftirqd. Once a networking request
> has been handed to ksoftirqd it cannot be redirected to another CPU,
> because softirq processing is fundamentally per-CPU. So if the network
> interrupt hits the CPU where the RT-task is running then the RT task
> will starve that ksoftirq instance (and hence the reply) even if
> another CPU in the system is idle.
>
> i agree that this is an SMP/RT artifact that should be fixed. hardirq
> workload can be redirected to other CPUs because it's single-threaded,
> but it's not that easy for softirq workload.
>
> i suspect the same phenomenon causes some of the other scheduling
> artifacts ('frozen' X) you've noticed.
does the ping phenomenon go away if you chrt both the networking IRQ
thread and both ksoftirqd's to above the RT task's priority? (this
doesnt solve the problem though - that task has RT priority for a reason
and on an SMP box the kernel should be able to schedule work without
getting into this starvation scenario.)
Ingo
>does the ping phenomenon go away if you chrt both the networking IRQ
>thread and both ksoftirqd's to above the RT task's priority?
For the most part, yes. I reran the test with -V0.7.7 and had continuous
ping responses until the system locked up with yet another deadlock. This
did NOT fix the display / mouse movement lockups. All IRQ and ksoftirqd
tasks were RT 99 priority for this test. latencytest ran at RT 30 priority.
The response time while RT was active looked like this...
# ping dws77
PING dws77 (192.52.216.87) from 192.52.215.17 : 56(84) bytes of data.
64 bytes from dws77 (192.52.216.87): icmp_seq=1 ttl=63 time=0.590 ms
64 bytes from dws77 (192.52.216.87): icmp_seq=2 ttl=63 time=0.468 ms
64 bytes from dws77 (192.52.216.87): icmp_seq=3 ttl=63 time=0.542 ms
64 bytes from dws77 (192.52.216.87): icmp_seq=4 ttl=63 time=0.492 ms
Note the response times are about 2x what I saw with the other kernels.
The max delay was about 200 msec.
The deadlock was between the two ksoftirqd tasks...
===============================================
BUG: circular semaphore deadlock detected!
-----------------------------------------------
ksoftirqd/1/6 is deadlocking current task ksoftirqd/0/3
1) ksoftirqd/0/3 is trying to acquire this lock:
[cb4640c0] {r:0,a:-1,&((sk)->sk_lock.slock)}
.. held by: ksoftirqd/1/ 6 [dff866f0, 0]
... acquired at: tcp_delack_timer+0x22/0x220
... trying at: tcp_v4_rcv+0x69b/0xb00
2) ksoftirqd/1/6 is blocked on this lock:
[c03c8900] {r:2,a:-1,ptype_lock}
.. held by: ksoftirqd/0/ 3 [dffe8020, 0]
... acquired at: net_rx_action+0x8e/0x200
------------------------------
| showing all locks held by: | (ksoftirqd/0/3 [dffe8020, 0]):
------------------------------
#001: [d84a7c30] {r:0,a:-1,&tp->rx_lock}
... acquired at: rtl8139_poll+0x48/0x180 [8139too]
------------------------------
| showing all locks held by: | (ksoftirqd/1/6 [dff866f0, 0]):
------------------------------
#001: [cb4640c0] {r:0,a:-1,&((sk)->sk_lock.slock)}
... acquired at: tcp_delack_timer+0x22/0x220
Appears that both were working on network operations concurrently.
Will send the full serial console log separately.
--Mark
* [email protected] <[email protected]> wrote:
> >does the ping phenomenon go away if you chrt both the networking IRQ
> >thread and both ksoftirqd's to above the RT task's priority?
>
> For the most part, yes. I reran the test with -V0.7.7 and had
> continuous ping responses until the system locked up with yet another
> deadlock. This did NOT fix the display / mouse movement lockups. All
> IRQ and ksoftirqd tasks were RT 99 priority for this test. latencytest
> ran at RT 30 priority.
what priority does events/0 and events/1 have? keventd handles part of
the mouse/keyboard workload.
> The deadlock was between the two ksoftirqd tasks...
there was one place missing - does the patch below fix this type of
deadlock?
Ingo
--- linux/net/ipv4/tcp_timer.c.orig2
+++ linux/net/ipv4/tcp_timer.c
@@ -208,6 +208,7 @@ static void tcp_delack_timer(unsigned lo
struct sock *sk = (struct sock*)data;
struct tcp_opt *tp = tcp_sk(sk);
+ rcu_read_lock_read(&ptype_lock);
bh_lock_sock(sk);
if (sock_owned_by_user(sk)) {
/* Try again later. */
@@ -261,6 +262,7 @@ out:
sk_stream_mem_reclaim(sk);
out_unlock:
bh_unlock_sock(sk);
+ rcu_read_unlock_read(&ptype_lock);
sock_put(sk);
}
* [email protected] <[email protected]> wrote:
> >does the ping phenomenon go away if you chrt both the networking IRQ
> >thread and both ksoftirqd's to above the RT task's priority?
>
> For the most part, yes. I reran the test with -V0.7.7 and had
> continuous ping responses until the system locked up with yet another
> deadlock. This did NOT fix the display / mouse movement lockups. All
> IRQ and ksoftirqd tasks were RT 99 priority for this test. latencytest
> ran at RT 30 priority.
another method would be to set all smp_affinity values in /proc/irq/*/
to 1 (i.e. let CPU#0 handle all IRQs), and start latencytest on CPU#1,
via 'taskset'. In theory this should ensure that no hardirq workload
runs on CPU#1 and thus ksoftirqd would not be active there either. (with
the exception of kernel timers started on that CPU, by latencytest.)
Ingo
>what priority does events/0 and events/1 have? keventd handles part of
>the mouse/keyboard workload.
The default priorities and not RT.
ps -eo pid,pri,rtprio,cmd
...
6 34 - [events/0]
7 34 - [events/1]
...
I can set those as well but then I'd probably have to follow with
the X server and everything else in the chain. The starvation problem
ripples across the system.
Will try the patch shortly and get back on the results later today.
--Mark H Johnson
<mailto:[email protected]>
* [email protected] <[email protected]> wrote:
> >what priority does events/0 and events/1 have? keventd handles part of
> >the mouse/keyboard workload.
> The default priorities and not RT.
>
> ps -eo pid,pri,rtprio,cmd
> ...
> 6 34 - [events/0]
> 7 34 - [events/1]
> ...
> I can set those as well but then I'd probably have to follow with
> the X server and everything else in the chain. The starvation problem
> ripples across the system.
X should be scheduled on the other CPU just fine. Only per-CPU kernel
threads (which are affine to their particular CPU) are affected by this
problem - ordinary tasks not. I.e. the system threads that have /0 and
/1 in their name. In theory you should not even need to chrt the hardirq
threads, those should schedule fine too.
Ingo
* Ingo Molnar <[email protected]> wrote:
> X should be scheduled on the other CPU just fine. Only per-CPU kernel
> threads (which are affine to their particular CPU) are affected by
> this problem - ordinary tasks not. I.e. the system threads that have
> /0 and /1 in their name. In theory you should not even need to chrt
> the hardirq threads, those should schedule fine too.
plus there's the 'priority inheritance dependency-chain closure' bug
noticed by John Cooper - that should only affect the latency of RT tasks
though.
Ingo
>> I can set those as well but then I'd probably have to follow with
>> the X server and everything else in the chain. The starvation problem
>> ripples across the system.
>
>X should be scheduled on the other CPU just fine. Only per-CPU kernel
>threads (which are affine to their particular CPU) are affected by this
>problem - ordinary tasks not. I.e. the system threads that have /0 and
>/1 in their name. In theory you should not even need to chrt the hardirq
>threads, those should schedule fine too.
Perhaps "should be fine" but the test I just ran indicates otherwise.
The kernel is -V0.7.7 plus the patch you just sent me.
All IRQ and /# tasks were set to RT priority 99.
Started the X test and the display locked up almost immediately while
ping responses continued to flow on a regular basis. After several seconds
I could see the display update / move the mouse & then the display locked
up again. It went back and forth a couple cycles and did not get unstuck
until the RT audio application quit (over 250000 samples).
I will let it run to see if I can reproduce the deadlock or if the symptoms
change with one of the other tests.
--Mark H Johnson
<mailto:[email protected]>
>Perhaps "should be fine" but the test I just ran indicates otherwise.
>The kernel is -V0.7.7 plus the patch you just sent me.
>All IRQ and /# tasks were set to RT priority 99.
A follow up to my previous message since the test just completed
with the following results:
2.6.10-rc1-mm2-RT-V0.7.7
181 packets lost (5%)
0.343, 2525.872, 213979, 21685 (min, average, max, deviation)
elapsed time was 3090 seconds
There was a burst of lost packets between seq 1777 and 1959,
that appears to cover all the lost ones. There was also a big
delay (up to 27305 msec) at seq 1665 through 1699 but no lost
packets. If I throw out those data points, the max drops to
something like 1800 msec and the average is in the 0.4 to 0.5
msec range.
The display lockup on the top test was a little odd. The window
that should have shown top appeared blank, stayed on the screen
for several seconds and then disappeared by itself. Apparently
top didn't even get enough CPU time to display the first cycle
before its time ran out. The audio test continued to run a while
after than & then stopped itself when the script noticed that top
was done.
The display lockups on the other tests (network and disk I/O)
were much less severe though still present at times.
--Mark H Johnson
<mailto:[email protected]>
Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
>
>
>>X should be scheduled on the other CPU just fine. Only per-CPU kernel
>>threads (which are affine to their particular CPU) are affected by
>>this problem - ordinary tasks not. I.e. the system threads that have
>>/0 and /1 in their name. In theory you should not even need to chrt
>>the hardirq threads, those should schedule fine too.
>
>
> plus there's the 'priority inheritance dependency-chain closure' bug
> noticed by John Cooper - that should only affect the latency of RT tasks
> though.
This is a fairly gnarly problem to address. The obvious
solution is to hold spinlocks in the mutexes as the dependency
tree is atomically traversed. However this will deadlock under
MP due to the unpredictable order of mutexes traversed. If the
dependency chain is not traversed (and semantics applied)
atomically, races exist which cause promotion decisions to be
made on [now] stale data.
The simple solution is a global spinlock which doesn't scale
well under MP. Another possible solution would be conditional
traversal of the chain where contention within the chain under
foot (from another chain walker) causes the traversal to
abort and retry. Though this has the down side of being
nondeterministic.
--
[email protected]
>there was one place missing - does the patch below fix this type of
>deadlock?
A different deadlock this time, same two actors but apparently a
different pair of locks.
Will send the full console log shortly.
--Mark
===============================================
BUG: circular semaphore deadlock detected!
-----------------------------------------------
ksoftirqd/1/6 is deadlocking current task ksoftirqd/0/3
1) ksoftirqd/0/3 is trying to acquire this lock:
[dfb5c8a4] {r:0,a:-1,&n->lock}
.. held by: ksoftirqd/1/ 6 [dff886f0, 0]
... acquired at: arp_solicit+0x167/0x230
... trying at: neigh_update+0x2a/0x390
2) ksoftirqd/1/6 is blocked on this lock:
[c03c8900] {r:1,a:-1,ptype_lock}
.. held by: ksoftirqd/0/ 3 [dffe8020, 0]
... acquired at: net_rx_action+0x8e/0x200
------------------------------
| showing all locks held by: | (ksoftirqd/0/3 [dffe8020, 0]):
------------------------------
#001: [d9044c30] {r:0,a:-1,&tp->rx_lock}
... acquired at: rtl8139_poll+0x48/0x180 [8139too]
------------------------------
| showing all locks held by: | (ksoftirqd/1/6 [dff886f0, 0]):
------------------------------
#001: [dfb5c8a4] {r:0,a:-1,&n->lock}
... acquired at: arp_solicit+0x167/0x230
* john cooper <[email protected]> wrote:
> > plus there's the 'priority inheritance dependency-chain closure' bug
> > noticed by John Cooper - that should only affect the latency of RT
> > tasks though.
>
> This is a fairly gnarly problem to address. The obvious solution is
> to hold spinlocks in the mutexes as the dependency tree is atomically
> traversed. However this will deadlock under MP due to the
> unpredictable order of mutexes traversed. If the dependency chain is
> not traversed (and semantics applied) atomically, races exist which
> cause promotion decisions to be made on [now] stale data.
is the order of locks in the dependency chain really unpredictable? If
two chain walkers get two locks in opposite order, doesnt that mean that
the lock ordering (as attempted by the blocked tasks) is deadlock-prone
already? I.e. this scenario should not happen.
Ingo
* [email protected] <[email protected]> wrote:
> 1) ksoftirqd/0/3 is trying to acquire this lock:
> [dfb5c8a4] {r:0,a:-1,&n->lock}
> .. held by: ksoftirqd/1/ 6 [dff886f0, 0]
> ... acquired at: arp_solicit+0x167/0x230
> ... trying at: neigh_update+0x2a/0x390
>
> 2) ksoftirqd/1/6 is blocked on this lock:
> [c03c8900] {r:1,a:-1,ptype_lock}
> .. held by: ksoftirqd/0/ 3 [dffe8020, 0]
> ... acquired at: net_rx_action+0x8e/0x200
this is a weird one. Note how ptype_lock is not shown to be owned by
ksoftirqd/0/3:
> ------------------------------
> | showing all locks held by: | (ksoftirqd/0/3 [dffe8020, 0]):
> ------------------------------
>
> #001: [d9044c30] {r:0,a:-1,&tp->rx_lock}
> ... acquired at: rtl8139_poll+0x48/0x180 [8139too]
neither does ptype_lock show up in the other logs you sent.
Ingo
Ingo Molnar wrote:
>i have released the -V0.7.1 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this release is mainly a merge of -V0.6.9 to 2.6.10-rc2-mm2.
>
>I havent done a proper changelog for a couple of days so here is a list
>of bigger changes since -V0.4:
>
> - implemented a first version of the priority inheritance handling and
> priority inversion avoidance logic. This feature, after some initial
> stability problems, solved the jackd and rtc_wakeup latencies that
> were introduced by the ultra-finegrained locking in the -V series.
>
> (the -T/U series had a coarser locking scheme triggered much lower
> levels of priority inversion scenarios. The locking in the -V series
> was clearly the tipping point.)
>
> The new PI code covers all synchronization objects in Linux (on
> PREEMPT_REALTIME): spinlocks, rwlocks, semaphores and rwsems.
> Feedback on the design of this code would be welcome, and patches as
> well, if you have a better scheme. The code is pretty modular so feel
> free to experiment with alternative schemes.
>
> - completely reworked the debugging framework. All lock types
> (spinlocks, rwlocks, semaphores and rwsems) are now tracked, both
> their symbolic name and their place of acquire are traced and printed
> out upon detection of a deadlock. More and better information is
> printed upon a deadlock. Got rid of the 'semaphore owners array' in
> debugging mode, this reduces the footprint of semaphores quite
> significantly and speeds up deadlock detection.
>
> - got rid of the separate 'counted semaphores' implementation, it was
> too intrusive. Made the core 'generic semaphores' implementation
> compatible with vanilla Linux counted semaphore semantics. This also
> enabled the unrolling of the completion-handling cleanups which,
> while being very nice, were getting intrusive as well.
>
> - countless build and driver related reports/fixes from lots of people
>
> - more latency breaks in the remaining critical sections. A
> particularly important one was the irqs-off latency bugfix from
> Thomas Gleixner.
>
> - sped up the i8259 PIC and the PIT timer hardirq handling routines -
> these are now in the path of the longest latency.
>
> - cleaned up IRQ and signal preemption - there were missed
> check-rescheds and possibilities for IRQ recursion.
>
> - made ALSA's ioctl()s not use the BKL - this fixes more jackd
> latencies.
>
>to create a -V0.7.1 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm2/2.6.10-rc1-mm2.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm2-V0.7.1
>
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
Hey,
I get a lock up with my wireless pcmcia cisco card. When i try to run to
dhcpcd command or iwconfig it just hangs. Also when
i insert my card at boot time it hangs when running the net init
scripts. I had this with version V0.7.8 and V0.7.10, have tested any
other RT patches, i didn't have this problem with VP-T3. Also i can now
mount and use my reiser4 partition.
> Ingo Molnar wrote:
>> - implemented a first version of the priority inheritance handling and
>> priority inversion avoidance logic. This feature, after some initial
>> stability problems, solved the jackd and rtc_wakeup latencies that
>> were introduced by the ultra-finegrained locking in the -V series.
How does this play with Inaky's priority inheritance patch? Could they be
combined somehow?
Chris
Ingo Molnar wrote:
> * john cooper <[email protected]> wrote:
>
>
>>>plus there's the 'priority inheritance dependency-chain closure' bug
>>>noticed by John Cooper - that should only affect the latency of RT
>>>tasks though.
>>
>>This is a fairly gnarly problem to address. The obvious solution is
>>to hold spinlocks in the mutexes as the dependency tree is atomically
>>traversed. However this will deadlock under MP due to the
>>unpredictable order of mutexes traversed. If the dependency chain is
>>not traversed (and semantics applied) atomically, races exist which
>>cause promotion decisions to be made on [now] stale data.
>
>
> is the order of locks in the dependency chain really unpredictable? If
> two chain walkers get two locks in opposite order, doesnt that mean that
> the lock ordering (as attempted by the blocked tasks) is deadlock-prone
> already? I.e. this scenario should not happen.
There does appear to be hope here. If the per-task mutex ownership
list is maintained in strict order of acquisition sequence and
reader-mutex acquisition sequence is policed this would seem to remove
the possibly of chain traversal deadlock.
As an implementation note, single-owner hard spinlocks seem
excessive for the chain walk. An approach allowing maximum
concurrency during traversal would be a reader-reference acquired
per node during the walk which would need to upgrade to an exclusive
writer-reference to effect promotion (waiter list priority reorder),
and then downgrade to a reader-reference to continue the traversal.
-john
--
[email protected]
I'm getting this on my athlon64 test machine with V0.7.11:
Freeing unused kernel memory: 208k freed
SELinux: Disabled at runtime.
SELinux: Unregistering netfilter hooks
IRQ#8 thread RT prio: 44.
ACPI: Power Button (FF) [PWRF]
ACPI: Sleep Button (CM) [SLPB]
ibm_acpi: ec object not found
BUG: Unable to handle kernel paging request at virtual address f888d940
printing eip:
c0229153
*pde = 37f40067
Oops: 0002 [#1]
PREEMPT
Modules linked in: video(U) container(U) button(U) battery(U) ac(U)
ext3(U) jbd(U) raid5(U) xor(U) sata_via(U) sata_promise(U) libata(U)
sd_mod(U) scsi_mod(U)
CPU: 0
EIP: 0060:[<c0229153>] Not tainted VLI
EFLAGS: 00010286 (2.6.9-1.520.1rV0.7.11.ll.rhfc2.ccrma)
EIP is at acpi_bus_register_driver+0x40/0x63
eax: f888d940 ebx: f88960c0 ecx: c013ef0b edx: 00000002
esi: 00000000 edi: 00000000 ebp: f68a3f94 esp: f68a3f8c
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process insmod (pid: 1012, threadinfo=f68a2000 task=f688a000)
Stack: c03ba660 f8896680 f68a3fa0 f887e039 f88960c0 f68a3fbc c0144ec9
c049a5a4
00000001 f8896680 0807a018 00000000 f68a2000 c01072a1 0807a018
00004ad4
0807a008 00000000 00000000 bfffced8 00000080 0000007b 0000007b
00000080
Call Trace:
[<c010811f>] show_stack+0x8f/0xb0 (28)
[<c01082da>] show_registers+0x16a/0x1d0 (56)
[<c01084f1>] die+0xf1/0x190 (64)
[<c011de39>] do_page_fault+0x369/0x660 (216)
[<c0107d21>] error_code+0x2d/0x38 (76)
[<f887e039>] acpi_video_init+0x39/0x59 [video] (12)
[<c0144ec9>] sys_init_module+0x169/0x220 (28)
[<c01072a1>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c010843e>] .... die+0x3e/0x190
.....[<c011de39>] .. ( <= do_page_fault+0x369/0x660)
.. [<c0140c9d>] .... print_traces+0x1d/0x60
.....[<c010811f>] .. ( <= show_stack+0x8f/0xb0)
Code: b8 ed ff ff ff eb 42 85 db b8 ea ff ff ff 74 39 68 60 a6 3b c0 e8
de 5d f1 ff c7 03 1c a7 3b c0 a1 20 a7 3b c0 89 1d 20 a7 3b c0 <89> 18
89 43 04 c7 04 24 60 a6 3b c0 e8 7c 5e f1 ff 58 89 5d 08
insmod:1012 BUG: lock held at task exit time!
[c03ba660] {drivers/acpi/scan.c:27}
.. held by: insmod/ 1012 [f688a000, 118]
... acquired at: acpi_bus_register_driver+0x2f/0x63
ACPI: PCI interrupt 0000:00:10.4[C] -> GSI 21 (level, low) -> IRQ 21
ehci_hcd 0000:00:10.4: EHCI Host Controller
ehci_hcd 0000:00:10.4: irq 21, pci mem 0xcfffed00
ehci_hcd 0000:00:10.4: new USB bus registered, assigned bus number 1
ehci_hcd 0000:00:10.4: USB 2.0 enabled, EHCI 1.00, driver 2004-May-10
The machine can boot single user. Kudzu hangs (but I can reboot with
ctrl-alt-del), if I disable kudzu it does not make it to X. Apparently
hangs, cannot reboot of do anything. With "acpi=off" it hangs after
(apparently) loading sata_via, responds to ctrl-alt-del.
-- Fernando
* Fernando Pablo Lopez-Lezcano <[email protected]> wrote:
> I'm getting this on my athlon64 test machine with V0.7.11:
> EIP is at acpi_bus_register_driver+0x40/0x63
does vanilla 2.6.10-rc1-mm2 boot fine?
Ingo
On Thu, Nov 04, 2004 at 08:44:16PM +0100, Ingo Molnar wrote:
>
> * john cooper <[email protected]> wrote:
> >
> > This is a fairly gnarly problem to address. The obvious solution is
> > to hold spinlocks in the mutexes as the dependency tree is atomically
> > traversed. However this will deadlock under MP due to the
> > unpredictable order of mutexes traversed. If the dependency chain is
> > not traversed (and semantics applied) atomically, races exist which
> > cause promotion decisions to be made on [now] stale data.
>
> is the order of locks in the dependency chain really unpredictable? If
> two chain walkers get two locks in opposite order, doesnt that mean that
> the lock ordering (as attempted by the blocked tasks) is deadlock-prone
> already? I.e. this scenario should not happen.
It *shouldn't*, but bugs do happen, and it'd be nice if a mutex
deadlock didn't get promoted into a less debuggable spinlock
deadlock. Plus, if there's any intention of ever exporting this
priority inheritance mechanism to userspace locks, we don't want to
promote a userspace deadlock into a kernel one.
Given how rarely contention should occur, I don't think that a single
lock would be a bottleneck except for obscenely large SMP machines.
-Scott
On Fri, Nov 05, 2004 at 04:42:38PM -0500, Scott Wood wrote:
> On Thu, Nov 04, 2004 at 08:44:16PM +0100, Ingo Molnar wrote:
> > is the order of locks in the dependency chain really unpredictable? If
> > two chain walkers get two locks in opposite order, doesnt that mean that
> > the lock ordering (as attempted by the blocked tasks) is deadlock-prone
> > already? I.e. this scenario should not happen.
>
> It *shouldn't*, but bugs do happen, and it'd be nice if a mutex
> deadlock didn't get promoted into a less debuggable spinlock
> deadlock. Plus, if there's any intention of ever exporting this
> priority inheritance mechanism to userspace locks, we don't want to
> promote a userspace deadlock into a kernel one.
>
> Given how rarely contention should occur, I don't think that a single
> lock would be a bottleneck except for obscenely large SMP machines.
Places that are surround by rcu locks are possibilities that could
hit the kind of contention. There's numerous places in the kernel
that use it, but nothing can be said until there's stats on how these
things content against each other, dcache_lock, I believe and others.
I think of an RT kernel with N threads in terms where it's an SMP
machine with same N number of processors. If you have N threads
pounding on the same critical sections, it's effectively N physical
processors hitting as well.
Correct me if I'm wrong, vague, etc... but that's how I understand
the problem and that's how I think it should be addressed. Ideally,
the kernel should be so efficient that these collision should never
happen and the use of priority propagation should be very shallow
to prevent irregularities with scheduling resulting from priority
boosting lock chains. Only a statically gathering of how this system
behaves will show the technical direction that this project should
direct itself.
BTW, we're working getting a single super-mutex that right now
internally that can possibly be used for proper priority propagation,
for all blocking type locks. Hopefully, with testing we'll see how
well it and the rest of the kernel performs with it. Correctness is
is the highest priority, but overall behavior of the system is very
important and should be next in ranking IMO.
bill
* Scott Wood <[email protected]> wrote:
> > is the order of locks in the dependency chain really unpredictable? If
> > two chain walkers get two locks in opposite order, doesnt that mean that
> > the lock ordering (as attempted by the blocked tasks) is deadlock-prone
> > already? I.e. this scenario should not happen.
>
> It *shouldn't*, but bugs do happen, and it'd be nice if a mutex
> deadlock didn't get promoted into a less debuggable spinlock deadlock.
> [...]
well, deadlock detection happens at lock-acquire time, so the deadlock
will be detected _first_, any PI spinlock-locking will happen on already
blocked (== no deadlock detected) tasks. This would also serve as a nice
secondary check for the deadlock detector.
> [...] Plus, if there's any intention of ever exporting this priority
> inheritance mechanism to userspace locks, we don't want to promote a
> userspace deadlock into a kernel one.
agreed.
> Given how rarely contention should occur, I don't think that a single
> lock would be a bottleneck except for obscenely large SMP machines.
well, blocking on a mutex happens quite frequently. But i dont have a
problem with the big lock other than the usual "if we can do better then
we should do better" attitude :-)
Ingo
i have released the -V0.7.18 Real-Time Preemption patch, which can be
downloaded from:
http://redhat.com/~mingo/realtime-preempt/
this release includes fixes and cleanups.
Changes between -V0.7.12 and -V0.7.18:
- merged to 2.6.10-rc1-mm3
- fixed the e1000 xmit warnings reported by Amit Shah. Same fix for tg3
too.
- added irq-latency fix from Thomas Gleixner: re-enable interrupts
before adding timings to the entropy pool.
- fixed excessive ksoftirqd overhead during outgoing TCP traffic.
ksoftirqd kept getting re-woken while it had no way to progress.
- added upstream fix from Andi Kleen for the vmalloc bug causing module
load problems/crashes. Added ipc/shm fix too.
Changes between -V0.7.1 and -V0.7.12:
- big source level cleanups: completely rearranged the mutex type
definitions and source files, to make it reflect the code. Made
all locking objects based on a new, central lock type: struct
rt_mutex. This type is never exposed externally, it is internal to
the RT code. Unified all the RT locking code in kernel/rt.c, this
also simplified and sped things up. Undid collateral damage to the
generic rwsem code - it is now untouched and independent of the RT
code.
- rearranged the way spinlocks interact with the RT code and cleaned
up the RT spinlock definitions. Found and fixed a bug in the process:
rwlocks were dropping the BKL upon contention.
- small x86 speedup: call __schedule not preempt_schedule_irq from
work_resched.
- ported PREEMPT_RT to x64. This resulted in the generalization of some
of the x86 changes.
- hopefully fixed fbcon kernel logging
- hacked reiser4 to make it work on PREEMPT_REALTIME.
- dropped the swap-layout-improvements patch. While it was working fine
it's not necessary for latencies anymore under the PREEMPT_REALTIME
approach, and the swap-patch was getting intrusive.
- fixed preemption-bug in drain_cpu_caches() on SMP [bug introduced by
PREEMPT_REALTIME]
- new attempt at getting rid of the networking related deadlocks
- selinux deadlock fix and RCU-code conversion to RT semantics
to create a -V0.7.18 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.18
Ingo
Ingo Molnar wrote:
>i have released the -V0.7.18 Real-Time Preemption patch, which can be
>downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this release includes fixes and cleanups.
>
>Changes between -V0.7.12 and -V0.7.18:
>
> - merged to 2.6.10-rc1-mm3
>
> - fixed the e1000 xmit warnings reported by Amit Shah. Same fix for tg3
> too.
>
> - added irq-latency fix from Thomas Gleixner: re-enable interrupts
> before adding timings to the entropy pool.
>
> - fixed excessive ksoftirqd overhead during outgoing TCP traffic.
> ksoftirqd kept getting re-woken while it had no way to progress.
>
> - added upstream fix from Andi Kleen for the vmalloc bug causing module
> load problems/crashes. Added ipc/shm fix too.
>
>Changes between -V0.7.1 and -V0.7.12:
>
> - big source level cleanups: completely rearranged the mutex type
> definitions and source files, to make it reflect the code. Made
> all locking objects based on a new, central lock type: struct
> rt_mutex. This type is never exposed externally, it is internal to
> the RT code. Unified all the RT locking code in kernel/rt.c, this
> also simplified and sped things up. Undid collateral damage to the
> generic rwsem code - it is now untouched and independent of the RT
> code.
>
> - rearranged the way spinlocks interact with the RT code and cleaned
> up the RT spinlock definitions. Found and fixed a bug in the process:
> rwlocks were dropping the BKL upon contention.
>
> - small x86 speedup: call __schedule not preempt_schedule_irq from
> work_resched.
>
> - ported PREEMPT_RT to x64. This resulted in the generalization of some
> of the x86 changes.
>
> - hopefully fixed fbcon kernel logging
>
> - hacked reiser4 to make it work on PREEMPT_REALTIME.
>
> - dropped the swap-layout-improvements patch. While it was working fine
> it's not necessary for latencies anymore under the PREEMPT_REALTIME
> approach, and the swap-patch was getting intrusive.
>
> - fixed preemption-bug in drain_cpu_caches() on SMP [bug introduced by
> PREEMPT_REALTIME]
>
> - new attempt at getting rid of the networking related deadlocks
>
> - selinux deadlock fix and RCU-code conversion to RT semantics
>
>to create a -V0.7.18 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.18
>
> Ingo
>-
>To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
>the body of a message to [email protected]
>More majordomo info at http://vger.kernel.org/majordomo-info.html
>Please read the FAQ at http://www.tux.org/lkml/
>
>
>
Got a bug:
Assertion failure in __journal_unfile_buffer() at
fs/jbd/transaction.c:1447: "jbd_is_locked_bh_state(bh)"
BUG at fs/jbd/transaction.c:1447!
------------[ cut here ]------------
kernel BUG at fs/jbd/transaction.c:1447!
invalid operand: 0000 [#1]
PREEMPT
Modules linked in: reiser4 radeon airo_cs airo ohci_hcd ehci_hcd 8139cp
mii ohci1394 ieee1394 snd_intel8x0 snd_ac97_codec usbhid uhci_hcd
intel_agp agpgart snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd usbcore
vfat fat
CPU: 0
EIP: 0060:[<c01d69d2>] Not tainted VLI
EFLAGS: 00010246 (2.6.10-rc1-mm3-RT-V0.7.18)
EIP is at __journal_unfile_buffer+0x102/0x230
eax: 00000022 ebx: c0ef211c ecx: c0378081 edx: cf951d7c
esi: cb2185c4 edi: cb2185c4 ebp: 00000000 esp: cf951d78
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process kjournald (pid: 6286, threadinfo=cf950000 task=cf941320)
Stack: c0378081 c037f479 000005a7 c0386db4 c036b2a3 c037f479 000005a7
c037f5e4
c0ef211c cb2185c4 00000000 cb299580 c01d7aec c0ef211c cf651250
00000001
cb2995b8 cf651014 cf950000 cf950000 c0137f43 00000000 00000000
00000000
Call Trace:
[<c01d7aec>] journal_commit_transaction+0x2dc/0x12a0 (52)
[<c0137f43>] __up_mutex+0x293/0x470 (32)
[<c011ccf0>] recalc_task_prio+0xd0/0x210 (28)
[<c011ce8b>] activate_task+0x5b/0x90 (40)
[<c013947a>] trace_start_sched_wakeup+0xca/0xf0 (16)
[<c01daf39>] kjournald+0xe9/0x260 (284)
[<c0136720>] autoremove_wake_function+0x0/0x40 (32)
[<c0136720>] autoremove_wake_function+0x0/0x40 (32)
[<c011d3de>] finish_task_switch+0x3e/0xb0 (20)
[<c012a4c6>] __mod_timer+0x36/0x1c0 (48)
[<c01dae30>] commit_timeout+0x0/0x10 (24)
[<c01dae50>] kjournald+0x0/0x260 (12)
[<c01042b5>] kernel_thread_helper+0x5/0x10 (16)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c0106971>] .... die+0x31/0x180
.....[<00000000>] .. ( <= 0x0)
.. [<c013960c>] .... print_traces+0xc/0x40
.....[<00000000>] .. ( <= 0x0)
Code: 05 00 00 68 79 f4 37 c0 68 a3 b2 36 c0 68 b4 6d 38 c0 e8 a2 b4 f4
ff 68 a7 05 00 00 68 79 f4 37 c0 68 81 80 37 c0 e8 8e b4 f4 ff <0f> 0b
a7 05 79 f4 37 c0 83 c4 20 e9 07 ff ff ff 8d b4 26 00 00
kjournald:6286 BUG: lock held at task exit time!
[cf651250] {&journal->j_list_lock}
.. held by: kjournald/ 6286 [cf941320, 116]
... acquired at: journal_commit_transaction+0x2a9/0x12a0
BUG at kernel/timer.c:166!
------------[ cut here ]------------
kernel BUG at kernel/timer.c:166!
invalid operand: 0000 [#2]
PREEMPT
Modules linked in: reiser4 radeon airo_cs airo ohci_hcd ehci_hcd 8139cp
mii ohci1394 ieee1394 snd_intel8x0 snd_ac97_codec usbhid uhci_hcd
intel_agp agpgart snd_seq_oss snd_seq_midi_event snd_seq snd_seq_device
snd_pcm_oss snd_pcm snd_timer snd_page_alloc snd_mixer_oss snd usbcore
vfat fat
CPU: 0
EIP: 0060:[<c012a5f2>] Not tainted VLI
EFLAGS: 00010296 (2.6.10-rc1-mm3-RT-V0.7.18)
EIP is at __mod_timer+0x162/0x1c0
eax: 0000001b ebx: 00000000 ecx: c0378081 edx: c13ddcc8
esi: cf951f90 edi: cb299480 ebp: cf651014 esp: c13ddcc4
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process pdflush (pid: 165, threadinfo=c13dc000 task=c13db970)
Stack: c0378081 c037a8eb 000000a6 00000000 c037f48e cb299480 cf651000
cb299480
cf651014 c01d4c37 cf951f90 0008ec57 00000000 cf651000 c01d4d60
cf651000
cb299480 cf651014 c13e7780 c13e77c4 c0138573 c13e77c4 00000001
c0149843
Call Trace:
[<c01d4c37>] get_transaction+0x67/0xd0 (40)
[<c01d4d60>] start_this_handle+0xc0/0x3e0 (20)
[<c0138573>] up_mutex+0x23/0xa0 (24)
[<c0149843>] cache_grow+0x133/0x250 (12)
[<c0137f43>] __up_mutex+0x293/0x470 (36)
[<c0149e5b>] kmem_cache_alloc+0x4b/0xd0 (16)
[<c0149e5b>] kmem_cache_alloc+0x4b/0xd0 (12)
[<c0138573>] up_mutex+0x23/0xa0 (12)
[<c0149e5b>] kmem_cache_alloc+0x4b/0xd0 (12)
[<c0149e5b>] kmem_cache_alloc+0x4b/0xd0 (12)
[<c01d5154>] journal_start+0x94/0xc0 (32)
[<c01c7dd5>] ext3_ordered_writepage+0x65/0x190 (24)
[<c0183fc1>] mpage_writepages+0x261/0x410 (28)
[<c01c7d70>] ext3_ordered_writepage+0x0/0x190 (24)
[<c0137f43>] __up_mutex+0x293/0x470 (52)
[<c0182063>] __sync_single_inode+0x53/0x280 (16)
[<c01466f9>] do_writepages+0x39/0x40 (28)
[<c018206f>] __sync_single_inode+0x5f/0x280 (16)
[<c01822be>] __writeback_single_inode+0x2e/0x140 (36)
[<c036483d>] __down_mutex+0xed/0x390 (24)
[<c036483d>] __down_mutex+0xed/0x390 (12)
[<c0364a04>] __down_mutex+0x2b4/0x390 (16)
[<c0182569>] generic_sync_sb_inodes+0x199/0x300 (40)
[<c01827cc>] writeback_inodes+0xcc/0xe0 (48)
[<c0146f80>] pdflush+0x0/0x30 (24)
[<c01464d3>] wb_kupdate+0x93/0x110 (4)
[<c0146e78>] __pdflush+0xb8/0x1c0 (56)
[<c0146e80>] __pdflush+0xc0/0x1c0 (36)
[<c0146f9e>] pdflush+0x1e/0x30 (20)
[<c0146440>] wb_kupdate+0x0/0x110 (20)
[<c03634f1>] schedule+0x31/0x100 (20)
[<c0136243>] kthread+0x83/0xc0 (8)
[<c01361c0>] kthread+0x0/0xc0 (20)
[<c01042b5>] kernel_thread_helper+0x5/0x10 (16)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c0106971>] .... die+0x31/0x180
.....[<00000000>] .. ( <= 0x0)
.. [<c013960c>] .... print_traces+0xc/0x40
.....[<00000000>] .. ( <= 0x0)
Code: e4 00 00 39 5e 4c 58 0f 84 6a ff ff ff 68 e0 32 3c c0 e9 27 ff ff
ff 68 a6 00 00 00 68 eb a8 37 c0 68 81 80 37 c0 e8 6e 78 ff ff <0f> 0b
a6 00 eb a8 37 c0 83 c4 0c e9 ab fe ff ff 68 a5 00 00 00
pdflush:165 BUG: lock held at task exit time!
[cf725840] {&s->s_umount}
.. held by: pdflush/ 165 [c13db970, 115]
... acquired at: writeback_inodes+0x4b/0xe0
pdflush:165 BUG: lock held at task exit time!
[cf651014] {&journal->j_state_lock}
.. held by: pdflush/ 165 [c13db970, 115]
... acquired at: start_this_handle+0x61/0x3e0
Ingo Molnar wrote:
>
> i have released the -V0.7.18 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this release includes fixes and cleanups.
>
Sorry. Build error for a non-debug configuration:
drivers/char/drm/drm_stub.c: In function `drm_fill_in_dev':
drivers/char/drm/drm_stub.c:60: error: parse error before '{' token
make[3]: *** [drivers/char/drm/drm_stub.o] Error 1
make[2]: *** [drivers/char/drm] Error 2
make[1]: *** [drivers/char] Error 2
make: *** [drivers] Error 2
If CONFIG_DEBUG_PREEMPT and/or CONFIG_RT_DEADLOCK_DETECT are set, the
RT-V0.7.18 build succeeds.
Not big deal. However, my personal benchmarks (jackd-R + 8*fluidsynth) are
being based on production-like kernel configurations (no kernel debug
options set), so I guess it will have to wait :)
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
On Sat, 2004-11-06 at 17:55 +0000, Rui Nuno Capela wrote:
>
> Sorry. Build error for a non-debug configuration:
>
> drivers/char/drm/drm_stub.c: In function `drm_fill_in_dev':
> drivers/char/drm/drm_stub.c:60: error: parse error before '{' token
> make[3]: *** [drivers/char/drm/drm_stub.o] Error 1
> make[2]: *** [drivers/char/drm] Error 2
> make[1]: *** [drivers/char] Error 2
> make: *** [drivers] Error 2
>
make that line look like:
spin_lock_init( &dev->count_lock );
> If CONFIG_DEBUG_PREEMPT and/or CONFIG_RT_DEADLOCK_DETECT are set, the
> RT-V0.7.18 build succeeds.
>
> Not big deal. However, my personal benchmarks (jackd-R + 8*fluidsynth) are
> being based on production-like kernel configurations (no kernel debug
> options set), so I guess it will have to wait :)
>
> Cheers.
--
Peter Zijlstra <[email protected]>
On Saturday 06 November 2004 18:17, Gunther Persoons wrote:
> Got a bug:
>
> Assertion failure in __journal_unfile_buffer() at
> fs/jbd/transaction.c:1447: "jbd_is_locked_bh_state(bh)"
> BUG at fs/jbd/transaction.c:1447!
<AOL>
Me too.
</AOL>
--
I route therefore you are
Ingo Molnar wrote:
>
> i have released the -V0.7.18 Real-Time Preemption patch, which can be
> downloaded from:
>
> http://redhat.com/~mingo/realtime-preempt/
>
I'm having trouble modprobe'ing the alsasound drivers ever since
RT-V0.7.11, to latest RT-V0.7.18, and I suspect this applies to -mm3 as
well.
The most evident trouble happens while unloading the modules, either on
shutdown or doing simple modprobe -r or rmmod. The system hangs
completely. On my P4/SMT machine it doesn't spit anything out to the
serial console. Nada. OTOH, on my P4/UP laptop, I found that it does
behave a litle more verbose, as the following syslog excerpt:
Nov 6 20:29:52 lambda alsa: Shutting down ALSA sound driver (version 1.0.6):
Nov 6 20:29:55 lambda kernel: usbcore: deregistering driver snd-usb-usx2y
Nov 6 20:29:55 lambda alsa: /etc/rc6.d/K70alsa: line 287: 15938
Segmentation fault /sbin/rmmod `echo $line | cut -d ' ' -f 1`
>/dev/null 2>&1
Nov 6 20:29:55 lambda kernel: BUG: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Nov 6 20:29:55 lambda kernel: printing eip:
Nov 6 20:29:55 lambda kernel: c012ae47
Nov 6 20:29:55 lambda kernel: *pde = 00000000
Nov 6 20:29:55 lambda kernel: Oops: 0000 [#1]
Nov 6 20:29:55 lambda kernel: PREEMPT
Nov 6 20:29:55 lambda kernel: Modules linked in: realtime commoncap
snd_usb_usx2y snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd_ali5451
snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore prism2_cs
p80211 pcmcia yenta_socket pcmcia_core natsemi crc32 loop subfs evdev
pl2303 usbserial ohci_hcd usbcore
Nov 6 20:29:55 lambda kernel: CPU: 0
Nov 6 20:29:55 lambda kernel: EIP: 0060:[__up_write+109/721] Not
tainted VLI
Nov 6 20:29:55 lambda kernel: EIP: 0060:[<c012ae47>] Not tainted VLI
Nov 6 20:29:55 lambda kernel: EFLAGS: 00010083 (2.6.10-rc1-mm2-RT-V0.7.7)
Nov 6 20:29:55 lambda kernel: EIP is at __up_write+0x6d/0x2d1
Nov 6 20:29:55 lambda kernel: eax: 00000000 ebx: d6cb8000 ecx:
00000064 edx: 00000064
Nov 6 20:29:55 lambda alsa: /etc/rc6.d/K70alsa: line 287: 15965
Segmentation fault /sbin/rmmod `echo $line | cut -d ' ' -f 1`
>/dev/null 2>&1
Nov 6 20:29:55 lambda kernel: esi: e011b5cc edi: ddd7ce30 ebp:
e0061e78 esp: d6cb9eb8
Nov 6 20:29:55 lambda kernel: ds: 007b es: 007b ss: 0068 preempt:
00000004
Nov 6 20:29:55 lambda kernel: Process rmmod (pid: 15938,
threadinfo=d6cb8000 task=ded203b0)
Nov 6 20:29:55 lambda kernel: Stack: c015258f df760280 00000008 c013baa4
ddc02028 00000246 d6cb8000 00000246
Nov 6 20:29:55 lambda kernel: df767400 00000000 d6cb8000 e011b5c8
e0061e68 e0061e78 c012b957 00000246
Nov 6 20:29:55 lambda kernel: ded203b0 d6cb8000 00000246 c02ff3c0
de3db4e8 e011b5e8 c030ac28 c01b1cef
Nov 6 20:29:55 lambda kernel: Call Trace:
Nov 6 20:29:55 lambda kernel: [invalidate_inode_buffers+26/240]
invalidate_inode_buffers+0x1a/0xf0 (4)
Nov 6 20:29:55 lambda kernel: [<c015258f>]
invalidate_inode_buffers+0x1a/0xf0 (4)
Nov 6 20:29:55 lambda kernel: [cache_flusharray+69/161]
cache_flusharray+0x45/0xa1 (12)
Nov 6 20:29:55 lambda kernel: [<c013baa4>] cache_flusharray+0x45/0xa1 (12)
Nov 6 20:29:55 lambda kernel: [up+80/173] up+0x50/0xad (44)
Nov 6 20:29:55 lambda kernel: [<c012b957>] up+0x50/0xad (44)
Nov 6 20:29:55 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:55 lambda kernel: [<c01b1cef>] kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:55 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 6 20:29:55 lambda kernel: [<c01b1cf1>] kobject_release+0x0/0x8 (8)
Nov 6 20:29:55 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 6 20:29:55 lambda kernel: [<c01b2597>] kref_put+0x51/0xc2 (12)
Nov 6 20:29:55 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:55 lambda kernel: [<c01f79d2>] bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:55 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 6 20:29:55 lambda kernel: [<c01f7d44>] driver_unregister+0xb/0x1a (8)
Nov 6 20:29:55 lambda kernel: [pg0+533397933/1069954048]
usb_deregister+0x31/0x3f [usbcore] (8)
Nov 6 20:29:55 lambda kernel: [<e004b1ad>] usb_deregister+0x31/0x3f
[usbcore] (8)
Nov 6 20:29:55 lambda kernel: [sys_delete_module+287/299]
sys_delete_module+0x11f/0x12b (20)
Nov 6 20:29:55 lambda kernel: [<c012d91f>] sys_delete_module+0x11f/0x12b
(20)
Nov 6 20:29:55 lambda kernel: [blk_queue_bounce+29/132]
blk_queue_bounce+0x1d/0x84 (20)
Nov 6 20:29:55 lambda kernel: [<c0140079>] blk_queue_bounce+0x1d/0x84 (20)
Nov 6 20:29:55 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(12)
Nov 6 20:29:55 lambda kernel: [<c0144f7a>] do_munmap+0x11a/0x176 (12)
Nov 6 20:29:55 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:55 lambda kernel: [<c014500e>] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:55 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:55 lambda kernel: [<c0103bc1>] sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:55 lambda kernel: Code: 75 0b 8b 46 04 85 c0 0f 84 71 01 00
00 b8 00 e0 ff ff 21 e0 83 40 14 01 8b 46 0c e8 d0 81 fe ff 8b 7e 0c 89 c2
8b 87 60 05 00 00 <8b> 08 0f 18 01 90 8d 9f 60 05 00 00 eb 10 8b 40 0c 39
d0 0f 4c
Nov 6 20:29:55 lambda kernel: <6>note: rmmod[15938] exited with
preempt_count 3
Nov 6 20:29:55 lambda kernel: BUG: scheduling while atomic:
rmmod/0x00000003/15938
Nov 6 20:29:55 lambda kernel: caller is do_exit+0x289/0x4b2
Nov 6 20:29:55 lambda kernel: [__schedule+1194/1525]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:55 lambda kernel: [<c02ac2fa>]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:55 lambda kernel: [exit_notify+1154/2290]
exit_notify+0x482/0x8f2 (24)
Nov 6 20:29:55 lambda kernel: [<c01187ba>] exit_notify+0x482/0x8f2 (24)
Nov 6 20:29:55 lambda kernel: [kmem_cache_free+72/197]
kmem_cache_free+0x48/0xc5 (24)
Nov 6 20:29:55 lambda kernel: [<c013bd6d>] kmem_cache_free+0x48/0xc5 (24)
Nov 6 20:29:55 lambda kernel: [do_exit+649/1202] do_exit+0x289/0x4b2 (32)
Nov 6 20:29:55 lambda kernel: [<c0118eb3>] do_exit+0x289/0x4b2 (32)
Nov 6 20:29:55 lambda kernel: [do_divide_error+0/330]
do_divide_error+0x0/0x14a (40)
Nov 6 20:29:55 lambda kernel: [<c0104d83>] do_divide_error+0x0/0x14a (40)
Nov 6 20:29:55 lambda kernel: [do_page_fault+920/1425]
do_page_fault+0x398/0x591 (64)
Nov 6 20:29:55 lambda kernel: [<c0111313>] do_page_fault+0x398/0x591 (64)
Nov 6 20:29:55 lambda kernel: [preempt_schedule+80/106]
preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:55 lambda kernel: [<c02ac5ab>] preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:55 lambda kernel: [queue_work+44/101] queue_work+0x2c/0x65 (20)
Nov 6 20:29:55 lambda kernel: [<c0125d66>] queue_work+0x2c/0x65 (20)
Nov 6 20:29:55 lambda kernel: [call_usermodehelper+274/316]
call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:55 lambda kernel: [<c0125caa>]
call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:55 lambda kernel: [__call_usermodehelper+0/72]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:55 lambda kernel: [<c0125b50>]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:55 lambda kernel: [do_page_fault+0/1425]
do_page_fault+0x0/0x591 (52)
Nov 6 20:29:55 lambda kernel: [<c0110f7b>] do_page_fault+0x0/0x591 (52)
Nov 6 20:29:55 lambda kernel: [error_code+45/56] error_code+0x2d/0x38 (8)
Nov 6 20:29:55 lambda kernel: [<c010461d>] error_code+0x2d/0x38 (8)
Nov 6 20:29:55 lambda kernel: [__up_write+109/721] __up_write+0x6d/0x2d1
(52)
Nov 6 20:29:55 lambda kernel: [<c012ae47>] __up_write+0x6d/0x2d1 (52)
Nov 6 20:29:55 lambda kernel: [invalidate_inode_buffers+26/240]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:55 lambda kernel: [<c015258f>]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:55 lambda kernel: [cache_flusharray+69/161]
cache_flusharray+0x45/0xa1 (12)
Nov 6 20:29:55 lambda kernel: [<c013baa4>] cache_flusharray+0x45/0xa1 (12)
Nov 6 20:29:55 lambda kernel: [up+80/173] up+0x50/0xad (44)
Nov 6 20:29:55 lambda kernel: [<c012b957>] up+0x50/0xad (44)
Nov 6 20:29:55 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:55 lambda kernel: [<c01b1cef>] kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:55 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 6 20:29:55 lambda kernel: [<c01b1cf1>] kobject_release+0x0/0x8 (8)
Nov 6 20:29:55 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 6 20:29:55 lambda kernel: [<c01b2597>] kref_put+0x51/0xc2 (12)
Nov 6 20:29:55 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:55 lambda kernel: [<c01f79d2>] bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:55 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 6 20:29:55 lambda kernel: [<c01f7d44>] driver_unregister+0xb/0x1a (8)
Nov 6 20:29:55 lambda kernel: [pg0+533397933/1069954048]
usb_deregister+0x31/0x3f [usbcore] (8)
Nov 6 20:29:55 lambda kernel: [<e004b1ad>] usb_deregister+0x31/0x3f
[usbcore] (8)
Nov 6 20:29:55 lambda kernel: [sys_delete_module+287/299]
sys_delete_module+0x11f/0x12b (20)
Nov 6 20:29:55 lambda kernel: [<c012d91f>] sys_delete_module+0x11f/0x12b
(20)
Nov 6 20:29:55 lambda kernel: [blk_queue_bounce+29/132]
blk_queue_bounce+0x1d/0x84 (20)
Nov 6 20:29:55 lambda kernel: [<c0140079>] blk_queue_bounce+0x1d/0x84 (20)
Nov 6 20:29:55 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(12)
Nov 6 20:29:55 lambda kernel: [<c0144f7a>] do_munmap+0x11a/0x176 (12)
Nov 6 20:29:55 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:55 lambda kernel: [<c014500e>] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:55 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:55 lambda kernel: [<c0103bc1>] sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:55 lambda kernel: ALI 5451 0000:00:06.0: Device was removed
without properly calling pci_disable_device(). This may need fixing.
Nov 6 20:29:55 lambda kernel: BUG: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Nov 6 20:29:55 lambda kernel: printing eip:
Nov 6 20:29:55 lambda kernel: c012ae47
Nov 6 20:29:55 lambda kernel: *pde = 00000000
Nov 6 20:29:55 lambda kernel: Oops: 0000 [#2]
Nov 6 20:29:55 lambda kernel: PREEMPT
Nov 6 20:29:55 lambda kernel: Modules linked in: realtime commoncap
snd_usb_usx2y snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd_ali5451
snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore prism2_cs
p80211 pcmcia yenta_socket pcmcia_core natsemi crc32 loop subfs evdev
pl2303 usbserial ohci_hcd usbcore
Nov 6 20:29:55 lambda kernel: CPU: 0
Nov 6 20:29:55 lambda kernel: EIP: 0060:[__up_write+109/721] Not
tainted VLI
Nov 6 20:29:55 lambda kernel: EIP: 0060:[<c012ae47>] Not tainted VLI
Nov 6 20:29:55 lambda kernel: EFLAGS: 00010083 (2.6.10-rc1-mm2-RT-V0.7.7)
Nov 6 20:29:55 lambda kernel: EIP is at __up_write+0x6d/0x2d1
Nov 6 20:29:55 lambda kernel: eax: 00000000 ebx: d6cb8000 ecx:
00000064 edx: 00000064
Nov 6 20:29:55 lambda kernel: esi: e00ef894 edi: ddd7ce30 ebp:
c0303198 esp: d6cb9ec4
Nov 6 20:29:55 lambda kernel: ds: 007b es: 007b ss: 0068 preempt:
00000004
Nov 6 20:29:55 lambda kernel: Process rmmod (pid: 15965,
threadinfo=d6cb8000 task=ded203b0)
Nov 6 20:29:55 lambda kernel: Stack: c015258f ded203b0 00000000 00000096
de536d84 00000246 d6cb8000 00000246
Nov 6 20:29:56 lambda kernel: df767400 00000000 d6cb8000 e00ef890
c0303188 c0303198 c012b957 00000246
Nov 6 20:29:56 lambda kernel: ded203b0 d6cb8000 00000246 c02ff3c0
deb50cc8 e00ef8b0 c030ac28 c01b1cef
Nov 6 20:29:56 lambda kernel: Call Trace:
Nov 6 20:29:56 lambda kernel: [invalidate_inode_buffers+26/240]
invalidate_inode_buffers+0x1a/0xf0 (4)
Nov 6 20:29:56 lambda kernel: [<c015258f>]
invalidate_inode_buffers+0x1a/0xf0 (4)
Nov 6 20:29:56 lambda kernel: [up+80/173] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [<c012b957>] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [<c01b1cef>] kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [<c01b1cf1>] kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [<c01b2597>] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [<c01f79d2>] bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [<c01f7d44>] driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [pci_unregister_driver+11/19]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [<c01b95e4>]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [sys_delete_module+287/299]
sys_delete_module+0x11f/0x12b (8)
Nov 6 20:29:56 lambda kernel: [<c012d91f>] sys_delete_module+0x11f/0x12b
(8)
Nov 6 20:29:56 lambda kernel: [unmap_vma_list+14/23]
unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [<c0144c7d>] unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(12)
Nov 6 20:29:56 lambda kernel: [<c0144f7a>] do_munmap+0x11a/0x176 (12)
Nov 6 20:29:56 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [<c014500e>] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda kernel: [<c0103bc1>] sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda kernel: Code: 75 0b 8b 46 04 85 c0 0f 84 71 01 00
00 b8 00 e0 ff ff 21 e0 83 40 14 01 8b 46 0c e8 d0 81 fe ff 8b 7e 0c 89 c2
8b 87 60 05 00 00 <8b> 08 0f 18 01 90 8d 9f 60 05 00 00 eb 10 8b 40 0c 39
d0 0f 4c
Nov 6 20:29:56 lambda kernel: <6>note: rmmod[15965] exited with
preempt_count 3
Nov 6 20:29:56 lambda kernel: BUG: scheduling while atomic:
rmmod/0x10000003/15965
Nov 6 20:29:56 lambda kernel: caller is __cond_resched+0x36/0x41
Nov 6 20:29:56 lambda kernel: [__schedule+1194/1525]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:56 lambda kernel: [<c02ac2fa>]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:56 lambda kernel: [__cond_resched+54/65]
__cond_resched+0x36/0x41 (80)
Nov 6 20:29:56 lambda kernel: [<c0113715>] __cond_resched+0x36/0x41 (80)
Nov 6 20:29:56 lambda kernel: [cond_resched+28/37]
cond_resched+0x1c/0x25 (12)
Nov 6 20:29:56 lambda kernel: [<c02acedb>] cond_resched+0x1c/0x25 (12)
Nov 6 20:29:56 lambda kernel: [unmap_vmas+374/385]
unmap_vmas+0x176/0x181 (8)
Nov 6 20:29:56 lambda kernel: [<c0140cff>] unmap_vmas+0x176/0x181 (8)
Nov 6 20:29:56 lambda kernel: [exit_mmap+79/268] exit_mmap+0x4f/0x10c (64)
Nov 6 20:29:56 lambda kernel: [<c01452df>] exit_mmap+0x4f/0x10c (64)
Nov 6 20:29:56 lambda kernel: [mmput+78/295] mmput+0x4e/0x127 (40)
Nov 6 20:29:56 lambda kernel: [<c01141d7>] mmput+0x4e/0x127 (40)
Nov 6 20:29:56 lambda kernel: [do_exit+304/1202] do_exit+0x130/0x4b2 (24)
Nov 6 20:29:56 lambda kernel: [<c0118d5a>] do_exit+0x130/0x4b2 (24)
Nov 6 20:29:56 lambda kernel: [do_divide_error+0/330]
do_divide_error+0x0/0x14a (40)
Nov 6 20:29:56 lambda kernel: [<c0104d83>] do_divide_error+0x0/0x14a (40)
Nov 6 20:29:56 lambda kernel: [do_page_fault+920/1425]
do_page_fault+0x398/0x591 (64)
Nov 6 20:29:56 lambda kernel: [<c0111313>] do_page_fault+0x398/0x591 (64)
Nov 6 20:29:56 lambda kernel: [preempt_schedule+80/106]
preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:56 lambda kernel: [<c02ac5ab>] preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:56 lambda kernel: [queue_work+44/101] queue_work+0x2c/0x65 (20)
Nov 6 20:29:56 lambda kernel: [<c0125d66>] queue_work+0x2c/0x65 (20)
Nov 6 20:29:56 lambda kernel: [call_usermodehelper+274/316]
call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:56 lambda kernel: [<c0125caa>]
call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:56 lambda kernel: [__call_usermodehelper+0/72]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:56 lambda kernel: [<c0125b50>]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:56 lambda kernel: [do_page_fault+0/1425]
do_page_fault+0x0/0x591 (52)
Nov 6 20:29:56 lambda kernel: [<c0110f7b>] do_page_fault+0x0/0x591 (52)
Nov 6 20:29:56 lambda kernel: [error_code+45/56] error_code+0x2d/0x38 (8)
Nov 6 20:29:56 lambda kernel: [<c010461d>] error_code+0x2d/0x38 (8)
Nov 6 20:29:56 lambda kernel: [__up_write+109/721] __up_write+0x6d/0x2d1
(52)
Nov 6 20:29:56 lambda kernel: [<c012ae47>] __up_write+0x6d/0x2d1 (52)
Nov 6 20:29:56 lambda kernel: [invalidate_inode_buffers+26/240]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:56 lambda kernel: [<c015258f>]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:56 lambda kernel: [up+80/173] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [<c012b957>] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [<c01b1cef>] kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [<c01b1cf1>] kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [<c01b2597>] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [<c01f79d2>] bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [<c01f7d44>] driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [pci_unregister_driver+11/19]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [<c01b95e4>]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [sys_delete_module+287/299]
sys_delete_module+0x11f/0x12b (8)
Nov 6 20:29:56 lambda kernel: [<c012d91f>] sys_delete_module+0x11f/0x12b
(8)
Nov 6 20:29:56 lambda kernel: [unmap_vma_list+14/23]
unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [<c0144c7d>] unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(12)
Nov 6 20:29:56 lambda kernel: [<c0144f7a>] do_munmap+0x11a/0x176 (12)
Nov 6 20:29:56 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [<c014500e>] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda kernel: [<c0103bc1>] sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda kernel: BUG: scheduling while atomic:
rmmod/0x10000003/15965
Nov 6 20:29:56 lambda kernel: caller is __cond_resched+0x36/0x41
Nov 6 20:29:56 lambda kernel: [__schedule+1194/1525]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:56 lambda kernel: [<c02ac2fa>]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:56 lambda kernel: [do_exit+328/1202] do_exit+0x148/0x4b2 (36)
Nov 6 20:29:56 lambda kernel: [<c0118d72>] do_exit+0x148/0x4b2 (36)
Nov 6 20:29:56 lambda kernel: [down_write_mutex+333/518]
down_write_mutex+0x14d/0x206 (4)
Nov 6 20:29:56 lambda kernel: [<c02ad5f9>] down_write_mutex+0x14d/0x206 (4)
Nov 6 20:29:56 lambda kernel: [remove_vm_struct+93/129]
remove_vm_struct+0x5d/0x81 (8)
Nov 6 20:29:56 lambda kernel: [<c0143446>] remove_vm_struct+0x5d/0x81 (8)
Nov 6 20:29:56 lambda kernel: [__cond_resched+54/65]
__cond_resched+0x36/0x41 (32)
Nov 6 20:29:56 lambda kernel: [<c0113715>] __cond_resched+0x36/0x41 (32)
Nov 6 20:29:56 lambda kernel: [cond_resched+28/37]
cond_resched+0x1c/0x25 (12)
Nov 6 20:29:56 lambda kernel: [<c02acedb>] cond_resched+0x1c/0x25 (12)
Nov 6 20:29:56 lambda kernel: [put_files_struct+146/269]
put_files_struct+0x92/0x10d (8)
Nov 6 20:29:56 lambda kernel: [<c0117ec3>] put_files_struct+0x92/0x10d (8)
Nov 6 20:29:56 lambda kernel: [do_exit+352/1202] do_exit+0x160/0x4b2 (32)
Nov 6 20:29:56 lambda kernel: [<c0118d8a>] do_exit+0x160/0x4b2 (32)
Nov 6 20:29:56 lambda kernel: [do_divide_error+0/330]
do_divide_error+0x0/0x14a (40)
Nov 6 20:29:56 lambda kernel: [<c0104d83>] do_divide_error+0x0/0x14a (40)
Nov 6 20:29:56 lambda kernel: [do_page_fault+920/1425]
do_page_fault+0x398/0x591 (64)
Nov 6 20:29:56 lambda kernel: [<c0111313>] do_page_fault+0x398/0x591 (64)
Nov 6 20:29:56 lambda kernel: [preempt_schedule+80/106]
preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:56 lambda kernel: [<c02ac5ab>] preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:56 lambda kernel: [queue_work+44/101] queue_work+0x2c/0x65 (20)
Nov 6 20:29:56 lambda kernel: [<c0125d66>] queue_work+0x2c/0x65 (20)
Nov 6 20:29:56 lambda kernel: [call_usermodehelper+274/316]
call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:56 lambda kernel: aa>] call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:56 lambda kernel: [__call_usermodehelper+0/72]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:56 lambda kernel: [<c0125b50>]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:56 lambda kernel: [do_page_fault+0/1425]
do_page_fault+0x0/0x591 (52)
Nov 6 20:29:56 lambda kernel: [<c0110f7b>] do_page_fault+0x0/0x591 (52)
Nov 6 20:29:56 lambda kernel: [error_code+45/56] error_code+0x2d/0x38 (8)
Nov 6 20:29:56 lambda kernel: [<c010461d>] error_code+0x2d/0x38 (8)
Nov 6 20:29:56 lambda kernel: [__up_write+109/721] __up_write+0x6d/0x2d1
(52)
Nov 6 20:29:56 lambda kernel: [<c012ae47>] __up_write+0x6d/0x2d1 (52)
Nov 6 20:29:56 lambda kernel: [invalidate_inode_buffers+26/240]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:56 lambda kernel: [<c015258f>]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:56 lambda kernel: [up+80/173] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [<c012b957>] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [<c01b1cef>] kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [<c01b1cf1>] kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [<c01b2597>] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [<c01f79d2>] bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [<c01f7d44>] driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [pci_unregister_driver+11/19]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [<c01b95e4>]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [sys_delete_module+287/299]
sys_delete_module+0x11f/0x12b (8)
Nov 6 20:29:56 lambda kernel: [<c012d91f>] sys_delete_module+0x11f/0x12b
(8)
Nov 6 20:29:56 lambda kernel: [unmap_vma_list+14/23]
unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [<c0144c7d>] unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(12)
Nov 6 20:29:56 lambda kernel: [<c0144f7a>] do_munmap+0x11a/0x176 (12)
Nov 6 20:29:56 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [<c014500e>] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda kernel: [<c0103bc1>] sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda kernel: BUG: scheduling while atomic:
rmmod/0x00000003/15965
Nov 6 20:29:56 lambda kernel: caller is do_exit+0x289/0x4b2
Nov 6 20:29:56 lambda kernel: [__schedule+1194/1525]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:56 lambda kernel: [<c02ac2fa>]
__sched_text_start+0x4aa/0x5f5 (8)
Nov 6 20:29:56 lambda kernel: [exit_notify+1154/2290]
exit_notify+0x482/0x8f2 (24)
Nov 6 20:29:56 lambda kernel: [<c01187ba>] exit_notify+0x482/0x8f2 (24)
Nov 6 20:29:56 lambda kernel: [kmem_cache_free+72/197]
kmem_cache_free+0x48/0xc5 (24)
Nov 6 20:29:56 lambda kernel: [<c013bd6d>] kmem_cache_free+0x48/0xc5 (24)
Nov 6 20:29:56 lambda kernel: [do_exit+649/1202] do_exit+0x289/0x4b2 (32)
Nov 6 20:29:56 lambda kernel: [<c0118eb3>] do_exit+0x289/0x4b2 (32)
Nov 6 20:29:56 lambda kernel: [do_divide_error+0/330]
do_divide_error+0x0/0x14a (40)
Nov 6 20:29:56 lambda kernel: [<c0104d83>] do_divide_error+0x0/0x14a (40)
Nov 6 20:29:56 lambda kernel: [do_page_fault+920/1425]
do_page_fault+0x398/0x591 (64)
Nov 6 20:29:56 lambda kernel: [<c0111313>] do_page_fault+0x398/0x591 (64)
Nov 6 20:29:56 lambda kernel: [preempt_schedule+80/106]
preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:56 lambda kernel: [<c02ac5ab>] preempt_schedule+0x50/0x6a (80)
Nov 6 20:29:56 lambda kernel: [queue_work+44/101] queue_work+0x2c/0x65 (20)
Nov 6 20:29:56 lambda kernel: [<c0125d66>] queue_work+0x2c/0x65 (20)
Nov 6 20:29:56 lambda kernel: [call_usermodehelper+274/316]
call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:56 lambda kernel: [<c0125caa>]
call_usermodehelper+0x112/0x13c (24)
Nov 6 20:29:56 lambda kernel: [__call_usermodehelper+0/72]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:56 lambda kernel: [<c0125b50>]
__call_usermodehelper+0x0/0x48 (20)
Nov 6 20:29:56 lambda kernel: [do_page_fault+0/1425]
do_page_fault+0x0/0x591 (52)
Nov 6 20:29:56 lambda kernel: [<c0110f7b>] do_page_fault+0x0/0x591 (52)
Nov 6 20:29:56 lambda kernel: [error_code+45/56] error_code+0x2d/0x38 (8)
Nov 6 20:29:56 lambda kernel: [<c010461d>] error_code+0x2d/0x38 (8)
Nov 6 20:29:56 lambda kernel: [__up_write+109/721] __up_write+0x6d/0x2d1
(52)
Nov 6 20:29:56 lambda kernel: [<c012ae47>] __up_write+0x6d/0x2d1 (52)
Nov 6 20:29:56 lambda kernel: [invalidate_inode_buffers+26/240]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:56 lambda kernel: [<c015258f>]
invalidate_inode_buffers+0x1a/0xf0 (12)
Nov 6 20:29:56 lambda kernel: [up+80/173] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [<c012b957>] up+0x50/0xad (56)
Nov 6 20:29:56 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [<c01b1cef>] kobject_cleanup+0x8e/0x90 (36)
Nov 6 20:29:56 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [<c01b1cf1>] kobject_release+0x0/0x8 (8)
Nov 6 20:29:56 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [<c01b2597>] kref_put+0x51/0xc2 (12)
Nov 6 20:29:56 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [<c01f79d2>] bus_remove_driver+0x3f/0x48 (36)
Nov 6 20:29:56 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [<c01f7d44>] driver_unregister+0xb/0x1a (8)
Nov 6 20:29:56 lambda kernel: [pci_unregister_driver+11/19]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [<c01b95e4>]
pci_unregister_driver+0xb/0x13 (8)
Nov 6 20:29:56 lambda kernel: [sys_delete_module+287/299]
sys_delete_module+0x11f/0x12b (8)
Nov 6 20:29:56 lambda kernel: [<c012d91f>] sys_delete_module+0x11f/0x12b
(8)
Nov 6 20:29:56 lambda kernel: [unmap_vma_list+14/23]
unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [<c0144c7d>] unmap_vma_list+0xe/0x17 (20)
Nov 6 20:29:56 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(12)
Nov 6 20:29:56 lambda kernel: [<c0144f7a>] do_munmap+0x11a/0x176 (12)
Nov 6 20:29:56 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [<c014500e>] sys_munmap+0x38/0x45 (36)
Nov 6 20:29:56 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda kernel: [<c0103bc1>] sysenter_past_esp+0x52/0x71 (12)
Nov 6 20:29:56 lambda alsa: succeeded
Nov 6 20:29:56 lambda rc: Stopping alsa: succeeded
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
Am Samstag 06 November 2004 16:57 schrieb Ingo Molnar:
>
> i have released the -V0.7.18 Real-Time Preemption patch, which can be
> downloaded from:
>
Hi Ingo
got this on a netconsole when I hit <TAB> in bash to complete "cat /proc/acpi":
>>>>>>
BUG: sleeping function called from invalid context bash(5364) at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1322
in_atomic():1 [00000001], irqs_disabled():0
[<c010803e>] <3>BUG: scheduling while atomic: bash/0x00000001/5364
caller is schedule+0x38/0x12e
[<c010803e>] dump_stack+0x23/0x25 (20)
[<c0311096>] __sched_text_start+0x9f6/0xdb4 (124)
[<c031148c>] schedule+0x38/0x12e (36)
[<c0312904>] __down_mutex+0x231/0x31a (84)
[<c013d23a>] __spin_lock+0x46/0x50 (24)
[<c013d2be>] _spin_lock_irqsave+0x1d/0x21 (16)
[<c025f03b>] e100_xmit_frame+0x3c/0x2d8 (60)
[<c02ae6c0>] netpoll_send_skb+0x23/0xb3 (28)
[<c0263066>] write_msg+0x56/0xfa (52)
[<c0124099>] __call_console_drivers+0x59/0x6c (32)
[<c01241c0>] call_console_drivers+0x8c/0x163 (36)
[<c01245bd>] release_console_sem+0x33/0xde (32)
[<c01244c3>] vprintk+0x134/0x16d (36)
[<c012438d>] printk+0x1d/0x1f (16)
[<c0107f18>] show_trace+0x65/0xcd (36)
[<c010803e>] dump_stack+0x23/0x25 (20)
[<c01203ab>] __might_sleep+0xbc/0xcf (36)
[<c013d228>] __spin_lock+0x34/0x50 (24)
[<c013d2be>] _spin_lock_irqsave+0x1d/0x21 (16)
[<c01437cd>] search_module_extables+0x1c/0x87 (32)
[<c01375d9>] search_exception_tables+0x39/0x3b (24)
[<c011b5d6>] fixup_exception+0x1a/0x34 (16)
[<c011ae50>] do_page_fault+0x37e/0x64e (220)
[<c0107c63>] error_code+0x2b/0x30 (100)
[<c0199214>] proc_lookup+0x81/0xc9 (52)
[<c0176318>] real_lookup+0xb2/0xd6 (36)
[<c017663f>] do_lookup+0x82/0x8d (32)
[<c0176dbf>] link_path_walk+0x775/0x1071 (108)
[<c01779e2>] path_lookup+0xa5/0x1b0 (32)
[<c0177c8f>] __user_walk+0x30/0x4d (32)
[<c017211b>] vfs_stat+0x1f/0x5a (92)
[<c01727b4>] sys_stat64+0x1e/0x3d (100)
[<c0107191>] sysenter_past_esp+0x52/0x71 (-4028)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c031300b>] .... _raw_spin_lock+0x1c/0x73
.....[<c011bb74>] .. ( <= task_rq_lock+0x32/0x5b)
.. [<c013ec19>] .... print_traces+0x1b/0x52
.....[<c010803e>] .. ( <= dump_stack+0x23/0x25)
<<<<<<
then follow lots of
>>>>>>
BUG: scheduling while atomic: bash/0x00000001/5364
caller is schedule+0x38/0x12e
[<c010803e>] dump_stack+0x23/0x25 (20)
[<c0311096>] __sched_text_start+0x9f6/0xdb4 (124)
[<c031148c>] schedule+0x38/0x12e (36)
[<c0312904>] __down_mutex+0x231/0x31a (84)
[<c013d23a>] __spin_lock+0x46/0x50 (24)
[<c013d2be>] _spin_lock_irqsave+0x1d/0x21 (16)
[<c025f03b>] e100_xmit_frame+0x3c/0x2d8 (60)
[<c02ae6c0>] netpoll_send_skb+0x23/0xb3 (28)
[<c0263066>] write_msg+0x56/0xfa (52)
[<c0124099>] __call_console_drivers+0x59/0x6c (32)
[<c01241e0>] call_console_drivers+0xac/0x163 (36)
[<c01245bd>] release_console_sem+0x33/0xde (32)
[<c01244c3>] vprintk+0x134/0x16d (36)
[<c012438d>] printk+0x1d/0x1f (16)
[<c0107f18>] show_trace+0x65/0xcd (36)
[<c010803e>] dump_stack+0x23/0x25 (20)
[<c01203ab>] __might_sleep+0xbc/0xcf (36)
[<c013d228>] __spin_lock+0x34/0x50 (24)
[<c013d2be>] _spin_lock_irqsave+0x1d/0x21 (16)
[<c01437cd>] search_module_extables+0x1c/0x87 (32)
[<c01375d9>] search_exception_tables+0x39/0x3b (24)
[<c011b5d6>] fixup_exception+0x1a/0x34 (16)
[<c011ae50>] do_page_fault+0x37e/0x64e (220)
[<c0107c63>] error_code+0x2b/0x30 (100)
[<c0199214>] proc_lookup+0x81/0xc9 (52)
[<c0176318>] real_lookup+0xb2/0xd6 (36)
[<c017663f>] do_lookup+0x82/0x8d (32)
[<c0176dbf>] link_path_walk+0x775/0x1071 (108)
[<c01779e2>] path_lookup+0xa5/0x1b0 (32)
[<c0177c8f>] __user_walk+0x30/0x4d (32)
[<c017211b>] vfs_stat+0x1f/0x5a (92)
[<c01727b4>] sys_stat64+0x1e/0x3d (100)
[<c0107191>] sysenter_past_esp+0x52/0x71 (-4028)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c031300b>] .... _raw_spin_lock+0x1c/0x73
.....[<c011bb74>] .. ( <= task_rq_lock+0x32/0x5b)
.. [<c013ec19>] .... print_traces+0x1b/0x52
.....[<c010803e>] .. ( <= dump_stack+0x23/0x25)
<<<<<<
then the machine is frozen.
Best regards,
Karsten
* Karsten Wiese <[email protected]> wrote:
> got this on a netconsole when I hit <TAB> in bash to complete "cat
> /proc/acpi":
does the patch below help?
Ingo
--- linux/kernel/module.c.orig
+++ linux/kernel/module.c
@@ -53,7 +54,7 @@
#define INIT_OFFSET_MASK (1UL << (BITS_PER_LONG-1))
/* Protects module list */
-static spinlock_t modlist_lock = SPIN_LOCK_UNLOCKED;
+static DECLARE_RAW_SPINLOCK(modlist_lock);
/* List of modules, protected by module_mutex AND modlist_lock */
static DECLARE_MUTEX(module_mutex);
[<c01080bf>] show_stack+0x7f/0xa0 (28)
[<c0108275>] show_registers+0x165/0x1d0 (56)
[<c010848c>] die+0xec/0x190 (64)
[<c0108a5e>] do_invalid_op+0x11e/0x120 (192)
[<c0107d07>] error_code+0x2b/0x30 (80)
[<c01e643d>] __journal_clean_checkpoint_list+0x5d/0x90 (44)
[<c01e3563>] journal_commit_transaction+0x273/0x1ac0 (416)
[<c01e7d9d>] kjournald+0x18d/0x430 (196)
[<c0105315>] kernel_thread_helper+0x5/0x10 (811573268)
Code: 4f 00 00 89 1c 24 e8 7b de f7 ff eb 9b c7 04 24 70 cd 3e c0 b9 ba 40 3f c0 be ff 02 00 00 89 74 24 08 89 4c 24 04 e8 2b 9f f3 ff <0f> 0b ff 02 ba 40 3f c0 eb bd 90 55 89 e5 57 56 53 83 ec 08 8b
kjournald:1445 BUG: lock held at task exit time!
[cf4ebc14] {&journal->j_state_lock}
.. held by: kjournald/ 1445 [cf9866c0, 115]
... acquired at: journal_commit_transaction+0xdf/0x1ac0
kjournald:1445 BUG: lock held at task exit time!
[cf4ebe50] {&journal->j_list_lock}
.. held by: kjournald/ 1445 [cf9866c0, 115]
... acquired at: journal_commit_transaction+0x268/0x1ac0
IRQ#7 thread RT prio: 39.
============================================
[ BUG: circular locking deadlock detected! ]
--------------------------------------------
bash/3986 is deadlocking current task bash/10630
1) bash/10630 is trying to acquire this lock:
[cdc048e0] {&inode->i_sem}
.. held by: bash/ 3986 [c8c87890, 115]
... acquired at: vfs_readdir+0x53/0xa0
... trying at: vfs_readdir+0x53/0xa0
2) bash/3986 is blocked on this lock:
[cf4ebc14] {&journal->j_state_lock}
.. held by: bash/10630 [cf9866c0, 115]
... acquired at: journal_commit_transaction+0xdf/0x1ac0
------------------------------
| showing all locks held by: | (bash/10630 [cf9866c0, 115]):
------------------------------
------------------------------
| showing all locks held by: | (bash/3986 [c8c87890, 115]):
------------------------------
#001: [cdc048e0] {&inode->i_sem}
... acquired at: vfs_readdir+0x53/0xa0
bash/3986's [blocked] stackdump:
c3873d70 00000086 c8c87890 c0560060 c04a0440 00000001 00000246 c8c87890
c01451f9 00000001 c044f318 c044f318 00000001 00000000 24535cc0 000f7cf3
c8c87a54 c3872000 c3872000 c8c87890 c3873d94 c03c981b c044f318 00000282
Call Trace:
[<c03c981b>] schedule+0x3b/0x130 (36)
[<c03cae11>] __down_mutex+0x311/0x3b0 (84)
[<c0138091>] __spin_lock+0x31/0x40 (24)
[<c01380b8>] _spin_lock+0x18/0x20 (16)
[<c01dfb96>] start_this_handle+0x66/0x4f0 (152)
[<c01e0136>] journal_start+0xc6/0xf0 (40)
[<c01d41e8>] ext3_dirty_inode+0x38/0xd0 (36)
[<c01855e1>] __mark_inode_dirty+0x1d1/0x1e0 (64)
[<c017cde1>] update_atime+0xd1/0xe0 (52)
[<c01747b3>] vfs_readdir+0x93/0xa0 (36)
[<c0174bdb>] sys_getdents64+0x6b/0xb0 (48)
[<c010720d>] sysenter_past_esp+0x52/0x71 (-8124)
bash/10630's [current] stackdump:
[<c01080fe>] dump_stack+0x1e/0x30 (20)
[<c0136313>] check_deadlock+0x1c3/0x2a0 (44)
[<c0136a22>] task_blocks_on_lock+0x1b2/0x1d0 (48)
[<c03ca976>] __down+0x226/0x350 (76)
[<c0137abe>] down+0x2e/0x160 (48)
[<c0174773>] vfs_readdir+0x53/0xa0 (36)
[<c0174bdb>] sys_getdents64+0x6b/0xb0 (48)
[<c010720d>] sysenter_past_esp+0x52/0x71 (-8124)
showing all tasks:
s init/ 1 [c1279770, 116] (not blocked)
s ksoftirqd/0/ 2 [c12791a0, 105] (not blocked)
s desched/0/ 3 [c1278bd0, 105] (not blocked)
s events/0/ 4 [c1278600, 98] (not blocked)
s khelper/ 5 [c1278030, 106] (not blocked)
s kthread/ 10 [cffdb790, 105] (not blocked)
s kacpid/ 18 [cffdb1c0, 115] (not blocked)
s IRQ 9/ 19 [cffdabf0, 115] (not blocked)
s kblockd/0/ 67 [cffda620, 105] (not blocked)
s pdflush/ 133 [cffda050, 116] (not blocked)
s pdflush/ 134 [cfebb7b0, 116] (not blocked)
s aio/0/ 136 [cfebac10, 106] (not blocked)
s kswapd0/ 135 [cfebb1e0, 116] (not blocked)
s IRQ 12/ 748 [cfeba070, 53] (not blocked)
s IRQ 6/ 763 [c13ab7d0, 50] (not blocked)
s kseriod/ 742 [cfeba640, 125] (not blocked)
s IRQ 14/ 786 [c13ab200, 51] (not blocked)
s IRQ 15/ 789 [c13aac30, 52] (not blocked)
s IRQ 1/ 819 [c13aa660, 54] (not blocked)
s reiserfs/0/ 824 [c13aa090, 105] (not blocked)
s IRQ 8/ 1163 [cf987260, 55] (not blocked)
s khubd/ 1357 [cf1b4820, 119] (not blocked)
s IRQ 4/ 1604 [cf987830, 56] (not blocked)
s IRQ 3/ 1605 [cfbb2700, 57] (not blocked)
s IRQ 7/ 1637 [cf3a7770, 60] (not blocked)
s IRQ 10/ 1970 [cf9860f0, 58] (not blocked)
s IRQ 11/ 2044 [cf5c87a0, 59] (not blocked)
s dhcpcd/ 2056 [cf5c8d70, 117] (not blocked)
s syslogd/ 2097 [cfbb3870, 116] (not blocked)
s klogd/ 2101 [cf2a39b0, 116] (not blocked)
s portmap/ 2127 [cf1b5990, 116] (not blocked)
s rpc.statd/ 2146 [cf1b53c0, 119] (not blocked)
s rsync/ 2223 [cf2a2270, 118] (not blocked)
s atd/ 2281 [cfd0c0b0, 116] (not blocked)
s smartd/ 2291 [cf3a6600, 116] (not blocked)
s sshd/ 2300 [cf1b4df0, 121] (not blocked)
s xinetd/ 2313 [cd4a2bf0, 117] (not blocked)
s dhcpd/ 2322 [cf2a33e0, 119] (not blocked)
s sendmail/ 2339 [cf1b4250, 116] (not blocked)
s sendmail/ 2347 [cfd0cc50, 117] (not blocked)
s gpm/ 2357 [cd4a3790, 116] (not blocked)
s crond/ 2366 [cfd0d7f0, 116] (not blocked)
s xfs/ 2394 [cf986c90, 116] (not blocked)
s dbus-daemon-1/ 2411 [cd4a31c0, 116] (not blocked)
s cupsd/ 2422 [cfd0c680, 116] (not blocked)
s mingetty/ 2632 [cfbb2cd0, 119] (not blocked)
s mingetty/ 2633 [cf5c9910, 119] (not blocked)
s mingetty/ 2634 [cf3a6bd0, 119] (not blocked)
s mingetty/ 2635 [cf2a2e10, 118] (not blocked)
s mingetty/ 2638 [cf5c9340, 118] (not blocked)
s mingetty/ 2639 [cf2a2840, 118] (not blocked)
s gdm-binary/ 2640 [cf5c81d0, 116] (not blocked)
s gdm-binary/ 2875 [cc3271e0, 117] (not blocked)
s X/ 2888 [cc366c30, 115] (not blocked)
s gnome-session/ 2923 [cfd0d220, 115] (not blocked)
s ssh-agent/ 2985 [cd4a2620, 116] (not blocked)
s gconfd-2/ 2996 [cc366090, 116] (not blocked)
s bonobo-activati/ 3519 [c8d1ed10, 116] (not blocked)
s gnome-settings-/ 3521 [cc326070, 115] (not blocked)
s xscreensaver/ 3917 [cc367200, 115] (not blocked)
s gnome-smproxy/ 3944 [cc366660, 117] (not blocked)
s metacity/ 3946 [cf3a6030, 115] (not blocked)
s gnome-panel/ 3948 [c92ff260, 115] (not blocked)
s nautilus/ 3950 [c92fec90, 115] (not blocked)
s nautilus/ 3955 [cc3677d0, 116] (not blocked)
s nautilus/ 3956 [cfbb32a0, 117] (not blocked)
s nautilus/ 3957 [c892a230, 116] (not blocked)
s nautilus/ 3958 [c8d1f2e0, 115] (not blocked)
s nautilus/ 3959 [c8c86150, 117] (not blocked)
s nautilus/ 3960 [c8d1f8b0, 116] (not blocked)
s nautilus/ 3961 [c8d1e170, 116] (not blocked)
s nautilus/ 3962 [c8d1e740, 116] (not blocked)
s nautilus/ 3963 [cd4a2050, 116] (not blocked)
s nautilus/ 3964 [cf3a71a0, 116] (not blocked)
s magicdev/ 3952 [cc326640, 115] (not blocked)
R gnome-terminal/ 3954 [c8ac4e50, 116] (not blocked)
s mapping-daemon/ 3966 [c8ac4880, 116] (not blocked)
s gnome-pty-helpe/ 3985 [c8c86720, 116] (not blocked)
D bash/ 3986 [c8c87890, 115] blocked on: [cf4ebc14] {&journal->j_state_lock}
.. held by: bash/10630 [cf9866c0, 115]
... acquired at: journal_commit_transaction+0xdf/0x1ac0
s notification-ar/ 3988 [c8c872c0, 115] (not blocked)
s bash/ 3989 [c8c86cf0, 115] (not blocked)
s wnck-applet/ 4029 [c892b970, 115] (not blocked)
s multiload-apple/ 4071 [c892a800, 116] (not blocked)
s gkb-applet-2/ 4079 [c892b3a0, 115] (not blocked)
s tail/ 4166 [c8ac42b0, 116] (not blocked)
s thunderbird/10586 [cc3277b0, 118] (not blocked)
s run-mozilla.sh/10598 [c92ff830, 119] (not blocked)
s thunderbird-bin/10603 [cc326c10, 115] (not blocked)
s thunderbird-bin/10604 [cfbb2130, 116] (not blocked)
s thunderbird-bin/10606 [c8ac5420, 116] (not blocked)
D bash/10630 [cf9866c0, 115] (not blocked)
---------------------------
| showing all locks held: |
---------------------------
#001: [c057a080] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#002: [c0579c94] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#003: [c0579e70] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#004: [c057a6fc] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#005: [c057a310] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#006: [c057a4ec] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#007: [c057ad78] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#008: [c057a98c] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#009: [c057ab68] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#010: [c057b3f4] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#011: [c057b008] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#012: [c057b1e4] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#013: [c057ba70] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#014: [c057b684] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#015: [c057b860] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#016: [c057c0ec] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#017: [c057bd00] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#018: [c057bedc] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#019: [c057c768] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#020: [c057c37c] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#021: [c057c558] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#022: [c057cde4] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#023: [c057c9f8] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#024: [c057cbd4] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#025: [c057d460] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#026: [c057d074] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#027: [c057d250] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#028: [c057dadc] {&hwif->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x98/0x180
#029: [c057d6f0] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#030: [c057d8cc] {&drive->gendev_rel_sem}
.. held by: init/ 1 [c1279770, 116]
... acquired at: init_hwif_data+0x16d/0x180
#031: [cfd07cf0] {&tty->atomic_read}
.. held by: mingetty/ 2632 [cfbb2cd0, 119]
... acquired at: read_chan+0x6c9/0x720
#032: [cc280cf0] {&tty->atomic_read}
.. held by: mingetty/ 2633 [cf5c9910, 119]
... acquired at: read_chan+0x6c9/0x720
#033: [cc349cf0] {&tty->atomic_read}
.. held by: mingetty/ 2638 [cf5c9340, 118]
... acquired at: read_chan+0x6c9/0x720
#034: [cc39fcf0] {&tty->atomic_read}
.. held by: mingetty/ 2639 [cf2a2840, 118]
... acquired at: read_chan+0x6c9/0x720
#035: [cc32bcf0] {&tty->atomic_read}
.. held by: mingetty/ 2635 [cf2a2e10, 118]
... acquired at: read_chan+0x6c9/0x720
#036: [cc3e1cf0] {&tty->atomic_read}
.. held by: mingetty/ 2634 [cf3a6bd0, 119]
... acquired at: read_chan+0x6c9/0x720
#037: [c4031cf0] {&tty->atomic_read}
.. held by: bash/ 3989 [c8c86cf0, 115]
... acquired at: read_chan+0x6c9/0x720
#038: [cdc048e0] {&inode->i_sem}
.. held by: bash/ 3986 [c8c87890, 115]
... acquired at: vfs_readdir+0x53/0xa0
=============================================
[ turning off deadlock detection. Please report this trace. ]
i have released the -V0.7.19 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this release includes fixes only.
Changes since -V0.7.18:
- fixed a merge bug introduced in -V0.7.18, breaking bit-spinlocks used
by ext3's journalling code. This could/should fix the kjournald crash
reported by Adam Heath, Gunther Persoons and Eran Mann. Bug triggered
on !SMP kernels only.
- added upstream patch to fix a crash in bttv/btcx_riscmem_free(),
reported by Shane Shrybman.
- made modlist_lock raw again - this could fix the /proc/acpi related
asserts reported by Karsten Wiese.
- fixed -RT locking bug in zap_completion_queue(), this could fix the
asserts reported by Shane Shrybman and others.
to create a -V0.7.19 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.19
Ingo
* Eran Mann <[email protected]> wrote:
> Ingo Molnar wrote:
> >i have released the -V0.7.18 Real-Time Preemption patch, which can be
> >downloaded from:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> I got the attached oops on 2.6.10-rc1-mm3-RT-V0.7.18 (probably during the
> daily cron job). Later in the morning when I tried to access some
> filesystems
> I got the attached deadlock report.
> Nov 8 04:19:32 eran kernel: BUG at include/linux/spinlock.h:767!
> Nov 8 04:19:32 eran kernel: ------------[ cut here ]------------
> Nov 8 04:19:32 eran kernel: kernel BUG at include/linux/spinlock.h:767!
ok, your bugreport pinpointed the bug: an RT-patch merging mistake when
i merged -RT to the spinlock-checker changes in recent BK.
the fix is below, but i've also put it into -V0.7.20 (which i released a
couple of minutes ago). Does this patch (or -V0.7.20) fix the kjournald
crash for you?
Ingo
--- linux/include/linux/spinlock.h.orig
+++ linux/include/linux/spinlock.h
@@ -750,7 +750,7 @@ static inline void bit_spin_lock(int bit
*/
static inline int bit_spin_trylock(int bitnum, unsigned long *addr)
{
-#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK)
+#if defined(CONFIG_SMP) || defined(CONFIG_DEBUG_SPINLOCK) || defined(CONFIG_PREEMPT)
if (test_and_set_bit(bitnum, addr))
return 0;
#endif
i have released the -V0.7.20 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this release includes a single fix relative to -V0.7.19: it fixes the
nondebug build errors reported by Rui Nuno Capela and Peter Zijlstra,
introduced in -V0.7.18.
to create a -V0.7.20 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.20
Ingo
Am Montag 08 November 2004 10:16 schrieb Ingo Molnar:
>
> i have released the -V0.7.19 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this release includes fixes only.
>
> Changes since -V0.7.18:
>
> - fixed a merge bug introduced in -V0.7.18, breaking bit-spinlocks used
> by ext3's journalling code. This could/should fix the kjournald crash
> reported by Adam Heath, Gunther Persoons and Eran Mann. Bug triggered
> on !SMP kernels only.
>
> - added upstream patch to fix a crash in bttv/btcx_riscmem_free(),
> reported by Shane Shrybman.
>
> - made modlist_lock raw again - this could fix the /proc/acpi related
> asserts reported by Karsten Wiese.
Doesn't. Please see attached logs.
RT-V0.7.19-dmesg_after_boot_rl3.log is a freshly booted dmesg output after logging on via ssh.
RT-V0.7.19-proc_acpi_TAB.log is captured via netconsole. This log started after logging in locally,
then typing in "cat /proc/acpi", then first <TAB> gives an additional "/", 2nd <TAB> gives no visual effect, 3rd <TAB> produces whats in the log.
thanks,
Karsten
* Karsten Wiese <[email protected]> wrote:
> RT-V0.7.19-dmesg_after_boot_rl3.log is a freshly booted dmesg output
> after logging on via ssh. RT-V0.7.19-proc_acpi_TAB.log is captured via
> netconsole. This log started after logging in locally, then typing in
> "cat /proc/acpi", then first <TAB> gives an additional "/", 2nd <TAB>
> gives no visual effect, 3rd <TAB> produces whats in the log.
could you try this with vanilla -mm3 too? The crash seems to be generic:
[<c011ae89>] do_page_fault+0x3b7/0x64e (220)
[<c0107c63>] error_code+0x2b/0x30 (100)
[<c01991e4>] proc_lookup+0x81/0xc9 (52)
[<c01762e8>] real_lookup+0xb2/0xd6 (36)
[<c017660f>] do_lookup+0x82/0x8d (32)
[<c0176d8f>] link_path_walk+0x775/0x1071 (108)
[<c01779b2>] path_lookup+0xa5/0x1b0 (32)
[<c0177c5f>] __user_walk+0x30/0x4d (32)
[<c01720eb>] vfs_stat+0x1f/0x5a (92)
[<c0172784>] sys_stat64+0x1e/0x3d (100)
[<c0107191>] sysenter_past_esp+0x52/0x71 (-4028)
while -RT made it a bit more murky by emitting an assert while the
kernel tried to crash in a critical section, it doesnt seem to be a
genuine -RT related crash.
(if it doesnt trigger in vanilla -mm3 then could you try -RT with
PREEMPT_REALTIME disabled?)
Ingo
* Karsten Wiese <[email protected]> wrote:
> RT-V0.7.19-dmesg_after_boot_rl3.log is a freshly booted dmesg output
> after logging on via ssh. RT-V0.7.19-proc_acpi_TAB.log is captured via
> netconsole. This log started after logging in locally, then typing in
> "cat /proc/acpi", then first <TAB> gives an additional "/", 2nd <TAB>
> gives no visual effect, 3rd <TAB> produces whats in the log.
there's at least one more netconsole buglet causing asserts, which
should be fixed by the patch below.
Ingo
--- linux/net/core/netpoll.c.orig
+++ linux/net/core/netpoll.c
@@ -194,7 +194,7 @@ repeat:
}
spin_lock(&np->dev->xmit_lock);
- np->dev->xmit_lock_owner = smp_processor_id();
+ np->dev->xmit_lock_owner = _smp_processor_id();
/*
* network drivers do not expect to be called if the queue is
Am Montag 08 November 2004 11:19 schrieb Ingo Molnar:
>
> * Karsten Wiese <[email protected]> wrote:
>
> > RT-V0.7.19-dmesg_after_boot_rl3.log is a freshly booted dmesg output
> > after logging on via ssh. RT-V0.7.19-proc_acpi_TAB.log is captured via
> > netconsole. This log started after logging in locally, then typing in
> > "cat /proc/acpi", then first <TAB> gives an additional "/", 2nd <TAB>
> > gives no visual effect, 3rd <TAB> produces whats in the log.
>
> could you try this with vanilla -mm3 too? The crash seems to be generic:
>
> [<c011ae89>] do_page_fault+0x3b7/0x64e (220)
> [<c0107c63>] error_code+0x2b/0x30 (100)
> [<c01991e4>] proc_lookup+0x81/0xc9 (52)
> [<c01762e8>] real_lookup+0xb2/0xd6 (36)
> [<c017660f>] do_lookup+0x82/0x8d (32)
> [<c0176d8f>] link_path_walk+0x775/0x1071 (108)
> [<c01779b2>] path_lookup+0xa5/0x1b0 (32)
> [<c0177c5f>] __user_walk+0x30/0x4d (32)
> [<c01720eb>] vfs_stat+0x1f/0x5a (92)
> [<c0172784>] sys_stat64+0x1e/0x3d (100)
> [<c0107191>] sysenter_past_esp+0x52/0x71 (-4028)
right, on -mm3 bash crashes alike on "cat /proc/acpi" + 3*<TAB>.
Sent a dmesg output to Andrew under the "2.6.10-rc1-mm3"-thread.
Thanks,
Karsten
On Monday 08 November 2004 10:50, you wrote:
>
> i have released the -V0.7.20 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
At boot time:
BUG: sleeping function called from invalid context modprobe(1782) at kernel/rt.c:1322
in_atomic():1 [00000001], irqs_disabled():1
[dump_stack+35/48] dump_stack+0x23/0x30 (20)
[__might_sleep+194/224] __might_sleep+0xc2/0xe0 (36)
[__spin_lock+56/96] __spin_lock+0x38/0x60 (24)
[_spin_lock+29/32] _spin_lock+0x1d/0x20 (16)
[kmem_cache_alloc+71/272] kmem_cache_alloc+0x47/0x110 (32)
[use_module+164/320] use_module+0xa4/0x140 (32)
[resolve_symbol+171/192] resolve_symbol+0xab/0xc0 (48)
[simplify_symbols+178/288] simplify_symbols+0xb2/0x120 (44)
[load_module+1395/2704] load_module+0x573/0xa90 (160)
[sys_init_module+107/576] sys_init_module+0x6b/0x240 (32)
[syscall_call+7/11] syscall_call+0x7/0xb (-4028)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [resolve_symbol+33/192] .... resolve_symbol+0x21/0xc0
.....[simplify_symbols+178/288] .. ( <= simplify_symbols+0xb2/0x120)
.. [print_traces+29/144] .... print_traces+0x1d/0x90
.....[dump_stack+35/48] .. ( <= dump_stack+0x23/0x30)
--
I route therefore you are
Ingo Molnar wrote:
>
> i have released the -V0.7.20 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
I'm seeing many of these on dmesg, almost everytime a module is getting
loaded:
BUG: sleeping function called from invalid context insmod(1357) at
kernel/rt.c:1322
in_atomic():1 [00000001], irqs_disabled():1
[<c0105040>] dump_stack+0x23/0x25 (20)
[<c0116026>] __might_sleep+0xbc/0xcf (36)
[<c01321d1>] __spin_lock+0x38/0x57 (24)
[<c013220d>] _spin_lock+0x1d/0x1f (16)
[<c0145386>] kmem_cache_alloc+0x3f/0xfe (32)
[<c01358ff>] use_module+0xa8/0x13a (32)
[<c01363db>] resolve_symbol+0x79/0x8c (40)
[<c0136a1d>] simplify_symbols+0xc4/0xfe (44)
[<c013779b>] load_module+0x64f/0x989 (144)
[<c0137b2c>] sys_init_module+0x57/0x23d (32)
[<c0104201>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c013637f>] .... resolve_symbol+0x1d/0x8c
.....[<c0136a1d>] .. ( <= simplify_symbols+0xc4/0xfe)
.. [<c0133cdf>] .... print_traces+0x1b/0x54
.....[<c0105040>] .. ( <= dump_stack+0x23/0x25)
Another critical issue is that USB is not working properly; the ohci_hcd
module gets loaded but devices don't get listed by lsusb at all, which I
think is a "bonus" from upstream 2.6.9-rc1-mm3.
In fact I think -mm3 is breaking many things around here, at least on my
laptop; another notorious is my wifi stuff (linux-wlan-ng/prism2_cs) now
failing due to unresolved symbols, such as these:
prism2_cs: Unknown symbol p80211netdev_rx
prism2_cs: Unknown symbol register_wlandev
prism2_cs: Unknown symbol wlan_unsetup
prism2_cs: Unknown symbol unregister_wlandev
prism2_cs: Unknown symbol p80211netdev_hwremoved
prism2_cs: Unknown symbol wlan_setup
prism2_cs: Unknown symbol p80211skb_rxmeta_attach
But I guess this is off-topic by now.
Take care.
--
rncbc aka Rui Nuno Capela
[email protected]
On Fri, 5 Nov 2004, Bill Huey wrote:
> [...]
> I think of an RT kernel with N threads in terms where it's an SMP
> machine with same N number of processors. If you have N threads
> pounding on the same critical sections, it's effectively N physical
> processors hitting as well.
>
Not quite. On a UP RT system you know that all lower priority tasks are
not running when your task is running. This gives some nice
properties. If you take care not to sleep your high priority task
effectively blocks all preemption by the lower priority tasks.
On a SMB system you don't have these nice properties. You always have to
take into account that N processes are really running at the same time.
On UP RT systems it is legal to think like this: The processor can
only do one thing. For the overall performance I can just as well lock the
rest out with one big mutex and get my work done in a hurry. This gives
simpler code and it is more efficient since locking/unlocking sections
takes time. It doesn't destroy the latency of the subsystem as long as you
can verify that the maximum locking time is less than the required latency
for that subsystem. If I can even keep my locking time below the maximum
allowed interrupt latency I can optimize it further by using
interrupt disable/enable instead of a mutex!
On SMB systems, on the other hand, this gives bad performance to have such
big locking sections. Especially if you use the equivalent of interrupt
disable/enable (spinlocks) you are not only slowing down your own
subsystem but the whole system.
In short: For SMB you have to think of parellization much more than on a
UP RT system. Ofcourse to think of UP RT system as a SMB system doesn't
make your system fail, but it might give you a suboptimal system. On the
other hand a system running on a UP with full preemption might not be
portable to a SMB system as you might be saved by "the nice properties".
So if you want to be portable, think of it as a SMB system :-)
> [...]
>
> bill
>
Esben
* Esben Nielsen <[email protected]> wrote:
> On a SMP system you don't have these nice properties. You always have
> to take into account that N processes are really running at the same
> time.
not necessarily. In theory we could introduce the notion of
"hyper-high-priority tasks" (e.g. SCHED_HYPER_FIFO), which tasks not
only get preempted on one CPU immediately, but cause the kernel to stop
and loop on all other CPUs as well. That way the same 'nice' properties
of UP kernels get carried over to the SMP system as well, at the cost of
serializing all execution while the hyper-high-prio task is running.
Once the task stops running, the other CPUs can continue as well.
Ingo
i have released the -V0.7.21 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this release includes fixes and debugging-improvements.
Changes since -V0.7.20:
- reverted the modlist_lock change - it caused more problems than it
solved.
- implemented irqs-off critical section timing/tracing, inspired by the
positive results Thomas Gleixner got with a different kind of cli/sti
tracer. To activate it, enable CONFIG_CRITICAL_TIMING and
CONFIG_CRITICAL_IRQSOFF_TIMING and cli/sti latencies will be reported
'integrated' into the preempt on/off latencies.
- sped up tracing in a number of ways. Performance of the tracer slowly
eroded in the past week or two, it needed alignment and size fixes ,
inlining/branch-prediction updates and i got rid of unnecessary code.
The max latency is now traced in cycles - this got rid of an
expensive 64-bit division in the fastpath. (the /proc/sys tunables
are still in usecs so userspace should not notice anything.) It's
still not cheap but roughly 5 times faster than -V0.7.20's tracer, on
a fast desktop box.
- renamed CONFIG_PREEMPT_REALTIME to CONFIG_PREEMPT_RT - it's shorter.
- renamed CONFIG_PREEMPT_TIMING to CONFIG_CRITICAL_TIMING and
introduced CONFIG_CRITICAL_IRQSOFF_TIMING to enable cli/sti timing.
to create a -V0.7.21 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.21
Ingo
On Mon, 8 Nov 2004 10:50:48 +0100
Ingo Molnar <[email protected]> wrote:
> i have released the -V0.7.20 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this release includes a single fix relative to -V0.7.19: it fixes the
> nondebug build errors reported by Rui Nuno Capela and Peter Zijlstra,
> introduced in -V0.7.18.
>
> to create a -V0.7.20 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.20
Hi,
i just wanted to let you know that this one doesn't lock up for me. I
actually built for 486 [to be able to run the image in qemu first]. After
the run in qemu showed no problems, i went to boot the kernel on my real
machine. It seems to work fine so far with rtc_wakeup showing max. jitters
around 30 usec (at f=1024) under load (find's + kernel compile)..
Will build a kernel (0.7.21) with debugging enabled to see if i miss any
BUG's.
flo
Ingo Molnar wrote:
>i have released the -V0.7.21 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this release includes fixes and debugging-improvements.
>
>Changes since -V0.7.20:
>
> - reverted the modlist_lock change - it caused more problems than it
> solved.
>
> - implemented irqs-off critical section timing/tracing, inspired by the
> positive results Thomas Gleixner got with a different kind of cli/sti
> tracer. To activate it, enable CONFIG_CRITICAL_TIMING and
> CONFIG_CRITICAL_IRQSOFF_TIMING and cli/sti latencies will be reported
> 'integrated' into the preempt on/off latencies.
>
> - sped up tracing in a number of ways. Performance of the tracer slowly
> eroded in the past week or two, it needed alignment and size fixes ,
> inlining/branch-prediction updates and i got rid of unnecessary code.
> The max latency is now traced in cycles - this got rid of an
> expensive 64-bit division in the fastpath. (the /proc/sys tunables
> are still in usecs so userspace should not notice anything.) It's
> still not cheap but roughly 5 times faster than -V0.7.20's tracer, on
> a fast desktop box.
>
> - renamed CONFIG_PREEMPT_REALTIME to CONFIG_PREEMPT_RT - it's shorter.
>
> - renamed CONFIG_PREEMPT_TIMING to CONFIG_CRITICAL_TIMING and
> introduced CONFIG_CRITICAL_IRQSOFF_TIMING to enable cli/sti timing.
>
>to create a -V0.7.21 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.21
>
> Ingo
>
>
>
When trying to patch my kernel i get following notice:
patching file include/linux/highmem.h
patch unexpectedly ends in middle of line
patch: **** unexpected end of file in patch
Hello Ingo,
Ingo Molnar wrote:
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V
>0.7.21
Unfortunately, it seems corrupted.
Regards,
Norberto
Norberto Bensa wrote:
> Hello Ingo,
>
> Ingo Molnar wrote:
>
>>http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V
>>0.7.21
>
>
> Unfortunately, it seems corrupted.
>
> Regards,
> Norberto
>
Yeah, what he said. :)
kr
On Mon, 8 Nov 2004, Ingo Molnar wrote:
>
> i have released the -V0.7.19 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this release includes fixes only.
>
> Changes since -V0.7.18:
>
> - fixed a merge bug introduced in -V0.7.18, breaking bit-spinlocks used
> by ext3's journalling code. This could/should fix the kjournald crash
> reported by Adam Heath, Gunther Persoons and Eran Mann. Bug triggered
> on !SMP kernels only.
The last kernel I tried was v0.7.13, so I doubt it was a recent introduction.
Will try something newer soon.
* Gunther Persoons <[email protected]> wrote:
> When trying to patch my kernel i get following notice:
> patching file include/linux/highmem.h
> patch unexpectedly ends in middle of line
> patch: **** unexpected end of file in patch
yeah ... the result of an incomplete upload. I've uploaded -V0.7.22 that
is a full patch. (no other changes)
Ingo
* K.R. Foley <[email protected]> wrote:
> >Unfortunately, it seems corrupted.
> Yeah, what he said. :)
i've uploaded -V0.7.22, please try that instead.
Ingo
On Mon, Nov 08, 2004 at 03:35:38PM +0100, Esben Nielsen wrote:
> Not quite. On a UP RT system you know that all lower priority tasks are
> not running when your task is running. This gives some nice
> properties. If you take care not to sleep your high priority task
> effectively blocks all preemption by the lower priority tasks.
...
> In short: For SMB you have to think of parellization much more than on a
> UP RT system. Ofcourse to think of UP RT system as a SMB system doesn't
> make your system fail, but it might give you a suboptimal system. On the
> other hand a system running on a UP with full preemption might not be
> portable to a SMB system as you might be saved by "the nice properties".
> So if you want to be portable, think of it as a SMB system :-)
Yeah, good points. I know, I'm being paranoid intentionally since much
of the kernel is so SMP oriented. I'm thinking in terms of large scale
concurrency since the structures of the kernel are fundamentally SMP and
which is tightly related to RT performance. There's a lot of overlap
between both world as it concerns locking structures, fine grainedness.
The really appealing aspect of this project that the same things that
make Linux such a high performance SMP system out of the box is exactly
what will also give it outstanding RT performance in both UP and SMP
configurations. The fine grained locking issues seem to be largely done
and this work is going to push those boundaries even more.
I know what you're saying about thread run subsets with relation to
priority (again good points), but I'm looking at large picture issues
and how this system is going to behave as all parts work together. We
haven't done this just yet and it's too immature a system to do so
until it becomes more stable and popular. It's a different point of
view than what you're talking about. :)
bill
i have released the -V0.7.23 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this release includes fixes, development/debugging improvements and
latency improvements and other speedups.
the biggest change is the reworked timing/tracing framework. Wakeup
timing became a compile-time thing and can be selected independently of
the preemption mode - i.e. it can now be used on near-vanilla !PREEMPT
kernels too, providing good wakeup-latency comparison of the various
preemption models.
irqs-off and preempt-off critical section timing/tracing can be selected
if wakeup timing is disabled, the two options can be selected separately
or together as well.
another improvement is that wakeup-timing now works correctly on SMP too
- the tracer 'follows' the highest-priority task in the system as it
gets bounced between CPUs and always traces the CPU where the task is
pending.
other changes since -V0.7.22:
- semaphore livelock fix: feedback from Mark H. Johnson pinpointed a
bug in the down_trylock() semaphore code: if preempted in the wrong
moment a lower-prio task could cause a higher-prio task to livelock
indefinitely. This fix could solve the 'keventd looping' problem
reported by Mark.
the fix is to make the down_trylock() code a bit simpler, but this
also introduces the potential for down_trylock() to get 'spurious'
locking-rejects. Hopefully this wont be a big problem - we dont
really guarantee that down_trylock() succeeds - but code using higher
semaphore counts to track resources could be negatively impacted by
this. We'll see.
- console assert fix: implemented a different type of fbcon
RT-preemption handling variant, this could solve the assert reported
by Amit Shah.
- debugging improvement: implemented a sequence counter for max latency
traces. This has the advantage of solving the 'slow console on SMP
problem': the latency-timing code used to get confused by another CPU
printing a timing message to a slow console and thus delaying that
other CPU. Now any latency that gets generated while a maximum is
printed is skipped.
- further shrunk the non-debug size of struct rt_mutex by moving the
save_state logic into the debug paths - size is now 4 machine-words.
- fixed CONFIG_HIGHMEM latencies: all atomic-kmap APIs are now wrapped
seemlessly and in a preemptible way.
- implemented an IO-APIC register cache to speed up the IRQ-redirection
latency hotpath. Also, made the POST flush a bit faster.
- disable KGDB if PREEMPT_RT - it's broken for now.
- move some runtime checks to under DEBUG_PREEMPT - this speeds up
CRITICAL_PREEMPT_TIMING.
to create a -V0.7.23 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.23
Ingo
>i have released the -V0.7.23 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
A few notes on results from running with this patch (well, actually -EA
that Ingo provided separately).
[1] Build problems, separately reported to Ingo with CONFIG_PREEMPT_RT
enabled on x86 and you have modules using kunmap_atomic. Fix by adding
kunmap_virt and kmap_to_page to the list of exports.
[2] The live lock that I was having seems to have been killed based
on an hour of testing (I could usually cause it in 5 minutes or less).
[3] I am not so sure that the latency tracing works. I do not get any
trace output, even if I set preempt_max_latency to zero. I also noticed
that /proc/sys/kernel/preempt_wakeup_timing was removed at .20 but
not sure if that was deliberate. As a result, I have no reports from
the kernel tracing.
[4] Application level latencies are OK but not great.
X test - only 90% of CPU loops are within 100 usec of nominal value.
In previous RT kernels I got > 99% with 100 usec.
top test - looks much nicer than X test, but still have up to 30%
overhead on CPU loop.
network I/O tests - smoothest of all test results, very nice
disk write - very noisy, bursts of long delays with only 82% within
100 usec and worst case has over 100% overhead (2.5 msec vs 1.16 nominal)
disk copy - fewer bursts, but worst case is similar to disk write.
About 95% within 100 usec.
disk read - relatively clean with a pair of modest bursts early in
the test and then settled out a little worse than the network tests.
99.9% of samples within 100 usec and max of 1.65 msec.
[5] concurrent ping of system had over 13% lost packets (1089 out
of 10723 - plus I let this run after the tests had completed). The
2.4 RT kernel I use has no lost packets.
[6] I did a separate run of a script Ingo suggested that samples
the kernel profile data. It shows basically no CPU time for the
events/0 and events/1 tasks. I took 11 samples in about 1/2 hour
of testing but nothing seems to jump out of the data.
--Mark H Johnson
<mailto:[email protected]>
* [email protected] <[email protected]> wrote:
> [3] I am not so sure that the latency tracing works. I do not get any
> trace output, even if I set preempt_max_latency to zero.
What is the value of preempt_thresh? If it's set then it overrides the
preempt_max_latency value. (previously the bogus timing values probably
triggered even a relatively high preempt-threshold, but now with a
correct tracer it's not possible anymore.)
> [...] I also noticed that /proc/sys/kernel/preempt_wakeup_timing was
> removed at .20 but not sure if that was deliberate. [...]
yeah, this was deliberate - it's a side-effect of separating it from the
other timing options.
Ingo
* [email protected] <[email protected]> wrote:
> >i have released the -V0.7.23 Real-Time Preemption patch, which can be
> >downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> A few notes on results from running with this patch (well, actually
> -EA that Ingo provided separately).
>
> [1] Build problems, separately reported to Ingo with CONFIG_PREEMPT_RT
> enabled on x86 and you have modules using kunmap_atomic. Fix by adding
> kunmap_virt and kmap_to_page to the list of exports.
these should be fixed in -V0.7.24 which i uploaded shortly after
-V0.7.23, based on your reports. In any case, the export problems
trigger with CONFIG_HIGHMEM and the affected subsystems compiled as
modules.
> [2] The live lock that I was having seems to have been killed based on
> an hour of testing (I could usually cause it in 5 minutes or less).
great!
> [4] Application level latencies are OK but not great.
> X test - only 90% of CPU loops are within 100 usec of nominal value.
> In previous RT kernels I got > 99% with 100 usec.
this might be a side-effect of the chrt-ing of events/[0|1] and/or
ksoftirqd (which we did to debug the 'freeze' problems) - are those
still chrt-ed? Please review and double-check all SCHED_FIFO tasks in
the system and keep only those that are absolutely necessary for
latencytest's operation [i.e. the soundcard IRQ and latencytest itself]
- everything else should be SCHED_OTHER. Do latencies get any better if
you do this?
Ingo
* Karsten Wiese <[email protected]> wrote:
> Hi
>
> On SMP/HT/P4 I get:
> BUG: lock held at task exit time!
> sh/5429: BUG in __up_mutex at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1064
> BUG: sleeping function called from invalid context sh(5429) at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1314
> in_atomic():1 [00000003], irqs_disabled():0
hm, apparently something leaked a BKL count. Unfortunately we dont know
precisely what did it, only that it happened. Did this happen during
bootup, or during normal use. Can you trigger it arbitrarily?
Ingo
>> [4] Application level latencies are OK but not great.
>> X test - only 90% of CPU loops are within 100 usec of nominal value.
>> In previous RT kernels I got > 99% with 100 usec.
>
>this might be a side-effect of the chrt-ing of events/[0|1] and/or
>ksoftirqd (which we did to debug the 'freeze' problems) - are those
>still chrt-ed?
For reference:
# ps -eo pid,pri,rtprio,cmd | grep '\['
1 23 - init [5]
2 139 99 [migration/0]
3 34 - [ksoftirqd/0]
4 34 - [desched/0]
5 139 99 [migration/1]
6 34 - [ksoftirqd/1]
7 34 - [desched/1]
8 41 1 [events/0]
9 41 1 [events/1]
10 34 - [khelper]
15 32 - [kthread]
27 34 - [kblockd/0]
28 34 - [kblockd/1]
36 24 - [khubd]
103 23 - [kswapd0]
104 32 - [aio/0]
105 33 - [aio/1]
180 139 99 [IRQ 8]
195 14 - [kseriod]
201 139 99 [IRQ 12]
237 139 99 [IRQ 14]
239 139 99 [IRQ 15]
278 139 99 [IRQ 1]
310 24 - [kirqd]
313 139 99 [IRQ 4]
320 24 - [kjournald]
605 139 99 [IRQ 10]
1206 24 - [kjournald]
1207 24 - [kjournald]
1309 139 99 [IRQ 3]
1323 31 - [IRQ 7]
1494 139 99 [IRQ 6]
1748 139 99 [IRQ 11]
14131 23 - [pdflush]
14242 24 - [pdflush]
17337 21 - grep \[
>Please review and double-check all SCHED_FIFO tasks in
>the system and keep only those that are absolutely necessary for
>latencytest's operation [i.e. the soundcard IRQ and latencytest itself]
>- everything else should be SCHED_OTHER. Do latencies get any better if
>you do this?
I can, but that is not necessarily an "apples to apples" comparison.
When I compare with 2.4 preempt + low latency kernels, the X stress
test had > 99% of the samples within 100 usec of the nominal value.
Don't forget - on a 2.4 kernel, the IRQ's are all unthreaded. On
the 2.4 kernel, heavy disk I/O is where I get the worst behavior
and even then, I get > 90% of samples within 100 usec.
I still maintain that a 2.6 RT kernel has to do as well or better
than a 2.4 RT kernel (or else, why would I step up??).
--Mark
Am Mittwoch 10 November 2004 14:52 schrieb Karsten Wiese:
> Am Dienstag 09 November 2004 17:05 schrieb Ingo Molnar:
> >
> > i have released the -V0.7.23 Real-Time Preemption patch, which can be
> > downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> Hi
>
> On SMP/HT/P4 I get:
> BUG: lock held at task exit time!
Forgot to write that this happened with V0.7.24.
>* [email protected] <[email protected]> wrote:
>
>> [3] I am not so sure that the latency tracing works. I do not get any
>> trace output, even if I set preempt_max_latency to zero.
>
>What is the value of preempt_thresh?
It was zero at boot time. Now that I check, set it somewhere to 200.
Setting it back to zero, I now see that I have some extremely
small reports, max so far is 63 usec. Will run my big test again
to see what turns up.
--Mark
Am Dienstag 09 November 2004 17:05 schrieb Ingo Molnar:
>
> i have released the -V0.7.23 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
Hi
On SMP/HT/P4 I get:
BUG: lock held at task exit time!
This was captured via netconsole:
<NETCONSOLELOG>
apm: disabled - APM is not SMP safe.
sh:5429 BUG: lock held at task exit time!
[c03af1c4] {kernel_sem.lock}
.. held by: sh/ 5429 [f4d2c420, 117]
... acquired at: __reacquire_kernel_lock+0x2e/0x60
sh/5429: BUG in __up_mutex at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1064
BUG: sleeping function called from invalid context sh(5429) at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1314
in_atomic():1 [00000003], irqs_disabled():0
[<c010704e>] dump_stack+0x23/0x25 (20)
[<c011d10b>] __might_sleep+0xbc/0xcf (36)
[<c013a136>] __spin_lock+0x34/0x50 (24)
[<c013a16f>] _spin_lock+0x1d/0x1f (16)
[<c014e36b>] kmem_cache_alloc+0x37/0xf1 (32)
[<c02c95d4>] alloc_skb+0x28/0xe6 (32)
[<c02da818>] find_skb+0x31/0x9b (24)
[<c02da97a>] netpoll_send_udp+0x40/0x25a (48)
[<c028f246>] write_msg+0x56/0xfa (52)
[<c0120e06>] __call_console_drivers+0x56/0x65 (32)
[<c0120f49>] call_console_drivers+0xac/0x163 (36)
[<c0121326>] release_console_sem+0x33/0xde (32)
[<c012122c>] vprintk+0x134/0x16d (36)
[<c01210f6>] printk+0x1d/0x1f (16)
[<c01390ee>] __up_mutex+0x270/0x443 (68)
[<c0139fc2>] up+0xe0/0xec (36)
[<c0342e95>] __schedule+0x985/0xdb4 (124)
[<c0123afc>] do_exit+0x331/0x59f (48)
[<c0123e81>] do_group_exit+0x39/0xd1 (40)
[<c01061a1>] sysenter_past_esp+0x52/0x71 (-4028)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c0344e5b>] .... _raw_spin_lock+0x1c/0x73
.....[<c0142848>] .. ( <= __do_IRQ+0xf6/0x16e)
.. [<c0344e5b>] .... _raw_spin_lock+0x1c/0x73
.....[<c010b340>] .. ( <= timer_interrupt+0xd3/0x10f)
.. [<c0344e5b>] .... _raw_spin_lock+0x1c/0x73
.....[<c013b1c9>] .. ( <= trace_start_sched_wakeup+0x22/0x7e)
.. [<c013b764>] .... print_traces+0x1b/0x52
.....[<c010704e>] .. ( <= dump_stack+0x23/0x25)
[<c010704e>] dump_stack+0x23/0x25 (20)
[<c01390f3>] __up_mutex+0x275/0x443 (68)
[<c0139fc2>] up+0xe0/0xec (36)
[<c0342e95>] __schedule+0x985/0xdb4 (124)
[<c0123afc>] do_exit+0x331/0x59f (48)
[<c0123e81>] do_group_exit+0x39/0xd1 (40)
[<c01061a1>] sysenter_past_esp+0x52/0x71 (-4028)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c0344e5b>] .... _raw_spin_lock+0x1c/0x73
.....[<c0142848>] .. ( <= __do_IRQ+0xf6/0x16e)
.. [<c0344e5b>] .... _raw_spin_lock+0x1c/0x73
.....[<c010b340>] .. ( <= timer_interrupt+0xd3/0x10f)
.. [<c0344e5b>] .... _raw_spin_lock+0x1c/0x73
.....[<c013b1c9>] .. ( <= trace_start_sched_wakeup+0x22/0x7e)
.. [<c013b764>] .... print_traces+0x1b/0x52
.....[<c010704e>] .. ( <= dump_stack+0x23/0x25)
autom4te/5419: BUG in lock_new_owner at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:651
[<c010704e>] dump_stack+0x23/0x25 (20)
BUG: scheduling while atomic: autom4te/0x00000001/5419
caller is schedule+0x38/0x12e
[<c010704e>] dump_stack+0x23/0x25 (20)
[<c0342f06>] __schedule+0x9f6/0xdb4 (124)
[<c03432fc>] schedule+0x38/0x12e (36)
[<c0344758>] __down_mutex+0x225/0x30f (84)
[<c013a148>] __spin_lock+0x46/0x50 (24)
[<c013a1cc>] _spin_lock_irqsave+0x1d/0x21 (16)
[<c028b31e>] e100_xmit_frame+0x13f/0x2d8 (60)
[<c02da8a5>] netpoll_send_skb+0x23/0xb8 (28)
[<c028f246>] write_msg+0x56/0xfa (52)
[<c0120e06>] __call_console_drivers+0x56/0x65 (32)
[<c0120f49>] call_console_drivers+0xac/0x163 (36)
[<c0121326>] release_console_sem+0x33/0xde (32)
[<c012122c>] vprintk+0x134/0x16d (36)
[<c01210f6>] printk+0x1d/0x1f (16)
[<c0106f4c>] show_trace+0x89/0xcd (36)
[<c010704e>] dump_stack+0x23/0x25 (20)
[<c0138be0>] lock_new_owner+0xbe/0xe2 (40)
[<c034462e>] __down_mutex+0xfb/0x30f (84)
[<c013a148>] __spin_lock+0x46/0x50 (24)
[<c013a1cc>] _spin_lock_irqsave+0x1d/0x21 (16)
[<c012a6d8>] del_timer+0x2c/0xe4 (32)
[<c012a7e1>] del_timer_sync+0x51/0x14b (104)
[<c0330469>] __rpc_execute+0x8d/0x3db (112)
[<c032bd4e>] rpc_call_sync+0x9b/0xc8 (40)
[<c01b7b4d>] nfs3_rpc_wrapper+0x3d/0x82 (40)
[<c01b7d31>] nfs3_proc_getattr+0x50/0x81 (40)
[<c01aeeec>] __nfs_revalidate_inode+0xf3/0x3b7 (192)
[<c01aac1c>] nfs_lookup_revalidate+0x37e/0x54c (352)
[<c0173220>] do_lookup+0x53/0x8d (32)
[<c01733da>] link_path_walk+0x180/0x1071 (108)
[<c01745f2>] path_lookup+0xa5/0x1b0 (32)
[<c017489f>] __user_walk+0x30/0x4d (32)
[<c016ed2b>] vfs_stat+0x1f/0x5a (92)
[<c016f3c4>] sys_stat64+0x1e/0x3d (100)
[<c01061a1>] sysenter_past_esp+0x52/0x71 (-4028)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c0344e5b>] .... _raw_spin_lock+0x1c/0x73
.....[<c0118894>] .. ( <= task_rq_lock+0x32/0x5b)
.. [<c013b764>] .... print_traces+0x1b/0x52
.....[<c010704e>] .. ( <= dump_stack+0x23/0x25)
</NETCONSOLELOG>
The "lock_new_owner" BUGs then repeated until the machine was restarted.
With HIGHMEM enabled, there were really weired things happening.
<HIGHMEMENABLEDLOG>
BUG at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/mm/highmem.c:191!
------------[ cut here ]------------
BUG: sleeping function called from invalid context init(1) at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1314
in_atomic():1 [00000001], irqs_disabled():0
[<c010704e>] dump_stack+0x23/0x25 (20)
[<c011d3db>] __might_sleep+0xbc/0xcf (36)
[<c013a3e6>] __spin_lock+0x34/0x50 (24)
[<c013a41f>] _spin_lock+0x1d/0x1f (16)
[<c014e632>] kmem_cache_alloc+0x37/0xf1 (32)
[<c02ca4f4>] alloc_skb+0x28/0xe6 (32)
[<c02db938>] find_skb+0x31/0x9b (24)
[<c02dba9a>] netpoll_send_udp+0x40/0x25a (48)
[<c02900c6>] write_msg+0x56/0xfa (52)
[<c01210d6>] __call_console_drivers+0x56/0x65 (32)
[<c0121219>] call_console_drivers+0xac/0x163 (36)
[<c01215f6>] release_console_sem+0x33/0xde (32)
[<c01214fc>] vprintk+0x134/0x16d (36)
[<c01213c6>] printk+0x1d/0x1f (16)
[<c0107299>] handle_BUG+0x63/0x9e (28)
[<c0107356>] die+0x82/0x195 (64)
[<c01079ac>] do_invalid_op+0x127/0x129 (192)
[<c0106c73>] error_code+0x2b/0x30 (76)
[<c01541dc>] copy_pte_range+0xd8/0x1e3 (48)
[<c015439b>] copy_pmd_range+0xb4/0xca (52)
[<c015441e>] copy_pgd_range+0x6d/0x8c (48)
[<c0154483>] copy_page_range+0x46/0x4e (48)
[<c011e76c>] copy_mm+0x28a/0x3ac (76)
[<c011f352>] copy_process+0x54a/0xf1b (76)
[<c011fe2c>] do_fork+0x6e/0x1ce (132)
[<c0104d08>] sys_fork+0x3d/0x3f (32)
[<c01061a1>] sysenter_past_esp+0x52/0x71 (-4028)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c0345fad>] .... _raw_spin_lock_irqsave+0x1b/0x70
.....[<c0107318>] .. ( <= die+0x44/0x195)
.. [<c013ba14>] .... print_traces+0x1b/0x52
.....[<c010704e>] .. ( <= dump_stack+0x23/0x25)
</HIGHMEMENABLEDLOG>
looks like some machinecode was undigestible for the cpu, no?
Best regards,
Karsten
* [email protected] <[email protected]> wrote:
> >- everything else should be SCHED_OTHER. Do latencies get any better if
> >you do this?
> I can, but that is not necessarily an "apples to apples" comparison.
the goal now would be to simplify the test and work down the issues in
isolation, instead of looking at a complex setup of mixed workloads and
just seeing 'it sucks' without knowing which component causes what.
That's why e.g. rtc_wakeup is so useful - it's simple and dependable and
still it showed a good deal of problems and helped debug/fix them.
Ingo
Am Mittwoch 10 November 2004 16:01 schrieb Ingo Molnar:
>
> * Karsten Wiese <[email protected]> wrote:
>
> > Hi
> >
> > On SMP/HT/P4 I get:
> > BUG: lock held at task exit time!
>
> > sh/5429: BUG in __up_mutex at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1064
> > BUG: sleeping function called from invalid context sh(5429) at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1314
> > in_atomic():1 [00000003], irqs_disabled():0
>
> hm, apparently something leaked a BKL count. Unfortunately we dont know
> precisely what did it, only that it happened. Did this happen during
> bootup, or during normal use. Can you trigger it arbitrarily?
Yes, it always happens, when callling ./cvscompile script of a project, that is mounted via nfs.
Haven't tried to do that ./cvscompile locally, should I?
* Karsten Wiese <[email protected]> wrote:
> Yes, it always happens, when callling ./cvscompile script of a
> project, that is mounted via nfs. Haven't tried to do that
> ./cvscompile locally, should I?
very interesting, nfs is indeed one of the frequent BKL users.
a 'BKL leak' is an unbalanced lock, e.g.:
lock_kernel();
... do stuff ...
if (error)
return;
... do more stuff ...
unlock_kernel();
the BKL is a very special type of lock which fact has the side-effect
that in the stock kernel a 'BKL leak' can go unnoticed very easily: it
causes no problems other than hard-to-debug (but severe) scalability
regressions. The moment the BKL count leaked from the NFS code that
process has been 'scalability-poisoned' and will be handicapped until it
exits.
In the PREEMPT_RT patchset i added a strict locking checker to do_exit()
that found this apparent NFS bug. Unfortunately the deadlock detector
only reported a pretty common place where the BKL gets
dropped/reacquired frequently so we dont know where the NFS code has the
lock/unlock imbalance. Could you report this bug to the NFS maintainers
(along with the above and the previous analysis)?
Ingo
Am Mittwoch 10 November 2004 15:20 schrieb Karsten Wiese:
> Am Mittwoch 10 November 2004 16:01 schrieb Ingo Molnar:
> >
> > * Karsten Wiese <[email protected]> wrote:
> >
> > > Hi
> > >
> > > On SMP/HT/P4 I get:
> > > BUG: lock held at task exit time!
> >
> > > sh/5429: BUG in __up_mutex at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1064
> > > BUG: sleeping function called from invalid context sh(5429) at /home/ka/kernel/2.6/linux-2.6.9-rc1-mm3-RT/kernel/rt.c:1314
> > > in_atomic():1 [00000003], irqs_disabled():0
> >
> > hm, apparently something leaked a BKL count. Unfortunately we dont know
> > precisely what did it, only that it happened. Did this happen during
> > bootup, or during normal use. Can you trigger it arbitrarily?
>
> Yes, it always happens, when callling ./cvscompile script of a project, that is mounted via nfs.
> Haven't tried to do that ./cvscompile locally, should I?
./cvscompile locally is ok.
Also if I disable HT in BIOS, the machine survives the crash "./cvscompile"ing via nfs
and the next "./cvscompile"s over nfs are ok. Also if I unmount / mount the nfs share again.
So it always happens the first time calling this ./cvscompile via nfs.
>* [email protected] <[email protected]> wrote:
>
>> >- everything else should be SCHED_OTHER. Do latencies get any better if
>> >you do this?
>
>> I can, but that is not necessarily an "apples to apples" comparison.
>
>the goal now would be to simplify the test and work down the issues in
>isolation, instead of looking at a complex setup of mixed workloads and
>just seeing 'it sucks' without knowing which component causes what.
However based on the results of the last several weeks, it is apparent
to me that the simple tests are finding only a subset of the problems.
The stressful series of tests is finding a number of symptoms much
sooner and more repeatable than those simple tests.
I was thinking about this problem this morning and was wondering if
we could do something like an "end trigger" to help determine the cause
of some of these pauses. Something like:
- start to fill / refresh the trace buffer (already doing this?)
- run RT CPU loop & sample TSC every 100 iterations or so
- if delta T exceeds 100 usec (or so), then set "end trigger" and
dump the data from /proc/latency_trace.
Repeat with some rate limit so we don't get too much data.
I can still run the stressful test cases to cause the situations and
get the "just in time" data for the analysis. Perhaps a variant of
the interface you provided before on tracing a specific path.
I may do a variant on this anyway. I think its important to see if
the symptom (> 100 usec CPU delay) is really:
- lots of short delays
OR
- relatively few long delays
and I have an idea of how to code that up and add to latencytrace.
--Mark H Johnson
<mailto:[email protected]>
>> [...] I also noticed that /proc/sys/kernel/preempt_wakeup_timing was
>> removed at .20 but not sure if that was deliberate. [...]
>
>yeah, this was deliberate - it's a side-effect of separating it from the
>other timing options.
OK. So maybe I didn't understand what you said previously. Now, if I build
to get maximum-latency wakeup values, I can't get the IRQ off or
preempt off timing and traces? If that's not true, how do I switch
between the different sampling methods?
--Mark H Johnson
<mailto:[email protected]>
* [email protected] <[email protected]> wrote:
> OK. So maybe I didn't understand what you said previously. Now, if I
> build to get maximum-latency wakeup values, I can't get the IRQ off or
> preempt off timing and traces? If that's not true, how do I switch
> between the different sampling methods?
you have to build another kernel for them. irqs-off and preempt-off
timing can be mixed freely (and both can be enabled in the same kernel),
but wakeup timing deserves its own .config space and since it's not
mixable with the other two methods i didnt see much point in enabling
all 3 at once with strange dependencies between them. Is this a big
issue? Normally i think the wakeup timing is more than enough to get a
feel of latencies, and if something specific is suspected the other ones
can be turned on.
Ingo
>I may do a variant on this anyway. I think its important to see if
>the symptom (> 100 usec CPU delay) is really:
>- lots of short delays
>OR
>- relatively few long delays
>and I have an idea of how to code that up and add to latencytrace.
A follow up on this message. My first test completed with the following
results. The new code indicates:
X - Min delay was 0. Max delay was 3. Ave delay was 0.015295.
top - Min delay was 0. Max delay was 23. Ave delay was 0.025659.
netO - Min delay was 0. Max delay was 31. Ave delay was 1.169024.
netI - Min delay was 1. Max delay was 35. Ave delay was 1.182864.
diskW - Min delay was 0. Max delay was 18. Ave delay was 1.166944.
diskC - Min delay was 0. Max delay was 18. Ave delay was 1.080036.
diskR - Min delay was 0. Max delay was 7. Ave delay was 0.803804.
A "delay" was counted if 1000 iterations of the CPU inner loop took
longer than 10 usec. For comparison, I moved this code to my 2.4 system
and got the following results:
X - Min delay was 0. Max delay was 17. Ave delay was 1.277730.
top - Min delay was 0. Max delay was 12. Ave delay was 1.452692.
netO - Min delay was 0. Max delay was 12. Ave delay was 1.633742.
netI - Min delay was 0. Max delay was 12. Ave delay was 1.626565.
diskW - Min delay was 0. Max delay was 14. Ave delay was 1.566188.
diskC - Min delay was 0. Max delay was 12. Ave delay was 1.701542.
diskR - Min delay was 0. Max delay was 12. Ave delay was 1.650909.
Looks pretty comparable at this level. The 2.4 results appear to be more
consistent.
Grr. The new code does have an impact on the application measurements
under 2.4. It appears the TSC accesses are being delayed while the disk
is active. The within 100 usec rate was only 77% (was 90%) but the peak
is still pretty close (2.60 vs. 2.38 msec).
I am also not sure this is the "right" measurement either. I probably
need to count the delays or divide the overall loop time by the number
of delays to see if that is a more meaningful value.
--Mark
>you have to build another kernel for them. irqs-off and preempt-off
>timing can be mixed freely (and both can be enabled in the same kernel),
>but wakeup timing deserves its own .config space and since it's not
>mixable with the other two methods i didnt see much point in enabling
>all 3 at once with strange dependencies between them. Is this a big
>issue? Normally i think the wakeup timing is more than enough to get a
>feel of latencies, and if something specific is suspected the other ones
>can be turned on.
Just that it takes an hour or so to rebuild the kernel plus the
disk storage to keep two kernels instead of one.
The wakeup latencies I am seeing are all quite small, but the overhead
I am seeing at the application level have been quite high. Only 40 wakeup
latencies > 50 usec in a half hour of testing. I guess I'll build a
.24 without wakeup timing to see what kind of problems I'm having.
I can send you the wakeup timing traces if you are interested but
they are all really short (< 100 usec) or appear to indicate a hardware
contention problem (one step at 100 usec or so).
--Mark
Ingo Molnar wrote:
> i have released the -V0.7.23 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
I have found some test results that I find interesting with -V0.7.24. I
modified the rtc-debug patch to work with any program that can use the
rtc driver. It measures the latency between handling an interrupt and
the actual read. The amlat program normally used with this patch sets up
a signal handler that reads /dev/rtc when there is data available and
then sleeps until it receives the signal. Realfeel just does a blocking
read on /dev/rtc. Oh and both of them setup the rtc for periodic
interrupts. I would probably expect the blocking read to be a bit faster
but not dramatically. Here are the results of a couple of very short
runs to illustrate the difference:
amlat results (sleep/sighandler):
Nov 10 21:10:39 daffy kernel: rtc histogram:
Nov 10 21:10:39 daffy kernel: 26 1
Nov 10 21:10:39 daffy kernel: 28 2152
Nov 10 21:10:39 daffy kernel: 29 4286
Nov 10 21:10:39 daffy kernel: 30 6857
Nov 10 21:10:39 daffy kernel: 31 408
Nov 10 21:10:39 daffy kernel: 32 9
Nov 10 21:10:39 daffy kernel: 33 32
Nov 10 21:10:39 daffy kernel: 34 217
Nov 10 21:10:39 daffy kernel: 35 145
Nov 10 21:10:39 daffy kernel: 36 26
Nov 10 21:10:39 daffy kernel: 40 2
Nov 10 21:10:39 daffy kernel: 41 9
Nov 10 21:10:39 daffy kernel: 42 100
Nov 10 21:10:39 daffy kernel: 43 180
Nov 10 21:10:39 daffy kernel: 44 113
Nov 10 21:10:39 daffy kernel: 45 30
Nov 10 21:10:39 daffy kernel: 46 2
Nov 10 21:10:39 daffy kernel: 47 4
Nov 10 21:10:39 daffy kernel: 48 15
Nov 10 21:10:39 daffy kernel: 49 4
Nov 10 21:10:39 daffy kernel: 50 5
Nov 10 21:10:39 daffy kernel:
Nov 10 21:10:39 daffy kernel: Total samples: 14597
realfeel results (blocking read):
Nov 10 21:11:32 daffy kernel: rtc histogram:
Nov 10 21:11:32 daffy kernel: 3 17844
Nov 10 21:11:32 daffy kernel: 4 859
Nov 10 21:11:32 daffy kernel: 5 34
Nov 10 21:11:32 daffy kernel: 6 1
Nov 10 21:11:32 daffy kernel: 8 1
Nov 10 21:11:32 daffy kernel: 12 19
Nov 10 21:11:32 daffy kernel: 13 41
Nov 10 21:11:32 daffy kernel: 14 7
Nov 10 21:11:32 daffy kernel:
Nov 10 21:11:32 daffy kernel: Total samples: 18806
Both of the above runs were with 'IRQ 8' running at SCHED_FF 99. Amlat
and realfeel were both running at SCHED_FF 98. The updated rtc-debug
patch to follow in case anyone is interested.
kr
Ingo Molnar wrote:
> i have released the -V0.7.23 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
Here is the updated rtc-debug patch. This version unlike previous
versions doesn't change the way the rtc driver works. The output of
/dev/rtc is preserved also so it doesn't break the existing
functionality of rtc. By the same token it won't produce output usable
by amlat, but it works for measuring the latency from interrupt to read.
It doesn't measure when a poll is satisfied yet because I didn't need
that yet. It doesn't trigger the end until the read.
kr
* K.R. Foley <[email protected]> wrote:
> Ingo Molnar wrote:
> >i have released the -V0.7.23 Real-Time Preemption patch, which can be
> >downloaded from the usual place:
> >
>
> Here is the updated rtc-debug patch. This version unlike previous
> versions doesn't change the way the rtc driver works. The output of
> /dev/rtc is preserved also so it doesn't break the existing
> functionality of rtc. By the same token it won't produce output usable
> by amlat, but it works for measuring the latency from interrupt to
> read.
looks good - i've added this to my tree. (with minor portability fixes:
rdtscll -> get_cycles(), long long -> cycles_t)
Ingo
* K.R. Foley <[email protected]> wrote:
> Here is the updated rtc-debug patch. This version unlike previous
> versions doesn't change the way the rtc driver works. The output of
> /dev/rtc is preserved also so it doesn't break the existing
> functionality of rtc. By the same token it won't produce output usable
> by amlat, but it works for measuring the latency from interrupt to
> read. It doesn't measure when a poll is satisfied yet because I
> didn't need that yet. It doesn't trigger the end until the read.
i've done some further cleanups: made it .config configurable
(CONFIG_RTC_HISTOGRAM), moved the latency-histogram construction code
into separate functions to make it more apparent that there is no impact
to the normal codepaths. Patch attached.
Ingo
* [email protected] <[email protected]> wrote:
> >you have to build another kernel for them. irqs-off and preempt-off
> >timing can be mixed freely (and both can be enabled in the same kernel),
> >but wakeup timing deserves its own .config space and since it's not
> >mixable with the other two methods i didnt see much point in enabling
> >all 3 at once with strange dependencies between them. Is this a big
> >issue? Normally i think the wakeup timing is more than enough to get a
> >feel of latencies, and if something specific is suspected the other ones
> >can be turned on.
>
> Just that it takes an hour or so to rebuild the kernel plus the disk
> storage to keep two kernels instead of one.
ok, i added runtime configurability of the 3 options back, so that you
can build all 3 in and switch between them via the preempt_wakeup_timing
flag. Will be in the next release.
Ingo
* [email protected] <[email protected]> wrote:
> I was thinking about this problem this morning and was wondering if
> we could do something like an "end trigger" to help determine the cause
> of some of these pauses. Something like:
> - start to fill / refresh the trace buffer (already doing this?)
> - run RT CPU loop & sample TSC every 100 iterations or so
> - if delta T exceeds 100 usec (or so), then set "end trigger" and
> dump the data from /proc/latency_trace.
> Repeat with some rate limit so we don't get too much data.
we had most of this in the tracer already, just obscured a bit.
In the current tree i've separated all the functionality into the
following flags: trace_enabled, trace_user_triggered, trace_freerunning.
When user_triggered is enabled all the other timing related tracing
activities stops (wakeup & critical/irq timing), and userspace is the
master of when tracing starts and stops. The way to turn the tracer
on/off is still the gettimeofday API hack:
gettimeofday(0,1);
gettimeofday(0,0);
i enhanced this user-defined tracing with a timing mechanism: the kernel
measures the latency between on and off and does the usual max-latency
or threshold tracking and saves the latency trace into
/proc/latency_trace only if the latency condition triggers. So userspace
only has to add the start/stop hooks and the kernel will deal with the
rest.
the freerunning flag can be used to generate traces where there's no
natural 'start' event, only some well-defined "we have a latency
problem" point. The application can start tracing at startup (or you can
start it via a different app), and it's enough to stop tracing when
there's a problem - the last ~4000 trace entries will be copied into
/proc/latency_trace. (the timing mechanism is still present in this mode
as well.)
(Freerunning mode can also be used if for some reason the delays are too
long to be fully traced and if the 'interesting' stuff is at the end of
the delay.)
This stuff will show up in the next release, later today. It should
cover the needs that your tests have at the moment, correct?
Ingo
* Ingo Molnar <[email protected]> wrote:
> i've done some further cleanups: made it .config configurable
> (CONFIG_RTC_HISTOGRAM), moved the latency-histogram construction code
> into separate functions to make it more apparent that there is no
> impact to the normal codepaths. Patch attached.
i've attached another update with a few more smaller details fixed:
- only print the histogram if a /dev/rtc using application indeed used
it to get interrupts. This removes bogus printouts triggered by
hwclock.
- skip the first RTC interrupt from the histogram - most of the
/dev/rtc applications do not handle the first event very well,
skewing the histogram.
Ingo
Ingo Molnar wrote:
> * Ingo Molnar <[email protected]> wrote:
>
>
>>i've done some further cleanups: made it .config configurable
>>(CONFIG_RTC_HISTOGRAM), moved the latency-histogram construction code
>>into separate functions to make it more apparent that there is no
>>impact to the normal codepaths. Patch attached.
>
>
> i've attached another update with a few more smaller details fixed:
>
> - only print the histogram if a /dev/rtc using application indeed used
> it to get interrupts. This removes bogus printouts triggered by
> hwclock.
>
> - skip the first RTC interrupt from the histogram - most of the
> /dev/rtc applications do not handle the first event very well,
> skewing the histogram.
>
> Ingo
Very nicely done. Also much less of a hack now the way you took all of
the code out of the normal codepaths and made it into inlines that
pretty much compile out if not being used.
kr
i have released the -V0.7.25-0 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this release includes fixes, new features and latency improvements.
the biggest change is the threading of the lone remaining non-threaded
external interrupt: the timer interrupt (IRQ#0).
It was not threaded until now because unlike device interrupts the timer
interrupt is quite deeply attached to Linux's architecture, driving
process timing, profiling and the scheduler. I separated these into
stuff that can be done from a thread context and stuff that needs to
execute directly.
Fortunately most of the expensive and latency-generating stuff could be
pushed into the irq thread. As a side-effect this enabled the turning of
rtc_lock and xtime_lock into a mutex. Also, it removed some ugly hacks
from rtc.c and should improve worst-case latencies.
the other bigger change is the reworking of the .config space. Instead
of the deep hierarchy of hard-to-understand technical PREEMPT options,
there's now a flattened out choice of 4 preemption models:
( ) No Forced Preemption (Server)
( ) Voluntary Kernel Preemption (Desktop)
( ) Preemptible Kernel (Low-Latency Desktop)
(X) Complete Preemption (Real-Time)
and updated help texts. Plus for the preemption models where it can be
freely turned off, softirq and hardirq threading is an additional
option.
the third bigger change is the reworking of the tracer to make it easier
to drive it from user-space. There are 3 runtime flags now:
/proc/sys/kernel/trace_enabled (default: 1)
/proc/sys/kernel/trace_user_triggered (default: 0)
/proc/sys/kernel/trace_freerunning (default: 0)
semantics of user-triggered tracing: if enabled then any active tracing
of wakeup and/or critical sections stops and userspace drives start/stop
events via gettimeofday(0,1)/gettimeofday(0,0). The latter saves the
current trace into /proc/latency_trace, the former clears the trace
buffer and starts tracing anew. Doing another gettimeofday(0,1) on an
already running tracer clears the trace and restarts it without saving
the current trace into /proc/latency_trace. Doing a gettimeofday(0,0) on
an already stopped tracer has no effect (i.e. /proc/latency_trace wont
be saved a second time). The tracer does timing for userspace
automatically the same way it does it for the built-in timing
mechanisms, and it can be configured via the preempt_max_latency and
preempt_tresh tunables.
also, wakeup-timing, irq-off and preempt-off critical section timing can
now be done at once again, the /proc/sys/kernel/preempt_wakeup_timing
flag switches between the modes. (default: 1)
other changes since -V0.7.24:
- debug feature: added the RTC-debug patch sanitized by K.R. Foley,
plus further cleanups.
- added upstream fix for kobject related crash, pointed out by Shane
Shrybman.
- cleanup: Kconfig help text fixes from Amit Shah
- latency improvement: on UP-IOAPIC, when redirecting an interrupt, do
not ack the APIC. This is the method used for direct interrupts and
on UP it might as well work out fine. It is certainly faster than
masking/unmasking, making UP-IOAPIC the fastest PIC mode again.
- livelock fix: the timer-irq threading unearthed a seqlock related
livelock scenario, which triggered in do_gettimeoffset() big-time.
The solution is to serialize seqlock readers with writers _iff_ the
seqlock status is 'invalid'. This is a rare event, but when it
happens it saves the day.
- debugging helper: the /proc/sys/kernel/debug_direct_keyboard flag
(default: 0) will hack the keyboard IRQ into being direct. NOTE: the
keyboard in this mode should only be used to access SysRq
functionality that is not possible via the threaded keyboard handler.
The direct keyboard IRQ can crash the system.
- new kernel profiling features: added profile=preempt to profile
whether interrupts hit the kernel in preemptible mode or in a
critical section. Added /proc/sys/kernel/prof_pid to profile a
specific PID only. (default: -1, meaning all tasks profiled) Added
/proc/irq/prof_cpu_mask back.
- robustness improvement: do not report atomicity-warnings during
kernel oopses - it's more important to get the oops out to the
console.
to create a -V0.7.25-0 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.25-0
Ingo
* Gunther Persoons <[email protected]> wrote:
> Got 2 times a hard lock up with this one. Happened while i was typing
> something and downloading both after 15-20 minutes.
.config?
Ingo
Ingo Molnar wrote:
>i have released the -V0.7.25-0 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this release includes fixes, new features and latency improvements.
>
>the biggest change is the threading of the lone remaining non-threaded
>external interrupt: the timer interrupt (IRQ#0).
>
>It was not threaded until now because unlike device interrupts the timer
>interrupt is quite deeply attached to Linux's architecture, driving
>process timing, profiling and the scheduler. I separated these into
>stuff that can be done from a thread context and stuff that needs to
>execute directly.
>
>Fortunately most of the expensive and latency-generating stuff could be
>pushed into the irq thread. As a side-effect this enabled the turning of
>rtc_lock and xtime_lock into a mutex. Also, it removed some ugly hacks
>from rtc.c and should improve worst-case latencies.
>
>the other bigger change is the reworking of the .config space. Instead
>of the deep hierarchy of hard-to-understand technical PREEMPT options,
>there's now a flattened out choice of 4 preemption models:
>
> ( ) No Forced Preemption (Server)
> ( ) Voluntary Kernel Preemption (Desktop)
> ( ) Preemptible Kernel (Low-Latency Desktop)
> (X) Complete Preemption (Real-Time)
>
>and updated help texts. Plus for the preemption models where it can be
>freely turned off, softirq and hardirq threading is an additional
>option.
>
>the third bigger change is the reworking of the tracer to make it easier
>to drive it from user-space. There are 3 runtime flags now:
>
> /proc/sys/kernel/trace_enabled (default: 1)
> /proc/sys/kernel/trace_user_triggered (default: 0)
> /proc/sys/kernel/trace_freerunning (default: 0)
>
>semantics of user-triggered tracing: if enabled then any active tracing
>of wakeup and/or critical sections stops and userspace drives start/stop
>events via gettimeofday(0,1)/gettimeofday(0,0). The latter saves the
>current trace into /proc/latency_trace, the former clears the trace
>buffer and starts tracing anew. Doing another gettimeofday(0,1) on an
>already running tracer clears the trace and restarts it without saving
>the current trace into /proc/latency_trace. Doing a gettimeofday(0,0) on
>an already stopped tracer has no effect (i.e. /proc/latency_trace wont
>be saved a second time). The tracer does timing for userspace
>automatically the same way it does it for the built-in timing
>mechanisms, and it can be configured via the preempt_max_latency and
>preempt_tresh tunables.
>
>also, wakeup-timing, irq-off and preempt-off critical section timing can
>now be done at once again, the /proc/sys/kernel/preempt_wakeup_timing
>flag switches between the modes. (default: 1)
>
>other changes since -V0.7.24:
>
> - debug feature: added the RTC-debug patch sanitized by K.R. Foley,
> plus further cleanups.
>
> - added upstream fix for kobject related crash, pointed out by Shane
> Shrybman.
>
> - cleanup: Kconfig help text fixes from Amit Shah
>
> - latency improvement: on UP-IOAPIC, when redirecting an interrupt, do
> not ack the APIC. This is the method used for direct interrupts and
> on UP it might as well work out fine. It is certainly faster than
> masking/unmasking, making UP-IOAPIC the fastest PIC mode again.
>
> - livelock fix: the timer-irq threading unearthed a seqlock related
> livelock scenario, which triggered in do_gettimeoffset() big-time.
> The solution is to serialize seqlock readers with writers _iff_ the
> seqlock status is 'invalid'. This is a rare event, but when it
> happens it saves the day.
>
> - debugging helper: the /proc/sys/kernel/debug_direct_keyboard flag
> (default: 0) will hack the keyboard IRQ into being direct. NOTE: the
> keyboard in this mode should only be used to access SysRq
> functionality that is not possible via the threaded keyboard handler.
> The direct keyboard IRQ can crash the system.
>
> - new kernel profiling features: added profile=preempt to profile
> whether interrupts hit the kernel in preemptible mode or in a
> critical section. Added /proc/sys/kernel/prof_pid to profile a
> specific PID only. (default: -1, meaning all tasks profiled) Added
> /proc/irq/prof_cpu_mask back.
>
> - robustness improvement: do not report atomicity-warnings during
> kernel oopses - it's more important to get the oops out to the
> console.
>
>to create a -V0.7.25-0 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.25-0
>
> Ingo
>
>
>
Got 2 times a hard lock up with this one. Happened while i was typing
something and downloading both after 15-20 minutes.
* Ingo Molnar <[email protected]> wrote:
> * Gunther Persoons <[email protected]> wrote:
>
> > Got 2 times a hard lock up with this one. Happened while i was typing
> > something and downloading both after 15-20 minutes.
>
> .config?
just in case you are using UP-IOAPIC, could you enable CONFIG_SMP (even
if you are running an UP box) and see whether the lockup goes away?
Which was the last -RT kernel that you tried that didnt lock up in this
fashion?
Ingo
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.10-rc1-mm3-RT-V0.7.25-0
# Thu Nov 11 16:07:49 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
# CONFIG_POSIX_MQUEUE is not set
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=14
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_HPET_TIMER=y
CONFIG_HPET_EMULATE_RTC=y
# CONFIG_SMP is not set
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT_DESKTOP is not set
CONFIG_PREEMPT_RT=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_UP_APIC=y
# CONFIG_X86_UP_IOAPIC is not set
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_HAVE_DEC_LOCK=y
# CONFIG_REGPARM is not set
#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
CONFIG_KERN_PHYS_OFFSET=1
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
# CONFIG_PM_DEBUG is not set
# CONFIG_SOFTWARE_SUSPEND is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_VIDEO=m
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
CONFIG_ACPI_THINKPAD=m
CONFIG_ACPI_IBM=m
# CONFIG_ACPI_TOSHIBA is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
# CONFIG_X86_PM_TIMER is not set
# CONFIG_ACPI_CONTAINER is not set
#
# APM (Advanced Power Management) BIOS Support
#
CONFIG_APM=y
# CONFIG_APM_IGNORE_USER_SUSPEND is not set
# CONFIG_APM_DO_ENABLE is not set
# CONFIG_APM_CPU_IDLE is not set
# CONFIG_APM_DISPLAY_BLANK is not set
# CONFIG_APM_RTC_IS_GMT is not set
# CONFIG_APM_ALLOW_INTS is not set
# CONFIG_APM_REAL_MODE_POWER_OFF is not set
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
# CONFIG_CPU_FREQ_DEBUG is not set
# CONFIG_CPU_FREQ_PROC_INTF is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE=y
# CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE is not set
CONFIG_CPU_FREQ_GOV_PERFORMANCE=y
CONFIG_CPU_FREQ_GOV_POWERSAVE=y
CONFIG_CPU_FREQ_GOV_USERSPACE=y
# CONFIG_CPU_FREQ_24_API is not set
# CONFIG_CPU_FREQ_GOV_ONDEMAND is not set
CONFIG_CPU_FREQ_TABLE=y
#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=y
CONFIG_X86_POWERNOW_K6=y
CONFIG_X86_POWERNOW_K7=y
CONFIG_X86_POWERNOW_K7_ACPI=y
CONFIG_X86_POWERNOW_K8=y
CONFIG_X86_POWERNOW_K8_ACPI=y
CONFIG_X86_GX_SUSPMOD=y
CONFIG_X86_SPEEDSTEP_CENTRINO=y
CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI=y
CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y
CONFIG_X86_SPEEDSTEP_ICH=y
CONFIG_X86_SPEEDSTEP_SMI=y
CONFIG_X86_P4_CLOCKMOD=y
# CONFIG_X86_CPUFREQ_NFORCE2 is not set
CONFIG_X86_LONGRUN=y
CONFIG_X86_LONGHAUL=y
#
# shared options
#
# CONFIG_X86_ACPI_CPUFREQ_PROC_INTF is not set
CONFIG_X86_SPEEDSTEP_LIB=y
# CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_LEGACY_PROC=y
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# PCCARD (PCMCIA/CardBus) support
#
CONFIG_PCCARD=y
# CONFIG_PCMCIA_DEBUG is not set
# CONFIG_PCMCIA_OBSOLETE is not set
CONFIG_PCMCIA=y
CONFIG_CARDBUS=y
#
# PC-card bridges
#
CONFIG_YENTA=m
# CONFIG_PD6729 is not set
# CONFIG_I82092 is not set
# CONFIG_I82365 is not set
# CONFIG_TCIC is not set
CONFIG_PCMCIA_PROBE=y
#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=m
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
# CONFIG_PARPORT is not set
#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set
#
# Protocols
#
# CONFIG_ISAPNP is not set
# CONFIG_PNPBIOS is not set
# CONFIG_PNPACPI is not set
#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=y
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=8192
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
CONFIG_LBD=y
# CONFIG_CDROM_PKTCDVD is not set
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_BLK_DEV_IDECS is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
CONFIG_BLK_DEV_IDESCSI=m
CONFIG_IDE_TASK_IOCTL=y
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_CMD640=y
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_AEC62XX=y
CONFIG_BLK_DEV_ALI15X3=y
# CONFIG_WDC_ALI15X3 is not set
CONFIG_BLK_DEV_AMD74XX=y
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
CONFIG_BLK_DEV_HPT34X=y
# CONFIG_HPT34X_AUTODMA is not set
CONFIG_BLK_DEV_HPT366=y
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NS87415 is not set
CONFIG_BLK_DEV_PDC202XX_OLD=y
# CONFIG_PDC202XX_BURST is not set
CONFIG_BLK_DEV_PDC202XX_NEW=y
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
CONFIG_BLK_DEV_SIS5513=y
CONFIG_BLK_DEV_SLC90E66=y
CONFIG_BLK_DEV_TRM290=y
CONFIG_BLK_DEV_VIA82CXXX=y
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=m
# CONFIG_SCSI_PROC_FS is not set
#
# SCSI support type (disk, tape, CD-ROM)
#
# CONFIG_BLK_DEV_SD is not set
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
# CONFIG_CHR_DEV_SG is not set
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
CONFIG_SCSI_SPI_ATTRS=m
# CONFIG_SCSI_FC_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
# CONFIG_SCSI_SATA is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLOGIC_1280_1040 is not set
CONFIG_SCSI_QLA2XXX=m
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# PCMCIA SCSI adapter support
#
# CONFIG_PCMCIA_AHA152X is not set
# CONFIG_PCMCIA_FDOMAIN is not set
# CONFIG_PCMCIA_NINJA_SCSI is not set
# CONFIG_PCMCIA_QLOGIC is not set
# CONFIG_PCMCIA_SYM53C500 is not set
#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
CONFIG_FUSION=m
CONFIG_FUSION_MAX_SGE=40
CONFIG_FUSION_CTL=m
#
# IEEE 1394 (FireWire) support
#
CONFIG_IEEE1394=m
#
# Subsystem Options
#
# CONFIG_IEEE1394_VERBOSEDEBUG is not set
# CONFIG_IEEE1394_OUI_DB is not set
# CONFIG_IEEE1394_EXTRA_CONFIG_ROMS is not set
#
# Device Drivers
#
#
# Texas Instruments PCILynx requires I2C
#
CONFIG_IEEE1394_OHCI1394=m
#
# Protocol Drivers
#
# CONFIG_IEEE1394_VIDEO1394 is not set
CONFIG_IEEE1394_SBP2=m
# CONFIG_IEEE1394_SBP2_PHYS_DMA is not set
# CONFIG_IEEE1394_ETH1394 is not set
CONFIG_IEEE1394_DV1394=m
CONFIG_IEEE1394_RAWIO=m
CONFIG_IEEE1394_CMP=m
CONFIG_IEEE1394_AMDTP=m
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
# CONFIG_NETLINK_DEV is not set
CONFIG_UNIX=y
CONFIG_NET_KEY=m
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
# CONFIG_INET_IPCOMP is not set
# CONFIG_INET_TUNNEL is not set
CONFIG_IP_TCPDIAG=y
# CONFIG_IP_TCPDIAG_IPV6 is not set
# CONFIG_IPV6 is not set
# CONFIG_NETFILTER is not set
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set
# CONFIG_NET_CLS_ROUTE is not set
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_NET_SB1000 is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
CONFIG_LANCE=m
CONFIG_NET_VENDOR_SMC=y
# CONFIG_WD80x3 is not set
# CONFIG_ULTRA is not set
# CONFIG_SMC9194 is not set
# CONFIG_NET_VENDOR_RACAL is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
CONFIG_HP100=m
# CONFIG_NET_ISA is not set
CONFIG_NET_PCI=y
CONFIG_PCNET32=m
CONFIG_AMD8111_ETH=m
# CONFIG_AMD8111E_NAPI is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
CONFIG_APRICOT=m
CONFIG_B44=m
CONFIG_FORCEDETH=m
# CONFIG_CS89x0 is not set
CONFIG_DGRS=m
# CONFIG_EEPRO100 is not set
CONFIG_E100=m
# CONFIG_E100_NAPI is not set
# CONFIG_FEALNX is not set
CONFIG_NATSEMI=m
CONFIG_NE2K_PCI=m
CONFIG_8139CP=m
CONFIG_8139TOO=m
# CONFIG_8139TOO_PIO is not set
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_OLD_RX_RESET is not set
CONFIG_SIS900=m
CONFIG_EPIC100=m
CONFIG_SUNDANCE=m
# CONFIG_SUNDANCE_MMIO is not set
CONFIG_TLAN=m
CONFIG_VIA_RHINE=m
# CONFIG_VIA_RHINE_MMIO is not set
# CONFIG_NET_POCKET is not set
#
# Ethernet (1000 Mbit)
#
CONFIG_ACENIC=m
# CONFIG_ACENIC_OMIT_TIGON_I is not set
# CONFIG_DL2K is not set
CONFIG_E1000=m
# CONFIG_E1000_NAPI is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
CONFIG_R8169=m
# CONFIG_R8169_NAPI is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
CONFIG_TIGON3=m
#
# Ethernet (10000 Mbit)
#
CONFIG_IXGB=m
# CONFIG_IXGB_NAPI is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
CONFIG_NET_RADIO=y
#
# Obsolete Wireless cards support (pre-802.11)
#
# CONFIG_STRIP is not set
# CONFIG_ARLAN is not set
# CONFIG_WAVELAN is not set
# CONFIG_PCMCIA_WAVELAN is not set
# CONFIG_PCMCIA_NETWAVE is not set
#
# Wireless 802.11 Frequency Hopping cards support
#
# CONFIG_PCMCIA_RAYCS is not set
#
# Wireless 802.11b ISA/PCI cards support
#
# CONFIG_AIRO is not set
# CONFIG_HERMES is not set
# CONFIG_ATMEL is not set
#
# Wireless 802.11b Pcmcia/Cardbus cards support
#
CONFIG_AIRO_CS=m
# CONFIG_PCMCIA_WL3501 is not set
#
# Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support
#
# CONFIG_PRISM54 is not set
CONFIG_NET_WIRELESS=y
#
# PCMCIA network device support
#
# CONFIG_NET_PCMCIA is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=m
# CONFIG_PPP_MULTILINK is not set
# CONFIG_PPP_FILTER is not set
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
CONFIG_GAMEPORT=m
CONFIG_SOUND_GAMEPORT=m
# CONFIG_GAMEPORT_NS558 is not set
# CONFIG_GAMEPORT_L4 is not set
# CONFIG_GAMEPORT_EMU10K1 is not set
# CONFIG_GAMEPORT_VORTEX is not set
# CONFIG_GAMEPORT_FM801 is not set
# CONFIG_GAMEPORT_CS461x is not set
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_INPORT is not set
# CONFIG_MOUSE_LOGIBM is not set
# CONFIG_MOUSE_PC110PAD is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_CONSOLE is not set
# CONFIG_SERIAL_8250_CS is not set
# CONFIG_SERIAL_8250_ACPI is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
CONFIG_HW_RANDOM=y
CONFIG_NVRAM=y
CONFIG_RTC=y
CONFIG_RTC_HISTOGRAM=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
CONFIG_SONYPI=m
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=m
CONFIG_AGP_ALI=m
CONFIG_AGP_ATI=m
CONFIG_AGP_AMD=m
CONFIG_AGP_AMD64=m
CONFIG_AGP_INTEL=m
# CONFIG_AGP_INTEL_MCH is not set
CONFIG_AGP_NVIDIA=m
CONFIG_AGP_SIS=m
CONFIG_AGP_SWORKS=m
CONFIG_AGP_VIA=m
# CONFIG_AGP_EFFICEON is not set
CONFIG_DRM=y
CONFIG_DRM_TDFX=m
CONFIG_DRM_R128=m
CONFIG_DRM_RADEON=m
CONFIG_DRM_I810=m
# CONFIG_DRM_I830 is not set
# CONFIG_DRM_I915 is not set
CONFIG_DRM_MGA=m
CONFIG_DRM_SIS=m
#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
# CONFIG_HPET is not set
# CONFIG_HANGCHECK_TIMER is not set
#
# I2C support
#
# CONFIG_I2C is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
CONFIG_FB=y
# CONFIG_FB_MODE_HELPERS is not set
# CONFIG_FB_TILEBLITTING is not set
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
# CONFIG_FB_HGA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_I810 is not set
# CONFIG_FB_INTEL is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON_OLD is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_SAVAGE is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_VIRTUAL is not set
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_MDA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=y
# CONFIG_FONTS is not set
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
#
# Logo configuration
#
CONFIG_LOGO=y
CONFIG_LOGO_LINUX_MONO=y
CONFIG_LOGO_LINUX_VGA16=y
CONFIG_LOGO_LINUX_CLUT224=y
#
# Sound
#
CONFIG_SOUND=y
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=m
CONFIG_SND_TIMER=m
CONFIG_SND_PCM=m
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=m
# CONFIG_SND_SEQ_DUMMY is not set
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=m
CONFIG_SND_PCM_OSS=m
CONFIG_SND_SEQUENCER_OSS=y
# CONFIG_SND_RTCTIMER is not set
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
#
# Generic devices
#
CONFIG_SND_MPU401_UART=m
CONFIG_SND_OPL3_LIB=m
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
#
# ISA devices
#
# CONFIG_SND_AD1848 is not set
# CONFIG_SND_CS4231 is not set
# CONFIG_SND_CS4232 is not set
# CONFIG_SND_CS4236 is not set
# CONFIG_SND_ES1688 is not set
# CONFIG_SND_ES18XX is not set
# CONFIG_SND_GUSCLASSIC is not set
# CONFIG_SND_GUSEXTREME is not set
# CONFIG_SND_GUSMAX is not set
# CONFIG_SND_INTERWAVE is not set
# CONFIG_SND_INTERWAVE_STB is not set
# CONFIG_SND_OPTI92X_AD1848 is not set
# CONFIG_SND_OPTI92X_CS4231 is not set
# CONFIG_SND_OPTI93X is not set
# CONFIG_SND_SB8 is not set
# CONFIG_SND_SB16 is not set
# CONFIG_SND_SBAWE is not set
# CONFIG_SND_WAVEFRONT is not set
# CONFIG_SND_CMI8330 is not set
# CONFIG_SND_OPL3SA2 is not set
# CONFIG_SND_SGALAXY is not set
# CONFIG_SND_SSCAPE is not set
#
# PCI devices
#
CONFIG_SND_AC97_CODEC=m
CONFIG_SND_ALI5451=m
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
CONFIG_SND_AZT3328=m
# CONFIG_SND_BT87X is not set
CONFIG_SND_CS46XX=m
# CONFIG_SND_CS46XX_NEW_DSP is not set
CONFIG_SND_CS4281=m
CONFIG_SND_EMU10K1=m
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
CONFIG_SND_NM256=m
# CONFIG_SND_RME32 is not set
CONFIG_SND_RME96=m
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
CONFIG_SND_TRIDENT=m
CONFIG_SND_YMFPCI=m
CONFIG_SND_ALS4000=m
CONFIG_SND_CMIPCI=m
CONFIG_SND_ENS1370=m
CONFIG_SND_ENS1371=m
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
CONFIG_SND_ICE1712=m
CONFIG_SND_ICE1724=m
CONFIG_SND_INTEL8X0=m
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_SONICVIBES is not set
CONFIG_SND_VIA82XX=m
# CONFIG_SND_VX222 is not set
#
# USB devices
#
# CONFIG_SND_USB_AUDIO is not set
# CONFIG_SND_USB_USX2Y is not set
#
# PCMCIA devices
#
# CONFIG_SND_VXPOCKET is not set
# CONFIG_SND_VXP440 is not set
# CONFIG_SND_PDAUDIOCF is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set
#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_BANDWIDTH is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_SUSPEND is not set
# CONFIG_USB_OTG is not set
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
# CONFIG_USB_EHCI_SPLIT_ISO is not set
# CONFIG_USB_EHCI_ROOT_HUB_TT is not set
CONFIG_USB_OHCI_HCD=m
CONFIG_USB_UHCI_HCD=m
#
# USB Device Class drivers
#
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_BLUETOOTH_TTY is not set
# CONFIG_USB_MIDI is not set
# CONFIG_USB_ACM is not set
CONFIG_USB_PRINTER=m
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_RW_DETECT is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_DPCM is not set
# CONFIG_USB_STORAGE_HP8200e is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
#
# USB Input Devices
#
CONFIG_USB_HID=m
CONFIG_USB_HIDINPUT=y
# CONFIG_HID_FF is not set
CONFIG_USB_HIDDEV=y
#
# USB HID Boot Protocol drivers
#
# CONFIG_USB_KBD is not set
# CONFIG_USB_MOUSE is not set
# CONFIG_USB_AIPTEK is not set
# CONFIG_USB_WACOM is not set
# CONFIG_USB_KBTAB is not set
# CONFIG_USB_POWERMATE is not set
# CONFIG_USB_MTOUCH is not set
# CONFIG_USB_EGALAX is not set
# CONFIG_USB_XPAD is not set
# CONFIG_USB_ATI_REMOTE is not set
#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USB_HPUSBSCSI is not set
#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set
#
# Video4Linux support is needed for USB Multimedia device support
#
#
# USB Network adaptors
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
#
# USB port drivers
#
#
# USB Serial Converter support
#
# CONFIG_USB_SERIAL is not set
#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_TIGL is not set
# CONFIG_USB_AUERSWALD is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGETKIT is not set
# CONFIG_USB_PHIDGETSERVO is not set
# CONFIG_USB_TEST is not set
#
# USB ATM/DSL drivers
#
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISER4_FS=m
CONFIG_REISER4_LARGE_KEY=y
# CONFIG_REISER4_CHECK is not set
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
CONFIG_REISERFS_PROC_INFO=y
# CONFIG_REISERFS_FS_XATTR is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
CONFIG_ROMFS_FS=y
# CONFIG_QUOTA is not set
CONFIG_DNOTIFY=y
# CONFIG_AUTOFS_FS is not set
# CONFIG_AUTOFS4_FS is not set
#
# Caches
#
# CONFIG_FSCACHE is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_TMPFS_XATTR is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=y
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
CONFIG_UFS_FS=m
# CONFIG_UFS_FS_WRITE is not set
#
# Network File Systems
#
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V4 is not set
# CONFIG_NFSD_TCP is not set
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_SUNRPC=y
# CONFIG_RPCSEC_GSS_KRB5 is not set
# CONFIG_RPCSEC_GSS_SPKM3 is not set
CONFIG_SMB_FS=m
# CONFIG_SMB_NLS_DEFAULT is not set
CONFIG_CIFS=m
# CONFIG_CIFS_STATS is not set
# CONFIG_CIFS_XATTR is not set
# CONFIG_CIFS_POSIX is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set
#
# Profiling support
#
# CONFIG_PROFILING is not set
#
# Kernel hacking
#
# CONFIG_DEBUG_KERNEL is not set
CONFIG_DEBUG_PREEMPT=y
CONFIG_WAKEUP_TIMING=y
CONFIG_PREEMPT_TRACE=y
# CONFIG_CRITICAL_PREEMPT_TIMING is not set
# CONFIG_CRITICAL_IRQSOFF_TIMING is not set
CONFIG_LATENCY_TIMING=y
# CONFIG_LATENCY_TRACE is not set
CONFIG_RT_DEADLOCK_DETECT=y
# CONFIG_USE_FRAME_POINTER is not set
CONFIG_EARLY_PRINTK=y
# CONFIG_4KSTACKS is not set
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
#
# Security options
#
# CONFIG_KEYS is not set
# CONFIG_SECURITY is not set
#
# Cryptographic options
#
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=m
# CONFIG_CRYPTO_MD4 is not set
CONFIG_CRYPTO_MD5=m
CONFIG_CRYPTO_SHA1=m
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_BLOWFISH=m
CONFIG_CRYPTO_TWOFISH=m
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES_586 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
CONFIG_CRYPTO_ARC4=m
# CONFIG_CRYPTO_KHAZAD is not set
CONFIG_CRYPTO_DEFLATE=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set
#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC32=y
# CONFIG_LIBCRC32C is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_PC=y
Ingo Molnar wrote:
>* Ingo Molnar <[email protected]> wrote:
>
>
>
>>* Gunther Persoons <[email protected]> wrote:
>>
>>
>>
>>>Got 2 times a hard lock up with this one. Happened while i was typing
>>>something and downloading both after 15-20 minutes.
>>>
>>>
>>.config?
>>
>>
>
>just in case you are using UP-IOAPIC, could you enable CONFIG_SMP (even
>if you are running an UP box) and see whether the lockup goes away?
>
>
Ok. Going to build a new kernel.
>Which was the last -RT kernel that you tried that didnt lock up in this
>fashion?
>
>
V0.7.24.
> Ingo
>
>
>
* Ingo Molnar <[email protected]> wrote:
> just in case you are using UP-IOAPIC, could you enable CONFIG_SMP
> (even if you are running an UP box) and see whether the lockup goes
> away? Which was the last -RT kernel that you tried that didnt lock up
> in this fashion?
if with CONFIG_SMP it's more stable, could you try the following: turn
off CONFIG_SMP again and edit arch/i386/kernel/io_apic.c and remove this
string:
&& defined(CONFIG_SMP)
(there should be only one occurence of this string.)
this will turn the previous IO-APIC logic (used in -V0.7.24) back on.
Ingo
Ingo Molnar wrote:
>* Ingo Molnar <[email protected]> wrote:
>
>
>
>>just in case you are using UP-IOAPIC, could you enable CONFIG_SMP
>>(even if you are running an UP box) and see whether the lockup goes
>>away? Which was the last -RT kernel that you tried that didnt lock up
>>in this fashion?
>>
>>
>
>if with CONFIG_SMP it's more stable, could you try the following: turn
>off CONFIG_SMP again and edit arch/i386/kernel/io_apic.c and remove this
>string:
>
> && defined(CONFIG_SMP)
>
>(there should be only one occurence of this string.)
>
>this will turn the previous IO-APIC logic (used in -V0.7.24) back on.
>
> Ingo
>
>
>
It locked up after 20 minutes with CONFIG_SMP enabled.
>i have released the -V0.7.25-0 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this release includes fixes, new features and latency improvements.
It may be coincidence, but when I did
chrt -p -f 99 2
(to set IRQ 0 to max RT priority, like the other IRQ's)
I got the following deadlock.
==========================================
[ BUG: lock recursion deadlock detected! |
------------------------------------------
already locked: [c140c2e0] {&base->lock}
.. held by: ksoftirqd/0: 4 [c17953f0, 105]
... acquired at: run_timer_softirq+0x108/0x470
------------------------------
| showing all locks held by: | (ksoftirqd/0/4 [c17953f0, 105]):
------------------------------
#001: [c140c2e0] {&base->lock}
... acquired at: run_timer_softirq+0x108/0x470
#002: [c0576b6c] {&timer->lock}
... acquired at: __mod_timer+0x47/0x1d0
There are a LOT of messages that stream out after this problem.
I will be sending the full serial console log separately.
Will reboot shortly and see if this is a repeatable problem or not.
--Mark
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.10-rc1-mm3-RT-V0.7.25-0
# Thu Nov 11 19:53:49 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
# CONFIG_CLEAN_COMPILE is not set
CONFIG_BROKEN=y
CONFIG_BROKEN_ON_SMP=y
CONFIG_LOCK_KERNEL=y
#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
CONFIG_BSD_PROCESS_ACCT_V3=y
CONFIG_SYSCTL=y
CONFIG_AUDIT=y
CONFIG_AUDITSYSCALL=y
CONFIG_LOG_BUF_SHIFT=15
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_KALLSYMS_ALL=y
CONFIG_KALLSYMS_EXTRA_PASS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_CPUSETS=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
CONFIG_MODULE_SRCVERSION_ALL=y
CONFIG_KMOD=y
CONFIG_STOP_MACHINE=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
CONFIG_MPENTIUMIII=y
# CONFIG_MPENTIUMM is not set
# CONFIG_MPENTIUM4 is not set
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
CONFIG_HPET_TIMER=y
CONFIG_SMP=y
CONFIG_NR_CPUS=8
# CONFIG_SCHED_SMT is not set
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT_DESKTOP is not set
CONFIG_PREEMPT_RT=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
CONFIG_X86_MCE_P4THERMAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
#
# Firmware Drivers
#
# CONFIG_EDD is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_IRQBALANCE=y
CONFIG_HAVE_DEC_LOCK=y
#
# Performance-monitoring counters support
#
CONFIG_PERFCTR=y
CONFIG_PERFCTR_INIT_TESTS=y
CONFIG_PERFCTR_VIRTUAL=y
CONFIG_PERFCTR_INTERRUPT_SUPPORT=y
CONFIG_PERFCTR_CPUS_FORBIDDEN_MASK=y
CONFIG_KERN_PHYS_OFFSET=1
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
CONFIG_PM_DEBUG=y
CONFIG_SOFTWARE_SUSPEND=y
CONFIG_PM_STD_PARTITION=""
#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
# CONFIG_ACPI_VIDEO is not set
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
# CONFIG_ACPI_HOTPLUG_CPU is not set
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
CONFIG_ACPI_THINKPAD=m
CONFIG_ACPI_IBM=y
# CONFIG_ACPI_TOSHIBA is not set
# CONFIG_ACPI_CUSTOM_DSDT is not set
CONFIG_ACPI_BLACKLIST_YEAR=0
CONFIG_ACPI_DEBUG=y
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
CONFIG_X86_PM_TIMER=y
# CONFIG_ACPI_CONTAINER is not set
#
# APM (Advanced Power Management) BIOS Support
#
# CONFIG_APM is not set
#
# CPU Frequency scaling
#
CONFIG_CPU_FREQ=y
# CONFIG_CPU_FREQ_DEBUG is not set
CONFIG_CPU_FREQ_PROC_INTF=m
# CONFIG_CPU_FREQ_DEFAULT_GOV_PERFORMANCE is not set
CONFIG_CPU_FREQ_DEFAULT_GOV_USERSPACE=y
CONFIG_CPU_FREQ_GOV_PERFORMANCE=m
CONFIG_CPU_FREQ_GOV_POWERSAVE=m
CONFIG_CPU_FREQ_GOV_USERSPACE=y
CONFIG_CPU_FREQ_24_API=y
CONFIG_CPU_FREQ_GOV_ONDEMAND=m
CONFIG_CPU_FREQ_TABLE=y
#
# CPUFreq processor drivers
#
CONFIG_X86_ACPI_CPUFREQ=m
# CONFIG_X86_POWERNOW_K6 is not set
# CONFIG_X86_POWERNOW_K7 is not set
# CONFIG_X86_POWERNOW_K8 is not set
# CONFIG_X86_GX_SUSPMOD is not set
CONFIG_X86_SPEEDSTEP_CENTRINO=m
CONFIG_X86_SPEEDSTEP_CENTRINO_ACPI=y
CONFIG_X86_SPEEDSTEP_CENTRINO_TABLE=y
CONFIG_X86_SPEEDSTEP_ICH=m
CONFIG_X86_SPEEDSTEP_SMI=m
CONFIG_X86_P4_CLOCKMOD=m
CONFIG_X86_CPUFREQ_NFORCE2=m
# CONFIG_X86_LONGRUN is not set
# CONFIG_X86_LONGHAUL is not set
#
# shared options
#
CONFIG_X86_ACPI_CPUFREQ_PROC_INTF=y
CONFIG_X86_SPEEDSTEP_LIB=m
CONFIG_X86_SPEEDSTEP_RELAXED_CAP_CHECK=y
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_MSI=y
CONFIG_PCI_LEGACY_PROC=y
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
CONFIG_EISA=y
CONFIG_EISA_VLB_PRIMING=y
CONFIG_EISA_PCI_EISA=y
CONFIG_EISA_VIRTUAL_ROOT=y
CONFIG_EISA_NAMES=y
CONFIG_MCA=y
CONFIG_MCA_LEGACY=y
CONFIG_MCA_PROC_FS=y
CONFIG_SCx200=m
CONFIG_HOTPLUG_CPU=y
#
# PCCARD (PCMCIA/CardBus) support
#
CONFIG_PCCARD=m
CONFIG_PCMCIA_DEBUG=y
CONFIG_PCMCIA_OBSOLETE=y
CONFIG_PCMCIA=m
CONFIG_CARDBUS=y
#
# PC-card bridges
#
CONFIG_YENTA=m
# CONFIG_PD6729 is not set
# CONFIG_I82092 is not set
# CONFIG_I82365 is not set
# CONFIG_TCIC is not set
CONFIG_PCMCIA_PROBE=y
#
# PCI Hotplug Support
#
CONFIG_HOTPLUG_PCI=m
CONFIG_HOTPLUG_PCI_FAKE=m
# CONFIG_HOTPLUG_PCI_COMPAQ is not set
# CONFIG_HOTPLUG_PCI_IBM is not set
CONFIG_HOTPLUG_PCI_ACPI=m
# CONFIG_HOTPLUG_PCI_ACPI_IBM is not set
# CONFIG_HOTPLUG_PCI_CPCI is not set
CONFIG_HOTPLUG_PCI_PCIE=m
CONFIG_HOTPLUG_PCI_PCIE_POLL_EVENT_MODE=y
# CONFIG_HOTPLUG_PCI_SHPC is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
# CONFIG_STANDALONE is not set
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=m
CONFIG_DEBUG_DRIVER=y
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y
CONFIG_PARPORT_PC_CML1=y
# CONFIG_PARPORT_SERIAL is not set
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_PC_PCMCIA is not set
# CONFIG_PARPORT_OTHER is not set
# CONFIG_PARPORT_1284 is not set
#
# Plug and Play support
#
# CONFIG_PNP is not set
#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_PS2 is not set
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
CONFIG_BLK_DEV_CRYPTOLOOP=m
CONFIG_BLK_DEV_NBD=m
CONFIG_BLK_DEV_SX8=m
CONFIG_BLK_DEV_UB=m
CONFIG_BLK_DEV_RAM=m
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_INITRAMFS_SOURCE=""
CONFIG_LBD=y
CONFIG_CDROM_PKTCDVD=m
CONFIG_CDROM_PKTCDVD_BUFFERS=8
CONFIG_CDROM_PKTCDVD_WCACHE=y
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
CONFIG_BLK_DEV_IDE_SATA=y
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECS=m
CONFIG_BLK_DEV_IDECD=m
CONFIG_BLK_DEV_IDETAPE=m
CONFIG_BLK_DEV_IDEFLOPPY=m
CONFIG_BLK_DEV_IDESCSI=m
CONFIG_IDE_TASK_IOCTL=y
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=m
# CONFIG_BLK_DEV_CMD640 is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
# CONFIG_BLK_DEV_RZ1000 is not set
CONFIG_BLK_DEV_IDEDMA_PCI=y
CONFIG_BLK_DEV_IDEDMA_FORCED=y
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
CONFIG_BLK_DEV_AMD74XX=y
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
CONFIG_CHR_DEV_SG=y
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
CONFIG_SCSI_MULTI_LUN=y
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
CONFIG_SCSI_FC_ATTRS=m
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_3W_9XXX is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AHA1740 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
CONFIG_SCSI_SATA=y
CONFIG_SCSI_SATA_AHCI=m
# CONFIG_SCSI_SATA_SVW is not set
CONFIG_SCSI_ATA_PIIX=y
# CONFIG_SCSI_SATA_NV is not set
# CONFIG_SCSI_SATA_PROMISE is not set
# CONFIG_SCSI_SATA_SX4 is not set
# CONFIG_SCSI_SATA_SIL is not set
# CONFIG_SCSI_SATA_SIS is not set
# CONFIG_SCSI_SATA_ULI is not set
# CONFIG_SCSI_SATA_VIA is not set
# CONFIG_SCSI_SATA_VITESSE is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_FD_MCS is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_IBMMCA is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INITIO is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_NCR_D700 is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_NCR_Q720 is not set
# CONFIG_SCSI_MCA_53C9X is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PCI2000 is not set
# CONFIG_SCSI_PCI2220I is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
# CONFIG_SCSI_QLOGIC_1280_1040 is not set
CONFIG_SCSI_QLA2XXX=y
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_SEAGATE is not set
# CONFIG_SCSI_SIM710 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_NSP32 is not set
CONFIG_SCSI_DEBUG=m
#
# PCMCIA SCSI adapter support
#
# CONFIG_PCMCIA_AHA152X is not set
# CONFIG_PCMCIA_FDOMAIN is not set
# CONFIG_PCMCIA_NINJA_SCSI is not set
# CONFIG_PCMCIA_QLOGIC is not set
# CONFIG_PCMCIA_SYM53C500 is not set
#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK_DEV=m
CONFIG_UNIX=y
CONFIG_NET_KEY=m
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
CONFIG_IP_PNP=y
CONFIG_IP_PNP_DHCP=y
CONFIG_IP_PNP_BOOTP=y
CONFIG_IP_PNP_RARP=y
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
CONFIG_ARPD=y
CONFIG_SYN_COOKIES=y
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
CONFIG_INET_TUNNEL=m
# CONFIG_IP_TCPDIAG is not set
# CONFIG_IP_TCPDIAG_IPV6 is not set
#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
CONFIG_IPV6=m
CONFIG_IPV6_PRIVACY=y
CONFIG_INET6_AH=m
CONFIG_INET6_ESP=m
CONFIG_INET6_IPCOMP=m
CONFIG_INET6_TUNNEL=m
CONFIG_IPV6_TUNNEL=m
CONFIG_NETFILTER=y
CONFIG_NETFILTER_DEBUG=y
#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
CONFIG_IP_NF_CT_ACCT=y
# CONFIG_IP_NF_CONNTRACK_MARK is not set
CONFIG_IP_NF_CT_PROTO_SCTP=m
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
CONFIG_IP_NF_TFTP=m
CONFIG_IP_NF_AMANDA=m
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
CONFIG_IP_NF_MATCH_IPRANGE=m
CONFIG_IP_NF_MATCH_MAC=m
# CONFIG_IP_NF_MATCH_PKTTYPE is not set
# CONFIG_IP_NF_MATCH_MARK is not set
# CONFIG_IP_NF_MATCH_MULTIPORT is not set
# CONFIG_IP_NF_MATCH_TOS is not set
# CONFIG_IP_NF_MATCH_RECENT is not set
# CONFIG_IP_NF_MATCH_ECN is not set
# CONFIG_IP_NF_MATCH_DSCP is not set
# CONFIG_IP_NF_MATCH_AH_ESP is not set
# CONFIG_IP_NF_MATCH_LENGTH is not set
# CONFIG_IP_NF_MATCH_TTL is not set
# CONFIG_IP_NF_MATCH_TCPMSS is not set
# CONFIG_IP_NF_MATCH_HELPER is not set
# CONFIG_IP_NF_MATCH_STATE is not set
# CONFIG_IP_NF_MATCH_CONNTRACK is not set
# CONFIG_IP_NF_MATCH_OWNER is not set
# CONFIG_IP_NF_MATCH_ADDRTYPE is not set
# CONFIG_IP_NF_MATCH_REALM is not set
# CONFIG_IP_NF_MATCH_SCTP is not set
# CONFIG_IP_NF_MATCH_COMMENT is not set
# CONFIG_IP_NF_MATCH_HASHLIMIT is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
# CONFIG_IP_NF_TARGET_ULOG is not set
# CONFIG_IP_NF_TARGET_TCPMSS is not set
# CONFIG_IP_NF_NAT is not set
# CONFIG_IP_NF_MANGLE is not set
# CONFIG_IP_NF_RAW is not set
# CONFIG_IP_NF_ARPTABLES is not set
# CONFIG_IP_NF_COMPAT_IPCHAINS is not set
# CONFIG_IP_NF_COMPAT_IPFWADM is not set
#
# IPv6: Netfilter Configuration
#
CONFIG_IP6_NF_QUEUE=m
CONFIG_IP6_NF_IPTABLES=m
CONFIG_IP6_NF_MATCH_LIMIT=m
CONFIG_IP6_NF_MATCH_MAC=m
CONFIG_IP6_NF_MATCH_RT=m
# CONFIG_IP6_NF_MATCH_OPTS is not set
# CONFIG_IP6_NF_MATCH_FRAG is not set
# CONFIG_IP6_NF_MATCH_HL is not set
# CONFIG_IP6_NF_MATCH_MULTIPORT is not set
# CONFIG_IP6_NF_MATCH_OWNER is not set
# CONFIG_IP6_NF_MATCH_MARK is not set
# CONFIG_IP6_NF_MATCH_IPV6HEADER is not set
# CONFIG_IP6_NF_MATCH_AHESP is not set
# CONFIG_IP6_NF_MATCH_LENGTH is not set
# CONFIG_IP6_NF_MATCH_EUI64 is not set
CONFIG_IP6_NF_FILTER=m
CONFIG_IP6_NF_TARGET_LOG=m
# CONFIG_IP6_NF_MANGLE is not set
# CONFIG_IP6_NF_RAW is not set
CONFIG_XFRM=y
# CONFIG_XFRM_USER is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
CONFIG_ATM=m
CONFIG_ATM_CLIP=m
CONFIG_ATM_CLIP_NO_ICMP=y
CONFIG_ATM_LANE=m
CONFIG_ATM_MPOA=m
CONFIG_ATM_BR2684=m
CONFIG_ATM_BR2684_IPFILTER=y
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
CONFIG_LLC=m
CONFIG_LLC2=m
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set
# CONFIG_NET_CLS_ROUTE is not set
#
# Network testing
#
CONFIG_NET_PKTGEN=m
# CONFIG_KGDBOE is not set
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
CONFIG_BONDING=m
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_ETHERTAP is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
CONFIG_NET_VENDOR_3COM=y
# CONFIG_EL1 is not set
# CONFIG_EL2 is not set
# CONFIG_ELPLUS is not set
# CONFIG_EL16 is not set
CONFIG_EL3=m
# CONFIG_3C515 is not set
# CONFIG_ELMC is not set
# CONFIG_ELMC_II is not set
# CONFIG_VORTEX is not set
# CONFIG_TYPHOON is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
# CONFIG_SKMC is not set
# CONFIG_NE2_MCA is not set
# CONFIG_IBMLANA is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_CS89x0 is not set
# CONFIG_DGRS is not set
CONFIG_EEPRO100=m
CONFIG_E100=m
# CONFIG_E100_NAPI is not set
# CONFIG_LNE390 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_NE3210 is not set
# CONFIG_ES3210 is not set
# CONFIG_8139CP is not set
# CONFIG_8139TOO is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_NET_POCKET is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_VIA_VELOCITY is not set
# CONFIG_TIGON3 is not set
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
CONFIG_NET_RADIO=y
#
# Obsolete Wireless cards support (pre-802.11)
#
# CONFIG_STRIP is not set
# CONFIG_ARLAN is not set
# CONFIG_WAVELAN is not set
# CONFIG_PCMCIA_WAVELAN is not set
# CONFIG_PCMCIA_NETWAVE is not set
#
# Wireless 802.11 Frequency Hopping cards support
#
# CONFIG_PCMCIA_RAYCS is not set
#
# Wireless 802.11b ISA/PCI cards support
#
# CONFIG_AIRO is not set
# CONFIG_HERMES is not set
# CONFIG_ATMEL is not set
#
# Wireless 802.11b Pcmcia/Cardbus cards support
#
# CONFIG_AIRO_CS is not set
# CONFIG_PCMCIA_WL3501 is not set
#
# Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support
#
# CONFIG_PRISM54 is not set
CONFIG_NET_WIRELESS=y
#
# PCMCIA network device support
#
# CONFIG_NET_PCMCIA is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
#
# ATM drivers
#
CONFIG_ATM_TCP=m
# CONFIG_ATM_LANAI is not set
# CONFIG_ATM_ENI is not set
# CONFIG_ATM_FIRESTREAM is not set
# CONFIG_ATM_ZATM is not set
# CONFIG_ATM_NICSTAR is not set
# CONFIG_ATM_IDT77252 is not set
# CONFIG_ATM_AMBASSADOR is not set
# CONFIG_ATM_HORIZON is not set
# CONFIG_ATM_IA is not set
# CONFIG_ATM_FORE200E_MAYBE is not set
# CONFIG_ATM_HE is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
CONFIG_PPP_BSDCOMP=m
CONFIG_PPPOE=m
CONFIG_PPPOATM=m
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
CONFIG_NETCONSOLE=y
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
CONFIG_INPUT_EVDEV=m
CONFIG_INPUT_EVBUG=m
#
# Input I/O drivers
#
# CONFIG_GAMEPORT is not set
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=m
# CONFIG_SERIO_CT82C710 is not set
CONFIG_SERIO_PARKBD=m
CONFIG_SERIO_PCIPS2=m
CONFIG_SERIO_LIBPS2=y
CONFIG_SERIO_RAW=m
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
CONFIG_KEYBOARD_XTKBD=m
CONFIG_KEYBOARD_NEWTON=m
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
CONFIG_MOUSE_SERIAL=m
# CONFIG_MOUSE_INPORT is not set
# CONFIG_MOUSE_LOGIBM is not set
# CONFIG_MOUSE_PC110PAD is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_NONSTANDARD=y
# CONFIG_COMPUTONE is not set
# CONFIG_ROCKETPORT is not set
# CONFIG_CYCLADES is not set
# CONFIG_DIGIEPCA is not set
# CONFIG_DIGI is not set
# CONFIG_ESPSERIAL is not set
# CONFIG_MOXA_INTELLIO is not set
# CONFIG_MOXA_SMARTIO is not set
# CONFIG_ISI is not set
# CONFIG_SYNCLINK is not set
# CONFIG_SYNCLINKMP is not set
CONFIG_N_HDLC=m
# CONFIG_RISCOM8 is not set
# CONFIG_SPECIALIX is not set
# CONFIG_SX is not set
# CONFIG_RIO is not set
# CONFIG_STALDRV is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
CONFIG_SERIAL_8250_CONSOLE=y
# CONFIG_SERIAL_8250_CS is not set
CONFIG_SERIAL_8250_ACPI=y
CONFIG_SERIAL_8250_NR_UARTS=4
CONFIG_SERIAL_8250_EXTENDED=y
CONFIG_SERIAL_8250_MANY_PORTS=y
CONFIG_SERIAL_8250_SHARE_IRQ=y
# CONFIG_SERIAL_8250_DETECT_IRQ is not set
# CONFIG_SERIAL_8250_MULTIPORT is not set
# CONFIG_SERIAL_8250_RSA is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_SERIAL_CORE_CONSOLE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_PRINTER=y
CONFIG_LP_CONSOLE=y
CONFIG_PPDEV=m
# CONFIG_TIPAR is not set
#
# IPMI
#
CONFIG_IPMI_HANDLER=m
CONFIG_IPMI_PANIC_EVENT=y
CONFIG_IPMI_PANIC_STRING=y
CONFIG_IPMI_DEVICE_INTERFACE=m
CONFIG_IPMI_SI=m
CONFIG_IPMI_WATCHDOG=m
CONFIG_IPMI_POWEROFF=m
#
# Watchdog Cards
#
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
#
# Watchdog Device Drivers
#
# CONFIG_SOFT_WATCHDOG is not set
# CONFIG_ACQUIRE_WDT is not set
# CONFIG_ADVANTECH_WDT is not set
# CONFIG_ALIM1535_WDT is not set
# CONFIG_ALIM7101_WDT is not set
# CONFIG_SC520_WDT is not set
# CONFIG_EUROTECH_WDT is not set
# CONFIG_IB700_WDT is not set
# CONFIG_WAFER_WDT is not set
# CONFIG_I8XX_TCO is not set
# CONFIG_SC1200_WDT is not set
# CONFIG_SCx200_WDT is not set
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_W83627HF_WDT is not set
# CONFIG_W83877F_WDT is not set
# CONFIG_MACHZ_WDT is not set
#
# ISA-based Watchdog Cards
#
# CONFIG_PCWATCHDOG is not set
# CONFIG_MIXCOMWD is not set
# CONFIG_WDT is not set
#
# PCI-based Watchdog Cards
#
# CONFIG_PCIPCWATCHDOG is not set
# CONFIG_WDTPCI is not set
#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set
CONFIG_HW_RANDOM=m
CONFIG_NVRAM=m
CONFIG_RTC=m
CONFIG_RTC_HISTOGRAM=m
CONFIG_GEN_RTC=m
CONFIG_GEN_RTC_X=y
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=y
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
CONFIG_AGP_INTEL=y
# CONFIG_AGP_INTEL_MCH is not set
CONFIG_AGP_NVIDIA=m
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
CONFIG_DRM=y
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_GAMMA is not set
# CONFIG_DRM_R128 is not set
# CONFIG_DRM_RADEON is not set
CONFIG_DRM_I810=m
CONFIG_DRM_I830=m
# CONFIG_DRM_I915 is not set
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
# CONFIG_MWAVE is not set
# CONFIG_SCx200_GPIO is not set
CONFIG_RAW_DRIVER=m
CONFIG_HPET=y
CONFIG_HPET_RTC_IRQ=y
CONFIG_HPET_MMAP=y
CONFIG_MAX_RAW_DEVS=256
CONFIG_HANGCHECK_TIMER=m
#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m
#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
# CONFIG_I2C_ALGOPCA is not set
#
# I2C Hardware Bus support
#
# CONFIG_I2C_ALI1535 is not set
# CONFIG_I2C_ALI1563 is not set
# CONFIG_I2C_ALI15X3 is not set
# CONFIG_I2C_AMD756 is not set
# CONFIG_I2C_AMD8111 is not set
# CONFIG_I2C_ELEKTOR is not set
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
# CONFIG_I2C_ISA is not set
CONFIG_I2C_NFORCE2=m
# CONFIG_I2C_PARPORT is not set
# CONFIG_I2C_PARPORT_LIGHT is not set
CONFIG_I2C_PIIX4=m
CONFIG_I2C_PROSAVAGE=m
CONFIG_I2C_SAVAGE4=m
# CONFIG_SCx200_ACB is not set
# CONFIG_I2C_SIS5595 is not set
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_STUB is not set
# CONFIG_I2C_VIA is not set
# CONFIG_I2C_VIAPRO is not set
# CONFIG_I2C_VOODOO3 is not set
# CONFIG_I2C_PCA_ISA is not set
#
# Hardware Sensors Chip support
#
CONFIG_I2C_SENSOR=m
# CONFIG_SENSORS_ADM1021 is not set
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ASB100 is not set
# CONFIG_SENSORS_DS1621 is not set
# CONFIG_SENSORS_FSCHER is not set
# CONFIG_SENSORS_GL518SM is not set
# CONFIG_SENSORS_IT87 is not set
CONFIG_SENSORS_LM75=m
# CONFIG_SENSORS_LM77 is not set
# CONFIG_SENSORS_LM78 is not set
# CONFIG_SENSORS_LM80 is not set
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
# CONFIG_SENSORS_VIA686A is not set
# CONFIG_SENSORS_W83781D is not set
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83627HF is not set
#
# Other I2C Chip support
#
# CONFIG_SENSORS_EEPROM is not set
# CONFIG_SENSORS_PCF8574 is not set
# CONFIG_SENSORS_PCF8591 is not set
# CONFIG_SENSORS_RTC8564 is not set
CONFIG_I2C_DEBUG_CORE=y
CONFIG_I2C_DEBUG_ALGO=y
CONFIG_I2C_DEBUG_BUS=y
CONFIG_I2C_DEBUG_CHIP=y
#
# Dallas's 1-wire bus
#
CONFIG_W1=m
# CONFIG_W1_MATROX is not set
# CONFIG_W1_DS9490 is not set
CONFIG_W1_THERM=m
# CONFIG_W1_SMEM is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
CONFIG_VIDEO_DEV=m
#
# Video For Linux
#
#
# Video Adapters
#
# CONFIG_VIDEO_BT848 is not set
# CONFIG_VIDEO_PMS is not set
# CONFIG_VIDEO_BWQCAM is not set
# CONFIG_VIDEO_CQCAM is not set
# CONFIG_VIDEO_CPIA is not set
# CONFIG_VIDEO_SAA5246A is not set
# CONFIG_VIDEO_SAA5249 is not set
# CONFIG_TUNER_3036 is not set
# CONFIG_VIDEO_STRADIS is not set
# CONFIG_VIDEO_ZORAN is not set
# CONFIG_VIDEO_ZR36120 is not set
# CONFIG_VIDEO_SAA7134 is not set
# CONFIG_VIDEO_MXB is not set
# CONFIG_VIDEO_DPC is not set
# CONFIG_VIDEO_HEXIUM_ORION is not set
# CONFIG_VIDEO_HEXIUM_GEMINI is not set
# CONFIG_VIDEO_CX88 is not set
# CONFIG_VIDEO_OVCAMCHIP is not set
#
# Radio Adapters
#
# CONFIG_RADIO_CADET is not set
# CONFIG_RADIO_RTRACK is not set
# CONFIG_RADIO_RTRACK2 is not set
# CONFIG_RADIO_AZTECH is not set
# CONFIG_RADIO_GEMTEK is not set
# CONFIG_RADIO_GEMTEK_PCI is not set
# CONFIG_RADIO_MAXIRADIO is not set
# CONFIG_RADIO_MAESTRO is not set
# CONFIG_RADIO_SF16FMI is not set
# CONFIG_RADIO_SF16FMR2 is not set
# CONFIG_RADIO_TERRATEC is not set
# CONFIG_RADIO_TRUST is not set
# CONFIG_RADIO_TYPHOON is not set
# CONFIG_RADIO_ZOLTRIX is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
CONFIG_FB=y
CONFIG_FB_MODE_HELPERS=y
# CONFIG_FB_TILEBLITTING is not set
# CONFIG_FB_CIRRUS is not set
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=m
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
# CONFIG_FB_HGA is not set
CONFIG_FB_RIVA=m
# CONFIG_FB_RIVA_I2C is not set
# CONFIG_FB_RIVA_DEBUG is not set
CONFIG_FB_I810=m
CONFIG_FB_I810_GTF=y
# CONFIG_FB_INTEL is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON_OLD is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
CONFIG_FB_SAVAGE=m
CONFIG_FB_SAVAGE_I2C=m
CONFIG_FB_SAVAGE_ACCEL=m
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_PM3 is not set
# CONFIG_FB_VIRTUAL is not set
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_MDA_CONSOLE=m
CONFIG_DUMMY_CONSOLE=y
CONFIG_FRAMEBUFFER_CONSOLE=m
CONFIG_FONTS=y
CONFIG_FONT_8x8=y
CONFIG_FONT_8x16=y
# CONFIG_FONT_6x11 is not set
# CONFIG_FONT_PEARL_8x8 is not set
# CONFIG_FONT_ACORN_8x8 is not set
CONFIG_FONT_MINI_4x6=y
CONFIG_FONT_SUN8x16=y
# CONFIG_FONT_SUN12x22 is not set
#
# Logo configuration
#
CONFIG_LOGO=y
CONFIG_LOGO_LINUX_MONO=y
# CONFIG_LOGO_LINUX_VGA16 is not set
# CONFIG_LOGO_LINUX_CLUT224 is not set
#
# Sound
#
CONFIG_SOUND=y
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_HWDEP=m
CONFIG_SND_RAWMIDI=m
CONFIG_SND_SEQUENCER=y
CONFIG_SND_SEQ_DUMMY=m
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_SEQUENCER_OSS=y
CONFIG_SND_RTCTIMER=m
CONFIG_SND_VERBOSE_PRINTK=y
CONFIG_SND_DEBUG=y
CONFIG_SND_DEBUG_MEMORY=y
CONFIG_SND_DEBUG_DETECT=y
#
# Generic devices
#
CONFIG_SND_MPU401_UART=m
CONFIG_SND_OPL3_LIB=m
CONFIG_SND_DUMMY=m
CONFIG_SND_VIRMIDI=m
CONFIG_SND_MTPAV=m
CONFIG_SND_SERIAL_U16550=m
CONFIG_SND_MPU401=m
#
# ISA devices
#
# CONFIG_SND_AD1848 is not set
# CONFIG_SND_CS4231 is not set
# CONFIG_SND_CS4232 is not set
# CONFIG_SND_CS4236 is not set
# CONFIG_SND_ES1688 is not set
# CONFIG_SND_ES18XX is not set
# CONFIG_SND_GUSCLASSIC is not set
# CONFIG_SND_GUSEXTREME is not set
# CONFIG_SND_GUSMAX is not set
# CONFIG_SND_INTERWAVE is not set
# CONFIG_SND_INTERWAVE_STB is not set
# CONFIG_SND_OPTI92X_AD1848 is not set
# CONFIG_SND_OPTI92X_CS4231 is not set
# CONFIG_SND_OPTI93X is not set
# CONFIG_SND_SB8 is not set
# CONFIG_SND_SB16 is not set
# CONFIG_SND_SBAWE is not set
# CONFIG_SND_WAVEFRONT is not set
# CONFIG_SND_CMI8330 is not set
# CONFIG_SND_OPL3SA2 is not set
# CONFIG_SND_SGALAXY is not set
# CONFIG_SND_SSCAPE is not set
#
# PCI devices
#
CONFIG_SND_AC97_CODEC=y
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_ATIIXP_MODEM is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
CONFIG_SND_INTEL8X0=y
# CONFIG_SND_INTEL8X0M is not set
CONFIG_SND_SONICVIBES=m
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VX222 is not set
#
# USB devices
#
# CONFIG_SND_USB_AUDIO is not set
# CONFIG_SND_USB_USX2Y is not set
#
# PCMCIA devices
#
# CONFIG_SND_VXPOCKET is not set
# CONFIG_SND_VXP440 is not set
# CONFIG_SND_PDAUDIOCF is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
CONFIG_USB=y
CONFIG_USB_DEBUG=y
#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
CONFIG_USB_BANDWIDTH=y
CONFIG_USB_DYNAMIC_MINORS=y
CONFIG_USB_SUSPEND=y
# CONFIG_USB_OTG is not set
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=m
# CONFIG_USB_EHCI_SPLIT_ISO is not set
# CONFIG_USB_EHCI_ROOT_HUB_TT is not set
CONFIG_USB_OHCI_HCD=m
CONFIG_USB_UHCI_HCD=m
#
# USB Device Class drivers
#
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_BLUETOOTH_TTY is not set
# CONFIG_USB_MIDI is not set
# CONFIG_USB_ACM is not set
# CONFIG_USB_PRINTER is not set
# CONFIG_USB_STORAGE is not set
#
# USB Input Devices
#
# CONFIG_USB_HID is not set
#
# USB HID Boot Protocol drivers
#
# CONFIG_USB_KBD is not set
# CONFIG_USB_MOUSE is not set
# CONFIG_USB_AIPTEK is not set
# CONFIG_USB_WACOM is not set
# CONFIG_USB_KBTAB is not set
# CONFIG_USB_POWERMATE is not set
# CONFIG_USB_MTOUCH is not set
# CONFIG_USB_EGALAX is not set
# CONFIG_USB_XPAD is not set
# CONFIG_USB_ATI_REMOTE is not set
#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USB_HPUSBSCSI is not set
#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set
# CONFIG_USB_VICAM is not set
# CONFIG_USB_DSBR is not set
# CONFIG_USB_IBMCAM is not set
# CONFIG_USB_KONICAWC is not set
# CONFIG_USB_OV511 is not set
# CONFIG_USB_SE401 is not set
# CONFIG_USB_SN9C102 is not set
# CONFIG_USB_STV680 is not set
#
# USB Network adaptors
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
#
# USB port drivers
#
# CONFIG_USB_USS720 is not set
#
# USB Serial Converter support
#
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
# CONFIG_USB_SERIAL_BELKIN is not set
# CONFIG_USB_SERIAL_WHITEHEAT is not set
# CONFIG_USB_SERIAL_DIGI_ACCELEPORT is not set
# CONFIG_USB_SERIAL_CYPRESS_M8 is not set
# CONFIG_USB_SERIAL_EMPEG is not set
# CONFIG_USB_SERIAL_FTDI_SIO is not set
# CONFIG_USB_SERIAL_VISOR is not set
# CONFIG_USB_SERIAL_IPAQ is not set
# CONFIG_USB_SERIAL_IR is not set
# CONFIG_USB_SERIAL_EDGEPORT is not set
# CONFIG_USB_SERIAL_EDGEPORT_TI is not set
# CONFIG_USB_SERIAL_IPW is not set
# CONFIG_USB_SERIAL_KEYSPAN_PDA is not set
# CONFIG_USB_SERIAL_KEYSPAN is not set
# CONFIG_USB_SERIAL_KLSI is not set
# CONFIG_USB_SERIAL_KOBIL_SCT is not set
# CONFIG_USB_SERIAL_MCT_U232 is not set
# CONFIG_USB_SERIAL_PL2303 is not set
# CONFIG_USB_SERIAL_SAFE is not set
# CONFIG_USB_SERIAL_CYBERJACK is not set
# CONFIG_USB_SERIAL_XIRCOM is not set
# CONFIG_USB_SERIAL_OMNINET is not set
#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_TIGL is not set
# CONFIG_USB_AUERSWALD is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGETKIT is not set
# CONFIG_USB_PHIDGETSERVO is not set
# CONFIG_USB_TEST is not set
#
# USB ATM/DSL drivers
#
# CONFIG_USB_ATM is not set
# CONFIG_USB_SPEEDTOUCH is not set
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
CONFIG_EXT2_FS_XATTR=y
CONFIG_EXT2_FS_POSIX_ACL=y
CONFIG_EXT2_FS_SECURITY=y
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
CONFIG_EXT3_FS_SECURITY=y
CONFIG_JBD=y
CONFIG_JBD_DEBUG=y
CONFIG_FS_MBCACHE=y
# CONFIG_REISERFS_FS is not set
# CONFIG_JFS_FS is not set
CONFIG_FS_POSIX_ACL=y
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
CONFIG_QUOTA=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=m
CONFIG_QUOTACTL=y
CONFIG_DNOTIFY=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=y
#
# Caches
#
CONFIG_FSCACHE=m
CONFIG_CACHEFS=m
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=y
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
CONFIG_DEVPTS_FS_XATTR=y
CONFIG_DEVPTS_FS_SECURITY=y
CONFIG_TMPFS=y
CONFIG_TMPFS_XATTR=y
CONFIG_TMPFS_SECURITY=y
CONFIG_HUGETLBFS=y
CONFIG_HUGETLB_PAGE=y
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
#
# Network File Systems
#
CONFIG_NFS_FS=y
CONFIG_NFS_V3=y
CONFIG_NFS_V4=y
CONFIG_NFS_DIRECTIO=y
CONFIG_NFSD=y
CONFIG_NFSD_V3=y
CONFIG_NFSD_V4=y
CONFIG_NFSD_TCP=y
# CONFIG_ROOT_NFS is not set
CONFIG_LOCKD=y
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=y
CONFIG_SUNRPC=y
CONFIG_SUNRPC_GSS=y
CONFIG_RPCSEC_GSS_KRB5=y
CONFIG_RPCSEC_GSS_SPKM3=m
# CONFIG_SMB_FS is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
# CONFIG_OSF_PARTITION is not set
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
# CONFIG_BSD_DISKLABEL is not set
# CONFIG_MINIX_SUBPARTITION is not set
# CONFIG_SOLARIS_X86_PARTITION is not set
# CONFIG_UNIXWARE_DISKLABEL is not set
# CONFIG_LDM_PARTITION is not set
# CONFIG_SGI_PARTITION is not set
# CONFIG_ULTRIX_PARTITION is not set
# CONFIG_SUN_PARTITION is not set
# CONFIG_EFI_PARTITION is not set
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
CONFIG_NLS_ISO8859_15=y
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
CONFIG_NLS_UTF8=y
#
# Profiling support
#
CONFIG_PROFILING=y
CONFIG_OPROFILE=m
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
CONFIG_SCHEDSTATS=y
# CONFIG_DEBUG_SLAB is not set
CONFIG_DEBUG_PREEMPT=y
CONFIG_WAKEUP_TIMING=y
CONFIG_PREEMPT_TRACE=y
CONFIG_CRITICAL_PREEMPT_TIMING=y
CONFIG_CRITICAL_IRQSOFF_TIMING=y
CONFIG_CRITICAL_TIMING=y
CONFIG_LATENCY_TIMING=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
CONFIG_RT_DEADLOCK_DETECT=y
CONFIG_DEBUG_KOBJECT=y
CONFIG_DEBUG_INFO=y
CONFIG_FRAME_POINTER=y
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_STACKOVERFLOW=y
CONFIG_KPROBES=y
CONFIG_DEBUG_STACK_USAGE=y
# CONFIG_DEBUG_PAGEALLOC is not set
CONFIG_4KSTACKS=y
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
#
# Security options
#
CONFIG_KEYS=y
CONFIG_KEYS_DEBUG_PROC_KEYS=y
CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_CAPABILITIES=m
CONFIG_SECURITY_ROOTPLUG=m
# CONFIG_SECURITY_SECLVL is not set
CONFIG_SECURITY_SELINUX=y
CONFIG_SECURITY_SELINUX_BOOTPARAM=y
CONFIG_SECURITY_SELINUX_BOOTPARAM_VALUE=1
CONFIG_SECURITY_SELINUX_DISABLE=y
CONFIG_SECURITY_SELINUX_DEVELOP=y
CONFIG_SECURITY_SELINUX_MLS=y
#
# Cryptographic options
#
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=y
CONFIG_CRYPTO_MD5=y
CONFIG_CRYPTO_SHA1=y
CONFIG_CRYPTO_SHA256=m
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_WP512 is not set
CONFIG_CRYPTO_DES=y
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES_586 is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
CONFIG_CRYPTO_DEFLATE=y
# CONFIG_CRYPTO_MICHAEL_MIC is not set
CONFIG_CRYPTO_CRC32C=y
CONFIG_CRYPTO_TEST=m
#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC32=y
CONFIG_LIBCRC32C=y
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=y
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_PC=y
[email protected] wrote:
>>i have released the -V0.7.25-0 Real-Time Preemption patch, which can be
>>downloaded from the usual place:
>>
>> http://redhat.com/~mingo/realtime-preempt/
>>
>>this release includes fixes, new features and latency improvements.
>
>
> It may be coincidence, but when I did
> chrt -p -f 99 2
> (to set IRQ 0 to max RT priority, like the other IRQ's)
>
> I got the following deadlock.
>
I tried reproducing this but could not. Of course that doesn't mean
much. :) Also, is this really they way you typed it in? If so, it is
very possible that the pid it tried to chrt was 99 not 2.
kr
Remi Colinet wrote:
> Ingo Molnar wrote:
>
>> i have released the -V0.7.25-0 Real-Time Preemption patch, which can be
>> downloaded from the usual place:
>>
>> http://redhat.com/~mingo/realtime-preempt/
>>
>>
>>
>>
> Hi,
>
> I'm getting the following warning with V0.7.25-0
>
> INSTALL sound/drivers/opl3/snd-opl3-lib.ko
> INSTALL sound/drivers/opl3/snd-opl3-synth.ko
> INSTALL sound/drivers/snd-dummy.ko
> INSTALL sound/drivers/snd-mtpav.ko
> INSTALL sound/drivers/snd-serial-u16550.ko
> INSTALL sound/drivers/snd-virmidi.ko
> INSTALL sound/pci/snd-sonicvibes.ko
> if [ -r System.map ]; then /sbin/depmod -ae -F System.map
> 2.6.10-rc1-mm3-RT-V0.7.25-0; fi
> WARNING:
> /lib/modules/2.6.10-rc1-mm3-RT-V0.7.25-0/kernel/drivers/char/rtc.ko
> needs unknown symbol rtc_close_event
> WARNING:
> /lib/modules/2.6.10-rc1-mm3-RT-V0.7.25-0/kernel/drivers/char/rtc.ko
> needs unknown symbol rtc_open_event
> [root@tigre01 im]#
>
> .config file attached
>
> Remi
>
Does the patch below fix this?
kr
--- linux-2.6.10-rc1-mm3/drivers/char/rtc.c.orig 2004-11-11
11:35:00.898841565 -0600
+++ linux-2.6.10-rc1-mm3/drivers/char/rtc.c 2004-11-11
12:07:54.019642611 -0600
@@ -863,7 +863,9 @@
if(rtc_status & RTC_IS_OPEN)
goto out_busy;
+#ifdef RTC_IRQ
rtc_open_event();
+#endif
rtc_status |= RTC_IS_OPEN;
rtc_irq_data = 0;
@@ -920,7 +922,9 @@
rtc_irq_data = 0;
rtc_status &= ~RTC_IS_OPEN;
spin_unlock_irq (&rtc_lock);
+#ifdef RTC_IRQ
rtc_close_event();
+#endif
return 0;
}
--- linux-2.6.10-rc1-mm3/drivers/char/rtc.c.orig 2004-11-11 11:35:00.898841565 -0600
+++ linux-2.6.10-rc1-mm3/drivers/char/rtc.c 2004-11-11 12:07:54.019642611 -0600
@@ -863,7 +863,9 @@
if(rtc_status & RTC_IS_OPEN)
goto out_busy;
+#ifdef RTC_IRQ
rtc_open_event();
+#endif
rtc_status |= RTC_IS_OPEN;
rtc_irq_data = 0;
@@ -920,7 +922,9 @@
rtc_irq_data = 0;
rtc_status &= ~RTC_IS_OPEN;
spin_unlock_irq (&rtc_lock);
+#ifdef RTC_IRQ
rtc_close_event();
+#endif
return 0;
}
found the bug that i think caused the freezes and deadlocks reported by
Mark and Gunther. Here's the announcement of a debug feature:
> - debugging helper: the /proc/sys/kernel/debug_direct_keyboard flag
> (default: 0) will hack the keyboard IRQ into being direct. NOTE: the
> keyboard in this mode should only be used to access SysRq
> functionality that is not possible via the threaded keyboard handler.
> The direct keyboard IRQ can crash the system.
it turns out i accidentally left debug_direct_keyboard default-enabled
... no wonder it caused lockups!
Ingo
* K.R. Foley <[email protected]> wrote:
> +#ifdef RTC_IRQ
> rtc_open_event();
> +#endif
> +#ifdef RTC_IRQ
> rtc_close_event();
> +#endif
indeed. I fixed it a bit differently in my tree, will upload a new patch
soon.
Ingo
i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is a fixes-only release that resolves a couple of bugs that slipped
into -V0.7.25-0:
- lockup/deadlock fix: make debug_direct_keyboard default to 0. It is
only a debug helper to be used for development, it was never intended
to be enabled. This fix should resolve the bugs reported by Gunther
Persoons and Mark H. Johnson.
- fix symbol export problems in rtc.ko, reported by Remi Colinet, based
on the patch from K.R. Foley.
- make preempt_wakeup_timing default to 1 if enabled in the .config, as
originally intended.
to create a -V0.7.25-1 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc1.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc1/2.6.10-rc1-mm3/2.6.10-rc1-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc1-mm3-V0.7.25-1
Ingo
K.R. Foley wrote:
> Remi Colinet wrote:
>
>> Ingo Molnar wrote:
>>
>>> i have released the -V0.7.25-0 Real-Time Preemption patch, which can be
>>> downloaded from the usual place:
>>>
>>> http://redhat.com/~mingo/realtime-preempt/
>>>
>> Hi,
>>
>> I'm getting the following warning with V0.7.25-0
>>
>> INSTALL sound/drivers/opl3/snd-opl3-lib.ko
>> INSTALL sound/drivers/opl3/snd-opl3-synth.ko
>> INSTALL sound/drivers/snd-dummy.ko
>> INSTALL sound/drivers/snd-mtpav.ko
>> INSTALL sound/drivers/snd-serial-u16550.ko
>> INSTALL sound/drivers/snd-virmidi.ko
>> INSTALL sound/pci/snd-sonicvibes.ko
>> if [ -r System.map ]; then /sbin/depmod -ae -F System.map
>> 2.6.10-rc1-mm3-RT-V0.7.25-0; fi
>> WARNING:
>> /lib/modules/2.6.10-rc1-mm3-RT-V0.7.25-0/kernel/drivers/char/rtc.ko
>> needs unknown symbol rtc_close_event
>> WARNING:
>> /lib/modules/2.6.10-rc1-mm3-RT-V0.7.25-0/kernel/drivers/char/rtc.ko
>> needs unknown symbol rtc_open_event
>> [root@tigre01 im]#
>>
>> .config file attached
>>
>> Remi
>
>
> Damn. Here is the patch again. Last one was hosed by wrap. Sorry.
>
> kr
Solved with V0.7.25-1
Thanks
Remi
On Thu, Nov 11, 2004 at 10:51:22PM +0100, Ingo Molnar wrote:
> i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
Patch to get rudimentary kgdb support working.
bill
On Thu, Nov 11, 2004 at 08:08:45PM -0800, Bill Huey wrote:
> Patch to get rudimentary kgdb support working.
Resent with some contamination removed from it.
bill
* Bill Huey <[email protected]> wrote:
> > Patch to get rudimentary kgdb support working.
thanks, the patch looks good. Is this one really needed:
> -static inline unsigned long long cycles_2_ns(unsigned long long cyc)
> +//static inline
> +//#error
> +unsigned long long cycles_2_ns(unsigned long long cyc)
?
Ingo
On Fri, Nov 12, 2004 at 09:39:38AM +0100, Ingo Molnar wrote:
> * Bill Huey <[email protected]> wrote:
> > > Patch to get rudimentary kgdb support working.
>
> thanks, the patch looks good. Is this one really needed:
No, it's not. It's for my timing stuff that's going to be release
as soon as I figure out how to deal with the irq balancing code.
(I'm still learning this as I go along)
I'm a newbie to releasing patches, so scold me when you feel it's
appropriate. :)
bill
On Thu, 2004-11-11 at 16:51, Ingo Molnar wrote:
> i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a fixes-only release that resolves a couple of bugs that slipped
> into -V0.7.25-0:
>
> - lockup/deadlock fix: make debug_direct_keyboard default to 0. It is
> only a debug helper to be used for development, it was never intended
> to be enabled. This fix should resolve the bugs reported by Gunther
> Persoons and Mark H. Johnson.
Ahh, that probably explains the problems I had with it!
V0.7.25-1 has been stable here with the ivtv driver for 11 hrs. No sign
of the ide dma time out issue either. Out of curiosity, do we know what
solved that problem?
Regards,
Shane
>i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
This builds just fine and runs without any serious failures. The RT
performance still has rough edges with several bursts of long
delays in the CPU loop of latencytest. These tests are all on an
SMP system with CONFIG_PREEMPT_RT and full tracing enabled.
I have sent several log files to Ingo separately but the following
summarizes what I am seeing:
[1] major network delays while latencytest is running (ping drops packets
or they get delayed by minutes). I did not see this on some previous tests
where I made more of the /0 and /1 tasks RT. May have to do that again.
[2] display / keyboard / mouse will occasionally freeze or act much more
slowly than on a non PREEMPT_RT kernel.
[3] I no longer see the major delays in events/0 and /1. This particular
live lock appears to be solved. I actually don't see any tasks or
applications taking a lot of CPU time except cpu_burn and latencytest
(both should). This is a little puzzling since with the poor response
to non RT / interactive activities, I would expect to see something
take up the CPU time.
[4] the user latency traces show the RT CPU loop in latencytrace being
delayed by a variety of kernel activites. I also note that preempt_count
is zero during several periods of the trace so I'm surprised that we did
not continue to run the RT task (and do this stuff on the other CPU).
These delays have hundreds of entries with over 100 usec delay overall.
[5] I can make the symptoms MUCH worse by simply running my cpu_burn
application as a non realtime application. (no I/O or system calls, just
a simple loop) This is run as a nice'd application so should run only
when nothing else is ready to run.
[6] the latency trace may have some SMP race conditions where the entries
displayed do not match the header. Examples are a 100 usec trace header
followed by 8 entries that last about 4 usec.
[7] both the wakeup and the preempt disable traces do not show any
significant periods of delays. The most I can get out of these is
roughly 100 usec which I believe correlates with disk DMA and the
particular motherboard chip set on the system under test. This
looks really good if [4] can be fixed.
[8] Some samples of /proc/loadavg during my big test showed some
extremely large numbers. For example:
5.07 402.44 0.58 5/120 4448
6.35 195.67 1.63 7/122 4663
5.39 130.82 2.20 18/122 4705
2.10 43.17 3.00 8/122 5912
8.90 8.89 4.70 10/123 7780
8.33 8.52 4.95 6/124 7887
Not quite sure what a 5 minute loadavg of 402 means when I have
only a 120 tasks in the system. May be a symptom of some bug in
the load average calculations (and not PREEMPT_RT related) but
not sure.
--Mark
Shane Shrybman wrote:
> On Thu, 2004-11-11 at 16:51, Ingo Molnar wrote:
>
>>i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
>>downloaded from the usual place:
>>
>> http://redhat.com/~mingo/realtime-preempt/
>>
>>this is a fixes-only release that resolves a couple of bugs that slipped
>>into -V0.7.25-0:
>>
>> - lockup/deadlock fix: make debug_direct_keyboard default to 0. It is
>> only a debug helper to be used for development, it was never intended
>> to be enabled. This fix should resolve the bugs reported by Gunther
>> Persoons and Mark H. Johnson.
>
>
> Ahh, that probably explains the problems I had with it!
>
> V0.7.25-1 has been stable here with the ivtv driver for 11 hrs. No sign
> of the ide dma time out issue either. Out of curiosity, do we know what
> solved that problem?
>
> Regards,
>
> Shane
>
What sort of errors did you get about the ide dma timeouts?
kr
On Fri, 2004-11-12 at 12:27, K.R. Foley wrote:
> Shane Shrybman wrote:
> > On Thu, 2004-11-11 at 16:51, Ingo Molnar wrote:
> >
> >>i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
> >>downloaded from the usual place:
> >>
> >> http://redhat.com/~mingo/realtime-preempt/
> >>
> >>this is a fixes-only release that resolves a couple of bugs that slipped
> >>into -V0.7.25-0:
> >>
> >> - lockup/deadlock fix: make debug_direct_keyboard default to 0. It is
> >> only a debug helper to be used for development, it was never intended
> >> to be enabled. This fix should resolve the bugs reported by Gunther
> >> Persoons and Mark H. Johnson.
> >
> >
> > Ahh, that probably explains the problems I had with it!
> >
> > V0.7.25-1 has been stable here with the ivtv driver for 11 hrs. No sign
> > of the ide dma time out issue either. Out of curiosity, do we know what
> > solved that problem?
> >
> > Regards,
> >
> > Shane
> >
>
> What sort of errors did you get about the ide dma timeouts?
>
Typical example of the error message:
kernel: hde: dma_timer_expiry: dma status == 0x24
kernel: ALSA sound/core/pcm_native.c:1424: playback drain error (DMA or IRQ trouble?)
kernel: PDC202XX: Primary channel reset.
kernel: hde: DMA interrupt recovery
kernel: hde: lost interrupt
This was on a Promise TX2 133 ide card with one IDE disk. The problem
would show itself if using the RT patches and APIC. But the problem seems
to have been resolved now.
> kr
>
Regards,
Shane
>Typical example of the error message:
>
>kernel: hde: dma_timer_expiry: dma status == 0x24
>kernel: ALSA sound/core/pcm_native.c:1424: playback drain error (DMA or
IRQ trouble?)
>kernel: PDC202XX: Primary channel reset.
>kernel: hde: DMA interrupt recovery
>kernel: hde: lost interrupt
>
>This was on a Promise TX2 133 ide card with one IDE disk. The problem
>would show itself if using the RT patches and APIC. But the problem seems
>to have been resolved now.
I had errors like that one when the IDE IRQ was at a priority less than
the real time task. Since then, I run with all the IRQ's at max RT priority
and will continue to do so until I get a better assessment of what my real
application (not these audio tests...) needs for IRQ priorities.
This may have been fixed as a side effect of Ingo setting the IRQ threads
at
RT priorities in the 40's.
--Mark H Johnson
<mailto:[email protected]>
#
# Automatically generated make config: don't edit
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_STANDALONE=y
CONFIG_BROKEN_ON_SMP=y
#
# General setup
#
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
# CONFIG_BSD_PROCESS_ACCT is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=14
CONFIG_HOTPLUG=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
CONFIG_FUTEX=y
CONFIG_EPOLL=y
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
#
# Loadable module support
#
CONFIG_MODULES=y
CONFIG_MODULE_UNLOAD=y
CONFIG_MODULE_FORCE_UNLOAD=y
CONFIG_OBSOLETE_MODPARM=y
# CONFIG_MODVERSIONS is not set
CONFIG_KMOD=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
# CONFIG_X86_GENERIC is not set
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_RWSEM_XCHGADD_ALGORITHM=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_HPET_TIMER is not set
# CONFIG_HPET_EMULATE_RTC is not set
# CONFIG_SMP is not set
CONFIG_PREEMPT=y
# CONFIG_X86_UP_APIC is not set
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
CONFIG_X86_MCE_NONFATAL=y
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
# CONFIG_MICROCODE is not set
# CONFIG_X86_MSR is not set
# CONFIG_X86_CPUID is not set
#
# Firmware Drivers
#
# CONFIG_EDD is not set
# CONFIG_SMBIOS is not set
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
CONFIG_BADRAM=y
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
# CONFIG_EFI is not set
CONFIG_HAVE_DEC_LOCK=y
CONFIG_REGPARM=y
#
# Power management options (ACPI, APM)
#
CONFIG_PM=y
CONFIG_SOFTWARE_SUSPEND=y
# CONFIG_PM_DISK is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
CONFIG_ACPI=y
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_INTERPRETER=y
CONFIG_ACPI_SLEEP=y
CONFIG_ACPI_SLEEP_PROC_FS=y
CONFIG_ACPI_AC=y
CONFIG_ACPI_BATTERY=y
CONFIG_ACPI_BUTTON=y
CONFIG_ACPI_FAN=y
CONFIG_ACPI_PROCESSOR=y
CONFIG_ACPI_THERMAL=y
# CONFIG_ACPI_ASUS is not set
# CONFIG_ACPI_TOSHIBA is not set
# CONFIG_ACPI_DEBUG is not set
CONFIG_ACPI_BUS=y
CONFIG_ACPI_EC=y
CONFIG_ACPI_POWER=y
CONFIG_ACPI_PCI=y
CONFIG_ACPI_SYSTEM=y
# CONFIG_X86_PM_TIMER is not set
# CONFIG_AMD76X_PM is not set
CONFIG_ACPI_INITRD=y
#
# APM (Advanced Power Management) BIOS Support
#
# CONFIG_APM is not set
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
CONFIG_HZ_1000=y
# CONFIG_HZ_512 is not set
# CONFIG_HZ_100 is not set
CONFIG_HZ=1000
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
CONFIG_PCI_MMCONFIG=y
CONFIG_PCI_LEGACY_PROC=y
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
#
# PCMCIA/CardBus support
#
CONFIG_PCMCIA=m
# CONFIG_PCMCIA_DEBUG is not set
CONFIG_YENTA=m
CONFIG_CARDBUS=y
# CONFIG_I82092 is not set
# CONFIG_I82365 is not set
# CONFIG_TCIC is not set
CONFIG_PCMCIA_PROBE=y
#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=y
CONFIG_BINFMT_MISC=y
#
# Device Drivers
#
#
# Generic Driver Options
#
# CONFIG_FW_LOADER is not set
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
CONFIG_PARPORT=y
CONFIG_PARPORT_PC=y
CONFIG_PARPORT_PC_CML1=y
# CONFIG_PARPORT_SERIAL is not set
# CONFIG_PARPORT_PC_FIFO is not set
# CONFIG_PARPORT_PC_SUPERIO is not set
# CONFIG_PARPORT_PC_PCMCIA is not set
# CONFIG_PARPORT_OTHER is not set
# CONFIG_PARPORT_1284 is not set
#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set
#
# Protocols
#
# CONFIG_ISAPNP is not set
# CONFIG_PNPBIOS is not set
#
# Block devices
#
CONFIG_BLK_DEV_FD=y
# CONFIG_BLK_DEV_XD is not set
# CONFIG_PARIDE is not set
# CONFIG_BLK_CPQ_DA is not set
# CONFIG_BLK_CPQ_CISS_DA is not set
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=y
CONFIG_BLK_DEV_CRYPTOLOOP=y
# CONFIG_BLK_DEV_NBD is not set
# CONFIG_BLK_DEV_CARMEL is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=4096
CONFIG_BLK_DEV_INITRD=y
CONFIG_LBD=y
# CONFIG_CDROM_PKTCDVD is not set
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
# CONFIG_IDEDISK_STROKE is not set
# CONFIG_BLK_DEV_IDECS is not set
CONFIG_BLK_DEV_IDECD=y
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
CONFIG_IDE_TASKFILE_IO=y
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_CMD640=y
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_ADMA=y
# CONFIG_BLK_DEV_AEC62XX is not set
# CONFIG_BLK_DEV_ALI15X3 is not set
# CONFIG_BLK_DEV_AMD74XX is not set
# CONFIG_BLK_DEV_ATIIXP is not set
# CONFIG_BLK_DEV_CMD64X is not set
# CONFIG_BLK_DEV_TRIFLEX is not set
# CONFIG_BLK_DEV_CY82C693 is not set
# CONFIG_BLK_DEV_CS5520 is not set
# CONFIG_BLK_DEV_CS5530 is not set
# CONFIG_BLK_DEV_HPT34X is not set
# CONFIG_BLK_DEV_HPT366 is not set
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NS87415 is not set
# CONFIG_BLK_DEV_PDC202XX_OLD is not set
# CONFIG_BLK_DEV_PDC202XX_NEW is not set
# CONFIG_BLK_DEV_SVWKS is not set
# CONFIG_BLK_DEV_SIIMAGE is not set
# CONFIG_BLK_DEV_SIS5513 is not set
# CONFIG_BLK_DEV_SLC90E66 is not set
# CONFIG_BLK_DEV_TRM290 is not set
# CONFIG_BLK_DEV_VIA82CXXX is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=y
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=y
# CONFIG_CHR_DEV_ST is not set
# CONFIG_CHR_DEV_OSST is not set
# CONFIG_BLK_DEV_SR is not set
CONFIG_CHR_DEV_SG=y
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_REPORT_LUNS=y
# CONFIG_SCSI_CONSTANTS is not set
# CONFIG_SCSI_LOGGING is not set
#
# SCSI Transport Attributes
#
# CONFIG_SCSI_SPI_ATTRS is not set
# CONFIG_SCSI_FC_ATTRS is not set
#
# SCSI low-level drivers
#
# CONFIG_BLK_DEV_3W_XXXX_RAID is not set
# CONFIG_SCSI_7000FASST is not set
# CONFIG_SCSI_ACARD is not set
# CONFIG_SCSI_AHA152X is not set
# CONFIG_SCSI_AHA1542 is not set
# CONFIG_SCSI_AACRAID is not set
# CONFIG_SCSI_AIC7XXX is not set
# CONFIG_SCSI_AIC7XXX_OLD is not set
# CONFIG_SCSI_AIC79XX is not set
# CONFIG_SCSI_DPT_I2O is not set
# CONFIG_SCSI_ADVANSYS is not set
# CONFIG_SCSI_IN2000 is not set
# CONFIG_SCSI_MEGARAID is not set
CONFIG_SCSI_SATA=y
# CONFIG_SCSI_SATA_SVW is not set
CONFIG_SCSI_ATA_PIIX=y
# CONFIG_SCSI_ATA_ITE is not set
# CONFIG_SCSI_SATA_PROMISE is not set
# CONFIG_SCSI_SATA_SX4 is not set
# CONFIG_SCSI_SATA_SIL is not set
# CONFIG_SCSI_SATA_SIS is not set
# CONFIG_SCSI_SATA_VIA is not set
# CONFIG_SCSI_SATA_VITESSE is not set
# CONFIG_SCSI_BUSLOGIC is not set
# CONFIG_SCSI_CPQFCTS is not set
# CONFIG_SCSI_DMX3191D is not set
# CONFIG_SCSI_DTC3280 is not set
# CONFIG_SCSI_EATA is not set
# CONFIG_SCSI_EATA_PIO is not set
# CONFIG_SCSI_FUTURE_DOMAIN is not set
# CONFIG_SCSI_GDTH is not set
# CONFIG_SCSI_GENERIC_NCR5380 is not set
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_IPS is not set
# CONFIG_SCSI_INIA100 is not set
# CONFIG_SCSI_PPA is not set
# CONFIG_SCSI_IMM is not set
# CONFIG_SCSI_NCR53C406A is not set
# CONFIG_SCSI_SYM53C8XX_2 is not set
# CONFIG_SCSI_IPR is not set
# CONFIG_SCSI_PAS16 is not set
# CONFIG_SCSI_PSI240I is not set
# CONFIG_SCSI_QLOGIC_FAS is not set
# CONFIG_SCSI_QLOGIC_ISP is not set
# CONFIG_SCSI_QLOGIC_FC is not set
# CONFIG_SCSI_QLOGIC_1280 is not set
CONFIG_SCSI_QLA2XXX=y
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
# CONFIG_SCSI_SYM53C416 is not set
# CONFIG_SCSI_DC395x is not set
# CONFIG_SCSI_DC390T is not set
# CONFIG_SCSI_T128 is not set
# CONFIG_SCSI_U14_34F is not set
# CONFIG_SCSI_ULTRASTOR is not set
# CONFIG_SCSI_NSP32 is not set
# CONFIG_SCSI_DEBUG is not set
#
# PCMCIA SCSI adapter support
#
# CONFIG_PCMCIA_AHA152X is not set
# CONFIG_PCMCIA_FDOMAIN is not set
# CONFIG_PCMCIA_NINJA_SCSI is not set
# CONFIG_PCMCIA_QLOGIC is not set
# CONFIG_PCMCIA_SYM53C500 is not set
#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set
#
# Multi-device support (RAID and LVM)
#
# CONFIG_MD is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
CONFIG_IEEE1394=y
#
# Subsystem Options
#
# CONFIG_IEEE1394_VERBOSEDEBUG is not set
# CONFIG_IEEE1394_OUI_DB is not set
# CONFIG_IEEE1394_EXTRA_CONFIG_ROMS is not set
#
# Device Drivers
#
#
# Texas Instruments PCILynx requires I2C
#
CONFIG_IEEE1394_OHCI1394=y
#
# Protocol Drivers
#
# CONFIG_IEEE1394_VIDEO1394 is not set
# CONFIG_IEEE1394_SBP2 is not set
# CONFIG_IEEE1394_ETH1394 is not set
# CONFIG_IEEE1394_DV1394 is not set
CONFIG_IEEE1394_RAWIO=y
# CONFIG_IEEE1394_CMP is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
# CONFIG_PACKET_MMAP is not set
# CONFIG_NETLINK_DEV is not set
CONFIG_UNIX=y
# CONFIG_NET_KEY is not set
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
# CONFIG_IP_ADVANCED_ROUTER is not set
# CONFIG_IP_PNP is not set
# CONFIG_NET_IPIP is not set
# CONFIG_NET_IPGRE is not set
# CONFIG_IP_MROUTE is not set
# CONFIG_ARPD is not set
# CONFIG_SYN_COOKIES is not set
# CONFIG_INET_AH is not set
# CONFIG_INET_ESP is not set
# CONFIG_INET_IPCOMP is not set
#
# IP: Virtual Server Configuration
#
# CONFIG_IP_VS is not set
# CONFIG_IPV6 is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=y
# CONFIG_IP_NF_FTP is not set
# CONFIG_IP_NF_IRC is not set
# CONFIG_IP_NF_TFTP is not set
# CONFIG_IP_NF_AMANDA is not set
CONFIG_IP_NF_QUEUE=y
CONFIG_IP_NF_IPTABLES=y
CONFIG_IP_NF_MATCH_LIMIT=y
CONFIG_IP_NF_MATCH_IPRANGE=y
CONFIG_IP_NF_MATCH_MAC=y
# CONFIG_IP_NF_MATCH_LAYER7 is not set
# CONFIG_IP_NF_MATCH_CHILDLEVEL is not set
CONFIG_IP_NF_MATCH_PKTTYPE=y
CONFIG_IP_NF_MATCH_MARK=y
CONFIG_IP_NF_MATCH_MULTIPORT=y
CONFIG_IP_NF_MATCH_TOS=y
CONFIG_IP_NF_MATCH_RECENT=y
CONFIG_IP_NF_MATCH_ECN=y
CONFIG_IP_NF_MATCH_DSCP=y
CONFIG_IP_NF_MATCH_AH_ESP=y
CONFIG_IP_NF_MATCH_LENGTH=y
CONFIG_IP_NF_MATCH_TTL=y
CONFIG_IP_NF_MATCH_TCPMSS=y
CONFIG_IP_NF_MATCH_HELPER=y
CONFIG_IP_NF_MATCH_STATE=y
CONFIG_IP_NF_MATCH_CONNTRACK=y
CONFIG_IP_NF_MATCH_OWNER=y
CONFIG_IP_NF_FILTER=y
CONFIG_IP_NF_TARGET_REJECT=y
CONFIG_IP_NF_NAT=y
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=y
CONFIG_IP_NF_TARGET_REDIRECT=y
CONFIG_IP_NF_TARGET_NETMAP=y
CONFIG_IP_NF_TARGET_SAME=y
# CONFIG_IP_NF_NAT_LOCAL is not set
# CONFIG_IP_NF_NAT_SNMP_BASIC is not set
CONFIG_IP_NF_MANGLE=y
CONFIG_IP_NF_TARGET_TOS=y
CONFIG_IP_NF_TARGET_ECN=y
CONFIG_IP_NF_TARGET_DSCP=y
CONFIG_IP_NF_TARGET_MARK=y
CONFIG_IP_NF_TARGET_CLASSIFY=y
CONFIG_IP_NF_TARGET_LOG=y
CONFIG_IP_NF_TARGET_ULOG=y
CONFIG_IP_NF_TARGET_TCPMSS=y
CONFIG_IP_NF_ARPTABLES=y
CONFIG_IP_NF_ARPFILTER=y
CONFIG_IP_NF_ARP_MANGLE=y
# CONFIG_IP_NF_RAW is not set
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
# CONFIG_ATM is not set
# CONFIG_BRIDGE is not set
# CONFIG_VLAN_8021Q is not set
# CONFIG_DECNET is not set
# CONFIG_LLC2 is not set
# CONFIG_IPX is not set
# CONFIG_ATALK is not set
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
# CONFIG_NET_DIVERT is not set
# CONFIG_ECONET is not set
# CONFIG_WAN_ROUTER is not set
# CONFIG_NET_FASTROUTE is not set
# CONFIG_NET_HW_FLOWCONTROL is not set
#
# QoS and/or fair queueing
#
# CONFIG_NET_SCHED is not set
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
# CONFIG_NETPOLL is not set
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
# CONFIG_NET_POLL_CONTROLLER is not set
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
# CONFIG_TUN is not set
# CONFIG_NET_SB1000 is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=y
# CONFIG_HAPPYMEAL is not set
# CONFIG_SUNGEM is not set
# CONFIG_NET_VENDOR_3COM is not set
# CONFIG_LANCE is not set
# CONFIG_NET_VENDOR_SMC is not set
# CONFIG_NET_VENDOR_RACAL is not set
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
# CONFIG_AT1700 is not set
# CONFIG_DEPCA is not set
# CONFIG_HP100 is not set
# CONFIG_NET_ISA is not set
CONFIG_NET_PCI=y
# CONFIG_PCNET32 is not set
# CONFIG_AMD8111_ETH is not set
# CONFIG_ADAPTEC_STARFIRE is not set
# CONFIG_AC3200 is not set
# CONFIG_APRICOT is not set
# CONFIG_B44 is not set
# CONFIG_FORCEDETH is not set
# CONFIG_CS89x0 is not set
# CONFIG_DGRS is not set
# CONFIG_EEPRO100 is not set
# CONFIG_E100 is not set
# CONFIG_FEALNX is not set
# CONFIG_NATSEMI is not set
# CONFIG_NE2K_PCI is not set
# CONFIG_8139CP is not set
CONFIG_8139TOO=y
CONFIG_8139TOO_PIO=y
# CONFIG_8139TOO_TUNE_TWISTER is not set
# CONFIG_8139TOO_8129 is not set
# CONFIG_8139_OLD_RX_RESET is not set
# CONFIG_SIS900 is not set
# CONFIG_EPIC100 is not set
# CONFIG_SUNDANCE is not set
# CONFIG_TLAN is not set
# CONFIG_VIA_RHINE is not set
# CONFIG_NET_POCKET is not set
#
# Ethernet (1000 Mbit)
#
# CONFIG_ACENIC is not set
# CONFIG_DL2K is not set
# CONFIG_E1000 is not set
# CONFIG_NS83820 is not set
# CONFIG_HAMACHI is not set
# CONFIG_YELLOWFIN is not set
# CONFIG_R8169 is not set
# CONFIG_SK98LIN is not set
# CONFIG_TIGON3 is not set
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
# CONFIG_TR is not set
#
# Wireless LAN (non-hamradio)
#
CONFIG_NET_RADIO=y
#
# Obsolete Wireless cards support (pre-802.11)
#
# CONFIG_STRIP is not set
# CONFIG_ARLAN is not set
# CONFIG_WAVELAN is not set
# CONFIG_PCMCIA_WAVELAN is not set
# CONFIG_PCMCIA_NETWAVE is not set
#
# Wireless 802.11 Frequency Hopping cards support
#
# CONFIG_PCMCIA_RAYCS is not set
#
# Wireless 802.11b ISA/PCI cards support
#
# CONFIG_AIRO is not set
# CONFIG_HERMES is not set
# CONFIG_ATMEL is not set
#
# Wireless 802.11b Pcmcia/Cardbus cards support
#
CONFIG_AIRO_CS=m
# CONFIG_PCMCIA_WL3501 is not set
#
# Prism GT/Duette 802.11(a/b/g) PCI/Cardbus support
#
# CONFIG_PRISM54 is not set
CONFIG_NET_WIRELESS=y
#
# PCMCIA network device support
#
# CONFIG_NET_PCMCIA is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
# CONFIG_PLIP is not set
# CONFIG_PPP is not set
# CONFIG_SLIP is not set
# CONFIG_NET_FC is not set
# CONFIG_SHAPER is not set
# CONFIG_NETCONSOLE is not set
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
CONFIG_INPUT_MOUSEDEV_PSAUX=y
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
# CONFIG_INPUT_JOYDEV is not set
# CONFIG_INPUT_TSDEV is not set
# CONFIG_INPUT_EVDEV is not set
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
# CONFIG_GAMEPORT is not set
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
# CONFIG_SERIO_SERPORT is not set
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PARKBD is not set
# CONFIG_SERIO_PCIPS2 is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_INPORT is not set
# CONFIG_MOUSE_LOGIBM is not set
# CONFIG_MOUSE_PC110PAD is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
# CONFIG_INPUT_MISC is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_NR_TTY_DEVICES=63
CONFIG_HW_CONSOLE=y
# CONFIG_SERIAL_NONSTANDARD is not set
#
# Serial drivers
#
CONFIG_SERIAL_8250=y
# CONFIG_SERIAL_8250_CONSOLE is not set
# CONFIG_SERIAL_8250_CS is not set
# CONFIG_SERIAL_8250_ACPI is not set
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=y
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
CONFIG_PRINTER=y
# CONFIG_LP_CONSOLE is not set
# CONFIG_PPDEV is not set
# CONFIG_TIPAR is not set
# CONFIG_QIC02_TAPE is not set
#
# IPMI
#
# CONFIG_IPMI_HANDLER is not set
#
# Watchdog Cards
#
# CONFIG_WATCHDOG is not set
# CONFIG_HW_RANDOM is not set
# CONFIG_NVRAM is not set
# CONFIG_RTC is not set
# CONFIG_GEN_RTC is not set
# CONFIG_DTLK is not set
# CONFIG_R3964 is not set
# CONFIG_APPLICOM is not set
# CONFIG_SONYPI is not set
#
# Ftape, the floppy tape device driver
#
# CONFIG_FTAPE is not set
CONFIG_AGP=y
# CONFIG_AGP_ALI is not set
# CONFIG_AGP_ATI is not set
# CONFIG_AGP_AMD is not set
# CONFIG_AGP_AMD64 is not set
CONFIG_AGP_INTEL=y
# CONFIG_AGP_INTEL_MCH is not set
# CONFIG_AGP_NVIDIA is not set
# CONFIG_AGP_SIS is not set
# CONFIG_AGP_SWORKS is not set
# CONFIG_AGP_VIA is not set
# CONFIG_AGP_EFFICEON is not set
CONFIG_DRM=y
# CONFIG_DRM_TDFX is not set
# CONFIG_DRM_GAMMA is not set
# CONFIG_DRM_R128 is not set
CONFIG_DRM_RADEON=y
# CONFIG_DRM_I810 is not set
# CONFIG_DRM_I830 is not set
# CONFIG_DRM_MGA is not set
# CONFIG_DRM_SIS is not set
# CONFIG_DRM_VIA is not set
#
# PCMCIA character devices
#
# CONFIG_SYNCLINK_CS is not set
# CONFIG_MWAVE is not set
# CONFIG_RAW_DRIVER is not set
CONFIG_HPET=y
# CONFIG_HPET_RTC_IRQ is not set
# CONFIG_HPET_NOMMAP is not set
# CONFIG_HANGCHECK_TIMER is not set
#
# I2C support
#
# CONFIG_I2C is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
CONFIG_FB=y
# CONFIG_FB_PM2 is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
# CONFIG_FB_VGA16 is not set
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
# CONFIG_FB_HGA is not set
# CONFIG_FB_RIVA is not set
# CONFIG_FB_CLE266 is not set
# CONFIG_FB_I810 is not set
# CONFIG_FB_MATROX is not set
# CONFIG_FB_RADEON_OLD is not set
# CONFIG_FB_RADEON is not set
# CONFIG_FB_ATY128 is not set
# CONFIG_FB_ATY is not set
# CONFIG_FB_SIS is not set
# CONFIG_FB_NEOMAGIC is not set
# CONFIG_FB_KYRO is not set
# CONFIG_FB_3DFX is not set
# CONFIG_FB_VOODOO1 is not set
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_VIRTUAL is not set
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
# CONFIG_MDA_CONSOLE is not set
CONFIG_DUMMY_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE is not set
#
# Logo configuration
#
CONFIG_LOGO=y
CONFIG_LOGO_WALKEN=y
CONFIG_LOGO_LINUX_MONO=y
CONFIG_LOGO_LINUX_VGA16=y
CONFIG_LOGO_LINUX_CLUT224=y
#
# Bootsplash configuration
#
#
# Sound
#
CONFIG_SOUND=y
#
# Advanced Linux Sound Architecture
#
CONFIG_SND=y
CONFIG_SND_TIMER=y
CONFIG_SND_PCM=y
CONFIG_SND_RAWMIDI=y
CONFIG_SND_SEQUENCER=y
# CONFIG_SND_SEQ_DUMMY is not set
CONFIG_SND_OSSEMUL=y
CONFIG_SND_MIXER_OSS=y
CONFIG_SND_PCM_OSS=y
CONFIG_SND_SEQUENCER_OSS=y
# CONFIG_SND_VERBOSE_PRINTK is not set
# CONFIG_SND_DEBUG is not set
#
# Generic devices
#
CONFIG_SND_MPU401_UART=y
# CONFIG_SND_DUMMY is not set
# CONFIG_SND_VIRMIDI is not set
# CONFIG_SND_MTPAV is not set
# CONFIG_SND_SERIAL_U16550 is not set
# CONFIG_SND_MPU401 is not set
#
# ISA devices
#
# CONFIG_SND_AD1848 is not set
# CONFIG_SND_CS4231 is not set
# CONFIG_SND_CS4232 is not set
# CONFIG_SND_CS4236 is not set
# CONFIG_SND_ES1688 is not set
# CONFIG_SND_ES18XX is not set
# CONFIG_SND_GUSCLASSIC is not set
# CONFIG_SND_GUSEXTREME is not set
# CONFIG_SND_GUSMAX is not set
# CONFIG_SND_INTERWAVE is not set
# CONFIG_SND_INTERWAVE_STB is not set
# CONFIG_SND_OPTI92X_AD1848 is not set
# CONFIG_SND_OPTI92X_CS4231 is not set
# CONFIG_SND_OPTI93X is not set
# CONFIG_SND_SB8 is not set
# CONFIG_SND_SB16 is not set
# CONFIG_SND_SBAWE is not set
# CONFIG_SND_WAVEFRONT is not set
# CONFIG_SND_CMI8330 is not set
# CONFIG_SND_OPL3SA2 is not set
# CONFIG_SND_SGALAXY is not set
# CONFIG_SND_SSCAPE is not set
#
# PCI devices
#
CONFIG_SND_AC97_CODEC=y
# CONFIG_SND_ALI5451 is not set
# CONFIG_SND_ATIIXP is not set
# CONFIG_SND_AU8810 is not set
# CONFIG_SND_AU8820 is not set
# CONFIG_SND_AU8830 is not set
# CONFIG_SND_AZT3328 is not set
# CONFIG_SND_BT87X is not set
# CONFIG_SND_CS46XX is not set
# CONFIG_SND_CS4281 is not set
# CONFIG_SND_EMU10K1 is not set
# CONFIG_SND_KORG1212 is not set
# CONFIG_SND_MIXART is not set
# CONFIG_SND_NM256 is not set
# CONFIG_SND_RME32 is not set
# CONFIG_SND_RME96 is not set
# CONFIG_SND_RME9652 is not set
# CONFIG_SND_HDSP is not set
# CONFIG_SND_TRIDENT is not set
# CONFIG_SND_YMFPCI is not set
# CONFIG_SND_ALS4000 is not set
# CONFIG_SND_CMIPCI is not set
# CONFIG_SND_ENS1370 is not set
# CONFIG_SND_ENS1371 is not set
# CONFIG_SND_ES1938 is not set
# CONFIG_SND_ES1968 is not set
# CONFIG_SND_MAESTRO3 is not set
# CONFIG_SND_FM801 is not set
# CONFIG_SND_ICE1712 is not set
# CONFIG_SND_ICE1724 is not set
CONFIG_SND_INTEL8X0=y
# CONFIG_SND_INTEL8X0M is not set
# CONFIG_SND_SONICVIBES is not set
# CONFIG_SND_VIA82XX is not set
# CONFIG_SND_VX222 is not set
#
# ALSA USB devices
#
# CONFIG_SND_USB_AUDIO is not set
#
# PCMCIA devices
#
# CONFIG_SND_VXPOCKET is not set
# CONFIG_SND_VXP440 is not set
# CONFIG_SND_PDAUDIOCF is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
CONFIG_USB=y
# CONFIG_USB_DEBUG is not set
#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_BANDWIDTH is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
#
# USB Host Controller Drivers
#
CONFIG_USB_EHCI_HCD=y
# CONFIG_USB_EHCI_SPLIT_ISO is not set
# CONFIG_USB_EHCI_ROOT_HUB_TT is not set
# CONFIG_USB_OHCI_HCD is not set
CONFIG_USB_UHCI_HCD=y
#
# USB Device Class drivers
#
# CONFIG_USB_AUDIO is not set
# CONFIG_USB_BLUETOOTH_TTY is not set
# CONFIG_USB_MIDI is not set
# CONFIG_USB_ACM is not set
CONFIG_USB_PRINTER=y
CONFIG_USB_STORAGE=y
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_DATAFAB is not set
# CONFIG_USB_STORAGE_FREECOM is not set
# CONFIG_USB_STORAGE_ISD200 is not set
# CONFIG_USB_STORAGE_DPCM is not set
# CONFIG_USB_STORAGE_HP8200e is not set
# CONFIG_USB_STORAGE_SDDR09 is not set
# CONFIG_USB_STORAGE_SDDR55 is not set
# CONFIG_USB_STORAGE_JUMPSHOT is not set
#
# USB Human Interface Devices (HID)
#
CONFIG_USB_HID=y
CONFIG_USB_HIDINPUT=y
# CONFIG_HID_FF is not set
# CONFIG_USB_HIDDEV is not set
# CONFIG_USB_AIPTEK is not set
# CONFIG_USB_WACOM is not set
# CONFIG_USB_KBTAB is not set
# CONFIG_USB_POWERMATE is not set
# CONFIG_USB_MTOUCH is not set
# CONFIG_USB_EGALAX is not set
# CONFIG_USB_XPAD is not set
# CONFIG_USB_ATI_REMOTE is not set
#
# USB Imaging devices
#
# CONFIG_USB_MDC800 is not set
# CONFIG_USB_MICROTEK is not set
# CONFIG_USB_HPUSBSCSI is not set
#
# USB Multimedia devices
#
# CONFIG_USB_DABUSB is not set
#
# Video4Linux support is needed for USB Multimedia device support
#
#
# USB Network adaptors
#
# CONFIG_USB_CATC is not set
# CONFIG_USB_KAWETH is not set
# CONFIG_USB_PEGASUS is not set
# CONFIG_USB_RTL8150 is not set
# CONFIG_USB_USBNET is not set
#
# USB port drivers
#
# CONFIG_USB_USS720 is not set
#
# USB Serial Converter support
#
# CONFIG_USB_SERIAL is not set
#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
# CONFIG_USB_TIGL is not set
# CONFIG_USB_AUERSWALD is not set
# CONFIG_USB_RIO500 is not set
# CONFIG_USB_LEGOTOWER is not set
# CONFIG_USB_LCD is not set
# CONFIG_USB_LED is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGETSERVO is not set
# CONFIG_USB_TEST is not set
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=y
CONFIG_EXT3_FS_XATTR=y
# CONFIG_EXT3_FS_POSIX_ACL is not set
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=y
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
CONFIG_REISER4_FS=y
# CONFIG_REISER4_FS_SYSCALL is not set
CONFIG_REISER4_LARGE_KEY=y
# CONFIG_REISER4_CHECK is not set
CONFIG_REISER4_USE_EFLUSH=y
# CONFIG_REISER4_COPY_ON_CAPTURE is not set
# CONFIG_REISER4_BADBLOCKS is not set
CONFIG_REISERFS_FS=y
# CONFIG_REISERFS_CHECK is not set
# CONFIG_REISERFS_PROC_INFO is not set
# CONFIG_REISERFS_FS_XATTR is not set
# CONFIG_JFS_FS is not set
# CONFIG_XFS_FS is not set
# CONFIG_MINIX_FS is not set
# CONFIG_ROMFS_FS is not set
# CONFIG_QUOTA is not set
# CONFIG_AUTOFS_FS is not set
CONFIG_AUTOFS4_FS=y
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
# CONFIG_ZISOFS is not set
CONFIG_UDF_FS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=y
CONFIG_MSDOS_FS=y
CONFIG_VFAT_FS=y
CONFIG_NTFS_FS=m
# CONFIG_NTFS_DEBUG is not set
# CONFIG_NTFS_RW is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
CONFIG_DEVFS_FS=y
CONFIG_DEVFS_MOUNT=y
# CONFIG_DEVFS_DEBUG is not set
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
# CONFIG_SUPERMOUNT is not set
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
# CONFIG_CRAMFS is not set
# CONFIG_SQUASHFS is not set
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
# CONFIG_SYSV_FS is not set
# CONFIG_UFS_FS is not set
# CONFIG_LUFS_FS is not set
#
# Network File Systems
#
CONFIG_NFS_FS=y
# CONFIG_NFS_V3 is not set
# CONFIG_NFS_V4 is not set
# CONFIG_NFS_DIRECTIO is not set
CONFIG_NFSD=y
# CONFIG_NFSD_V3 is not set
# CONFIG_NFSD_TCP is not set
CONFIG_LOCKD=y
CONFIG_EXPORTFS=y
CONFIG_SUNRPC=y
# CONFIG_RPCSEC_GSS_KRB5 is not set
CONFIG_SMB_FS=m
# CONFIG_SMB_NLS_DEFAULT is not set
# CONFIG_CIFS is not set
# CONFIG_NCP_FS is not set
# CONFIG_CODA_FS is not set
# CONFIG_AFS_FS is not set
#
# Partition Types
#
# CONFIG_PARTITION_ADVANCED is not set
CONFIG_MSDOS_PARTITION=y
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=y
# CONFIG_NLS_CODEPAGE_737 is not set
# CONFIG_NLS_CODEPAGE_775 is not set
# CONFIG_NLS_CODEPAGE_850 is not set
# CONFIG_NLS_CODEPAGE_852 is not set
# CONFIG_NLS_CODEPAGE_855 is not set
# CONFIG_NLS_CODEPAGE_857 is not set
# CONFIG_NLS_CODEPAGE_860 is not set
# CONFIG_NLS_CODEPAGE_861 is not set
# CONFIG_NLS_CODEPAGE_862 is not set
# CONFIG_NLS_CODEPAGE_863 is not set
# CONFIG_NLS_CODEPAGE_864 is not set
# CONFIG_NLS_CODEPAGE_865 is not set
# CONFIG_NLS_CODEPAGE_866 is not set
# CONFIG_NLS_CODEPAGE_869 is not set
# CONFIG_NLS_CODEPAGE_936 is not set
# CONFIG_NLS_CODEPAGE_950 is not set
# CONFIG_NLS_CODEPAGE_932 is not set
# CONFIG_NLS_CODEPAGE_949 is not set
# CONFIG_NLS_CODEPAGE_874 is not set
# CONFIG_NLS_ISO8859_8 is not set
# CONFIG_NLS_CODEPAGE_1250 is not set
# CONFIG_NLS_CODEPAGE_1251 is not set
CONFIG_NLS_ISO8859_1=y
# CONFIG_NLS_ISO8859_2 is not set
# CONFIG_NLS_ISO8859_3 is not set
# CONFIG_NLS_ISO8859_4 is not set
# CONFIG_NLS_ISO8859_5 is not set
# CONFIG_NLS_ISO8859_6 is not set
# CONFIG_NLS_ISO8859_7 is not set
# CONFIG_NLS_ISO8859_9 is not set
# CONFIG_NLS_ISO8859_13 is not set
# CONFIG_NLS_ISO8859_14 is not set
# CONFIG_NLS_ISO8859_15 is not set
# CONFIG_NLS_KOI8_R is not set
# CONFIG_NLS_KOI8_U is not set
# CONFIG_NLS_UTF8 is not set
#
# Profiling support
#
CONFIG_PROFILING=y
CONFIG_OPROFILE=y
#
# NeTraverse Win4Lin Support
#
# CONFIG_MKI is not set
#
# Kernel hacking
#
# CONFIG_DEBUG_KERNEL is not set
CONFIG_EARLY_PRINTK=y
CONFIG_DEBUG_SPINLOCK_SLEEP=y
# CONFIG_KGDB_MORE is not set
# CONFIG_FRAME_POINTER is not set
# CONFIG_4KSTACKS is not set
CONFIG_SCHEDSTATS=y
#
# Security options
#
# CONFIG_SECURITY is not set
#
# Cryptographic options
#
CONFIG_CRYPTO=y
# CONFIG_CRYPTO_HMAC is not set
# CONFIG_CRYPTO_NULL is not set
# CONFIG_CRYPTO_MD4 is not set
# CONFIG_CRYPTO_MD5 is not set
# CONFIG_CRYPTO_SHA1 is not set
# CONFIG_CRYPTO_SHA256 is not set
# CONFIG_CRYPTO_SHA512 is not set
# CONFIG_CRYPTO_DES is not set
# CONFIG_CRYPTO_BLOWFISH is not set
# CONFIG_CRYPTO_TWOFISH is not set
# CONFIG_CRYPTO_SERPENT is not set
# CONFIG_CRYPTO_AES is not set
# CONFIG_CRYPTO_CAST5 is not set
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_DEFLATE is not set
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set
#
# Library routines
#
CONFIG_CRC32=y
# CONFIG_LIBCRC32C is not set
# CONFIG_QSORT is not set
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_STD_RESOURCES=y
CONFIG_PC=y
* Shane Shrybman <[email protected]> wrote:
> V0.7.25-1 has been stable here with the ivtv driver for 11 hrs. No
> sign of the ide dma time out issue either. Out of curiosity, do we
> know what solved that problem?
could you try the attached patch - does it trigger the DMA timeouts
again? There were 3 changes to the IOAPIC code that could have affected
your dma-timeout problem, this patch reverts all of them.
Mark's suggestion sounds quite plausible too - but the question is, your
timeout problems went away previously by tweaking io_apic.c, so it would
be nice to see that they are still gone even with the old 'broken'
io_apic.c logic. (none of the io_apic.c changes fixes any particular
bug, they are only latency optimizations, so i'd be surprised if they
really impacted your timeout problems.)
if the DMA timeouts are still gone even with this patch applied then i
think it's safe to conclude that Mark's explanation is the correct one,
and that it was starvation of the SCHED_OTHER IDE irq-thread that caused
the timeouts: it _really_ was a timeout. (a workaround would be to make
the timeout longer)
Ingo
--- linux/arch/i386/kernel/io_apic.c.orig2
+++ linux/arch/i386/kernel/io_apic.c
@@ -150,7 +150,7 @@ static void update_io_apic_cache(unsigne
}
}
-#define IOAPIC_CACHE
+// #define IOAPIC_CACHE
/*
* Some systems need a POST flush or else level-triggered interrupts
* generate lots of spurious interrupts due to the POST-ed write not
@@ -188,7 +188,7 @@ static void __modify_IO_APIC_irq (unsign
/*
* Force POST flush by reading:
*/
- reg = *(IO_APIC_BASE(entry->apic)+4);
+ reg = io_apic_read(entry->apic, 0x10 + pin*2);
#endif
if (!entry->next)
break;
@@ -1940,7 +1940,7 @@ static unsigned int startup_level_ioapic
* unacked local APIC is dangerous on SMP as it can prevent the
* delivery of IPIs and can thus cause deadlocks.)
*/
-#if defined(CONFIG_PREEMPT_HARDIRQS) && defined(CONFIG_SMP)
+#if defined(CONFIG_PREEMPT_HARDIRQS)
static void mask_and_ack_level_ioapic_irq(unsigned int irq)
{
* Gunther Persoons <[email protected]> wrote:
> I cant use my pcmcia wireless network card with this version, i can
> use it with V0.7.25-0. dhcpcd and ifconfig lock when i try to use
> them. config attached.
extremely weird - there simply was no change between -0 and -1 that
could have affected it. If you do this on the -1 kernel:
echo 0 > /proc/sys/kernel/preempt_wakeup_timing
echo 1 > /proc/sys/kernel/debug_direct_keyboard
then you'll get precisely the -0 kernel, bit for bit. (plus the symbol
export fix in rtc.ko, which should have zero relevance to your setup.)
[note that debug_direct_keyboard is dangerous.]
so i believe the explanation has to be something else:
- are you sure the build is correct?
- are you sure it still works with the -0 kernel, maybe the bug is
transient?
Ingo
On Fri, 2004-11-12 at 13:23, [email protected] wrote:
> >Typical example of the error message:
> >
> >kernel: hde: dma_timer_expiry: dma status == 0x24
> >kernel: ALSA sound/core/pcm_native.c:1424: playback drain error (DMA or
> IRQ trouble?)
> >kernel: PDC202XX: Primary channel reset.
> >kernel: hde: DMA interrupt recovery
> >kernel: hde: lost interrupt
> >
> >This was on a Promise TX2 133 ide card with one IDE disk. The problem
> >would show itself if using the RT patches and APIC. But the problem seems
> >to have been resolved now.
>
> I had errors like that one when the IDE IRQ was at a priority less than
> the real time task. Since then, I run with all the IRQ's at max RT priority
> and will continue to do so until I get a better assessment of what my real
> application (not these audio tests...) needs for IRQ priorities.
>
Ok, I wasn't comparing apples to apples. I forgot I had to remove the sb live
card from this machine a few days ago. So the hardware config wasn't exactly
the same. I have reinstalled the sb live card now and I am retesting on 0.7.25.
The sb live shares an irq with ide2(promise card)
CPU0
0: 835791 IO-APIC-edge timer 0/35791
1: 2207 IO-APIC-edge i8042 1/2207
8: 4 IO-APIC-edge rtc 0/4
9: 0 IO-APIC-level acpi 0/0
15: 11 IO-APIC-edge ide1 1/9
16: 70527 IO-APIC-level ide2, EMU10K1 0/70527
17: 1093 IO-APIC-level eth0 0/1092
18: 37440 IO-APIC-level bttv0, Bt87x audio 173/37439
19: 39147 IO-APIC-level aic7xxx, ivtv0 340/39143
21: 25091 IO-APIC-level uhci_hcd, uhci_hcd, uhci_hcd, uhci_hcd 52/25091
22: 85425 IO-APIC-level VIA8237 494/71991
NMI: 836519
LOC: 836340
ERR: 0
MIS: 1
With the sb live card back in use the system has hung once with the sound looping.
It hung after I started playing a video in a second instance of the mythfrontend
application. I had the nmi_watchdog on and netconsole logging to another machine
but there was nothing in the logs.
I have rebooted and I am trying to verify the dma_timer_expiry issue is still gone
with the sb live in use.
> This may have been fixed as a side effect of Ingo setting the IRQ threads
> at
> RT priorities in the 40's.
>
I had originally thought this might be the cause as well so I jacked
ide2 priority but it didn't help. However it does share the irq so maybe
that is a factor here.
> --Mark H Johnson
> <mailto:[email protected]>
>
shane
On Fri, 2004-11-12 at 15:13, Ingo Molnar wrote:
> * Shane Shrybman <[email protected]> wrote:
>
> > V0.7.25-1 has been stable here with the ivtv driver for 11 hrs. No
> > sign of the ide dma time out issue either. Out of curiosity, do we
> > know what solved that problem?
>
> could you try the attached patch - does it trigger the DMA timeouts
> again? There were 3 changes to the IOAPIC code that could have affected
> your dma-timeout problem, this patch reverts all of them.
>
Yes it does trigger the DMA timeouts again. Just the addition of
CONFIG_SMP dep is enough to trigger it. Which isn't surprising since
the hack was to put #if 0 there wasn't it?
> Mark's suggestion sounds quite plausible too - but the question is, your
> timeout problems went away previously by tweaking io_apic.c, so it would
> be nice to see that they are still gone even with the old 'broken'
> io_apic.c logic. (none of the io_apic.c changes fixes any particular
> bug, they are only latency optimizations, so i'd be surprised if they
> really impacted your timeout problems.)
>
> if the DMA timeouts are still gone even with this patch applied then i
> think it's safe to conclude that Mark's explanation is the correct one,
> and that it was starvation of the SCHED_OTHER IDE irq-thread that caused
> the timeouts: it _really_ was a timeout. (a workaround would be to make
> the timeout longer)
>
> Ingo
>
> --- linux/arch/i386/kernel/io_apic.c.orig2
> +++ linux/arch/i386/kernel/io_apic.c
> @@ -150,7 +150,7 @@ static void update_io_apic_cache(unsigne
> }
> }
>
> -#define IOAPIC_CACHE
> +// #define IOAPIC_CACHE
> /*
> * Some systems need a POST flush or else level-triggered interrupts
> * generate lots of spurious interrupts due to the POST-ed write not
> @@ -188,7 +188,7 @@ static void __modify_IO_APIC_irq (unsign
> /*
> * Force POST flush by reading:
> */
> - reg = *(IO_APIC_BASE(entry->apic)+4);
> + reg = io_apic_read(entry->apic, 0x10 + pin*2);
> #endif
> if (!entry->next)
> break;
> @@ -1940,7 +1940,7 @@ static unsigned int startup_level_ioapic
> * unacked local APIC is dangerous on SMP as it can prevent the
> * delivery of IPIs and can thus cause deadlocks.)
> */
> -#if defined(CONFIG_PREEMPT_HARDIRQS) && defined(CONFIG_SMP)
> +#if defined(CONFIG_PREEMPT_HARDIRQS)
>
> static void mask_and_ack_level_ioapic_irq(unsigned int irq)
> {
>
shane
On Fri, 2004-11-12 at 15:13, Ingo Molnar wrote:
> * Shane Shrybman <[email protected]> wrote:
>
> > V0.7.25-1 has been stable here with the ivtv driver for 11 hrs. No
> > sign of the ide dma time out issue either. Out of curiosity, do we
> > know what solved that problem?
>
> could you try the attached patch - does it trigger the DMA timeouts
> again? There were 3 changes to the IOAPIC code that could have affected
> your dma-timeout problem, this patch reverts all of them.
>
Ok, V0.7.25-1 seems to have resolved the DMA timeout problem.
I don't know how useful it is but this patch also seems to have resolved
that problem.
--- linux-2.6.10-rc1mm3-RT3/arch/i386/kernel/io_apic.c 2004-11-11 16:41:37.000000000 -0500
+++ linux-2.6.10-rc1mm3-RT3.T5/arch/i386/kernel/io_apic.c 2004-11-12 17:54:31.000000000 -0500
@@ -156,7 +156,7 @@
* generate lots of spurious interrupts due to the POST-ed write not
* reaching the IOAPIC before the IRQ is ACK-ed in the local APIC.
*/
-#define IOAPIC_POSTFLUSH
+//#define IOAPIC_POSTFLUSH
static void __modify_IO_APIC_irq (unsigned int irq, unsigned long enable, unsigned long disable)
{
@@ -1940,7 +1940,7 @@
* unacked local APIC is dangerous on SMP as it can prevent the
* delivery of IPIs and can thus cause deadlocks.)
*/
-#if defined(CONFIG_PREEMPT_HARDIRQS) && defined(CONFIG_SMP)
+#if defined(CONFIG_PREEMPT_HARDIRQS)
static void mask_and_ack_level_ioapic_irq(unsigned int irq)
{
Regards,
Shane
Ingo Molnar wrote:
>* Gunther Persoons <[email protected]> wrote:
>
>
>
>>I cant use my pcmcia wireless network card with this version, i can
>>use it with V0.7.25-0. dhcpcd and ifconfig lock when i try to use
>>them. config attached.
>>
>>
>
>extremely weird - there simply was no change between -0 and -1 that
>could have affected it. If you do this on the -1 kernel:
>
> echo 0 > /proc/sys/kernel/preempt_wakeup_timing
> echo 1 > /proc/sys/kernel/debug_direct_keyboard
>
>then you'll get precisely the -0 kernel, bit for bit. (plus the symbol
>export fix in rtc.ko, which should have zero relevance to your setup.)
>
>[note that debug_direct_keyboard is dangerous.]
>
>so i believe the explanation has to be something else:
>
> - are you sure the build is correct?
>
> - are you sure it still works with the -0 kernel, maybe the bug is
> transient?
>
> Ingo
>
>
>
Removing every software update i did between 25.0 and 25.1 resolved the
problem, i think there was something with my gentoo init scripts.
Although 25.0 was working fine with the software updates. I am now going
to reinstall the updates one by one to see which one caused it.
Ingo Molnar wrote:
>* Gunther Persoons <[email protected]> wrote:
>
>
>
>>I cant use my pcmcia wireless network card with this version, i can
>>use it with V0.7.25-0. dhcpcd and ifconfig lock when i try to use
>>them. config attached.
>>
>>
>
>extremely weird - there simply was no change between -0 and -1 that
>could have affected it. If you do this on the -1 kernel:
>
> echo 0 > /proc/sys/kernel/preempt_wakeup_timing
> echo 1 > /proc/sys/kernel/debug_direct_keyboard
>
>then you'll get precisely the -0 kernel, bit for bit. (plus the symbol
>export fix in rtc.ko, which should have zero relevance to your setup.)
>
>[note that debug_direct_keyboard is dangerous.]
>
>so i believe the explanation has to be something else:
>
> - are you sure the build is correct?
>
> - are you sure it still works with the -0 kernel, maybe the bug is
> transient?
>
> Ingo
>
>
>
As i thought the init scripts were my problem. But i have an other question.
I recently started to use NFS. But with the mainline kernel cpu usage is
100%, and when i look in top si shows bewteen 40 and 60% cpu usage. With
your kernel si is 0%, but ksoftriqd/0 shows around 38% cpu usage and
total cpu usage is around 52%. Is this normal? on my server cpu usage is
2% but it uses a intel network card. My laptop is using a wireless
pcmcia card (cisco).
Ingo Molnar wrote:
>* Gunther Persoons <[email protected]> wrote:
>
>
>
>>I cant use my pcmcia wireless network card with this version, i can
>>use it with V0.7.25-0. dhcpcd and ifconfig lock when i try to use
>>them. config attached.
>>
>>
>
>extremely weird - there simply was no change between -0 and -1 that
>could have affected it. If you do this on the -1 kernel:
>
> echo 0 > /proc/sys/kernel/preempt_wakeup_timing
> echo 1 > /proc/sys/kernel/debug_direct_keyboard
>
>then you'll get precisely the -0 kernel, bit for bit. (plus the symbol
>export fix in rtc.ko, which should have zero relevance to your setup.)
>
>[note that debug_direct_keyboard is dangerous.]
>
>so i believe the explanation has to be something else:
>
> - are you sure the build is correct?
>
> - are you sure it still works with the -0 kernel, maybe the bug is
> transient?
>
> Ingo
>
>
>
this bug i got with .26
wget:12388 BUG: lock held at task exit time!
[c03ec764] {kernel_sem.lock}
.. held by: wget:12388 [c87d2680, 116]
... acquired at: __lock_text_start+0x2c/0x63
wget/12388: BUG in __up_mutex at kernel/rt.c:1076
[<c01395b0>] __up_mutex+0x2a3/0x509 (8)
[<c037f3b0>] __sched_text_start+0x508/0x64b (36)
[<c013a637>] up+0xef/0x104 (24)
[<c037f3b0>] __sched_text_start+0x508/0x64b (12)
[<c037f3b0>] __sched_text_start+0x508/0x64b (20)
[<c012480d>] do_exit+0x2d8/0x515 (8)
[<c0138126>] printk_lock+0x7f/0xc1 (4)
[<c0381136>] __lock_text_start+0x2c/0x63 (36)
[<c012480d>] do_exit+0x2d8/0x515 (32)
[<c012ea47>] get_signal_to_deliver+0x21e/0x379 (16)
[<c0124ab8>] do_group_exit+0x3f/0xcc (28)
[<c012ea47>] get_signal_to_deliver+0x21e/0x379 (8)
[<c012ea73>] get_signal_to_deliver+0x24a/0x379 (24)
[<c0105f88>] do_signal+0xa4/0x174 (44)
[<c014725b>] free_hot_page+0x20/0x24 (112)
[<c0177541>] poll_freewait+0x38/0x40 (12)
[<c0178254>] sys_poll+0x18b/0x21f (16)
[<c0177549>] __pollwait+0x0/0xc6 (36)
[<c010608d>] do_notify_resume+0x35/0x38 (24)
[<c010620e>] work_notifysig+0x13/0x15 (8)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c037eef2>] .... __sched_text_start+0x4a/0x64b
.....[<00000000>] .. ( <= 0x0)
.. [<c013a5f1>] .... up+0xa9/0x104
.....[<00000000>] .. ( <= 0x0)
.. [<c0139665>] .... __up_mutex+0x358/0x509
.....[<00000000>] .. ( <= 0x0)
.. [<c013b1fe>] .... print_traces+0x14/0x44
.....[<00000000>] .. ( <= 0x0)
* Gunther Persoons <[email protected]> wrote:
> this bug i got with .26
> wget:12388 BUG: lock held at task exit time!
> [c03ec764] {kernel_sem.lock}
> .. held by: wget:12388 [c87d2680, 116]
> ... acquired at: __lock_text_start+0x2c/0x63
i've uploaded .26-1 which has special BKL-debugging code added, which
will (hopefully) pinpoint where the BKL count leaked. (Karsten had
similar problems, with NFS.)
so, could you try .26-1 from the usual place:
http://redhat.com/~mingo/realtime-preempt/
and make sure you still have CONFIG_RT_DEADLOCK_DETECT enabled. When
this warning message hits next time around it should print some more
info about the place that last acquired the BKL.
Ingo
On Thu, 11 Nov 2004 22:51:22 +0100
Ingo Molnar <[email protected]> wrote:
> i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
Hi,
i just build and booted into 26-3 (w/o debugging stuff) and put a little
load on the system (find /'s plus kernel compile plus rtc_wakeup -f 8192).
Got this on the console:
`IRQ 8` [14] is being piggy. need_resched=0, cpu=0
and the machine locked. will build with debugging and try to reproduce.
flo
* Gunther Persoons <[email protected]> wrote:
> As i thought the init scripts were my problem. But i have an other
> question. I recently started to use NFS. But with the mainline kernel
> cpu usage is 100%, and when i look in top si shows bewteen 40 and 60%
> cpu usage. With your kernel si is 0%, but ksoftriqd/0 shows around 38%
> cpu usage and total cpu usage is around 52%. Is this normal? on my
> server cpu usage is 2% but it uses a intel network card. My laptop is
> using a wireless pcmcia card (cisco).
normally the RT kernel has higher system overhead (all IRQ traffic goes
to separate thread contexts, involving context-switching, etc.) so a
_reduction_ in system overhead looks a bit strange. Is there a
difference in performance?
Ingo
Ingo Molnar wrote:
>* Gunther Persoons <[email protected]> wrote:
>
>
>
>>As i thought the init scripts were my problem. But i have an other
>>question. I recently started to use NFS. But with the mainline kernel
>>cpu usage is 100%, and when i look in top si shows bewteen 40 and 60%
>>cpu usage. With your kernel si is 0%, but ksoftriqd/0 shows around 38%
>>cpu usage and total cpu usage is around 52%. Is this normal? on my
>>server cpu usage is 2% but it uses a intel network card. My laptop is
>>using a wireless pcmcia card (cisco).
>>
>>
>
>normally the RT kernel has higher system overhead (all IRQ traffic goes
>to separate thread contexts, involving context-switching, etc.) so a
>_reduction_ in system overhead looks a bit strange. Is there a
>difference in performance?
>
> Ingo
>
>
>
With the mainline kernel i get speeds around 600-700kb/s and with the RT
kernel i get speeds around 550kb/s. No other differnces except the cpu
usage and that the RT kernel feels much more responsive.
Florian Schmidt wrote:
> On Thu, 11 Nov 2004 22:51:22 +0100
> Ingo Molnar <[email protected]> wrote:
>
>
>>i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
>>downloaded from the usual place:
>>
>> http://redhat.com/~mingo/realtime-preempt/
>
>
> Hi,
>
> i just build and booted into 26-3 (w/o debugging stuff) and put a little
> load on the system (find /'s plus kernel compile plus rtc_wakeup -f 8192).
> Got this on the console:
>
> `IRQ 8` [14] is being piggy. need_resched=0, cpu=0
>
> and the machine locked. will build with debugging and try to reproduce.
>
> flo
>
Did you get any other messages in the log? This message is harmless as
far as the machine locking. This gets printed from rtc when a read of
/dev/rtc is missed before another interrupt arrives.
kr
On Sun, 14 Nov 2004 07:26:46 -0600
"K.R. Foley" <[email protected]> wrote:
> > `IRQ 8` [14] is being piggy. need_resched=0, cpu=0
> >
> > and the machine locked. will build with debugging and try to reproduce.
> >
> > flo
> >
>
> Did you get any other messages in the log? This message is harmless as
> far as the machine locking. This gets printed from rtc when a read of
> /dev/rtc is missed before another interrupt arrives.
I see. I have rebuilt and run the kernel with debugging, but it seems the
console dump is pretty useless when an ncurses app is running on the active
console (1 line at the bottom showed that there was more output, but i
couldn't see it). Will rerun and try to reproduce again without any ncurses
stuff running :)
flo
* Shane Shrybman <[email protected]> wrote:
> -#define IOAPIC_POSTFLUSH
> +//#define IOAPIC_POSTFLUSH
> -#if defined(CONFIG_PREEMPT_HARDIRQS) && defined(CONFIG_SMP)
> +#if defined(CONFIG_PREEMPT_HARDIRQS)
unfortunately the POST-flush is still needed. Without it i can see lots
of spurious interrupts on SMP systems. (most likely caused by the ACK
reaching the IO-APIC _before_ the mask-the-irq PCI-space write [which
gets delayed in the chipset due to write optimizations], so the IO-APIC
still thinks that the IRQ is enabled and for level-triggered IRQs this
means that another interrupt is sent to the CPU.)
Ingo
K.R. Foley wrote:
> Florian Schmidt wrote:
>
>> On Thu, 11 Nov 2004 22:51:22 +0100
>> Ingo Molnar <[email protected]> wrote:
>>
>>
>>> i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
>>> downloaded from the usual place:
>>>
>>> http://redhat.com/~mingo/realtime-preempt/
>>
>>
>>
>> Hi,
>>
>> i just build and booted into 26-3 (w/o debugging stuff) and put a little
>> load on the system (find /'s plus kernel compile plus rtc_wakeup -f
>> 8192).
>> Got this on the console:
>>
>> `IRQ 8` [14] is being piggy. need_resched=0, cpu=0
>>
>> and the machine locked. will build with debugging and try to reproduce.
>>
>> flo
>>
>
> Did you get any other messages in the log? This message is harmless as
> far as the machine locking. This gets printed from rtc when a read of
> /dev/rtc is missed before another interrupt arrives.
>
> kr
Actually this message should probably be removed, because the only
process that will every show up as being a piggy anymore will be 'IRQ
8', right?
kr
On Sun, 14 Nov 2004 07:26:46 -0600
"K.R. Foley" <[email protected]> wrote:
> Did you get any other messages in the log? This message is harmless as
> far as the machine locking. This gets printed from rtc when a read of
> /dev/rtc is missed before another interrupt arrives.
Arr, this time it just locked silently. sys-rq keysd were still available
but didn't produce any console output (sys-rq-b still rebooted the machine
though :))
I suppose this doesn't relly make sense w/o a serial console. I will get a
second machine on next friday. Then i can hopefully provide more useful info
than "he my machine locked up"..
flo
* Florian Schmidt <[email protected]> wrote:
> i just build and booted into 26-3 (w/o debugging stuff) and put a
> little load on the system (find /'s plus kernel compile plus
> rtc_wakeup -f 8192). Got this on the console:
>
> `IRQ 8` [14] is being piggy. need_resched=0, cpu=0
>
> and the machine locked. will build with debugging and try to
> reproduce.
hm, i tried and couldnt reproduce this, so i'm curious what your
debugging build yields.
Ingo
On Sun, 14 Nov 2004 15:15:51 +0100
Ingo Molnar <[email protected]> wrote:
> > i just build and booted into 26-3 (w/o debugging stuff) and put a
> > little load on the system (find /'s plus kernel compile plus
> > rtc_wakeup -f 8192). Got this on the console:
> >
> > `IRQ 8` [14] is being piggy. need_resched=0, cpu=0
> >
> > and the machine locked. will build with debugging and try to
> > reproduce.
>
> hm, i tried and couldnt reproduce this, so i'm curious what your
> debugging build yields.
not mch sadly. I tried booting into it once more and had to wait quite a
while (around 30minutes) until the lock. I got this around 10 minutes before
the lock though:
Nov 15 00:09:23 mango kernel: bug in rtc_read(): called in state S_IDLE!
The system locked up quitly again. no console dump. sys rq kept working (i
could sync, remount ro and reboot). Does sys rq offer diagnosis which would
be useful for you?
flo
Florian Schmidt wrote:
> On Sun, 14 Nov 2004 15:15:51 +0100
> Ingo Molnar <[email protected]> wrote:
>
>
>>>i just build and booted into 26-3 (w/o debugging stuff) and put a
>>>little load on the system (find /'s plus kernel compile plus
>>>rtc_wakeup -f 8192). Got this on the console:
>>>
>>>`IRQ 8` [14] is being piggy. need_resched=0, cpu=0
>>>
>>>and the machine locked. will build with debugging and try to
>>>reproduce.
>>
>>hm, i tried and couldnt reproduce this, so i'm curious what your
>>debugging build yields.
>
>
> not mch sadly. I tried booting into it once more and had to wait quite a
> while (around 30minutes) until the lock. I got this around 10 minutes before
> the lock though:
>
> Nov 15 00:09:23 mango kernel: bug in rtc_read(): called in state S_IDLE!
Still don't think this has anything to do with the lock. This message is
usually produced by reading the rtc with a program that is running at a
higher priority than 'IRQ 8'. Did you chrt the 'IRQ 8' thread? Make sure
the reader priority is at least 1 less than the handler.
>
> The system locked up quitly again. no console dump. sys rq kept working (i
> could sync, remount ro and reboot). Does sys rq offer diagnosis which would
> be useful for you?
Possibly 't' for trace?
kr
>
> flo
>
Ingo Molnar wrote:
>
> i have released the -V0.7.25-1 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
Hi,
I've been running RT-0.7.26-3 already on both of my machines (P4/UP laptop
and P4/SMP-HT desktop) and I must say that overall stability seems to be
good.
However I still have some pending complaints ;) These are the ones that
are troubling my confidence:
1) Almost everytime the P4/SMP box locks up while unloading the ALSA
modules e.g.on shutdown. This has been an issue for quite some time on the
latest RT patches, not exclusive to RT-V0.7.26-3. Probably it
started since the merge into -mm3, but not sure.
One thing to note is that, when the nmi_watchdog=1 boot parameter is
set, this lockup behavior seem to be avoided.
This isn't quite an issue on my other P4/UP (laptop), but it segfaults
sometimes too, while rmmod'ing the alsa modules. It doesn't lockup
thought, and the corresponding tracedump can be pasted from syslog (see
attachment). Unfortunately this is the only cross-evidence I could gather,
and hope it helps to a clue, just because...
2) Serial console (or netconsole, if that matters) aren't showing
anything relevant for debugging; SysRq-T is just silent, only printing a
"Show State" one liner. No traces, no dumps.
3) USB hotplugging is not working as it should be on my P4/UP laptop
(ohci_hcd), althought it seems to work on the P4/SMP-HT desktop
(uhci_hcd). USB devices are only recognized if and only if already plugged at
boot/init time; plugging in on a later time doesn't get listed by 'lsusb',
but a single 'wakeup' message shows _once_, and only once, on
syslog/dmesg.
Unplugging and/or plugging in back again, gives you nothing not even
that 'wakeup' message. As reported a few days before, this really seem to
be introduced by -mm3 (and still an issue on -mm4, FWIW).
I'm just asking for hints here, as one of the main uses of the RT kernel
on my laptop is about using a Tascam US-224 USB Audio/MIDI controller
interface, which is USB 1.1 based and have been quite successful with it,
at least until (and including) -mm2-RT-V0.7.11 .
OK. Just some last resort questions: is there any plans (or recipe) on
merging the RT patch(es) against the 2.6.10(-rc1) vanilla kernel? Or, at
least for my laptop's sake, on top of this late and "well" behaved -mm2 ?
Hope someone knows it better ;)
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> 2) Serial console (or netconsole, if that matters) aren't showing
> anything relevant for debugging; SysRq-T is just silent, only printing
> a "Show State" one liner. No traces, no dumps.
next time around could you try SysRq-D first?
> OK. Just some last resort questions: is there any plans (or recipe) on
> merging the RT patch(es) against the 2.6.10(-rc1) vanilla kernel? Or,
> at least for my laptop's sake, on top of this late and "well" behaved
> -mm2 ?
there should be an -rc2-mm1 kernel out within the next day or two, at
which point i'll merge. (-rc1-mm5 has some problems.)
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> 1) Almost everytime the P4/SMP box locks up while unloading the ALSA
> modules e.g.on shutdown. This has been an issue for quite some time on
> the latest RT patches, not exclusive to RT-V0.7.26-3. Probably it
> started since the merge into -mm3, but not sure.
hm, the syslog you sent suggests that it's the 2.6.10-rc1-mm3-RT-V0.7.24
kernel that crashed:
Nov 11 12:39:46 lambda kernel: EFLAGS: 00010083 (2.6.10-rc1-mm3-RT-V0.7.24)
not -V0.7.26-3. The particular rmmod crash you got:
Nov 11 12:39:46 lambda kernel: [<c013b72c>] kmem_cache_free+0x4a/0xc7 (8)
Nov 11 12:39:46 lambda kernel: [kobject_cleanup+142/144] kobject_cleanup+0x8e/0x90 (12)
Nov 11 12:39:46 lambda kernel: [<c01b0f08>] kobject_cleanup+0x8e/0x90 (12)
seems to be quite related to one of the fixes that -V0.7.25 includes:
- added upstream fix for kobject related crash, pointed out by Shane
Shrybman.
so ... unless you got similar crashes with -V0.7.25 or later kernels
(but no syslog traces), please try the latest one (-V0.7.26-4), does
that one crash in rmmod too?
Ingo
On Sun, 14 Nov 2004 15:15:51 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > i just build and booted into 26-3 (w/o debugging stuff) and put a
> > little load on the system (find /'s plus kernel compile plus
> > rtc_wakeup -f 8192). Got this on the console:
> >
> > `IRQ 8` [14] is being piggy. need_resched=0, cpu=0
> >
> > and the machine locked. will build with debugging and try to
> > reproduce.
>
> hm, i tried and couldnt reproduce this, so i'm curious what your
> debugging build yields.
Ok, i found time to boot into it once more. Now i'm pretty certain that the
rtc triggers the lock. As this time the kernel ran fine again for ca. 30
minutes. and then it locked right at the moment of spitting one of the rtc
being piggy messages to the ocnsole (of which i get about 1 per minute or
so, so it still might have been coincidence, but i was busy typing atm and
right in the moment of the piggy message, keyboard stopped working).
The sys-rq-t didn't help so much as i only have 50lines on my vga console.
The only thing i got to see was a list of held locks. I wrote down the
unique ones:
lock held by aquired at
atomic_read bash read_char
gendev_rel_sem init init_hwif_data
serio_lock IRQ 1 serio_interrupt
&mm->mmap_sem rtc_wakeup do_page_fault
sysrq_key_table IRQ 1 __handle_sysrq
Btw: i do have access to another machine on the internet, but i connect to
the net it via ppp0, thus netconsole won't help, right? Would it maybe be
feasible to add some sort of netconsole support which just dumps prinkt's
over any net interface to any IP with the price of not being able to catch
very early printk's (i'm probably talking out of my ass here. you'll set me
straight :))
Flo
.config attached
And FYI: some latency traces from before the lock:
preemption latency trace v1.0.7 on 2.6.10-rc1-mm3-RT-V0.7.26-4-NORT
-------------------------------------------------------
latency: 985 us, entries: 19 (19) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:49
-----------------
=> started at: try_to_wake_up+0x5a/0x110 <c01148da>
=> ended at: finish_task_switch+0x51/0xc0 <c0114dc1>
=======>
5 80000000 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): (50) ((98))
5 80000000 0.000ms (+0.000ms): (2) ((5))
5 80000000 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 80000000 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 80000000 0.000ms (+0.000ms): do_softirq (irq_exit)
5 80000000 0.000ms (+0.983ms): __do_softirq (do_softirq)
5 00000000 0.983ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 80000000 0.984ms (+0.000ms): __schedule (preempt_schedule)
5 80000000 0.984ms (+0.000ms): profile_hit (__schedule)
5 80000000 0.984ms (+0.000ms): sched_clock (__schedule)
2 80000000 0.984ms (+0.000ms): __switch_to (__schedule)
2 80000000 0.984ms (+0.000ms): (5) ((2))
2 80000000 0.984ms (+0.000ms): (98) ((50))
2 80000000 0.985ms (+0.000ms): finish_task_switch (__schedule)
2 80000000 0.985ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000000 0.985ms (+0.003ms): (2) ((50))
2 80000000 0.989ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
preemption latency trace v1.0.7 on 2.6.10-rc1-mm3-RT-V0.7.26-4-NORT
-------------------------------------------------------
latency: 1035 us, entries: 19 (19) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:49
-----------------
=> started at: try_to_wake_up+0x5a/0x110 <c01148da>
=> ended at: finish_task_switch+0x51/0xc0 <c0114dc1>
=======>
5 80000000 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): (50) ((98))
5 80000000 0.000ms (+0.000ms): (2) ((5))
5 80000000 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 80000000 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 80000000 0.000ms (+0.000ms): do_softirq (irq_exit)
5 80000000 0.000ms (+1.033ms): __do_softirq (do_softirq)
5 00000000 1.033ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 80000000 1.034ms (+0.000ms): __schedule (preempt_schedule)
5 80000000 1.034ms (+0.000ms): profile_hit (__schedule)
5 80000000 1.034ms (+0.000ms): sched_clock (__schedule)
2 80000000 1.034ms (+0.000ms): __switch_to (__schedule)
2 80000000 1.034ms (+0.000ms): (5) ((2))
2 80000000 1.035ms (+0.000ms): (98) ((50))
2 80000000 1.035ms (+0.000ms): finish_task_switch (__schedule)
2 80000000 1.035ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000000 1.035ms (+0.003ms): (2) ((50))
2 80000000 1.038ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
preemption latency trace v1.0.7 on 2.6.10-rc1-mm3-RT-V0.7.26-4-NORT
-------------------------------------------------------
latency: 1048 us, entries: 19 (19) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:49
-----------------
=> started at: try_to_wake_up+0x5a/0x110 <c01148da>
=> ended at: finish_task_switch+0x51/0xc0 <c0114dc1>
=======>
5 80000000 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): (50) ((98))
5 80000000 0.000ms (+0.000ms): (2) ((5))
5 80000000 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 80000000 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 80000000 0.000ms (+0.000ms): do_softirq (irq_exit)
5 80000000 0.000ms (+1.046ms): __do_softirq (do_softirq)
5 00000000 1.046ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 80000000 1.047ms (+0.000ms): __schedule (preempt_schedule)
5 80000000 1.047ms (+0.000ms): profile_hit (__schedule)
5 80000000 1.047ms (+0.000ms): sched_clock (__schedule)
2 80000000 1.047ms (+0.000ms): __switch_to (__schedule)
2 80000000 1.047ms (+0.000ms): (5) ((2))
2 80000000 1.047ms (+0.000ms): (98) ((50))
2 80000000 1.048ms (+0.000ms): finish_task_switch (__schedule)
2 80000000 1.048ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000000 1.048ms (+0.002ms): (2) ((50))
2 80000000 1.050ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
preemption latency trace v1.0.7 on 2.6.10-rc1-mm3-RT-V0.7.26-4-NORT
-------------------------------------------------------
latency: 56 us, entries: 19 (19) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 8/14, uid:0 nice:-10 policy:1 rt_prio:98
-----------------
=> started at: try_to_wake_up+0x5a/0x110 <c01148da>
=> ended at: finish_task_switch+0x51/0xc0 <c0114dc1>
=======>
5 80000000 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): (1) ((98))
5 80000000 0.000ms (+0.000ms): (14) ((5))
5 80000000 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 80000000 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 80000000 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 80000000 0.000ms (+0.000ms): do_softirq (irq_exit)
5 80000000 0.000ms (+0.054ms): __do_softirq (do_softirq)
5 00000000 0.055ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 80000000 0.055ms (+0.000ms): __schedule (preempt_schedule)
5 80000000 0.055ms (+0.000ms): profile_hit (__schedule)
5 80000000 0.055ms (+0.000ms): sched_clock (__schedule)
14 80000000 0.055ms (+0.000ms): __switch_to (__schedule)
14 80000000 0.056ms (+0.000ms): (5) ((14))
14 80000000 0.056ms (+0.000ms): (98) ((1))
14 80000000 0.056ms (+0.000ms): finish_task_switch (__schedule)
14 80000000 0.056ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
14 80000000 0.056ms (+0.003ms): (14) ((1))
14 80000000 0.059ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
* [email protected] <[email protected]> wrote:
> [1] major network delays while latencytest is running (ping drops
> packets or they get delayed by minutes). I did not see this on some
> previous tests where I made more of the /0 and /1 tasks RT. May have
> to do that again.
i think this is directly related to what priority the ksoftirqd threads
have.
> [6] the latency trace may have some SMP race conditions where the
> entries displayed do not match the header. Examples are a 100 usec
> trace header followed by 8 entries that last about 4 usec.
i think i fixed a related bug in the latest kernel(s):
touch_preempt_timing() was mistakenly 'touching' a live user-triggered
trace and could interfere in a similar fashion. Please re-report if this
still happens with -V0.7.26-3-ish or later kernels.
> [8] Some samples of /proc/loadavg during my big test showed some
> extremely large numbers. For example:
> 5.07 402.44 0.58 5/120 4448
i'm currently trying to track down this one. The rq->nr_uninterruptible
count got out of sync during one of the scheduler changes - and this
causes large negative task counts, messing up the load-average.
Ingo
* Ingo Molnar <[email protected]> wrote:
> > [8] Some samples of /proc/loadavg during my big test showed some
> > extremely large numbers. For example:
> > 5.07 402.44 0.58 5/120 4448
>
> i'm currently trying to track down this one. The
> rq->nr_uninterruptible count got out of sync during one of the
> scheduler changes - and this causes large negative task counts,
> messing up the load-average.
ok, found it - it's an upstream bug in fact. I've uploaded -V0.7.26-5
with the fix.
Ingo
Hi Ingo,
>
> Rui Nuno Capela wrote:
>
>> 1) Almost everytime the P4/SMP box locks up while unloading the ALSA
>> modules e.g.on shutdown. This has been an issue for quite some time on
>> the latest RT patches, not exclusive to RT-V0.7.26-3. Probably it
>> started since the merge into -mm3, but not sure.
>
> hm, the syslog you sent suggests that it's the 2.6.10-rc1-mm3-RT-V0.7.24
> kernel that crashed:
>
> Nov 11 12:39:46 lambda kernel: EFLAGS: 00010083
> (2.6.10-rc1-mm3-RT-V0.7.24)
>
> not -V0.7.26-3. The particular rmmod crash you got:
>
Yes, but as I said so, I couldn't get any relevent trace on the P4/SMP
box, where the issue means real trouble -- the system just locks up while
serial console's annoyingly quiet about it.
Did you notice about nmi_watchdog=1? As it seems, '/etc/init.d/alsasound
stop' just runs smoothly then.
The dump I sent is in fact taken from my P4/UP desktop, and I thought it
was somewhat related. Indeed, I cannot see it happenning anymore since
running RT-0.7.25-1.
I will try RT-0.7.26-4 later on.
Seeya.
--
rncbc aka Rui Nuno Capela
[email protected]
Didn't see an announcement for V0.7.26, so I'll post this summary
under this title.
Built today with V0.7.26-4 without any problems. System booted up
and telnet 5 was uneventful as well.
Ran two series of tests with latencytest and my stress tests with
the following results:
[1] Appear to have a new symptom of 200 usec delays at raw_read_unlock
which doesn't make any sense to me. Have included a latency trace
at the end of this message with an example. An occasional 100 usec
hit I understand (disk DMA) but I don't recall seeing this symptom
before.
[2] Still get the symptoms with truncated trace output and bad
ping responses. Refer to my previous messages for examples.
[3] The logging script (sleep 5 seconds, record data if slept for
over 10 seconds) was triggered about 30 times in an hour of testing.
None have the huge load average values reported last time but
several have 1 minute load averages above 15 (expect 6-8).
[4] All of the tests have bursts of long application level delays.
I'll be running another test program to see if I can find anything
with the user level tracing. Disk activity seems to make it worse
but all the tests had at least one CPU delay over a millisecond.
There seems to be a "short" (well > 500 usec) delay related to
disk reads and a longer one for disk writes.
[5] System after testing was done had a major "time shift"
as noted in the system log.
Nov 15 12:33:55 dws77 ntpd[2359]: synchronized to 192.52.216.4, stratum=3
Nov 15 12:33:57 dws77 ntpd[2359]: synchronized to 192.52.216.1, stratum=2
Nov 15 12:33:36 dws77 ntpd[2359]: time reset -21.466037 s
Nov 15 12:33:36 dws77 ntpd[2359]: frequency error -512 PPM exceeds
tolerance 500 PPM
No crashes nor any major stability problems.
--Mark
--- 200 usec latency example ---
preemption latency trace v1.0.7 on 2.6.10-rc1-mm3-RT-V0.7.26-4
-------------------------------------------------------
latency: 206 us, entries: 12 (12) | [VP:0 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: kjournald/1209, uid:0 nice:0 policy:0 rt_prio:0
-----------------
=> started at: __down_mutex+0x3f/0x300 <c032d9af>
=> ended at: __down_mutex+0x1a6/0x300 <c032db16>
=======>
1209 80000000 0.000ms (+0.002ms): __down_mutex (__spin_lock)
1209 80000000 0.002ms (+0.001ms): _raw_spin_lock (__down_mutex)
1209 80000000 0.004ms (+0.000ms): _raw_spin_lock (__down_mutex)
1209 80000000 0.004ms (+0.000ms): do_nmi (__down_mutex)
1209 80000000 0.005ms (+0.000ms): do_nmi (mcount)
1209 80000000 0.005ms (+0.000ms): do_nmi (<00200286>)
1209 80000000 0.006ms (+0.000ms): profile_hook (profile_tick)
1209 80000000 0.006ms (+0.000ms): _raw_read_lock (profile_hook)
1209 80000000 0.007ms (+0.196ms): _raw_read_unlock (profile_tick)
1209 80000000 0.204ms (+0.001ms): set_new_owner (__down_mutex)
1209 80000000 0.205ms (+0.000ms): _raw_spin_unlock (__down_mutex)
1209 80000000 0.205ms (+0.000ms): _raw_spin_unlock (__down_mutex)
* [email protected] <[email protected]> wrote:
> 1209 80000000 0.005ms (+0.000ms): do_nmi (mcount)
> 1209 80000000 0.005ms (+0.000ms): do_nmi (<00200286>)
> 1209 80000000 0.006ms (+0.000ms): profile_hook (profile_tick)
> 1209 80000000 0.006ms (+0.000ms): _raw_read_lock (profile_hook)
> 1209 80000000 0.007ms (+0.196ms): _raw_read_unlock (profile_tick)
> 1209 80000000 0.204ms (+0.001ms): set_new_owner (__down_mutex)
i've seen NMIs causing such problems before. Could you try a testrun
with all debug options disabled in the .config (and REGPARM enabled,
etc.) plus nmi_watchdog=0? Just to see how many of the artifacts are
related to debugging overhead.
Ingo
meself writes:
>
> Ingo Molnar wrote:
>>
>>> 1) Almost everytime the P4/SMP box locks up while unloading the ALSA
>>> modules e.g.on shutdown. This has been an issue for quite some time on
>>> the latest RT patches, not exclusive to RT-V0.7.26-3. Probably it
>>> started since the merge into -mm3, but not sure.
>>
>> hm, the syslog you sent suggests that it's the
>> 2.6.10-rc1-mm3-RT-V0.7.24 kernel that crashed:
>>
>> Nov 11 12:39:46 lambda kernel: EFLAGS: 00010083
>> (2.6.10-rc1-mm3-RT-V0.7.24)
>>
>> not -V0.7.26-3. The particular rmmod crash you got:
>>
>
> Yes, but as I said so, I couldn't get any relevent trace on the P4/SMP
> box, where the issue means real trouble -- the system just locks up
> while serial console's annoyingly quiet about it.
>
> I will try RT-0.7.26-4 later on.
>
Already testing with RT-0.7.26-5 now. No good. Same lockup behavior on
alsa shutdown, altought not always, but very frequently. Nothing comes out
via serial console. Not even SysRq is of any help, pretty hard these
lockups are.
Oh, about the nmi_watchdog=1 trick: forget what I've told you before; I
already saw a couple of freezes while its on.
(config.gz file attached).
/me out of ideas.
Byw now.
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> Already testing with RT-0.7.26-5 now. No good. Same lockup behavior on
> alsa shutdown, altought not always, but very frequently. Nothing comes
> out via serial console. Not even SysRq is of any help, pretty hard
> these lockups are.
i'm rebasing to -rc2-mm1 currently, it should be completed today and
we'll see whether those ALSA problems are upstream related.
is it stable if you dont unload the ALSA modules?
Ingo
Ingo Molnar wrote:
>
> Rui Nuno Capela wrote:
>
>> Already testing with RT-0.7.26-5 now. No good. Same lockup behavior on
>> alsa shutdown, altought not always, but very frequently. Nothing comes
>> out via serial console. Not even SysRq is of any help, pretty hard
>> these lockups are.
>
> i'm rebasing to -rc2-mm1 currently, it should be completed today and
> we'll see whether those ALSA problems are upstream related.
>
> is it stable if you dont unload the ALSA modules?
>
Yes, it looks like the stabliest of the RTs I've tested to date. Trouble
only comes when '/etc/init.d/alsasound stop' is called.
Bye.
--
rncbc aka Rui Nuno Capela
[email protected]
i have released the -V0.7.27-1 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this quick update fixes a couple of build bugs.
Changes since a -V0.7.27-0:
- fix iptables compilation error
- fix selinux compilation error
to create a -V0.7.27-1 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm1/2.6.10-rc2-mm1.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-1
Ingo
i have released the -V0.7.27-3 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is another quick update to fix a couple of bugs. Sorry about the
fast pace of updates but these fixes are worth having ASAP:
Changes since a -V0.7.27-1:
- fix module-put BKL count bug - this could explain/fix the lockups
reported by Rui Nuno Capela.
- fixed a netfilter related networking deadlock reported by Mark H.
Johnson two weeks ago, it triggered on my testbox today. This (rare)
bug could potentially explain some of the other lockup reports that
are still open.
- fix load average constant +1.0 offset when PREEMPT_RT is enabled.
This was an artifact of the IRQ-threading of the timer interrupt.
to create a -V0.7.27-3 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm1/2.6.10-rc2-mm1.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3
Ingo
i have released the -V0.7.27-0 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is a pure merge of -V0.7.26-5 to 2.6.10-rc2-mm1, there are no other
changes.
to create a -V0.7.27-0 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm1/2.6.10-rc2-mm1.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-0
Ingo
* Ingo Molnar <[email protected]> wrote:
> > Already testing with RT-0.7.26-5 now. No good. Same lockup behavior on
> > alsa shutdown, altought not always, but very frequently. Nothing comes
> > out via serial console. Not even SysRq is of any help, pretty hard
> > these lockups are.
>
> i'm rebasing to -rc2-mm1 currently, it should be completed today and
> we'll see whether those ALSA problems are upstream related.
>
> is it stable if you dont unload the ALSA modules?
i just found a potential problem that could cause a near-lockup during
module removal. This code in __module_put_and_exit() could loop for
quite long time:
while (current->lock_depth != -1)
unlock_kernel();
since for specifically ALSA's no-BKL purpose i've introduced the notion
of ->lock_depth going below -1. So if we happen to put the module while
->lock_depth is -2, it could take quite some time for it to go down to
zero again ... (and it could cause other problems as well)
i fixed this in the -V0.7.27-2 release, freshly uploaded to the usual
place:
http://redhat.com/~mingo/realtime-preempt/
do you still see lockups with this patch?
Ingo
On Tue, 16 Nov 2004 14:40:27 +0100
Ingo Molnar <[email protected]> wrote:
> i have released the -V0.7.27-3 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is another quick update to fix a couple of bugs. Sorry about the
> fast pace of updates but these fixes are worth having ASAP:
Hi,
i built a 27-4 kernel and tried to boot into it. It hangs after:
Uncompressing linux.. Ok, booting the kernel
Will try plain 2.6.10-rc2-mm1 and 2.6.10-rc2
.config is here:
http://affenbande.org/~tapas/config
flo
On Tue, 16 Nov 2004 15:20:21 +0100
Florian Schmidt <[email protected]> wrote:
> i built a 27-4 kernel and tried to boot into it. It hangs after:
>
> Uncompressing linux.. Ok, booting the kernel
>
> Will try plain 2.6.10-rc2-mm1 and 2.6.10-rc2
>
> .config is here:
>
> http://affenbande.org/~tapas/config
Ok, both 2.6.10-rc2 and 2.6.10-rc2-mm1 boot fine.. Rebuilding 27-4 kernel to
see if i can still reproduce above mentioned hang..
flo
On Tue, 16 Nov 2004 16:08:22 +0100
Florian Schmidt <[email protected]> wrote:
> On Tue, 16 Nov 2004 15:20:21 +0100
> Florian Schmidt <[email protected]> wrote:
>
> > i built a 27-4 kernel and tried to boot into it. It hangs after:
> >
> > Uncompressing linux.. Ok, booting the kernel
> >
> > Will try plain 2.6.10-rc2-mm1 and 2.6.10-rc2
> >
> > .config is here:
> >
> > http://affenbande.org/~tapas/config
>
> Ok, both 2.6.10-rc2 and 2.6.10-rc2-mm1 boot fine.. Rebuilding 27-4 kernel to
> see if i can still reproduce above mentioned hang..
ok, this new build still hangs at the same spot.
flo
On Tue, 16 Nov 2004 16:29:24 +0100, Florian Schmidt <[email protected]> wrote:
> ok, this new build still hangs at the same spot.
same problem here
Florian Schmidt <[email protected]> wrote:
>ok, this new build still hangs at the same spot.
Me too. The serial console output follows at the end. Will try a
few boot alternatives and let you know if I can get this to run.
>From what I can tell, it was attempting to test the NMI watchdog
when it failed.
--Mark
-----
Linux version 2.6.10-rc2-mm1-RT-V0.7.27-4 (root@dws77) (gcc version 3.3.3
20040412 (Red Hat Linux 3.3.3-7)) #1 SMP Tue Nov 16 09:18:20 CST 2004
BIOS-provided physical RAM map:
BIOS-e820: 0000000000000000 - 000000000009fc00 (usable)
BIOS-e820: 000000000009fc00 - 00000000000a0000 (reserved)
BIOS-e820: 00000000000f0000 - 0000000000100000 (reserved)
BIOS-e820: 0000000000100000 - 000000001fff0000 (usable)
BIOS-e820: 000000001fff0000 - 000000001fff8000 (ACPI data)
BIOS-e820: 000000001fff8000 - 0000000020000000 (ACPI NVS)
BIOS-e820: 00000000fec00000 - 00000000fec01000 (reserved)
BIOS-e820: 00000000fee00000 - 00000000fee01000 (reserved)
BIOS-e820: 00000000ffff0000 - 0000000100000000 (reserved)
0MB HIGHMEM available.
511MB LOWMEM available.
found SMP MP-table at 000fb170
DMI 2.3 present.
Using APIC driver default
ACPI: LAPIC (acpi_id[0x01] lapic_id[0x00] enabled)
Processor #0 6:8 APIC version 17
ACPI: LAPIC (acpi_id[0x02] lapic_id[0x01] enabled)
Processor #1 6:8 APIC version 17
Using ACPI for processor (LAPIC) configuration information
Intel MultiProcessor Specification v1.1
Virtual Wire compatibility mode.
OEM ID: VIA Product ID: VT3075 APIC at: 0xFEE00000
I/O APIC #2 Version 17 at 0xFEC00000.
Enabling APIC mode: Flat. Using 1 I/O APICs
Processors: 2
Real-Time Preemption Support (c) Ingo Molnar
Built 1 zonelists
Initializing CPU#0
Kernel command line: ro root=LABEL=/ nmi_watchdog=1 single
console=ttyS0,9600n8r profile=2
kernel CPU profiling enabled
kernel profiling shift: 2
PID hash table entries: 2048 (order: 11, 32768 bytes)
Detected 864.206 MHz processor.
Using tsc for high-res timesource
Console: colour VGA+ 80x25
Dentry cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 65536 (order: 6, 262144 bytes)
Memory: 511224k/524224k available (2231k kernel code, 12612k reserved, 658k
data, 232k init, 0k highmem)
Checking if this processor honours the WP bit even in supervisor mode...
Ok.
Security Framework v1.0.0 initialized
SELinux: Initializing.
SELinux: Starting in permissive mode
selinux_register_security: Registering secondary module capability
Capability LSM initialized as secondary
Mount-cache hash table entries: 512 (order: 0, 4096 bytes)
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#0.
Enabling fast FPU save and restore... done.
Enabling unmasked SIMD FPU exception support... done.
Checking 'hlt' instruction... OK.
CPU0: Intel Pentium III (Coppermine) stepping 06
per-CPU timeslice cutoff: 730.77 usecs.
task migration cache decay timeout: 1 msecs.
Booting processor 1/1 eip 2000
Initializing CPU#1
CPU: L1 I cache: 16K, L1 D cache: 16K
CPU: L2 cache: 256K
Intel machine check architecture supported.
Intel machine check reporting enabled on CPU#1.
CPU1: Intel Pentium III (Coppermine) stepping 06
Total of 2 processors activated (3411.96 BogoMIPS).
ENABLING IO-APIC IRQs
..TIMER: vector=0x31 pin1=2 pin2=0
t<
[from a good boot of -V0.7.26-4...]
testing NMI watchdog ... OK.
checking TSC synchronization across 2 CPUs: passed.
IRQ#0 thread RT prio: 49.
spawn_desched_task(00000000)
desched cpu_callback 3/00000000
ksoftirqd started up.
softirq RT prio: 24.
desched cpu_callback 2/00000000
desched thread 0 started up.
desched cpu_callback 3/00000001
desched cpu_callback 2/00000001
ksoftirqd started up.
softirq RT prio: 24.
Brought up 2 CPUs
... and so on ...
>From what I can tell, it was attempting to test the NMI watchdog
>when it failed.
Confirmed, clean boot when I removed
nmi_watchdog=1 profile=2
from the boot parameters. Will be doing some tests without it.
--Mark
[email protected] wrote:
>>From what I can tell, it was attempting to test the NMI watchdog
>
>>when it failed.
>
>
> Confirmed, clean boot when I removed
> nmi_watchdog=1 profile=2
> from the boot parameters. Will be doing some tests without it.
>
> --Mark
>
>
I have no such boot parameters and I still couldn't get it to boot on my
SMP workstation at the office.
kr
* [email protected] <[email protected]> wrote:
> Florian Schmidt <[email protected]> wrote:
>
> >ok, this new build still hangs at the same spot.
>
> Me too. The serial console output follows at the end. Will try a few
> boot alternatives and let you know if I can get this to run.
> >From what I can tell, it was attempting to test the NMI watchdog
> when it failed.
i've uploaded -5 with a fix in profile_tick() - does it boot fine for
you now?
Btw., a good way to catch such early bootup bugs is to activate
early-printk over the serial console:
earlyprintk=serial,ttyS0,38400 console=ttyS0,38400 console=tty0
and in this particular case the most effective serial logging method is:
earlyprintk=serial,ttyS0,38400,keep console=ttyS0,38400 console=tty0
the 'keep' tells the kernel to keep the early console a bit longer -
which in this particular timer-interrupt crash case produces a more
usable log. (the 'keep' parameter makes the serial console a bit less
useful as a regular console later on, so it should only be used for
crashes that the normal early console doesnt catch.)
Ingo
Ingo Molnar wrote:
> * [email protected] <[email protected]> wrote:
>
>
>>Florian Schmidt <[email protected]> wrote:
>>
>>
>>>ok, this new build still hangs at the same spot.
>>
>>Me too. The serial console output follows at the end. Will try a few
>>boot alternatives and let you know if I can get this to run.
>>>From what I can tell, it was attempting to test the NMI watchdog
>>when it failed.
>
>
> i've uploaded -5 with a fix in profile_tick() - does it boot fine for
> you now?
>
This one boots for me now.
kr
-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1
On Tue, 16 Nov 2004, Ingo Molnar wrote:
> to create a -V0.7.27-3 tree from scratch, the patching order is:
>
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm1/2.6.10-rc2-mm1.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3
>
root@Atlantis:/usr/src# wget http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3
- --18:43:08-- http://redhat.com/%7Emingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3
=> `realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3'
Resolving redhat.com... 209.132.177.50
Connecting to redhat.com[209.132.177.50]:80... connected.
HTTP request sent, awaiting response... 302 Found
Location: http://www.redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3 [following]
- --18:43:11-- http://www.redhat.com/%7Emingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3
=> `realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3'
Resolving http://www.redhat.com... 209.132.177.50
Connecting to http://www.redhat.com[209.132.177.50]:80... connected.
HTTP request sent, awaiting response... 301 Moved Permanently
Location: http://people.redhat.com/mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3 [following]
- --18:43:12-- http://people.redhat.com/mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3
=> `realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3'
Resolving people.redhat.com... 66.187.233.237
Connecting to people.redhat.com[66.187.233.237]:80... connected.
HTTP request sent, awaiting response... 404 Not Found
18:43:12 ERROR 404: Not Found.
Mind Booster Noori
- --
/* *************************************************************** */
Marcos Daniel Marado Torres AKA Mind Booster Noori
http://student.dei.uc.pt/~marado - [email protected]
() Join the ASCII ribbon campaign against html email, Microsoft
/\ attachments and Software patents. They endanger the World.
Sign a petition against patents: http://petition.eurolinux.org
/* *************************************************************** */
-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.2.1 (GNU/Linux)
Comment: Made with pgp4pine 1.76
iD8DBQFBmkpqmNlq8m+oD34RAv3VAJ0Xk4PVQuKmxbWGS9BKCyTj/b3pUACg7H5W
hvMBsDXBU/3xSOJOI4jU7fA=
=AyKN
-----END PGP SIGNATURE-----
* Marcos D. Marado Torres <[email protected]> wrote:
> root@Atlantis:/usr/src# wget
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.27-3
> 18:43:12 ERROR 404: Not Found.
yes. If you get a 404 then a newer patch has been uploaded - check the
home directory for the latest patch. Old patches are in the older/
subdirectory.
Ingo
Ingo Molnar wrote:
> * [email protected] <[email protected]> wrote:
>
>
>>Florian Schmidt <[email protected]> wrote:
>>
>>
>>>ok, this new build still hangs at the same spot.
>>
>>Me too. The serial console output follows at the end. Will try a few
>>boot alternatives and let you know if I can get this to run.
>>>From what I can tell, it was attempting to test the NMI watchdog
>>when it failed.
>
>
> i've uploaded -5 with a fix in profile_tick() - does it boot fine for
> you now?
>
I now have both of my SMP systems booted on -V0.7.27-6 now without any
problems.
kr
* K.R. Foley <[email protected]> wrote:
> >i've uploaded -5 with a fix in profile_tick() - does it boot fine for
> >you now?
> >
>
> I now have both of my SMP systems booted on -V0.7.27-6 now without any
> problems.
great. The current release is meanwhile at -V0.7.27-10, which includes
other minor updates:
- two fixes to the wakeup timing code - this should resolve some of the
weird traces reported by Mark H. Johnson.
- two minor tweaks to the wakeup/reschedule path which should improve
wakeup latencies.
Ingo
On Tue, 16 Nov 2004 22:24:01 +0100
Ingo Molnar <[email protected]> wrote:
> great. The current release is meanwhile at -V0.7.27-10, which includes
> other minor updates:
>
Ok, this one boots fine again for me (didn't test the ones betwen my last
report and this one).
I have not yet tried to get this kernel to lock up yet, but i made another
interesting observation:
irq 8 at prio 98 (only irq 1 with higher prio 99). running rtc_wakeup in the
console (it runs SCHED_FIFO allright). Switching consoles (different text
consoles - not swithcing to X, though this basically produces similar
results) produces large jitters (around 1 ms) and occasional missed irq's
and piggy messages. This is completely reproducable here. The rtc histogram
doesn't show any large wakeup latencies.
/proc/latency_trace doesn't show that high latencies either on console
switch:
preemption latency trace v1.0.7 on 2.6.10-rc2-mm1-RT-V0.7.27-10
-------------------------------------------------------
latency: 63 us, entries: 22 (22) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 8/13, uid:0 nice:-5 policy:1 rt_prio:98
-----------------
=> started at: try_to_wake_up+0x51/0x170 <c010f3a1>
=> ended at: finish_task_switch+0x51/0xb0 <c010f911>
=======>
5 80010004 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 80010003 0.000ms (+0.000ms): (1) ((98))
5 80010003 0.000ms (+0.000ms): (13) ((5))
5 80010003 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
5 80010003 0.000ms (+0.000ms): (0) ((1))
5 80010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 80010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
5 80010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 80010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 80000002 0.000ms (+0.000ms): do_softirq (irq_exit)
5 80000002 0.001ms (+0.061ms): __do_softirq (do_softirq)
5 00000000 0.062ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 90000000 0.062ms (+0.000ms): __schedule (preempt_schedule)
5 90000000 0.062ms (+0.000ms): profile_hit (__schedule)
5 90000001 0.062ms (+0.000ms): sched_clock (__schedule)
13 80000002 0.062ms (+0.000ms): __switch_to (__schedule)
13 80000002 0.062ms (+0.000ms): (5) ((13))
13 80000002 0.062ms (+0.000ms): (98) ((1))
13 80000002 0.062ms (+0.000ms): finish_task_switch (__schedule)
13 80000001 0.062ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
13 80000001 0.063ms (+0.003ms): (13) ((1))
13 80000001 0.066ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
I sometimes do get large values in /proc/latency_trace, but they seem to be
unrelated to the console switching.
flo
* Florian Schmidt <[email protected]> wrote:
> I have not yet tried to get this kernel to lock up yet, but i made
> another interesting observation:
>
> irq 8 at prio 98 (only irq 1 with higher prio 99). running rtc_wakeup
> in the console (it runs SCHED_FIFO allright). Switching consoles
> (different text consoles - not swithcing to X, though this basically
> produces similar results) produces large jitters (around 1 ms) and
> occasional missed irq's and piggy messages. This is completely
> reproducable here. The rtc histogram doesn't show any large wakeup
> latencies.
interesting, i'll try to reproduce this.
btw., for best rtc_wakeup results you should chrt IRQ#0 to prio 99 too,
because it uses rtc_lock, otherwise it's an extra PI pass to undo the
lock inversion, which adds another 10 usecs or so to the worst-case
path.
and i'd suggest to chrt irq 1 back to below prio 90, maybe this explains
the console-switching latency? If you do a console-switch via the
keyboard then its priority 99 can get inherited by events/0 which then
does the quite expensive VGA console clearing/copying with priority 99,
possibly delaying rtc_wakeup quite significantly, easily for a
millisecond or so. So what you are seeing might be priority inheritance
handling at work!
> I sometimes do get large values in /proc/latency_trace, but they seem
> to be unrelated to the console switching.
could you post such a large latency trace? Are they like the bad traces
Mark is occasionally seeing, with some ridiculously large latency and a
ridiculously short execution trace?
Ingo
On Tue, 16 Nov 2004 22:20:39 +0100
Florian Schmidt <[email protected]> wrote:
> /proc/latency_trace doesn't show that high latencies either on console
> switch:
>
correction: now i do see large wakeup latencies in /proc/latency_trace, but
only when rtc_wakeup is not running [??? :)]. something's fishy..
(IRQ 1/18/CPU#0): new 2 us maximum-latency wakeup.
(IRQ 1/18/CPU#0): new 3 us maximum-latency wakeup.
(syslogd/302/CPU#0): new 4 us maximum-latency wakeup.
(ksoftirqd/0/3/CPU#0): new 4 us maximum-latency wakeup.
(ksoftirqd/0/3/CPU#0): new 5 us maximum-latency wakeup.
(IRQ 0/2/CPU#0): new 857 us maximum-latency wakeup.
(IRQ 0/2/CPU#0): new 891 us maximum-latency wakeup.
(IRQ 0/2/CPU#0): new 976 us maximum-latency wakeup.
(IRQ 0/2/CPU#0): new 1117 us maximum-latency wakeup.
It seems this excerpt from below trace is characteristic for all the long
traces:
5 80000002 0.001ms (+1.114ms): __do_softirq (do_softirq)
5 00000000 1.115ms (+0.000ms): preempt_schedule (_mmx_memcpy)
preemption latency trace v1.0.7 on 2.6.10-rc2-mm1-RT-V0.7.27-10
-------------------------------------------------------
latency: 1117 us, entries: 22 (22) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:50
-----------------
=> started at: try_to_wake_up+0x51/0x170 <c010f3a1>
=> ended at: finish_task_switch+0x51/0xb0 <c010f911>
=======>
5 80010004 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 80010003 0.000ms (+0.000ms): (49) ((98))
5 80010003 0.000ms (+0.000ms): (2) ((5))
5 80010003 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
5 80010003 0.000ms (+0.000ms): (0) ((1))
5 80010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 80010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
5 80010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 80010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 80000002 0.000ms (+0.000ms): do_softirq (irq_exit)
5 80000002 0.001ms (+1.114ms): __do_softirq (do_softirq)
5 00000000 1.115ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 90000000 1.115ms (+0.000ms): __schedule (preempt_schedule)
5 90000000 1.115ms (+0.000ms): profile_hit (__schedule)
5 90000001 1.116ms (+0.000ms): sched_clock (__schedule)
2 80000002 1.116ms (+0.000ms): __switch_to (__schedule)
2 80000002 1.116ms (+0.000ms): (5) ((2))
2 80000002 1.116ms (+0.000ms): (98) ((49))
2 80000002 1.116ms (+0.000ms): finish_task_switch (__schedule)
2 80000001 1.116ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000001 1.116ms (+0.003ms): (2) ((49))
2 80000001 1.120ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
Ah, and one more thing: When i boot up the computer my init script sets irq
3 to prio 98. But it seems the irq handler's priority changes actually when
the soundcard is used the first time. So i need to re-set the irq prio
_after_ i have used the soundcard for the first time..
flo
On Tue, 16 Nov 2004 23:31:35 +0100
Ingo Molnar <[email protected]> wrote:
> and i'd suggest to chrt irq 1 back to below prio 90, maybe this explains
> the console-switching latency? If you do a console-switch via the
> keyboard then its priority 99 can get inherited by events/0 which then
> does the quite expensive VGA console clearing/copying with priority 99,
> possibly delaying rtc_wakeup quite significantly, easily for a
> millisecond or so. So what you are seeing might be priority inheritance
> handling at work!
>
ah, will try this right away..
flo
* Florian Schmidt <[email protected]> wrote:
> It seems this excerpt from below trace is characteristic for all the long
> traces:
>
> 5 80000002 0.001ms (+1.114ms): __do_softirq (do_softirq)
> 5 00000000 1.115ms (+0.000ms): preempt_schedule (_mmx_memcpy)
i've seen this before, it's still unsolved. This trace shows it nicely:
> 5 80010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
> 5 80010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
> 5 80010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
> 5 80000002 0.000ms (+0.000ms): do_softirq (irq_exit)
> 5 80000002 0.001ms (+1.114ms): __do_softirq (do_softirq)
> 5 00000000 1.115ms (+0.000ms): preempt_schedule (_mmx_memcpy)
> 5 90000000 1.115ms (+0.000ms): __schedule (preempt_schedule)
> 5 90000000 1.115ms (+0.000ms): profile_hit (__schedule)
this is either a false positive, or we missed a preemption. To see which
one, could you apply the attached patch and try to reproduce this long
trace? The new trace will tell us whether need_resched is set during
that ~1 msec window.
Ingo
--- linux/kernel/latency.c.orig
+++ linux/kernel/latency.c
@@ -184,6 +184,7 @@ ____trace(struct cpu_trace *tr, unsigned
* Encode irqs-off into the preempt count:
*/
+ (irqs_disabled() ? 0x80000000 : 0)
+ + (need_resched() ? 0x08000000 : 0)
#endif
;
}
Florian Schmidt wrote:
> On Tue, 16 Nov 2004 22:24:01 +0100
> Ingo Molnar <[email protected]> wrote:
>
>
>>great. The current release is meanwhile at -V0.7.27-10, which includes
>>other minor updates:
>>
>
>
> Ok, this one boots fine again for me (didn't test the ones betwen my last
> report and this one).
>
> I have not yet tried to get this kernel to lock up yet, but i made another
> interesting observation:
>
> irq 8 at prio 98 (only irq 1 with higher prio 99). running rtc_wakeup in the
> console (it runs SCHED_FIFO allright). Switching consoles (different text
> consoles - not swithcing to X, though this basically produces similar
> results) produces large jitters (around 1 ms) and occasional missed irq's
> and piggy messages. This is completely reproducable here. The rtc histogram
> doesn't show any large wakeup latencies.
Just a thought. What priority are you running rtc_wakup at? If you are
doing something like:
schp.sched_priority = sched_get_priority_max(SCHED_FIFO); // which
equates to a priority of 99
Then you it is actually running at a higher priority than the rtc, and
it won't work very well. I tend to run rtc (IRQ 8) at 99 and the
programs accessing it at 98 which seems to work reasonably well.
>
> /proc/latency_trace doesn't show that high latencies either on console
> switch:
>
> preemption latency trace v1.0.7 on 2.6.10-rc2-mm1-RT-V0.7.27-10
> -------------------------------------------------------
> latency: 63 us, entries: 22 (22) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
> -----------------
> | task: IRQ 8/13, uid:0 nice:-5 policy:1 rt_prio:98
> -----------------
> => started at: try_to_wake_up+0x51/0x170 <c010f3a1>
> => ended at: finish_task_switch+0x51/0xb0 <c010f911>
> =======>
> 5 80010004 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
> 5 80010003 0.000ms (+0.000ms): (1) ((98))
> 5 80010003 0.000ms (+0.000ms): (13) ((5))
> 5 80010003 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
> 5 80010003 0.000ms (+0.000ms): (0) ((1))
> 5 80010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
> 5 80010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
> 5 80010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
> 5 80010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
> 5 80000002 0.000ms (+0.000ms): do_softirq (irq_exit)
> 5 80000002 0.001ms (+0.061ms): __do_softirq (do_softirq)
> 5 00000000 0.062ms (+0.000ms): preempt_schedule (_mmx_memcpy)
> 5 90000000 0.062ms (+0.000ms): __schedule (preempt_schedule)
> 5 90000000 0.062ms (+0.000ms): profile_hit (__schedule)
> 5 90000001 0.062ms (+0.000ms): sched_clock (__schedule)
> 13 80000002 0.062ms (+0.000ms): __switch_to (__schedule)
> 13 80000002 0.062ms (+0.000ms): (5) ((13))
> 13 80000002 0.062ms (+0.000ms): (98) ((1))
> 13 80000002 0.062ms (+0.000ms): finish_task_switch (__schedule)
> 13 80000001 0.062ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
> 13 80000001 0.063ms (+0.003ms): (13) ((1))
> 13 80000001 0.066ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
>
> I sometimes do get large values in /proc/latency_trace, but they seem to be
> unrelated to the console switching.
>
> flo
>
On Tue, 16 Nov 2004 15:42:52 -0600
"K.R. Foley" <[email protected]> wrote:
> Just a thought. What priority are you running rtc_wakup at? If you are
> doing something like:
>
> schp.sched_priority = sched_get_priority_max(SCHED_FIFO); // which
> equates to a priority of 99
>
> Then you it is actually running at a higher priority than the rtc, and
> it won't work very well. I tend to run rtc (IRQ 8) at 99 and the
> programs accessing it at 98 which seems to work reasonably well.
yah, the default for rtc_wakeup is 91 for the read() thread and 90 for the
reporting thread. So rtc's prio is above that.
flo
On Tue, 16 Nov 2004 22:33:12 +0100
Florian Schmidt <[email protected]> wrote:
> On Tue, 16 Nov 2004 23:31:35 +0100
> Ingo Molnar <[email protected]> wrote:
>
> > and i'd suggest to chrt irq 1 back to below prio 90, maybe this explains
> > the console-switching latency? If you do a console-switch via the
> > keyboard then its priority 99 can get inherited by events/0 which then
> > does the quite expensive VGA console clearing/copying with priority 99,
> > possibly delaying rtc_wakeup quite significantly, easily for a
> > millisecond or so. So what you are seeing might be priority inheritance
> > handling at work!
> >
>
> ah, will try this right away..
now, doesn't seem to make a difference.. i gave IRQ 1 prio 40 and the
behaviour is the same.. BTW: what about IRQ 0? I tried at different prios
[40 and 99]. Doesn't make a difference either. As far as i understand it, it
really shouldn't make a difference either as in the interesting cases (app
woken up by irq) the scheduler is explicitly run anyways, right?
Here's another trace:
preemption latency trace v1.0.7 on 2.6.10-rc2-mm1-RT-V0.7.27-10
-------------------------------------------------------
latency: 1056 us, entries: 22 (22) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:99
-----------------
=> started at: try_to_wake_up+0x51/0x170 <c010f3a1>
=> ended at: finish_task_switch+0x51/0xb0 <c010f911>
=======>
5 80010004 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 80010003 0.000ms (+0.000ms): (0) ((98))
5 80010003 0.000ms (+0.000ms): (2) ((5))
5 80010003 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
5 80010003 0.000ms (+0.000ms): (0) ((1))
5 80010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 80010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
5 80010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 80010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 80000002 0.000ms (+0.000ms): do_softirq (irq_exit)
5 80000002 0.001ms (+1.054ms): __do_softirq (do_softirq)
5 00000000 1.055ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 90000000 1.055ms (+0.000ms): __schedule (preempt_schedule)
5 90000000 1.055ms (+0.000ms): profile_hit (__schedule)
5 90000001 1.055ms (+0.000ms): sched_clock (__schedule)
2 80000002 1.056ms (+0.000ms): __switch_to (__schedule)
2 80000002 1.056ms (+0.000ms): (5) ((2))
2 80000002 1.056ms (+0.000ms): (98) ((0))
2 80000002 1.056ms (+0.000ms): finish_task_switch (__schedule)
2 80000001 1.056ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000001 1.056ms (+0.003ms): (2) ((0))
2 80000001 1.060ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
ah, it's a missed reschedule:
> 5 80010004 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
> 5 80010003 0.000ms (+0.000ms): (0) ((98))
> 5 80010003 0.000ms (+0.000ms): (2) ((5))
> 5 80010003 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
> 5 80010003 0.000ms (+0.000ms): (0) ((1))
> 5 80010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
> 5 80010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
> 5 80010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
> 5 80010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
> 5 80000002 0.000ms (+0.000ms): do_softirq (irq_exit)
> 5 80000002 0.001ms (+1.054ms): __do_softirq (do_softirq)
> 5 00000000 1.055ms (+0.000ms): preempt_schedule (_mmx_memcpy)
> 5 90000000 1.055ms (+0.000ms): __schedule (preempt_schedule)
note this entry:
> 5 80010003 0.000ms (+0.000ms): (0) ((1))
this was generated by:
__trace(0, need_resched());
so need_resched() is definite set. The kernel should have rescheduled.
The other trace entries corroborate this:
> 5 80010003 0.000ms (+0.000ms): (0) ((98))
> 5 80010003 0.000ms (+0.000ms): (2) ((5))
these two entries mean that PID 2 got woken up by PID 5, and that PID 2
has a priority of 0, which is much higher than PID 5's prio 98 [the
kernel-internal priority scale is inverted], so no wonder need_resched()
is set.
Ingo
On Tue, 16 Nov 2004 23:42:57 +0100
Ingo Molnar <[email protected]> wrote:
> this is either a false positive, or we missed a preemption. To see which
> one, could you apply the attached patch and try to reproduce this long
> trace? The new trace will tell us whether need_resched is set during
> that ~1 msec window.
hmm, the output doens't look so much different (or am i just blind?). maybe
i need to do make clean before building with this patch applied?
flo
preemption latency trace v1.0.7 on 2.6.10-rc2-mm1-RT-V0.7.27-10
-------------------------------------------------------
latency: 992 us, entries: 22 (22) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:49
-----------------
=> started at: try_to_wake_up+0x51/0x170 <c010f3a1>
=> ended at: finish_task_switch+0x51/0xb0 <c010f911>
=======>
5 80010004 0.000ms (+0.000ms): trace_start_sched_wakeup (try_to_wake_up)
5 88010003 0.000ms (+0.000ms): (50) ((98))
5 88010003 0.000ms (+0.000ms): (2) ((5))
5 88010003 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
5 88010003 0.000ms (+0.000ms): (0) ((1))
5 88010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
5 88010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
5 88010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
5 88010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
5 88000002 0.000ms (+0.000ms): do_softirq (irq_exit)
5 88000002 0.001ms (+0.990ms): __do_softirq (do_softirq)
5 08000000 0.991ms (+0.000ms): preempt_schedule (_mmx_memcpy)
5 98000000 0.991ms (+0.000ms): __schedule (preempt_schedule)
5 98000000 0.991ms (+0.000ms): profile_hit (__schedule)
5 98000001 0.991ms (+0.000ms): sched_clock (__schedule)
2 80000002 0.991ms (+0.000ms): __switch_to (__schedule)
2 80000002 0.991ms (+0.000ms): (5) ((2))
2 80000002 0.992ms (+0.000ms): (98) ((50))
2 80000002 0.992ms (+0.000ms): finish_task_switch (__schedule)
2 80000001 0.992ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000001 0.992ms (+0.001ms): (2) ((50))
2 80000001 0.993ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
* Florian Schmidt <[email protected]> wrote:
> hmm, the output doens't look so much different (or am i just blind?).
> maybe i need to do make clean before building with this patch applied?
the trace is fine, note the extra 0x08000000:
> 5 88010003 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
> 5 88010003 0.000ms (+0.000ms): (0) ((1))
> 5 88010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
> 5 88010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
^--- this one
> 5 88010001 0.000ms (+0.000ms): preempt_schedule (__do_IRQ)
> 5 88010001 0.000ms (+0.000ms): irq_exit (do_IRQ)
> 5 88000002 0.000ms (+0.000ms): do_softirq (irq_exit)
> 5 88000002 0.001ms (+0.990ms): __do_softirq (do_softirq)
> 5 08000000 0.991ms (+0.000ms): preempt_schedule (_mmx_memcpy)
> 5 98000000 0.991ms (+0.000ms): __schedule (preempt_schedule)
it was zero before - indeed hard to notice optically :-|
found the reason for this latency meanwhile: it's kernel_fpu_begin() and
kernel_fpu_end() disabling/enabling preemption. (these are used by the
mmx memcpy's)
i've uploaded the -11 patch with a preliminary fix:
http://redhat.com/~mingo/realtime-preempt/
which turns off the FPU-based ops if PREEMPT_RT is specified. The speed
difference should be small but the latency difference is large ...
could you try -11, do you still see these large latencies?
Ingo
On Wed, 17 Nov 2004 00:11:45 +0100
Ingo Molnar <[email protected]> wrote:
> > 5 88010002 0.000ms (+0.000ms): wake_up_process (redirect_hardirq)
> ^--- this one
>
> it was zero before - indeed hard to notice optically :-|
nah, i just didn't know what to look for :)
> i've uploaded the -11 patch with a preliminary fix:
>
> which turns off the FPU-based ops if PREEMPT_RT is specified. The speed
> difference should be small but the latency difference is large ...
>
> could you try -11, do you still see these large latencies?
yes, this seems to fix it. no more extra jitter or large latencies on
console switches.
Now, on to trying to lock up the machine ;)
Ah, btw: one thing i observed with my soundcard. I load the module at bootup
and chrt its IRQ handler to prio 98 (a check with chrt shows this prio
allright). Now it seems that the first time the soundcard is actually used
the thread gets back its original prio (from dmesg):
IRQ#3 thread RT prio: 42.
Maybe the sounddriver (snd-cs46xx) i use never initializes its irq before
the first time it gets used to play something. Well anyways, the workaround
is to change its prio after the first time it is used and not directly after
module loading..
flo
On Tue, 16 Nov 2004 23:55:35 +0100
Florian Schmidt <[email protected]> wrote:
> yes, this seems to fix it. no more extra jitter or large latencies on
> console switches.
>
> Now, on to trying to lock up the machine ;)
>
Arr, it did lock up again. This time i was in X, so i couldn't use any sysrq
stuff to see something. Will try tomorrow again..
flo
Florian Schmidt wrote:
> On Tue, 16 Nov 2004 23:55:35 +0100
> Florian Schmidt <[email protected]> wrote:
>
>
>>yes, this seems to fix it. no more extra jitter or large latencies on
>>console switches.
>>
>>Now, on to trying to lock up the machine ;)
>>
>
>
> Arr, it did lock up again. This time i was in X, so i couldn't use any sysrq
> stuff to see something. Will try tomorrow again..
Was this random or under some particular stress/load?
--
[email protected]
On Tue, 2004-11-16 at 18:43 +0000, Marcos D. Marado Torres wrote:
> HTTP request sent, awaiting response... 404 Not Found
> 18:43:12 ERROR 404: Not Found.
Usually means a bug was found and Ingo is uploading a new version.
Lee
On Tue, Nov 16, 2004 at 02:40:27PM +0100, Ingo Molnar wrote:
> i have released the -V0.7.27-3 Real-Time Preemption patch, which can be
> downloaded from the usual place:
Against some version of V0.7.25... that I just deleted.
bill
Initializing Cryptographic API
kgdb <20030915.1651.33> : port =3f8, IRQ=4, divisor =1
BUG: scheduling while atomic: swapper/0x00000001/1
caller is schedule+0x3f/0x13c
[<c01041f4>] dump_stack+0x23/0x27 (20)
[<c02ce307>] __sched_text_start+0xc97/0xce7 (116)
[<c02ce396>] schedule+0x3f/0x13c (36)
[<c02ce60a>] wait_for_completion+0x95/0x137 (96)
[<c0138cd8>] kthread_create+0x22a/0x22c (368)
[<c0145a30>] start_irq_thread+0x4f/0x83 (32)
[<c01453ec>] setup_irq+0x55/0x140 (36)
[<c0145655>] request_irq+0x90/0x107 (44)
[<c01e1bc1>] kgdb_enable_ints_now+0xa5/0xb0 (28)
[<c03bfb89>] kgdb_enable_ints+0x2c/0x63 (16)
[<c03a8a23>] do_initcalls+0x31/0xc6 (32)
[<c01003bb>] init+0x87/0x19a (28)
[<c0101329>] kernel_thread_helper+0x5/0xb (1047322644)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c02cfd66>] .... _raw_spin_trylock+0x1c/0x57
.....[<c01e1b31>] .. ( <= kgdb_enable_ints_now+0x15/0xb0)
.. [<c013dbe3>] .... print_traces+0x1d/0x56
.....[<c01041f4>] .. ( <= dump_stack+0x23/0x27)
--
Amit Shah
Codito Technologies Pvt. Ltd.
On Wed, Nov 17, 2004 at 01:29:33PM +0530, Amit Shah wrote:
> Initializing Cryptographic API
> kgdb <20030915.1651.33> : port =3f8, IRQ=4, divisor =1
> BUG: scheduling while atomic: swapper/0x00000001/1
> caller is schedule+0x3f/0x13c
> [<c01041f4>] dump_stack+0x23/0x27 (20)
> [<c02ce307>] __sched_text_start+0xc97/0xce7 (116)
> [<c02ce396>] schedule+0x3f/0x13c (36)
> [<c02ce60a>] wait_for_completion+0x95/0x137 (96)
> [<c0138cd8>] kthread_create+0x22a/0x22c (368)
> [<c0145a30>] start_irq_thread+0x4f/0x83 (32)
> [<c01453ec>] setup_irq+0x55/0x140 (36)
> [<c0145655>] request_irq+0x90/0x107 (44)
> [<c01e1bc1>] kgdb_enable_ints_now+0xa5/0xb0 (28)
> [<c03bfb89>] kgdb_enable_ints+0x2c/0x63 (16)
> [<c03a8a23>] do_initcalls+0x31/0xc6 (32)
> [<c01003bb>] init+0x87/0x19a (28)
> [<c0101329>] kernel_thread_helper+0x5/0xb (1047322644)
Woops, it means that KGDB needs a direct irq and not an irq-thread.
Let me see if I can work up something tonight before I head to bed.
bill
On Wed, Nov 17, 2004 at 12:26:20AM -0800, Bill Huey wrote:
> On Wed, Nov 17, 2004 at 01:29:33PM +0530, Amit Shah wrote:
> > Initializing Cryptographic API
> > kgdb <20030915.1651.33> : port =3f8, IRQ=4, divisor =1
> > BUG: scheduling while atomic: swapper/0x00000001/1
> > caller is schedule+0x3f/0x13c
> > [<c01041f4>] dump_stack+0x23/0x27 (20)
> > [<c02ce307>] __sched_text_start+0xc97/0xce7 (116)
> > [<c02ce396>] schedule+0x3f/0x13c (36)
> > [<c02ce60a>] wait_for_completion+0x95/0x137 (96)
> > [<c0138cd8>] kthread_create+0x22a/0x22c (368)
> > [<c0145a30>] start_irq_thread+0x4f/0x83 (32)
> > [<c01453ec>] setup_irq+0x55/0x140 (36)
> > [<c0145655>] request_irq+0x90/0x107 (44)
> > [<c01e1bc1>] kgdb_enable_ints_now+0xa5/0xb0 (28)
> > [<c03bfb89>] kgdb_enable_ints+0x2c/0x63 (16)
> > [<c03a8a23>] do_initcalls+0x31/0xc6 (32)
> > [<c01003bb>] init+0x87/0x19a (28)
> > [<c0101329>] kernel_thread_helper+0x5/0xb (1047322644)
>
> Woops, it means that KGDB needs a direct irq and not an irq-thread.
> Let me see if I can work up something tonight before I head to bed.
It could be horribly wrong for a number of reasons (wait for Ingo
for a proper irq code fix and additional support), but try this:
[attachment]
It should be a good hint as to how to fix this problem.
bill
* Bill Huey <[email protected]> wrote:
> + if (irqflags & SA_NODELAYFORCED) {
> + irqflags &= ~SA_NODELAYFORCED;
> + irqflags |= SA_NODELAY;
i've removed the SA_NODELAY-clearing hack from manage.c, that makes
things much cleaner.
Ingo
On Tue, 16 Nov 2004 19:06:50 -0500
john cooper <[email protected]> wrote:
> > Arr, it did lock up again. This time i was in X, so i couldn't use any sysrq
> > stuff to see something. Will try tomorrow again..
>
> Was this random or under some particular stress/load?
I had rtc_wakeup running with a rtc frequency of 8192 hz at the time plus
some general usage (reading mails, etc..) In earlier kernels it seemed that
the lock fell together with the rtc IRQ being piggy. will try to reproduce
now with the freshest RP kernel.
flo
* Florian Schmidt <[email protected]> wrote:
> > Was this random or under some particular stress/load?
>
> I had rtc_wakeup running with a rtc frequency of 8192 hz at the time
> plus some general usage (reading mails, etc..) In earlier kernels it
> seemed that the lock fell together with the rtc IRQ being piggy. will
> try to reproduce now with the freshest RP kernel.
i've just uploaded the -28-0 kernel with a couple of robustness updates,
could you try that one?
Ingo
i have released the -V0.7.28-0 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is a fixes & latency-reduction release.
Changes since a -V0.7.27-3:
- made the UP-ioapic code a bit more conservative again - maybe some of
the lockups are related?
- removed the BKL from the sound code in a cleaner way and
removed the quite fragile 'negative ->lock_depth' code. Much less
intrusive than i originally thought, and much cleaner as well.
- more fixes to the wakeup-timing logic, 4 false positives fixed in
total, mostly related to new-task-wakeup not accurately starting the
tracer.
- fixed the mmx-memcpy related latency reported by Florian Schmidt and
others. Also turned off the MMX/SSE ops in the RAID code, which
can introduce similar latencies.
- kgdb fix from Bill Huey
- knfsd shutdown with-BKL-held fix
- highmem compilation fix
- profiling related crash fix
- implemented 'direct-path' rescheduling to further reduce scheduling
latency: the kernel will now in most cases go from try_to_wakeup()
into the scheduler directly without re-enabling interrupts ever again
(and thus not giving irq handlers a window to increase latency). This
is also the final fix for irq nesting and irq-stack recursion.
- turn off sync wakeups on PREEMPT_RT -> they are latency generators
to create a -V0.7.28-0 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm1/2.6.10-rc2-mm1.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.28-0
Ingo
* Florian Schmidt <[email protected]> wrote:
> > Was this random or under some particular stress/load?
>
> I had rtc_wakeup running with a rtc frequency of 8192 hz at the time
> plus some general usage (reading mails, etc..) In earlier kernels it
> seemed that the lock fell together with the rtc IRQ being piggy. will
> try to reproduce now with the freshest RP kernel.
could you send me the latest .config you are using on this box?
Ingo
On Wed, 17 Nov 2004 14:02:36 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > > Was this random or under some particular stress/load?
> >
> > I had rtc_wakeup running with a rtc frequency of 8192 hz at the time
> > plus some general usage (reading mails, etc..) In earlier kernels it
> > seemed that the lock fell together with the rtc IRQ being piggy. will
> > try to reproduce now with the freshest RP kernel.
I am not all certain that there really is a correlation like this. It might
have been coincidence. This boot locked again when i was in X for 1 minute
for checking mails. So again no console output.
I will change the keyboard IRQ handler back to prio 99. Maybe sysrq is
usable then..
Will stay on the console from now until next lockup to see if i get anything
out of sysrq (although the 50 lines vga console is probably of limited use
here).
>
> could you send me the latest .config you are using on this box?
sure. attached.
flo
p.s.: some more system info:
~$ cat /proc/cpuinfo
processor : 0
vendor_id : AuthenticAMD
cpu family : 6
model : 4
model name : AMD Athlon(tm) Processor
stepping : 2
cpu MHz : 1195.144
cache size : 256 KB
fdiv_bug : no
hlt_bug : no
f00f_bug : no
coma_bug : no
fpu : yes
fpu_exception : yes
cpuid level : 1
wp : yes
flags : fpu vme de pse tsc msr pae mce cx8 sep mtrr pge mca cmov pat pse36 mmx fxsr pni syscall mmxext 3dnowext 3dnow
bogomips : 2359.29
0000:00:00.0 Host bridge: Silicon Integrated Systems [SiS] 735 Host (rev 01)
Flags: bus master, medium devsel, latency 32
Memory at d0000000 (32-bit, non-prefetchable) [size=256M]
Capabilities: <available only to root>
0000:00:01.0 PCI bridge: Silicon Integrated Systems [SiS] Virtual PCI-to-PCI bridge (AGP) (prog-if 00 [Normal decode])
Flags: bus master, fast devsel, latency 64
Bus: primary=00, secondary=01, subordinate=01, sec-latency=64
Memory behind bridge: cdc00000-cfcfffff
Prefetchable memory behind bridge: bd900000-cdafffff
0000:00:02.0 ISA bridge: Silicon Integrated Systems [SiS] SiS85C503/5513 (LPC Bridge)
Flags: bus master, medium devsel, latency 0
0000:00:02.1 SMBus: Silicon Integrated Systems [SiS]: Unknown device 0016
Flags: medium devsel
I/O ports at 0c00 [size=32]
0000:00:02.5 IDE interface: Silicon Integrated Systems [SiS] 5513 [IDE] (rev d0) (prog-if 80 [Master])
Subsystem: Silicon Integrated Systems [SiS] SiS5513 EIDE Controller (A,B step)
Flags: bus master, fast devsel, latency 128
I/O ports at ff00 [size=16]
0000:00:02.7 Multimedia audio controller: Silicon Integrated Systems [SiS] Sound Controller (rev a0)
Subsystem: Elitegroup Computer Systems: Unknown device 0a14
Flags: bus master, medium devsel, latency 64, IRQ 10
I/O ports at dc00 [size=256]
I/O ports at d800 [size=64]
Capabilities: <available only to root>
0000:00:03.0 Ethernet controller: Silicon Integrated Systems [SiS] SiS900 PCI Fast Ethernet (rev 90)
Subsystem: Elitegroup Computer Systems: Unknown device 0a14
Flags: bus master, medium devsel, latency 64, IRQ 5
I/O ports at d400 [size=256]
Memory at cfffe000 (32-bit, non-prefetchable) [size=4K]
Expansion ROM at cffc0000 [disabled] [size=128K]
Capabilities: <available only to root>
0000:00:0f.0 Multimedia audio controller: Cirrus Logic CS 4614/22/24 [CrystalClear SoundFusion Audio Accelerator] (rev 01)
Subsystem: TERRATEC Electronic GmbH: Unknown device 112e
Flags: bus master, medium devsel, latency 64, IRQ 3
Memory at cfffd000 (32-bit, non-prefetchable) [size=4K]
Memory at cfe00000 (32-bit, non-prefetchable) [size=1M]
Capabilities: <available only to root>
0000:01:00.0 VGA compatible controller: nVidia Corporation NV20 [GeForce3 Ti 200] (rev a3) (prog-if 00 [VGA])
Subsystem: Micro-Star International Co., Ltd.: Unknown device 8503
Flags: bus master, 66MHz, medium devsel, latency 64, IRQ 11
Memory at ce000000 (32-bit, non-prefetchable) [size=16M]
Memory at c0000000 (32-bit, prefetchable) [size=128M]
Memory at cda80000 (32-bit, prefetchable) [size=512K]
Expansion ROM at cfcf0000 [disabled] [size=64K]
Capabilities: <available only to root>
what else would be interesting for you?
flo
* Florian Schmidt <[email protected]> wrote:
> > could you send me the latest .config you are using on this box?
>
> sure. attached.
> what else would be interesting for you?
have you kicked the latency tracer by clearing preempt_max_latency, or
is it at the default (off) value?
Ingo
* Florian Schmidt <[email protected]> wrote:
> > > I had rtc_wakeup running with a rtc frequency of 8192 hz at the time
> > > plus some general usage (reading mails, etc..) In earlier kernels it
> > > seemed that the lock fell together with the rtc IRQ being piggy. will
> > > try to reproduce now with the freshest RP kernel.
>
> I am not all certain that there really is a correlation like this. It
> might have been coincidence. This boot locked again when i was in X
> for 1 minute for checking mails. So again no console output.
managed to reproduce the lockup on my testbox, using your .config,
running rtc_wakeup -f 8192 and starting X. Hard hang and i had IRQ1 at
prio 99. Will turn on the NMI watchdog now, hopefully this lockup will
stay easy to reproduce.
Ingo
On Wed, 17 Nov 2004 14:41:50 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > > could you send me the latest .config you are using on this box?
> >
> > sure. attached.
>
> > what else would be interesting for you?
>
> have you kicked the latency tracer by clearing preempt_max_latency, or
> is it at the default (off) value?
I didn't touch it.
flo
On Wed, 17 Nov 2004 14:45:09 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > > > I had rtc_wakeup running with a rtc frequency of 8192 hz at the time
> > > > plus some general usage (reading mails, etc..) In earlier kernels it
> > > > seemed that the lock fell together with the rtc IRQ being piggy. will
> > > > try to reproduce now with the freshest RP kernel.
> >
> > I am not all certain that there really is a correlation like this. It
> > might have been coincidence. This boot locked again when i was in X
> > for 1 minute for checking mails. So again no console output.
>
> managed to reproduce the lockup on my testbox, using your .config,
> running rtc_wakeup -f 8192 and starting X. Hard hang and i had IRQ1 at
> prio 99. Will turn on the NMI watchdog now, hopefully this lockup will
> stay easy to reproduce.
Hi,
i experienced another one. But as i stayed on the console sysrq was
available, so i can send you the last locks listed by sysrq-t.
The scenario was this:
rtc_wakeup -f 8192 in one console
some find /'s in another
Now i changed to a third console and put some load on the system my doing
make clean bzImage in some kernel source dir.
right after hitting enter after typing "make clean bziage" i got another
piggy message and the machine locked. It seems (to my uneducated mind) cc1
and rtc_wakeup both are involved with this as the list of held locks (or the
part of the list which i can see) shows them.
there were 5 locks of the following form
&drive-gendev_rel_sem
init
init_hwif_data
2 locks of this form:
&tty->atomic_read
getty
reasd_char
and these:
&mm->page_table_lock
cc1
exit_mmap
&mm->mmap_sem
rtc_wakeup
do_page_fault
&mm->page_table_lock
rtc_wakeup
handle_mm_fault
&serio_lock
IRQ 1
serio_interrupt
sysrq_table_lock
IRQ 1
__handle_sysrq
flo
Hi,
After I finished using my USB HDD, I umount it.
When unplug from my computer, there are the following output:
Nov 17 21:13:11 magf1 kernel: usb 1-5: USB disconnect, address 2
Nov 17 21:13:11 magf1 kernel: slab error in kmem_cache_destroy(): cache
`scsi_cmd_cache': Can't free all objects
Nov 17 21:13:11 magf1 kernel: [<c0104164>] dump_stack+0x23/0x27 (20)
Nov 17 21:13:11 magf1 kernel: [<c0151ba5>] kmem_cache_destroy+0x106/0x194
(28)
Nov 17 21:13:11 magf1 kernel: [<c02c99b5>]
scsi_destroy_command_freelist+0x53/0x81 (28)
Nov 17 21:13:11 magf1 kernel: [<c02ca866>] scsi_host_dev_release+0x43/0x164
(172)
Nov 17 21:13:11 magf1 kernel: [<c0295b4d>] device_release+0x7c/0x80 (32)
Nov 17 21:13:11 magf1 kernel: [<c0226b94>] kobject_cleanup+0x94/0x96 (32)
Nov 17 21:13:11 magf1 kernel: [<c02276ba>] kref_put+0x46/0xf0 (40)
Nov 17 21:13:11 magf1 kernel: [<c0226bd2>] kobject_put+0x25/0x29 (16)
Nov 17 21:13:11 magf1 kernel: [<d8880146>]
usb_stor_release_resources+0x7e/0x133 [usb_storage] (24)
Nov 17 21:13:11 magf1 kernel: [<d88806d6>] storage_disconnect+0x9a/0xa8
[usb_storage] (16)
Nov 17 21:13:11 magf1 kernel: [<d88a118c>] usb_unbind_interface+0x8b/0x8d
[usbcore] (28)
Nov 17 21:13:11 magf1 kernel: [<c0296d92>] device_release_driver+0x86/0x88
(28)
Nov 17 21:13:11 magf1 kernel: [<c0296f98>] bus_remove_device+0x56/0x86 (20)
Nov 17 21:13:11 magf1 kernel: [<c0295f91>] device_del+0x4f/0x88 (20)
Nov 17 21:13:11 magf1 kernel: [<d88a947c>] usb_disable_device+0xd1/0x172
[usbcore] (44)
Nov 17 21:13:11 magf1 kernel: [<d88a3c97>] usb_disconnect+0xb0/0x187
[usbcore] (44)
Nov 17 21:13:11 magf1 kernel: [<d88a5309>]
hub_port_connect_change+0x471/0x4a0 [usbcore] (72)
Nov 17 21:13:11 magf1 kernel: [<d88a5779>] hub_events+0x441/0x52d [usbcore]
(76)
Nov 17 21:13:11 magf1 kernel: [<d88a589c>] hub_thread+0x37/0x120 [usbcore]
(96)
Nov 17 21:13:11 magf1 kernel: [<c0101329>] kernel_thread_helper+0x5/0xb
(690024468)
Nov 17 21:13:11 magf1 kernel: ---------------------------
Nov 17 21:13:11 magf1 kernel: | preempt count: 00000001 ]
Nov 17 21:13:11 magf1 kernel: | 1-level deep critical section nesting:
Nov 17 21:13:11 magf1 kernel: ----------------------------------------
Nov 17 21:13:11 magf1 kernel: .. [<c013ce8c>] .... print_traces+0x1d/0x56
Nov 17 21:13:11 magf1 kernel: .....[<c0104164>] .. ( <=
dump_stack+0x23/0x27)
Nov 17 21:13:11 magf1 kernel:
On linux-2.6.6/2.6.7 and Fedora Core 2, My USB HDD is working properly. On
realtime preempt kernel, I
have such problem.
Paul
----- Original Message -----
From: "Florian Schmidt" <[email protected]>
To: "Ingo Molnar" <[email protected]>
Cc: "john cooper" <[email protected]>; "K.R. Foley" <[email protected]>;
<[email protected]>; <[email protected]>; "Lee Revell"
<[email protected]>; "Rui Nuno Capela" <[email protected]>; "Bill Huey"
<[email protected]>; "Adam Heath" <[email protected]>; "Thomas Gleixner"
<[email protected]>; "Michal Schmidt" <[email protected]>;
"Fernando Pablo Lopez-Lezcano" <[email protected]>; "Karsten Wiese"
<[email protected]>; "Gunther Persoons"
<[email protected]>; <[email protected]>; "Shane Shrybman"
<[email protected]>; "Amit Shah" <[email protected]>; "Stefan Schweizer"
<[email protected]>
Sent: Wednesday, November 17, 2004 8:59 PM
Subject: Re: [patch] Real-Time Preemption, -RT-2.6.10-rc2-mm1-V0.7.27-3
> On Wed, 17 Nov 2004 14:41:50 +0100
> Ingo Molnar <[email protected]> wrote:
>
> >
> > * Florian Schmidt <[email protected]> wrote:
> >
> > > > could you send me the latest .config you are using on this box?
> > >
> > > sure. attached.
> >
> > > what else would be interesting for you?
> >
> > have you kicked the latency tracer by clearing preempt_max_latency, or
> > is it at the default (off) value?
>
> I didn't touch it.
>
> flo
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
>
>great. The current release is meanwhile at -V0.7.27-10, which includes
>other minor updates:
A kernel built with -V0.7.27-10 had the following BUG early in the
boot sequence. All boot messages prior to this point were basically
the same as I sent in the previous report.
--Mark
[this is where -4 died...]
testing NMI watchdog ... OK.
checking TSC synchronization across 2 CPUs: passed.
IRQ#0 thread RT prio: 49.
BUG at kernel/softirq.c:514!
------------[ cut here ]------------
kernel BUG at kernel/softirq.c:514!
invalid operand: 0000 [#1]
PREEMPT SMP
Modules linked in:
CPU: 0
EIP: 0060:[<c012751e>] Not tainted VLI
EFLAGS: 00010286 (2.6.10-rc2-mm1-RT-V0.7.27-10)
EIP is at cpu_callback+0xfe/0x130
eax: 0000001d ebx: 00000000 ecx: c032ab6e edx: dff82000
esi: 00000000 edi: 00000000 ebp: dff83fb0 esp: dff83f98
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process swapper (pid: 1, threadinfo=dff82000 task=dff81460)
Stack: c033cefe c033f16e 00000202 c013b924 00000000 00000000 dff83fc8
c03eb7ee
c037843c 00000003 00000000 c0100350 dff83fd0 c0100302 dff83fec
c0100387
00000008 00000000 0000007b c0100350 00000000 00000000 c0102019
00000000
Call Trace:
[<c0104e3f>] show_stack+0x8f/0xb0 (28)
[<c0104fff>] show_registers+0x16f/0x1e0 (56)
[<c0105236>] die+0x106/0x190 (64)
[<c0105810>] do_invalid_op+0x130/0x140 (192)
[<c0104a7f>] error_code+0x2b/0x30 (84)
[<c03eb7ee>] spawn_ksoftirqd+0x2e/0x60 (24)
[<c0100302>] do_pre_smp_initcalls+0x12/0x20 (8)
[<c0100387>] init+0x37/0x1b0 (28)
[<c0102019>] kernel_thread_helper+0x5/0xc (537378836)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c032c702>] .... _raw_spin_lock_irqsave+0x22/0x90
.....[<c0105174>] .. ( <= die+0x44/0x190)
.. [<c013dfbd>] .... print_traces+0x1d/0x60
.....[<c0104e3f>] .. ( <= show_stack+0x8f/0xb0)
Code: 0f 0b 03 02 6e f1 33 c0 e9 77 ff ff ff c7 04 24 fe ce 33 c0 b8 02 02
00 00 89 44 24 08 b8 6e f1 33 c0 89 44 24 04 e8 62 9c ff ff <0f> 0b 02 02
6e f1 33 c0 8b 14 b5 20 20 41 c0 e9 39 ff ff ff 8b
<0>Kernel panic - not syncing: Attempted to kill init!
* Florian Schmidt <[email protected]> wrote:
> > managed to reproduce the lockup on my testbox, using your .config,
> > running rtc_wakeup -f 8192 and starting X. Hard hang and i had IRQ1 at
> > prio 99. Will turn on the NMI watchdog now, hopefully this lockup will
> > stay easy to reproduce.
>
> Hi,
>
> i experienced another one. But as i stayed on the console sysrq was
> available, so i can send you the last locks listed by sysrq-t.
i can now reproduce it at will. It's some sort of MM deadlock (infinite
pagefaults in rtc_wakeup), i'm debugging it now.
Ingo
On Wed, 17 Nov 2004, Ingo Molnar wrote:
>
> i have released the -V0.7.28-0 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a fixes & latency-reduction release.
>
> Changes since a -V0.7.27-3:
Running .26-5. Been running almost 2 days. All small latency values. Then,
just a few minutes ago, got a 133us value:
preemption latency trace v1.0.7 on 2.6.10-rc1-mm3-RT-V0.7.26-5
-------------------------------------------------------
latency: 133 us, entries: 41 (41) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 15/19, uid:0 nice:-10 policy:1 rt_prio:46
-----------------
=> started at: try_to_wake_up+0x5a/0x110 <c0115f1a>
=> ended at: finish_task_switch+0x51/0xc0 <c0116401>
=======>
0 80000000 00000000 [0284618592175289] 0.000ms (+0.000ms): trace_start_sched_wakeup+0x8e/0xc0 <c013623e> (try_to_wake_up+0x5a/0x110 <c0115f1a>)
0 80000000 00000001 [0284618592175411] 0.000ms (+0.000ms): <00000034> (<0000008c>)
0 80000000 00000002 [0284618592175481] 0.000ms (+0.000ms): <00000013> (<00000000>)
0 80000000 00000003 [0284618592175763] 0.000ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (try_to_wake_up+0xf6/0x110 <c0115fb6>)
0 80000000 00000004 [0284618592175975] 0.000ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (__do_IRQ+0x13d/0x170 <c013f3cd>)
0 80000000 00000005 [0284618592176118] 0.000ms (+0.127ms): irq_exit+0xb/0x50 <c013f10b> (do_IRQ+0x53/0x70 <c01080d3>)
0 80000000 00000006 [0284618592387634] 0.127ms (+0.000ms): do_IRQ+0x0/0x70 <c0108080> (<0000a253>)
0 80000000 00000007 [0284618592387690] 0.127ms (+0.000ms): do_IRQ+0x0/0x70 <c0108080> (<0000000e>)
0 80000000 00000008 [0284618592387909] 0.128ms (+0.001ms): mask_and_ack_8259A+0xe/0x110 <c010b98e> (__do_IRQ+0x86/0x170 <c013f316>)
0 80000000 00000009 [0284618592390653] 0.129ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (__do_IRQ+0x86/0x170 <c013f316>)
0 80000000 0000000a [0284618592390834] 0.129ms (+0.000ms): redirect_hardirq+0xe/0xa0 <c013f15e> (__do_IRQ+0xbe/0x170 <c013f34e>)
0 80000000 0000000b [0284618592390971] 0.129ms (+0.000ms): wake_up_process+0xb/0x30 <c0115fdb> (redirect_hardirq+0x6e/0xa0 <c013f1be>)
0 80000000 0000000c [0284618592391058] 0.129ms (+0.000ms): try_to_wake_up+0xe/0x110 <c0115ece> (wake_up_process+0x2b/0x30 <c0115ffb>)
0 80000000 0000000d [0284618592391145] 0.130ms (+0.000ms): task_rq_lock+0xb/0x30 <c0115a8b> (try_to_wake_up+0x27/0x110 <c0115ee7>)
0 80000000 0000000e [0284618592391351] 0.130ms (+0.000ms): activate_task+0x11/0xa0 <c0115df1> (try_to_wake_up+0x5a/0x110 <c0115f1a>)
0 80000000 0000000f [0284618592391455] 0.130ms (+0.000ms): sched_clock+0x14/0x80 <c01103c4> (activate_task+0x1c/0xa0 <c0115dfc>)
0 80000000 00000010 [0284618592391614] 0.130ms (+0.000ms): recalc_task_prio+0xe/0x190 <c0115c5e> (activate_task+0x32/0xa0 <c0115e12>)
0 80000000 00000011 [0284618592391773] 0.130ms (+0.000ms): effective_prio+0x8/0x60 <c0115bf8> (recalc_task_prio+0x98/0x190 <c0115ce8>)
0 80000000 00000012 [0284618592391929] 0.130ms (+0.000ms): enqueue_task+0x12/0x50 <c0115bb2> (activate_task+0x6c/0xa0 <c0115e4c>)
0 80000000 00000013 [0284618592392059] 0.130ms (+0.000ms): trace_start_sched_wakeup+0xe/0xc0 <c01361be> (try_to_wake_up+0x5a/0x110 <c0115f1a>)
0 80000000 00000014 [0284618592392333] 0.130ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (try_to_wake_up+0x5a/0x110 <c0115f1a>)
0 80000000 00000015 [0284618592392439] 0.130ms (+0.000ms): <00000034> (<0000008c>)
0 80000000 00000016 [0284618592392510] 0.130ms (+0.000ms): <00000012> (<00000000>)
0 80000000 00000017 [0284618592392714] 0.130ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (try_to_wake_up+0xf6/0x110 <c0115fb6>)
0 80000000 00000018 [0284618592392916] 0.131ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (__do_IRQ+0x13d/0x170 <c013f3cd>)
0 80000000 00000019 [0284618592393022] 0.131ms (+0.000ms): irq_exit+0xb/0x50 <c013f10b> (do_IRQ+0x53/0x70 <c01080d3>)
0 80000000 0000001a [0284618592393709] 0.131ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (apm_bios_call_simple+0xd5/0xf0 <c0110ff5>)
0 80000000 0000001b [0284618592393899] 0.131ms (+0.000ms): apm_do_busy+0xb/0x40 <c01111eb> (cpu_idle+0x4c/0x70 <c0103f9c>)
0 80000000 0000001c [0284618592393987] 0.131ms (+0.000ms): apm_bios_call_simple+0xe/0xf0 <c0110f2e> (apm_do_busy+0x2e/0x40 <c011120e>)
0 80000000 0000001d [0284618592394648] 0.132ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (apm_bios_call_simple+0xd5/0xf0 <c0110ff5>)
0 00000000 0000001e [0284618592394819] 0.132ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (cpu_idle+0x64/0x70 <c0103fb4>)
0 80000000 0000001f [0284618592394966] 0.132ms (+0.000ms): __schedule+0xe/0x6b0 <c02bb38e> (preempt_schedule+0x58/0x80 <c02bbbc8>)
0 80000000 00000020 [0284618592395053] 0.132ms (+0.000ms): profile_hit+0x9/0x50 <c011bec9> (__schedule+0x43/0x6b0 <c02bb3c3>)
0 80000000 00000021 [0284618592395254] 0.132ms (+0.000ms): sched_clock+0x14/0x80 <c01103c4> (__schedule+0x70/0x6b0 <c02bb3f0>)
19 80000000 00000022 [0284618592395691] 0.132ms (+0.000ms): __switch_to+0xe/0x190 <c01049ae> (__schedule+0x2fc/0x6b0 <c02bb67c>)
19 80000000 00000023 [0284618592395824] 0.132ms (+0.000ms): <00000000> (<00000013>)
19 80000000 00000024 [0284618592395876] 0.132ms (+0.000ms): <0000008c> (<00000034>)
19 80000000 00000025 [0284618592395941] 0.132ms (+0.000ms): finish_task_switch+0x14/0xc0 <c01163c4> (__schedule+0x345/0x6b0 <c02bb6c5>)
19 80000000 00000026 [0284618592396082] 0.133ms (+0.000ms): trace_stop_sched_switched+0x14/0x190 <c0136284> (finish_task_switch+0x51/0xc0 <c0116401>)
19 80000000 00000027 [0284618592396150] 0.133ms (+0.002ms): <00000013> (<00000034>)
19 80000000 00000028 [0284618592399999] 0.135ms (+0.000ms): trace_stop_sched_switched+0x133/0x190 <c01363a3> (finish_task_switch+0x51/0xc0 <c0116401>)
Note the jump in irq_exit/do_IRQ.
I have .27-11 installed, but hadn't rebooted yet into it. Will compile .28-0,
and hopefully boot into it today.
Ingo Molnar wrote:
> i have released the -V0.7.28-0 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a fixes & latency-reduction release.
>
I know I am late reporting this but I didn't figure it out until late
this afternoon. I had trouble booting this one on my SMP workstation at
the office. It would hang after it had almost finished booting. Anyway
the solution was to disable tracing in /etc/rc.local and then re-enable
it after it has finished booting. I know this happens late in the boot
but it works for me.
echo 0 > /proc/sys/kernel/trace_enabled
#echo 0 > /proc/sys/kernel/preempt_wakeup_timing
#echo 50 > /proc/sys/kernel/preempt_max_latency
To be honest I am not sure which of the above fixes the late boot hang
and I didn't have time to figure it out either. This doesn't happen on
my SMP system here.
kr
* K.R. Foley <[email protected]> wrote:
> I know I am late reporting this but I didn't figure it out until late
> this afternoon. I had trouble booting this one on my SMP workstation
> at the office. It would hang after it had almost finished booting.
> Anyway the solution was to disable tracing in /etc/rc.local and then
> re-enable it after it has finished booting. I know this happens late
> in the boot but it works for me.
>
> echo 0 > /proc/sys/kernel/trace_enabled
> #echo 0 > /proc/sys/kernel/preempt_wakeup_timing
> #echo 50 > /proc/sys/kernel/preempt_max_latency
>
> To be honest I am not sure which of the above fixes the late boot hang
> and I didn't have time to figure it out either. This doesn't happen on
> my SMP system here.
there's a generic bug i'm chasing right now that seems to get worse with
tracing enabled. The symptom of the bug is typically a system hang.
Ingo
On Wed, 2004-11-17 at 13:42 +0100, Ingo Molnar wrote:
> i have released the -V0.7.28-0 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a fixes & latency-reduction release.
Hi,
here's another message log this time on my Dell laptop when removing my
prism wlan pccard running the hostap driver.
BTW I didn't see my previous report about my vdr/router box on lkml so I
hope it got through. Otherwise just tell me and I'll resend it.
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
i have released the -V0.7.28-1 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this should fix the lockup bug reported by Florian Schmidt.
there's a generic PREEMPT bug in the upstream kernel: there exists a
single-instruction race window in __flush_tlb(), if the kernel preempted
exactly there in a lazy-TLB thread and certain other, rare scheduling
and MM properties were true as well (a certain constellation of threads
and lazy-TLB kernel threads occured), and the lazy-TLB task then got
another user TLB to inherit, and switched to a task from which it
inherited that new TLB, thus the wrong cr3 was loaded and inherited by
this next, non-lazy-TLB next task; then (and only then) this scenario
would typically manifest itself in the form of an infinite pagefault
lockup occuring much after the fact, upon the next userspace access (to
the joy of a totally baffled kernel developer). I suspect from the
description you can guess how much fun it was to debug it =B-)
the bug is even more rare in the generic kernel, because there most (but
not all) TLB flush points are in a critical section.
this fix could resolve some of the other 'my box just locked up'
reports.
Changes since a -V0.7.28-0:
- reverted the UP-ioapic change - it was unrelated to the lockup and it
is known to cause problems on certain IDE/soundcard combinations.
- fixed and improved the trace_print_on_crash tracing feature - it was
highly needed to find the TLB bug ...
to create a -V0.7.28-1 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm1/2.6.10-rc2-mm1.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm1-V0.7.28-1
Ingo
On Thu, 18 Nov 2004 13:35:21 +0100
Ingo Molnar <[email protected]> wrote:
>
> i have released the -V0.7.28-1 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this should fix the lockup bug reported by Florian Schmidt.
great news! did you find any sleep at all? anyways, built and booted fine.
putting load on the system since 15 minutes. If it locks up again, i'll write
another mail.
flo
* Adam Heath <[email protected]> wrote:
> Running .26-5. Been running almost 2 days. All small latency values.
> Then, just a few minutes ago, got a 133us value:
this entry has most of the overhead:
> 0 80000000 00000004 [0284618592175975] 0.000ms (+0.000ms): preempt_schedule+0x11/0x80 <c02bbb81> (__do_IRQ+0x13d/0x170 <c013f3cd>)
> 0 80000000 00000005 [0284618592176118] 0.000ms (+0.127ms): irq_exit+0xb/0x50 <c013f10b> (do_IRQ+0x53/0x70 <c01080d3>)
> 0 80000000 00000006 [0284618592387634] 0.127ms (+0.000ms): do_IRQ+0x0/0x70 <c0108080> (<0000a253>)
> 0 80000000 00000007 [0284618592387690] 0.127ms (+0.000ms): do_IRQ+0x0/0x70 <c0108080> (<0000000e>)
this shows that we interrupted some longer critical section - in this
case it seems to be BIOS/APM code.
> Note the jump in irq_exit/do_IRQ.
that jump is a delayed interrupt hitting the BIOS on its way out of APM
idle mode it seems:
> 0 80000000 0000001b [0284618592393899] 0.131ms (+0.000ms): apm_do_busy+0xb/0x40 <c01111eb> (cpu_idle+0x4c/0x70 <c0103f9c>)
> 0 80000000 0000001c [0284618592393987] 0.131ms (+0.000ms): apm_bios_call_simple+0xe/0xf0 <c0110f2e> (apm_do_busy+0x2e/0x40 <c011120e>)
There's nothing to be done about that, except to disable APM. Perhaps
you could try ACPI, maybe that doesnt have such latencies in the BIOS.
Ingo
Ingo Molnar wrote:
>
> i have released the -V0.7.28-1 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this should fix the lockup bug reported by Florian Schmidt.
>
I'm still seeing this sometimes (not everytime) on my P4/UP laptop while
shutting down ALSA modules. This isn't the same as the lockup I've been
reporting lately (that happens on my P4/SMT desktop) but may be remotely
related.
Nov 18 12:22:21 lambda kernel: BUG: Unable to handle kernel NULL pointer
dereference at virtual address 00000000
Nov 18 12:22:21 lambda kernel: printing eip:
Nov 18 12:22:21 lambda kernel: c0129a4b
Nov 18 12:22:21 lambda kernel: *pde = 00000000
Nov 18 12:22:21 lambda kernel: Oops: 0000 [#1]
Nov 18 12:22:21 lambda kernel: PREEMPT
Nov 18 12:22:21 lambda kernel: Modules linked in: realtime commoncap
snd_ali5451 snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore
pcmcia yenta_socket pcmcia_core natsemi crc32 loop evdev ohci_hcd usbcore
Nov 18 12:22:21 lambda kernel: CPU: 0
Nov 18 12:22:21 lambda kernel: EIP: 0060:[__up_mutex+59/384] Not
tainted VLI
Nov 18 12:22:21 lambda kernel: EIP: 0060:[<c0129a4b>] Not tainted VLI
Nov 18 12:22:21 lambda kernel: EFLAGS: 00010006
(2.6.10-rc2-mm1-RT-V0.7.28-1)
Nov 18 12:22:21 lambda kernel: EIP is at __up_mutex+0x3b/0x180
Nov 18 12:22:21 lambda kernel: eax: 00000000 ebx: d70c2000 ecx:
de2510d0 edx: 00000063
Nov 18 12:22:21 lambda kernel: esi: de2510d0 edi: e00f2894 ebp:
c02fa090 esp: d70c3ef0
Nov 18 12:22:21 lambda kernel: ds: 007b es: 007b ss: 0068 preempt:
00000004
Nov 18 12:22:21 lambda kernel: Process rmmod (pid: 6836,
threadinfo=d70c2000 task=dca7ad70)
Nov 18 12:22:21 lambda kernel: Stack: 00000282 c0139cdc 00000286 00000282
c01ad4cb 00000000 d70c2000 c0301908
Nov 18 12:22:21 lambda kernel: c02fa080 c02fa090 c012a165 e00f28a4
c01ad4cb e00f28bc c01ad4cd bfffd3f0
Nov 18 12:22:21 lambda kernel: d70c2000 c01adadf e00f28a4 00000000
bfffd3f0 d70c2000 e00f28a4 00000000
Nov 18 12:22:21 lambda kernel: Call Trace:
Nov 18 12:22:21 lambda kernel: [kmem_cache_free+74/199]
kmem_cache_free+0x4a/0xc7 (8)
Nov 18 12:22:21 lambda kernel: [<c0139cdc>] kmem_cache_free+0x4a/0xc7 (8)
Nov 18 12:22:21 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (12)
Nov 18 12:22:21 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (12)
Nov 18 12:22:21 lambda kernel: [up+53/61] up+0x35/0x3d (24)
Nov 18 12:22:21 lambda kernel: [<c012a165>] up+0x35/0x3d (24)
Nov 18 12:22:21 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (8)
Nov 18 12:22:21 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (8)
Nov 18 12:22:21 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 18 12:22:21 lambda kernel: [<c01ad4cd>] kobject_release+0x0/0x8 (8)
Nov 18 12:22:21 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 18 12:22:21 lambda kernel: [<c01adadf>] kref_put+0x51/0xc2 (12)
Nov 18 12:22:21 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 18 12:22:21 lambda kernel: [<c01f3aff>] bus_remove_driver+0x3f/0x48 (36)
Nov 18 12:22:21 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 18 12:22:21 lambda kernel: [<c01f3e98>] driver_unregister+0xb/0x1a (8)
Nov 18 12:22:21 lambda kernel: [pci_unregister_driver+11/19]
pci_unregister_driver+0xb/0x13 (8)
Nov 18 12:22:21 lambda kernel: [<c01b4b10>]
pci_unregister_driver+0xb/0x13 (8)
Nov 18 12:22:21 lambda kernel: [sys_delete_module+292/304]
sys_delete_module+0x124/0x130 (8)
Nov 18 12:22:21 lambda kernel: [<c012c044>] sys_delete_module+0x124/0x130
(8)
Nov 18 12:22:21 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(32)
Nov 18 12:22:21 lambda kernel: [<c01431b3>] do_munmap+0x11a/0x176 (32)
Nov 18 12:22:21 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (12)
Nov 18 12:22:21 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (12)
Nov 18 12:22:21 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (24)
Nov 18 12:22:21 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (24)
Nov 18 12:22:21 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 18 12:22:21 lambda kernel: [<c01023f1>] sysenter_past_esp+0x52/0x71 (12)
Nov 18 12:22:21 lambda kernel: Code: 10 9c 8f 44 24 08 fa 9c 58 b8 00 e0
ff ff 21 e0 83 40 14 01 83 40 14 01 8b 47 08 e8 50 7f fe ff 8b 77 08 89 c2
8b 86 48 05 00 00 <8b> 08 0f 18 01 90 8d 9e 48 05 00 00 eb 10 8b 40 0c 39
d0 0f 4c
Nov 18 12:22:21 lambda kernel: <6>note: rmmod[6836] exited with
preempt_count 3
Nov 18 12:22:21 lambda kernel: BUG: scheduling while atomic:
rmmod/0x00000003/6836
Nov 18 12:22:21 lambda kernel: caller is do_exit+0x2a5/0x4ce
Nov 18 12:22:21 lambda kernel: [__schedule+1155/1484]
__sched_text_start+0x483/0x5cc (8)
Nov 18 12:22:21 lambda kernel: [<c02a6507>]
__sched_text_start+0x483/0x5cc (8)
Nov 18 12:22:21 lambda kernel: [exit_notify+1154/2281]
exit_notify+0x482/0x8e9 (24)
Nov 18 12:22:21 lambda kernel: [<c011743a>] exit_notify+0x482/0x8e9 (24)
Nov 18 12:22:21 lambda kernel: [do_exit+677/1230] do_exit+0x2a5/0x4ce (56)
Nov 18 12:22:21 lambda kernel: [<c0117b46>] do_exit+0x2a5/0x4ce (56)
Nov 18 12:22:21 lambda kernel: [do_divide_error+0/320]
do_divide_error+0x0/0x140 (48)
Nov 18 12:22:21 lambda kernel: [<c01035a9>] do_divide_error+0x0/0x140 (48)
Nov 18 12:22:21 lambda kernel: [do_page_fault+865/1341]
do_page_fault+0x361/0x53d (64)
Nov 18 12:22:21 lambda kernel: [<c010fc18>] do_page_fault+0x361/0x53d (64)
Nov 18 12:22:21 lambda kernel: [call_usermodehelper+346/364]
call_usermodehelper+0x15a/0x16c (72)
Nov 18 12:22:21 lambda kernel: [<c012493e>]
call_usermodehelper+0x15a/0x16c (72)
Nov 18 12:22:21 lambda kernel: [__call_usermodehelper+0/72]
__call_usermodehelper+0x0/0x48 (56)
Nov 18 12:22:21 lambda kernel: [<c012479c>]
__call_usermodehelper+0x0/0x48 (56)
Nov 18 12:22:21 lambda kernel: [__down_mutex+73/322]
__down_mutex+0x49/0x142 (16)
Nov 18 12:22:21 lambda kernel: [<c02a7600>] __down_mutex+0x49/0x142 (16)
Nov 18 12:22:21 lambda kernel: [dput+121/657] dput+0x79/0x291 (4)
Nov 18 12:22:21 lambda kernel: [<c0165899>] dput+0x79/0x291 (4)
Nov 18 12:22:21 lambda kernel: [kfree+81/237] kfree+0x51/0xed (28)
Nov 18 12:22:21 lambda kernel: [<c0139e0e>] kfree+0x51/0xed (28)
Nov 18 12:22:21 lambda kernel: [do_page_fault+0/1341]
do_page_fault+0x0/0x53d (28)
Nov 18 12:22:21 lambda kernel: [<c010f8b7>] do_page_fault+0x0/0x53d (28)
Nov 18 12:22:21 lambda kernel: [error_code+43/48] error_code+0x2b/0x30 (8)
Nov 18 12:22:21 lambda kernel: [<c0102e57>] error_code+0x2b/0x30 (8)
Nov 18 12:22:21 lambda kernel: [locate_fd+56/137] locate_fd+0x38/0x89 (32)
Nov 18 12:22:21 lambda kernel: [<c016007b>] locate_fd+0x38/0x89 (32)
Nov 18 12:22:21 lambda kernel: [__up_mutex+59/384] __up_mutex+0x3b/0x180
(12)
Nov 18 12:22:21 lambda kernel: [<c0129a4b>] __up_mutex+0x3b/0x180 (12)
Nov 18 12:22:21 lambda kernel: [kmem_cache_free+74/199]
kmem_cache_free+0x4a/0xc7 (16)
Nov 18 12:22:21 lambda kernel: [<c0139cdc>] kmem_cache_free+0x4a/0xc7 (16)
Nov 18 12:22:21 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (12)
Nov 18 12:22:21 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (12)
Nov 18 12:22:21 lambda kernel: [up+53/61] up+0x35/0x3d (24)
Nov 18 12:22:21 lambda kernel: [<c012a165>] up+0x35/0x3d (24)
Nov 18 12:22:21 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (8)
Nov 18 12:22:21 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (8)
Nov 18 12:22:21 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 18 12:22:21 lambda kernel: [<c01ad4cd>] kobject_release+0x0/0x8 (8)
Nov 18 12:22:21 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 18 12:22:21 lambda kernel: [<c01adadf>] kref_put+0x51/0xc2 (12)
Nov 18 12:22:21 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 18 12:22:21 lambda kernel: [<c01f3aff>] bus_remove_driver+0x3f/0x48 (36)
Nov 18 12:22:21 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 18 12:22:21 lambda kernel: [<c01f3e98>] driver_unregister+0xb/0x1a (8)
Nov 18 12:22:21 lambda kernel: [pci_unregister_driver+11/19]
pci_unregister_driver+0xb/0x13 (8)
Nov 18 12:22:21 lambda kernel: [<c01b4b10>]
pci_unregister_driver+0xb/0x13 (8)
Nov 18 12:22:21 lambda kernel: [sys_delete_module+292/304]
sys_delete_module+0x124/0x130 (8)
Nov 18 12:22:21 lambda kernel: [<c012c044>] sys_delete_module+0x124/0x130
(8)
Nov 18 12:22:21 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(32)
Nov 18 12:22:21 lambda kernel: [<c01431b3>] do_munmap+0x11a/0x176 (32)
Nov 18 12:22:21 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (12)
Nov 18 12:22:21 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (12)
Nov 18 12:22:21 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (24)
Nov 18 12:22:21 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (24)
Nov 18 12:22:21 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 18 12:22:21 lambda kernel: [<c01023f1>] sysenter_past_esp+0x52/0x71 (12)
Nov 18 12:22:21 lambda kernel: Kernel logging (proc) stopped.
Nov 18 12:22:21 lambda kernel: Kernel log daemon terminating.
Nov 18 12:22:22 lambda exiting on signal 15
Another one, a couple of times later:
Nov 18 13:21:59 lambda alsa: Shutting down ALSA sound driver (version 1.0.7):
Nov 18 13:22:01 lambda kernel: usbcore: deregistering driver snd-usb-usx2y
Nov 18 13:22:01 lambda alsa: /etc/rc0.d/K70alsa: line 287: 5700
Segmentation fault /sbin/rmmod `echo $line | cut -d ' ' -f 1`
>/dev/null 2>&1
Nov 18 13:22:03 lambda kernel: BUG: Unable to handle kernel paging request
at virtual address f39d2483
Nov 18 13:22:03 lambda kernel: printing eip:
Nov 18 13:22:03 lambda kernel: c0129a4b
Nov 18 13:22:03 lambda kernel: *pde = 00000000
Nov 18 13:22:03 lambda kernel: Oops: 0000 [#1]
Nov 18 13:22:03 lambda kernel: PREEMPT
Nov 18 13:22:03 lambda kernel: Modules linked in: realtime commoncap
snd_usb_usx2y snd_usb_lib snd_rawmidi snd_seq_device snd_hwdep snd_ali5451
snd_ac97_codec snd_pcm snd_timer snd_page_alloc snd soundcore prism2_cs
p80211 pcmcia pcmcia_core natsemi crc32 loop subfs evdev ohci_hcd usbcore
Nov 18 13:22:03 lambda kernel: CPU: 0
Nov 18 13:22:03 lambda kernel: EIP: 0060:[__up_mutex+59/384] Not
tainted VLI
Nov 18 13:22:03 lambda kernel: EIP: 0060:[<c0129a4b>] Not tainted VLI
Nov 18 13:22:03 lambda kernel: EFLAGS: 00010093
(2.6.10-rc2-mm1-RT-V0.7.28-1)
Nov 18 13:22:03 lambda kernel: EIP is at __up_mutex+0x3b/0x180
Nov 18 13:22:03 lambda kernel: eax: f39d2483 ebx: deb66000 ecx:
deaae1b0 edx: a1a407da
Nov 18 13:22:03 lambda kernel: esi: deaae1b0 edi: e010b58c ebp:
e00609b0 esp: deb67ee4
Nov 18 13:22:03 lambda kernel: ds: 007b es: 007b ss: 0068 preempt:
00000004
Nov 18 13:22:03 lambda kernel: Process rmmod (pid: 5700,
threadinfo=deb66000 task=de5c5450)
Nov 18 13:22:03 lambda kernel: Stack: 00000296 c0139cdc 00000286 00000296
c01ad4cb 00000000 deb66000 c0301908
Nov 18 13:22:03 lambda kernel: e00609a0 e00609b0 c012a165 e010b59c
c01ad4cb e010b5b4 c01ad4cd bfffd3f0
Nov 18 13:22:03 lambda kernel: deb66000 c01adadf e010b59c 00000000
bfffd3f0 deb66000 e010b59c 00000000
Nov 18 13:22:03 lambda kernel: Call Trace:
Nov 18 13:22:03 lambda kernel: [kmem_cache_free+74/199]
kmem_cache_free+0x4a/0xc7 (8)
Nov 18 13:22:03 lambda kernel: [<c0139cdc>] kmem_cache_free+0x4a/0xc7 (8)
Nov 18 13:22:03 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (12)
Nov 18 13:22:03 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (12)
Nov 18 13:22:03 lambda kernel: [up+53/61] up+0x35/0x3d (24)
Nov 18 13:22:03 lambda kernel: [<c012a165>] up+0x35/0x3d (24)
Nov 18 13:22:03 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (8)
Nov 18 13:22:03 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (8)
Nov 18 13:22:03 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 18 13:22:03 lambda kernel: [<c01ad4cd>] kobject_release+0x0/0x8 (8)
Nov 18 13:22:03 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 18 13:22:03 lambda kernel: [<c01adadf>] kref_put+0x51/0xc2 (12)
Nov 18 13:22:03 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 18 13:22:03 lambda kernel: [<c01f3aff>] bus_remove_driver+0x3f/0x48 (36)
Nov 18 13:22:03 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 18 13:22:03 lambda kernel: [<c01f3e98>] driver_unregister+0xb/0x1a (8)
Nov 18 13:22:03 lambda kernel: [pg0+533373460/1069945856]
usb_deregister+0x31/0x3f [usbcore] (8)
Nov 18 13:22:03 lambda kernel: [<e0047214>] usb_deregister+0x31/0x3f
[usbcore] (8)
Nov 18 13:22:03 lambda kernel: [sys_delete_module+292/304]
sys_delete_module+0x124/0x130 (20)
Nov 18 13:22:03 lambda kernel: [<c012c044>] sys_delete_module+0x124/0x130
(20)
Nov 18 13:22:03 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(32)
Nov 18 13:22:03 lambda kernel: [<c01431b3>] do_munmap+0x11a/0x176 (32)
Nov 18 13:22:03 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (12)
Nov 18 13:22:03 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (12)
Nov 18 13:22:03 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (24)
Nov 18 13:22:03 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (24)
Nov 18 13:22:03 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 18 13:22:03 lambda kernel: [<c01023f1>] sysenter_past_esp+0x52/0x71 (12)
Nov 18 13:22:03 lambda kernel: Code: 10 9c 8f 44 24 08 fa 9c 58 b8 00 e0
ff ff 21 e0 83 40 14 01 83 40 14 01 8b 47 08 e8 50 7f fe ff 8b 77 08 89 c2
8b 86 48 05 00 00 <8b> 08 0f 18 01 90 8d 9e 48 05 00 00 eb 10 8b 40 0c 39
d0 0f 4c
Nov 18 13:22:03 lambda kernel: <6>note: rmmod[5700] exited with
preempt_count 3
Nov 18 13:22:03 lambda kernel: BUG: scheduling while atomic:
rmmod/0x00000003/5700
Nov 18 13:22:03 lambda kernel: caller is do_exit+0x2a5/0x4ce
Nov 18 13:22:03 lambda kernel: [__schedule+1155/1484]
__sched_text_start+0x483/0x5cc (8)
Nov 18 13:22:03 lambda kernel: [<c02a6507>]
__sched_text_start+0x483/0x5cc (8)
Nov 18 13:22:03 lambda kernel: [exit_notify+1154/2281]
exit_notify+0x482/0x8e9 (24)
Nov 18 13:22:03 lambda kernel: [<c011743a>] exit_notify+0x482/0x8e9 (24)
Nov 18 13:22:03 lambda kernel: [do_exit+677/1230] do_exit+0x2a5/0x4ce (56)
Nov 18 13:22:03 lambda kernel: [<c0117b46>] do_exit+0x2a5/0x4ce (56)
Nov 18 13:22:03 lambda kernel: [do_divide_error+0/320]
do_divide_error+0x0/0x140 (48)
Nov 18 13:22:03 lambda kernel: [<c01035a9>] do_divide_error+0x0/0x140 (48)
Nov 18 13:22:03 lambda kernel: [do_page_fault+865/1341]
do_page_fault+0x361/0x53d (64)
Nov 18 13:22:03 lambda kernel: [<c010fc18>] do_page_fault+0x361/0x53d (64)
Nov 18 13:22:03 lambda kernel: [call_usermodehelper+346/364]
call_usermodehelper+0x15a/0x16c (72)
Nov 18 13:22:03 lambda kernel: [<c012493e>]
call_usermodehelper+0x15a/0x16c (72)
Nov 18 13:22:03 lambda kernel: [__schedule+662/1484]
__sched_text_start+0x296/0x5cc (40)
Nov 18 13:22:03 lambda kernel: [<c02a631a>]
__sched_text_start+0x296/0x5cc (40)
Nov 18 13:22:03 lambda kernel: [__call_usermodehelper+0/72]
__call_usermodehelper+0x0/0x48 (16)
Nov 18 13:22:03 lambda kernel: [<c012479c>]
__call_usermodehelper+0x0/0x48 (16)
Nov 18 13:22:03 lambda kernel: [__down_mutex+73/322]
__down_mutex+0x49/0x142 (16)
Nov 18 13:22:03 lambda kernel: [<c02a7600>] __down_mutex+0x49/0x142 (16)
Nov 18 13:22:03 lambda kernel: [dput+121/657] dput+0x79/0x291 (4)
Nov 18 13:22:03 lambda kernel: [<c0165899>] dput+0x79/0x291 (4)
Nov 18 13:22:03 lambda kernel: [kfree+81/237] kfree+0x51/0xed (28)
Nov 18 13:22:03 lambda kernel: [<c0139e0e>] kfree+0x51/0xed (28)
Nov 18 13:22:03 lambda kernel: [do_page_fault+0/1341]
do_page_fault+0x0/0x53d (28)
Nov 18 13:22:03 lambda kernel: [<c010f8b7>] do_page_fault+0x0/0x53d (28)
Nov 18 13:22:03 lambda kernel: [error_code+43/48] error_code+0x2b/0x30 (8)
Nov 18 13:22:03 lambda kernel: [<c0102e57>] error_code+0x2b/0x30 (8)
Nov 18 13:22:03 lambda kernel: [locate_fd+56/137] locate_fd+0x38/0x89 (32)
Nov 18 13:22:03 lambda kernel: [<c016007b>] locate_fd+0x38/0x89 (32)
Nov 18 13:22:03 lambda kernel: [__up_mutex+59/384] __up_mutex+0x3b/0x180
(12)
Nov 18 13:22:03 lambda kernel: [<c0129a4b>] __up_mutex+0x3b/0x180 (12)
Nov 18 13:22:03 lambda kernel: [kmem_cache_free+74/199]
kmem_cache_free+0x4a/0xc7 (16)
Nov 18 13:22:03 lambda kernel: [<c0139cdc>] kmem_cache_free+0x4a/0xc7 (16)
Nov 18 13:22:03 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (12)
Nov 18 13:22:03 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (12)
Nov 18 13:22:03 lambda kernel: [up+53/61] up+0x35/0x3d (24)
Nov 18 13:22:03 lambda kernel: [<c012a165>] up+0x35/0x3d (24)
Nov 18 13:22:03 lambda kernel: [kobject_cleanup+142/144]
kobject_cleanup+0x8e/0x90 (8)
Nov 18 13:22:03 lambda kernel: [<c01ad4cb>] kobject_cleanup+0x8e/0x90 (8)
Nov 18 13:22:03 lambda kernel: [kobject_release+0/8]
kobject_release+0x0/0x8 (8)
Nov 18 13:22:03 lambda kernel: [<c01ad4cd>] kobject_release+0x0/0x8 (8)
Nov 18 13:22:03 lambda kernel: [kref_put+81/194] kref_put+0x51/0xc2 (12)
Nov 18 13:22:03 lambda kernel: [<c01adadf>] kref_put+0x51/0xc2 (12)
Nov 18 13:22:03 lambda kernel: [bus_remove_driver+63/72]
bus_remove_driver+0x3f/0x48 (36)
Nov 18 13:22:03 lambda kernel: [<c01f3aff>] bus_remove_driver+0x3f/0x48 (36)
Nov 18 13:22:03 lambda kernel: [driver_unregister+11/26]
driver_unregister+0xb/0x1a (8)
Nov 18 13:22:03 lambda kernel: [<c01f3e98>] driver_unregister+0xb/0x1a (8)
Nov 18 13:22:03 lambda kernel: [pg0+533373460/1069945856]
usb_deregister+0x31/0x3f [usbcore] (8)
Nov 18 13:22:03 lambda kernel: [<e0047214>] usb_deregister+0x31/0x3f
[usbcore] (8)
Nov 18 13:22:03 lambda kernel: [sys_delete_module+292/304]
sys_delete_module+0x124/0x130 (20)
Nov 18 13:22:03 lambda kernel: [<c012c044>] sys_delete_module+0x124/0x130
(20)
Nov 18 13:22:03 lambda kernel: [do_munmap+282/374] do_munmap+0x11a/0x176
(32)
Nov 18 13:22:03 lambda kernel: [<c01431b3>] do_munmap+0x11a/0x176 (32)
Nov 18 13:22:03 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (12)
Nov 18 13:22:03 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (12)
Nov 18 13:22:03 lambda kernel: [sys_munmap+56/69] sys_munmap+0x38/0x45 (24)
Nov 18 13:22:03 lambda kernel: [<c0143247>] sys_munmap+0x38/0x45 (24)
Nov 18 13:22:03 lambda kernel: [sysenter_past_esp+82/113]
sysenter_past_esp+0x52/0x71 (12)
Nov 18 13:22:03 lambda kernel: [<c01023f1>] sysenter_past_esp+0x52/0x71 (12)
Nov 18 13:22:03 lambda kernel: Kernel logging (proc) stopped.
Nov 18 13:22:03 lambda kernel: Kernel log daemon terminating.
Nov 18 13:22:04 lambda exiting on signal 15
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
* Christian Meder <[email protected]> wrote:
> after successfully running the last couple of rt patches on my Dell
> Inspiron laptop I thought I'd give it a try on my combined vdr/router
> box which is probably more interesting from a rt point of view. This
> box is bridging wireless/ADSL and working as a digital vdr using the
> kernel DVB-S drivers.
>
> I got the appended logging messages with the appended config. Is
> there anything else I should provide for debugging purposes or are the
> messages just harmless ?
the messages mean that i havent converted the bridge code's RCU locking
to PREEMPT_RT yet. I've done this in the -V0.7.28-2 patch that i've just
uploaded to:
http://redhat.com/~mingo/realtime-preempt/
does bridging work fine with this patch, and if yes, do you get any
(other) warning messages?
Ingo
* Christian Meder <[email protected]> wrote:
> On Wed, 2004-11-17 at 13:42 +0100, Ingo Molnar wrote:
> > i have released the -V0.7.28-0 Real-Time Preemption patch, which can be
> > downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
> >
> > this is a fixes & latency-reduction release.
>
> Hi,
>
> here's another message log this time on my Dell laptop when removing
> my prism wlan pccard running the hostap driver.
could you try this with the vanilla 2.6.10-rc2-mm1 kernel too? The crash
you got is an escallation of a crash within a critical section, but that
original crash does not seem to be directly related to PREEMPT_RT.
(also, please enable CONFIG_USE_FRAME_POINTER, to make the backtraces
easier to parse.)
Ingo
On Thu, 2004-11-18 at 16:54 +0100, Ingo Molnar wrote:
> * Christian Meder <[email protected]> wrote:
>
> > after successfully running the last couple of rt patches on my Dell
> > Inspiron laptop I thought I'd give it a try on my combined vdr/router
> > box which is probably more interesting from a rt point of view. This
> > box is bridging wireless/ADSL and working as a digital vdr using the
> > kernel DVB-S drivers.
> >
> > I got the appended logging messages with the appended config. Is
> > there anything else I should provide for debugging purposes or are the
> > messages just harmless ?
>
> the messages mean that i havent converted the bridge code's RCU locking
> to PREEMPT_RT yet. I've done this in the -V0.7.28-2 patch that i've just
> uploaded to:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> does bridging work fine with this patch, and if yes, do you get any
> (other) warning messages?
Thanks, will try soon and report. There was another trace in my log of
the vdr/router box which seemed unrelated to the bridging traces.
Christian
Nov 17 22:44:41 verena kernel: dvb-ttpci: found av7110-0.
Nov 17 22:44:43 verena kernel: ves1x93: Detected ves1893a rev2
Nov 17 22:44:43 verena kernel: DVB: registering frontend 0 (VES1893)...
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel:
==========================================
Nov 17 22:44:44 verena kernel: [ BUG: lock recursion deadlock detected!
|
Nov 17 22:44:44 verena kernel:
------------------------------------------
Nov 17 22:44:44 verena kernel: already locked: [c9c8fa10] {&fe->sem}
Nov 17 22:44:44 verena kernel: .. held by: vdr: 2147
[ca3a6950, 118]
Nov 17 22:44:44 verena kernel: ... acquired at: dvb_frontend_start
+0x75/0x100
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: ------------------------------
Nov 17 22:44:44 verena kernel: | showing all locks held by: | (vdr/2147
[ca3a6950, 118]):
Nov 17 22:44:44 verena kernel: ------------------------------
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #001: [c9c8fa10] {&fe->sem}
Nov 17 22:44:44 verena kernel: ... acquired at: dvb_frontend_start
+0x75/0x100
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #002: [c045e444]
{kernel_sem.lock}
Nov 17 22:44:44 verena kernel: ... acquired at: lock_kernel+0x27/0x50
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: -{current task's
backtrace}----------------->
Nov 17 22:44:44 verena kernel: [check_deadlock+608/640] check_deadlock
+0x260/0x280 (8)
Nov 17 22:44:44 verena kernel: [dvb_frontend_ioctl+171/832]
dvb_frontend_ioctl+0xab/0x340 (28)
Nov 17 22:44:44 verena kernel: [dvb_frontend_ioctl+171/832]
dvb_frontend_ioctl+0xab/0x340 (8)
Nov 17 22:44:44 verena kernel: [task_blocks_on_lock+243/256]
task_blocks_on_lock+0xf3/0x100 (12)
Nov 17 22:44:44 verena kernel: [dvb_frontend_ioctl+171/832]
dvb_frontend_ioctl+0xab/0x340 (16)
Nov 17 22:44:44 verena kernel: [__down_interruptible+534/1072]
__down_interruptible+0x216/0x430 (4)
Nov 17 22:44:44 verena kernel: [dvb_frontend_ioctl+171/832]
dvb_frontend_ioctl+0xab/0x340 (4)
Nov 17 22:44:44 verena kernel: [pg0+264556013/1068020736]
av7110_send_fw_cmd+0x4d/0xc0 [dvb_ttpci] (12)
Nov 17 22:44:44 verena kernel: [up+190/256] up+0xbe/0x100 (24)
Nov 17 22:44:44 verena kernel: [down_interruptible+173/480]
down_interruptible+0xad/0x1e0 (36)
Nov 17 22:44:44 verena kernel: [lru_cache_add_active+13/64]
lru_cache_add_active+0xd/0x40 (4)
Nov 17 22:44:44 verena kernel: [dvb_frontend_ioctl+171/832]
dvb_frontend_ioctl+0xab/0x340 (40)
Nov 17 22:44:44 verena kernel: [__kmalloc+133/272] __kmalloc+0x85/0x110
(52)
Nov 17 22:44:44 verena kernel: [dvb_usercopy+132/272] dvb_usercopy
+0x84/0x110 (32)
Nov 17 22:44:44 verena kernel: [__down+470/800] __down+0x1d6/0x320 (44)
Nov 17 22:44:44 verena kernel: [lock_kernel+39/80] lock_kernel
+0x27/0x50 (4)
Nov 17 22:44:44 verena kernel: [__down_mutex+470/864] __down_mutex
+0x1d6/0x360 (12)
Nov 17 22:44:44 verena kernel: [fget+47/96] fget+0x2f/0x60 (4)
Nov 17 22:44:44 verena kernel: [__down_mutex+470/864] __down_mutex
+0x1d6/0x360 (4)
Nov 17 22:44:44 verena kernel: [kmem_cache_free+53/208] kmem_cache_free
+0x35/0xd0 (4)
Nov 17 22:44:44 verena kernel: [down+173/480] down+0xad/0x1e0 (48)
Nov 17 22:44:44 verena kernel: [fget+72/96] fget+0x48/0x60 (16)
Nov 17 22:44:44 verena kernel: [dvb_generic_ioctl+62/80]
dvb_generic_ioctl+0x3e/0x50 (24)
Nov 17 22:44:44 verena kernel: [dvb_frontend_ioctl+0/832]
dvb_frontend_ioctl+0x0/0x340 (8)
Nov 17 22:44:44 verena kernel: [sys_ioctl+184/528] sys_ioctl+0xb8/0x210
(12)
Nov 17 22:44:44 verena kernel: [syscall_call+7/11] syscall_call+0x7/0xb
(36)
Nov 17 22:44:44 verena kernel: ---------------------------
Nov 17 22:44:44 verena kernel: | preempt count: 00000002 ]
Nov 17 22:44:44 verena kernel: | 2-level deep critical section nesting:
Nov 17 22:44:44 verena kernel: ----------------------------------------
Nov 17 22:44:44 verena kernel: .. [__down_interruptible+1057/1072] ....
__down_interruptible+0x421/0x430
Nov 17 22:44:44 verena kernel: .....[<00000000>] .. ( <= 0x0)
Nov 17 22:44:44 verena kernel: .. [print_traces+13/64] .... print_traces
+0xd/0x40
Nov 17 22:44:44 verena kernel: .....[<00000000>] .. ( <= 0x0)
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: showing all tasks:
Nov 17 22:44:44 verena kernel: s init: 1 [c1239370, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 0: 2 [c1238d00, 50]
(not blocked)
Nov 17 22:44:44 verena kernel: s ksoftirqd/0: 3 [c1238690, 105]
(not blocked)
Nov 17 22:44:44 verena kernel: s desched/0: 4 [c1238020, 105]
(not blocked)
Nov 17 22:44:44 verena kernel: s events/0: 5 [cf7db390, 98]
(not blocked)
Nov 17 22:44:44 verena kernel: s khelper: 6 [cf7dad20, 113]
(not blocked)
Nov 17 22:44:44 verena kernel: s kthread: 11 [cf7da6b0, 110]
(not blocked)
Nov 17 22:44:44 verena kernel: s kacpid: 19 [cf7da040, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 11: 20 [cf64d3b0, 58]
(not blocked)
Nov 17 22:44:44 verena kernel: s kblockd/0: 91 [cf64cd40, 110]
(not blocked)
Nov 17 22:44:44 verena kernel: s khubd: 104 [cf64c6d0, 115]
(not blocked)
Nov 17 22:44:44 verena kernel: s pdflush: 183 [cf64c060, 117]
(not blocked)
Nov 17 22:44:44 verena kernel: s pdflush: 184 [cf6c53d0, 115]
(not blocked)
Nov 17 22:44:44 verena kernel: s aio/0: 186 [cf6c46f0, 112]
(not blocked)
Nov 17 22:44:44 verena kernel: s kswapd0: 185 [cf6c4d60, 125]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 8: 775 [cf6c4080, 56]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 12: 785 [c139ed80, 54]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 7: 804 [c139e710, 112]
(not blocked)
Nov 17 22:44:44 verena kernel: s kseriod: 779 [c139f3f0, 125]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 6: 807 [c139e0a0, 51]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 14: 821 [c13e9450, 52]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 15: 824 [c13dc0e0, 53]
(not blocked)
Nov 17 22:44:44 verena kernel: s scsi_eh_0: 837 [c13dcdc0, 117]
blocked on: [c13e1fac] {sem.lock}
Nov 17 22:44:44 verena kernel: .. held by: scsi_eh_0: 837
[c13dcdc0, 117]
Nov 17 22:44:44 verena kernel: ... acquired at: scsi_error_handler
+0x50/0x100
Nov 17 22:44:44 verena kernel: s IRQ 1: 864 [c13cd410, 55]
(not blocked)
Nov 17 22:44:44 verena kernel: s kjournald: 903 [c13e8770, 115]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 5: 1074 [c13dc750, 57]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 10: 1116 [c13e8100, 59]
(not blocked)
Nov 17 22:44:44 verena kernel: s kjournald: 1153 [cf439470, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s portmap: 1320 [ce6ace20, 115]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 4: 1328 [c13cc0c0, 112]
(not blocked)
Nov 17 22:44:44 verena kernel: s IRQ 3: 1331 [cf438e00, 112]
(not blocked)
Nov 17 22:44:44 verena kernel: s syslogd: 1433 [c13ccda0, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s klogd: 1436 [ce6ac7b0, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s pppd: 1448 [cdbb2e40, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s pppoe: 1451 [c13dd430, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s gnugk: 1484 [c13e8de0, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s gnugk: 1508 [cd366e60, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s gnugk: 1509 [cd3667f0, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s gnugk: 1510 [cd366180, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s gnugk: 1511 [cc5014f0, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s inetd: 1489 [cdbb34b0, 122]
(not blocked)
Nov 17 22:44:44 verena kernel: s jabberd: 1492 [ce6ad490, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s ip-up: 1502 [cd3674d0, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s run-parts: 1504 [cdbb27d0, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s exim: 1506 [cdbb2160, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s exim: 1507 [cf438120, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s jabberd: 1516 [cc5001a0, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s jabberd: 1519 [c13cc730, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1907 [cc500e80, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1908 [cc58eea0, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1909 [cc58e830, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1910 [cf438790, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1911 [cc58e1c0, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1912 [cc7a7530, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1913 [cc7a6ec0, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s nfsd: 1914 [cc7a6850, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s lockd: 1916 [cc7a61e0, 121]
(not blocked)
Nov 17 22:44:44 verena kernel: s rpciod/0: 1917 [cc783550, 111]
(not blocked)
Nov 17 22:44:44 verena kernel: s rpc.mountd: 1920 [cc782ee0, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s postmaster: 1962 [cc58f510, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s postmaster: 1966 [cc7fc890, 121]
(not blocked)
Nov 17 22:44:44 verena kernel: s postmaster: 1967 [cc7fcf00, 122]
(not blocked)
Nov 17 22:44:44 verena kernel: s slpd: 1981 [cc7fd570, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s sshd: 1987 [cc500810, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s vdrconvert.sh: 1996 [cbcd8f20, 135]
(not blocked)
Nov 17 22:44:44 verena kernel: s rpc.statd: 2004 [cc782870, 117]
(not blocked)
Nov 17 22:44:44 verena kernel: s Xvfb: 2005 [cbcd9590, 131]
(not blocked)
Nov 17 22:44:44 verena kernel: s atd: 2018 [cbcd8240, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s cron: 2021 [cc782200, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s fcron: 2024 [cbee6f40, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s apache: 2031 [cbee68d0, 116]
(not blocked)
Nov 17 22:44:44 verena kernel: s vdr: 2039 [cb1f4f60, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s getty: 2040 [cb1f48f0, 117]
(not blocked)
Nov 17 22:44:44 verena kernel: s getty: 2043 [ce6ac140, 117]
(not blocked)
Nov 17 22:44:44 verena kernel: s getty: 2044 [cbee75b0, 117]
(not blocked)
Nov 17 22:44:44 verena kernel: s getty: 2045 [cbcd88b0, 117]
(not blocked)
Nov 17 22:44:44 verena kernel: s getty: 2046 [cb1f55d0, 117]
(not blocked)
Nov 17 22:44:44 verena kernel: s lircd: 2059 [cbee6260, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s wait2enc.py: 2060 [cb2815f0, 135]
(not blocked)
Nov 17 22:44:44 verena kernel: s apache: 2063 [cb280f80, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s apache: 2064 [cb280910, 120]
(not blocked)
Nov 17 22:44:44 verena kernel: s apache: 2065 [cb2802a0, 121]
(not blocked)
Nov 17 22:44:44 verena kernel: s apache: 2066 [caead610, 122]
(not blocked)
Nov 17 22:44:44 verena kernel: s apache: 2067 [caeacfa0, 123]
(not blocked)
Nov 17 22:44:44 verena kernel: s vdradmind.pl: 2078 [caeac2c0, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s runvdr: 2079 [cb1f4280, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: s arm_mon: 2119 [ca3a7630, 119]
(not blocked)
Nov 17 22:44:44 verena kernel: s sleep: 2139 [c9de0fe0, 136]
(not blocked)
Nov 17 22:44:44 verena kernel: s vdr: 2147 [ca3a6950, 118]
(not blocked)
Nov 17 22:44:44 verena kernel: R vdr: 2148 [caeac930, 119]
(not blocked)
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: ---------------------------
Nov 17 22:44:44 verena kernel: | showing all locks held: |
Nov 17 22:44:44 verena kernel: ---------------------------
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #001: [c053ee6c]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #002: [c053ea90]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #003: [c053ec64]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #004: [c053f4b4]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #005: [c053f0d8]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #006: [c053f2ac]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #007: [c053fafc]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #008: [c053f720]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #009: [c053f8f4]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #010: [c0540144]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #011: [c053fd68]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #012: [c053ff3c]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #013: [c054078c]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #014: [c05403b0]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #015: [c0540584]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #016: [c0540dd4]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #017: [c05409f8]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #018: [c0540bcc]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #019: [c054141c]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #020: [c0541040]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #021: [c0541214]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #022: [c0541a64]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #023: [c0541688]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #024: [c054185c]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #025: [c05420ac]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #026: [c0541cd0]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #027: [c0541ea4]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #028: [c05426f4]
{&hwif->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x8e/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #029: [c0542318]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #030: [c05424ec]
{&drive->gendev_rel_sem}
Nov 17 22:44:44 verena kernel: .. held by: swapper: 0
[c044cd80, 140]
Nov 17 22:44:44 verena kernel: ... acquired at: init_hwif_data
+0x160/0x180
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #031: [ccd3dcf0]
{&tty->atomic_read}
Nov 17 22:44:44 verena kernel: .. held by: getty: 2040
[cb1f48f0, 117]
Nov 17 22:44:44 verena kernel: ... acquired at: read_chan+0x6f1/0x750
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #032: [cd66acf0]
{&tty->atomic_read}
Nov 17 22:44:44 verena kernel: .. held by: getty: 2044
[cbee75b0, 117]
Nov 17 22:44:44 verena kernel: ... acquired at: read_chan+0x6f1/0x750
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #033: [cda85cf0]
{&tty->atomic_read}
Nov 17 22:44:44 verena kernel: .. held by: getty: 2045
[cbcd88b0, 117]
Nov 17 22:44:44 verena kernel: ... acquired at: read_chan+0x6f1/0x750
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #034: [cdfe6cf0]
{&tty->atomic_read}
Nov 17 22:44:44 verena kernel: .. held by: getty: 2046
[cb1f55d0, 117]
Nov 17 22:44:44 verena kernel: ... acquired at: read_chan+0x6f1/0x750
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #035: [cb22ccf0]
{&tty->atomic_read}
Nov 17 22:44:44 verena kernel: .. held by: getty: 2043
[ce6ac140, 117]
Nov 17 22:44:44 verena kernel: ... acquired at: read_chan+0x6f1/0x750
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #036: [c9c8fa10] {&fe->sem}
Nov 17 22:44:44 verena kernel: .. held by: vdr: 2147
[ca3a6950, 118]
Nov 17 22:44:44 verena kernel: ... acquired at: dvb_frontend_start
+0x75/0x100
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: #037: [c045e444]
{kernel_sem.lock}
Nov 17 22:44:44 verena kernel: .. held by: vdr: 2147
[ca3a6950, 118]
Nov 17 22:44:44 verena kernel: ... acquired at: lock_kernel+0x27/0x50
Nov 17 22:44:44 verena kernel:
=============================================
Nov 17 22:44:44 verena kernel:
Nov 17 22:44:44 verena kernel: [ turning off deadlock detection. Please
report this trace. ]
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
On Thu, 2004-11-18 at 13:35 +0100, Ingo Molnar wrote:
> i have released the -V0.7.28-1 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this should fix the lockup bug reported by Florian Schmidt.
>
> there's a generic PREEMPT bug in the upstream kernel: there exists a
> single-instruction race window in __flush_tlb(), if the kernel preempted
> exactly there in a lazy-TLB thread and certain other, rare scheduling
> and MM properties were true as well (a certain constellation of threads
> and lazy-TLB kernel threads occured), and the lazy-TLB task then got
> another user TLB to inherit, and switched to a task from which it
> inherited that new TLB, thus the wrong cr3 was loaded and inherited by
> this next, non-lazy-TLB next task; then (and only then) this scenario
> would typically manifest itself in the form of an infinite pagefault
> lockup occuring much after the fact, upon the next userspace access (to
> the joy of a totally baffled kernel developer). I suspect from the
> description you can guess how much fun it was to debug it =B-)
>
> the bug is even more rare in the generic kernel, because there most (but
> not all) TLB flush points are in a critical section.
>
> this fix could resolve some of the other 'my box just locked up'
> reports.
Hi,
I've got one of those 'my box just locked up'. I can reproduce it with
0.7.25-1, 0.7.28-0 and 0.7.28-1 by starting the Jetty servlet container
with our inhouse java project under a Blackdown 1.4 jdk. Within a minute
the laptop just locks up: no mouse, no ping, console switching sysrq-t
or anything. The peculiar thing is that I was running 0.7.25-1 for two
or three days before and it was rocksolid. It was just when I started to
work with the jvm that things fell apart.
Any chance to get any interesting and helpful data in this setup ?
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
>I've got one of those 'my box just locked up'. I can reproduce it with
>0.7.25-1, 0.7.28-0 and 0.7.28-1 by starting the Jetty servlet container
>with our inhouse java project under a Blackdown 1.4 jdk. Within a minute
>the laptop just locks up: no mouse, no ping, console switching sysrq-t
Sysrq+T itself might work, it's just a matter to get to the console! If you can
get to a con, I'm sure it does, because hitting the keys generates an interrupt
which will be delivered to the sysrq code if interrupts are not currently
disabled. So even if your box appears totally dead, you'd be surprised when you
hit Sysrq+B.
To get any data, I'd start a vnc server on it, and all graphical apps in it,
and let the machine itself stay on tty10 (or wherever kernel output goes), and
then try hitting Sysrq+T when it hangs.
Jan Engelhardt
--
Gesellschaft für Wissenschaftliche Datenverarbeitung
Am Fassberg, 37077 Göttingen, http://www.gwdg.de
i have released the -V0.7.29-0 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is a pure merge of -V0.7.28-2 to 2.6.10-rc2-mm2. -rc2-mm2 itself is
a fixes-only release.
to create a -V0.7.29-0 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm2/2.6.10-rc2-mm2.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm2-V0.7.29-0
Ingo
On Thu, 2004-11-18 at 17:11 +0100, Ingo Molnar wrote:
> * Christian Meder <[email protected]> wrote:
>
> > On Wed, 2004-11-17 at 13:42 +0100, Ingo Molnar wrote:
> > > i have released the -V0.7.28-0 Real-Time Preemption patch, which can be
> > > downloaded from the usual place:
> > >
> > > http://redhat.com/~mingo/realtime-preempt/
> > >
> > > this is a fixes & latency-reduction release.
> >
> > Hi,
> >
> > here's another message log this time on my Dell laptop when removing
> > my prism wlan pccard running the hostap driver.
>
> could you try this with the vanilla 2.6.10-rc2-mm1 kernel too? The crash
> you got is an escallation of a crash within a critical section, but that
> original crash does not seem to be directly related to PREEMPT_RT.
2.6.10-rc2-mm1 would have been my next try anyway cause otherwise I'd be
rather unable to get any work done for my job due to the jvm lockup
problem I described in my other mail ;-)
> (also, please enable CONFIG_USE_FRAME_POINTER, to make the backtraces
> easier to parse.)
Sure.
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
On Thu, 2004-11-18 at 17:11 +0100, Ingo Molnar wrote:
> * Christian Meder <[email protected]> wrote:
>
> > On Wed, 2004-11-17 at 13:42 +0100, Ingo Molnar wrote:
> > > i have released the -V0.7.28-0 Real-Time Preemption patch, which can be
> > > downloaded from the usual place:
> > >
> > > http://redhat.com/~mingo/realtime-preempt/
> > >
> > > this is a fixes & latency-reduction release.
> >
> > Hi,
> >
> > here's another message log this time on my Dell laptop when removing
> > my prism wlan pccard running the hostap driver.
>
> could you try this with the vanilla 2.6.10-rc2-mm1 kernel too? The crash
> you got is an escallation of a crash within a critical section, but that
> original crash does not seem to be directly related to PREEMPT_RT.
Ok, tried it now. The output from 2.6.10-rc2-mm1 on removal of the prism
pccard is pretty innocuous and everything works fine:
Nov 18 17:29:27 localhost kernel: hostap_cs: CS_EVENT_CARD_REMOVAL
Nov 18 17:29:27 localhost kernel: wifi0: card already removed or not
configured during shutdown
Nov 18 17:29:27 localhost kernel: wifi0: Interrupt, but dev not OK
Nov 18 17:29:27 localhost kernel: hostap_cs: Driver unloaded
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
* Christian Meder <[email protected]> wrote:
> I've got one of those 'my box just locked up'. I can reproduce it with
> 0.7.25-1, 0.7.28-0 and 0.7.28-1 by starting the Jetty servlet
> container with our inhouse java project under a Blackdown 1.4 jdk.
> Within a minute the laptop just locks up: no mouse, no ping, console
> switching sysrq-t or anything. The peculiar thing is that I was
> running 0.7.25-1 for two or three days before and it was rocksolid. It
> was just when I started to work with the jvm that things fell apart.
>
> Any chance to get any interesting and helpful data in this setup ?
best would be to have a reproducer. Can you trigger it over the network,
using a remote session? If yes then you might want to try this: let the
box boot in, switch it to a text console and dont touch the keyboard
after that. Do this over the remote session:
echo 1 > /proc/sys/kernel/debug_direct_keyboard
this activates a direct interrupt line for the keyboard only. Keep the
box on the text-console, and try to reproduce the hang over the network.
Once it triggers, try SysRq - does it work? (leds wont work, but normal
keys should work.)
if the keyboard still doesnt work then you could try nmi_watchdog=1 or
nmi_watchdog=2 (the latter on IO-APIC-less systems), and serial logging,
to capture a dump of the hard-lockup.
Ingo
* Christian Meder <[email protected]> wrote:
> > could you try this with the vanilla 2.6.10-rc2-mm1 kernel too? The crash
> > you got is an escallation of a crash within a critical section, but that
> > original crash does not seem to be directly related to PREEMPT_RT.
>
> Ok, tried it now. The output from 2.6.10-rc2-mm1 on removal of the prism
> pccard is pretty innocuous and everything works fine:
>
> Nov 18 17:29:27 localhost kernel: hostap_cs: CS_EVENT_CARD_REMOVAL
> Nov 18 17:29:27 localhost kernel: wifi0: card already removed or not
> configured during shutdown
> Nov 18 17:29:27 localhost kernel: wifi0: Interrupt, but dev not OK
> Nov 18 17:29:27 localhost kernel: hostap_cs: Driver unloaded
ok. Could you please retry with the latest kernel and USE_FRAME_POINTERS
enabled? It wasnt completely clear from your previous log precisely
which function generated the fault so it would be easier for me to sort
it out if you could reproduce it once more.
Ingo
On Thu, 18 Nov 2004, Ingo Molnar wrote:
>
> * Christian Meder <[email protected]> wrote:
>
> > > could you try this with the vanilla 2.6.10-rc2-mm1 kernel too? The crash
> > > you got is an escallation of a crash within a critical section, but that
> > > original crash does not seem to be directly related to PREEMPT_RT.
> >
> > Ok, tried it now. The output from 2.6.10-rc2-mm1 on removal of the prism
> > pccard is pretty innocuous and everything works fine:
> >
> > Nov 18 17:29:27 localhost kernel: hostap_cs: CS_EVENT_CARD_REMOVAL
> > Nov 18 17:29:27 localhost kernel: wifi0: card already removed or not
> > configured during shutdown
> > Nov 18 17:29:27 localhost kernel: wifi0: Interrupt, but dev not OK
> > Nov 18 17:29:27 localhost kernel: hostap_cs: Driver unloaded
>
> ok. Could you please retry with the latest kernel and USE_FRAME_POINTERS
> enabled? It wasnt completely clear from your previous log precisely
> which function generated the fault so it would be easier for me to sort
> it out if you could reproduce it once more.
That's CONFIG_FRAME_POINTER, btw.
* Rui Nuno Capela <[email protected]> wrote:
> I'm still seeing this sometimes (not everytime) on my P4/UP laptop
> while shutting down ALSA modules. This isn't the same as the lockup
> I've been reporting lately (that happens on my P4/SMT desktop) but may
> be remotely related.
could you send me the latest .config of your laptop?
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> I'm still seeing this sometimes (not everytime) on my P4/UP laptop
> while shutting down ALSA modules. This isn't the same as the lockup
> I've been reporting lately (that happens on my P4/SMT desktop) but may
> be remotely related.
could you (and Christian) try the patch below, ontop of a current-ish
tree - does the unload crash still occur? (this is an earlier cleanup
patch from Thomas Gleixner, but it could fix a real PREEMPT_RT bug in
this particular case.)
Ingo
--- linux/drivers/base/driver.c.orig
+++ linux/drivers/base/driver.c
@@ -79,14 +79,13 @@ void put_driver(struct device_driver * d
* since most of the things we have to do deal with the bus
* structures.
*
- * The one interesting aspect is that we initialize @drv->unload_sem
- * to a locked state here. It will be unlocked when the driver
- * reference count reaches 0.
+ * We init the completion strcut here. When the reference
+ * count reaches zero, complete() is called from bus_release().
*/
int driver_register(struct device_driver * drv)
{
INIT_LIST_HEAD(&drv->devices);
- init_MUTEX_LOCKED(&drv->unload_sem);
+ init_completion(&drv->unload_done);
return bus_add_driver(drv);
}
@@ -97,18 +96,16 @@ int driver_register(struct device_driver
*
* Again, we pass off most of the work to the bus-level call.
*
- * Though, once that is done, we attempt to take @drv->unload_sem.
- * This will block until the driver refcount reaches 0, and it is
- * released. Only modular drivers will call this function, and we
+ * Though, once that is done, we wait until the driver refcount
+ * reaches 0, and complete() is called in bus_release().
+ * Only modular drivers will call this function, and we
* have to guarantee that it won't complete, letting the driver
* unload until all references are gone.
*/
-
void driver_unregister(struct device_driver * drv)
{
bus_remove_driver(drv);
- down(&drv->unload_sem);
- up(&drv->unload_sem);
+ wait_for_completion(&drv->unload_done);
}
/**
--- linux/drivers/base/bus.c.orig
+++ linux/drivers/base/bus.c
@@ -65,7 +65,7 @@ static struct sysfs_ops driver_sysfs_ops
static void driver_release(struct kobject * kobj)
{
struct device_driver * drv = to_driver(kobj);
- up(&drv->unload_sem);
+ complete(&drv->unload_done);
}
static struct kobj_type ktype_driver = {
--- linux/include/linux/device.h.orig
+++ linux/include/linux/device.h
@@ -102,7 +102,7 @@ struct device_driver {
char * name;
struct bus_type * bus;
- struct semaphore unload_sem;
+ struct completion unload_done;
struct kobject kobj;
struct list_head devices;
* Christian Meder <[email protected]> wrote:
> Thanks, will try soon and report. There was another trace in my log of
> the vdr/router box which seemed unrelated to the bridging traces.
does the patch below fix that message?
Ingo
--- linux/drivers/media/dvb/dvb-core/dvb_frontend.c.orig2
+++ linux/drivers/media/dvb/dvb-core/dvb_frontend.c
@@ -658,7 +658,7 @@ static void dvb_frontend_stop (struct dv
printk("dvb_frontend_stop: thread PID %d already died\n",
fe->thread_pid);
/* make sure the mutex was not held by the thread */
- init_MUTEX (&fe->sem);
+ sema_init_nocheck (&fe->sem, 1);
return;
}
@@ -1127,10 +1127,10 @@ dvb_register_frontend (int (*ioctl) (str
memset (fe, 0, sizeof (struct dvb_frontend_data));
- init_MUTEX (&fe->sem);
+ sema_init_nocheck (&fe->sem, 1);
init_waitqueue_head (&fe->wait_queue);
init_waitqueue_head (&fe->events.wait_queue);
- init_MUTEX (&fe->events.sem);
+ sema_init_nocheck (&fe->events.sem, 1);
fe->events.eventw = fe->events.eventr = 0;
fe->events.overflow = 0;
fe->module = module;
Ingo Molnar wrote:
> i have released the -V0.7.29-0 Real-Time Preemption patch,
Hi,
is it supposed to work on x86_64? It doesn't compile.
I tried to fix the compilation errors with these 2 one-liners:
diff -Nurp linux-2.6.10-rc2-mm2-RT0.7.29-0/arch/x86_64/kernel/time.c linux-2.6.10-rc2-mm2-RT/arch/x86_64/kernel/time.c
--- linux-2.6.10-rc2-mm2-RT0.7.29-0/arch/x86_64/kernel/time.c 2004-11-18 22:16:10.728832816 +0100
+++ linux-2.6.10-rc2-mm2-RT/arch/x86_64/kernel/time.c 2004-11-18 22:00:57.000000000 +0100
@@ -49,7 +49,7 @@ static void cpufreq_delayed_get(void);
extern int using_apic_timer;
-DEFINE_RAW_SPINLOCK(rtc_lock);
+DEFINE_SPINLOCK(rtc_lock);
DEFINE_RAW_SPINLOCK(i8253_lock);
static int nohpet __initdata = 0;
diff -Nurp linux-2.6.10-rc2-mm2-RT0.7.29-0/include/asm-x86_64/vsyscall.h linux-2.6.10-rc2-mm2-RT/include/asm-x86_64/vsyscall.h
--- linux-2.6.10-rc2-mm2-RT0.7.29-0/include/asm-x86_64/vsyscall.h 2004-11-18 22:16:11.739679144 +0100
+++ linux-2.6.10-rc2-mm2-RT/include/asm-x86_64/vsyscall.h 2004-11-18 21:56:30.000000000 +0100
@@ -52,7 +52,7 @@ extern struct vxtime_data vxtime;
extern unsigned long wall_jiffies;
extern struct timezone sys_tz;
extern int sysctl_vsyscall;
-extern raw_seqlock_t xtime_lock;
+//extern raw_seqlock_t xtime_lock;
#define ARCH_HAVE_XTIME_LOCK 1
It now compiles. When I try to run it, I get instant reboot
after "BIOS data check successful".
This is not necessarily a new issue. I haven't tried running
realtime-preempt on x86_64 before.
Michal
On Thu, 2004-11-18 at 16:54 +0100, Ingo Molnar wrote:
> * Christian Meder <[email protected]> wrote:
>
> > after successfully running the last couple of rt patches on my Dell
> > Inspiron laptop I thought I'd give it a try on my combined vdr/router
> > box which is probably more interesting from a rt point of view. This
> > box is bridging wireless/ADSL and working as a digital vdr using the
> > kernel DVB-S drivers.
> >
> > I got the appended logging messages with the appended config. Is
> > there anything else I should provide for debugging purposes or are the
> > messages just harmless ?
>
> the messages mean that i havent converted the bridge code's RCU locking
> to PREEMPT_RT yet. I've done this in the -V0.7.28-2 patch that i've just
> uploaded to:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> does bridging work fine with this patch, and if yes, do you get any
> (other) warning messages?
I'm running 0.7.29 now on the vdr/router box and bridging is working
fine and there are no new warning messages.
Thanks,
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
On Thu, 2004-11-18 at 21:42 +0100, Ingo Molnar wrote:
> * Christian Meder <[email protected]> wrote:
>
> > Thanks, will try soon and report. There was another trace in my log of
> > the vdr/router box which seemed unrelated to the bridging traces.
>
> does the patch below fix that message?
Yesss. This patch applied on top of 0.7.29-0 removes all warning
messages (bridging/dvb) in my vdr/router box.
Thanks,
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
On Thu, 2004-11-18 at 22:05 +0100, Ingo Molnar wrote:
> * Rui Nuno Capela <[email protected]> wrote:
>
> > I'm still seeing this sometimes (not everytime) on my P4/UP laptop
> > while shutting down ALSA modules. This isn't the same as the lockup
> > I've been reporting lately (that happens on my P4/SMT desktop) but may
> > be remotely related.
>
> could you (and Christian) try the patch below, ontop of a current-ish
> tree - does the unload crash still occur? (this is an earlier cleanup
> patch from Thomas Gleixner, but it could fix a real PREEMPT_RT bug in
> this particular case.)
This patch on top of 0.7.29-0 fixes the prism pccard unload crash on my
Dell laptop.
This just leaves me with the mysterious traceless jvm related crash.
I'll do my best to get a trace ;-)
Thanks,
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
On Thu, 2004-11-18 at 14:46 +0000, Rui Nuno Capela wrote:
> I'm still seeing this sometimes (not everytime) on my P4/UP laptop while
> shutting down ALSA modules.
Same thing here (BUG on unload ALSA modules) with 0.7.27-10 FWIW.
Lee
* Rui Nuno Capela <[email protected]> wrote:
> I'm still seeing this sometimes (not everytime) on my P4/UP laptop
> while shutting down ALSA modules. This isn't the same as the lockup
> I've been reporting lately (that happens on my P4/SMT desktop) but may
> be remotely related.
this seems to be quite similar to the module-unload crash Christian
Meder reported, and it seems to be a genuine PREEMPT_RT bug. (or an
upstream bug that the vanilla kernel ignores silently.)
Ingo
* Christian Meder <[email protected]> wrote:
> This just leaves me with the mysterious traceless jvm related crash.
> I'll do my best to get a trace ;-)
it would be equally useful to somehow reproduce the lockup with public
software only, and post the precise steps how to reproduce it.
Ingo
* Lee Revell <[email protected]> wrote:
> On Thu, 2004-11-18 at 14:46 +0000, Rui Nuno Capela wrote:
> > I'm still seeing this sometimes (not everytime) on my P4/UP laptop while
> > shutting down ALSA modules.
>
> Same thing here (BUG on unload ALSA modules) with 0.7.27-10 FWIW.
0.7.29-1 fixes a similar module-unload problem reported by Christian
Meder, could you try it?
Ingo
* Michal Schmidt <[email protected]> wrote:
> Ingo Molnar wrote:
> >i have released the -V0.7.29-0 Real-Time Preemption patch,
>
> Hi,
> is it supposed to work on x86_64? It doesn't compile.
not at the moment - it needs the timer interrupt threading changes.
Ingo
I'm getting a bug print (really a warning) from enable_irq spawned from
the e100 driver. The reason is that enable_irq is being called because
the irq depth is zero.
Looking into this, it is because the e100 uses a shared interrupt. On
setup (see drivers/net/e100.c: e100_up) it disables the irq that it will
use, and then calls request_irq which calls setup_irq which zeros out
the depth of the irq if it is not shared. So if the e100 is the first
to be loaded, then you get this message.
I know that for now this doesn't hurt anything, but besides annoying me
in my print outs (I can't stop panicking when I see it ;-), is this
really a bug and thus a design flaw of the e100? How else can a shared
irq initialize without turning off the irq before setting itself up?
Should it enable the irq before it requests it, and thus open the race
of a spurious interrupt, or just disable all interrupts?
Thanks,
--
Steven Rostedt
Senior Engineer
Kihon Technologies
On Fri, 2004-11-19 at 04:56, Ingo Molnar wrote:
> 0.7.29-1 fixes a similar module-unload problem reported by Christian
> Meder, could you try it?
>
> Ingo
>
Hi, just tried V0.7.29-1 with the ivtv module and I got this oops:
Attached scsi generic sg2 at scsi0, channel 0, id 2, lun 0, type 0
IRQ#22 thread RT prio: 38.
BUG: Unable to handle kernel NULL pointer dereference at virtual address 00000000
printing eip:
e0ab77f8
*pde = 00000000
Oops: 0002 [#1]
PREEMPT
Modules linked in: sg msp3400 saa7115 tveeprom ivtv nfsd exportfs lockd sunrpc tuner tvaudio bttv video_buf firmware_class btcx_risc snd_via82xx snd_mpu401_uart i2c_viapro joydev usbhid uhci_hcd usbcore 3c59x mii emu10k1_gp gameport snd_emu10k1 snd_rawmidi snd_seq_device snd_ac97_codec snd_pcm_oss snd_mixer_oss snd_pcm snd_timer snd_page_alloc snd_util_mem snd_hwdep snd soundcore via_agp agpgart dm_mod realtime rtc
CPU: 0
EIP: 0060:[<e0ab77f8>] Not tainted VLI
EFLAGS: 00210286 (2.6.10-rc22-V29)
EIP is at buffer_queue+0x38/0x70 [bttv]
eax: de7d4744 ebx: 00000000 ecx: d745c2a0 edx: d745c304
esi: de7d40e8 edi: e0ad34e0 ebp: d2571c34 esp: d2571c24
ds: 007b es: 007b ss: 0068 preempt: 00000001
Process mythbackend (pid: 4071, threadinfo=d2570000 task=d27fb800)
Stack: d2571c34 00200286 00200286 e0ad39dc d2571ec8 e0ab8fb8 de7d4000 d745c2a0
e0ac37b4 00000160 000001e0 00000004 00001c3c e0ad34e0 de7d4010 d27fb800
d2571c78 c01112d0 00000002 d745c2a0 c01dcf75 de7d4000 c01339a1 00200286
Call Trace:
[<c0104093>] show_stack+0x83/0xa0 (28)
[<c010424c>] show_registers+0x16c/0x1d0 (56)
[<c0104447>] die+0xf7/0x190 (64)
[<c0114fb0>] do_page_fault+0x360/0x6b0 (220)
[<c0103cab>] error_code+0x2b/0x30 (76)
[<e0ab8fb8>] bttv_do_ioctl+0x538/0x15f0 [bttv] (660)
[<c0276a4e>] video_usercopy+0x8e/0x160 (168)
[<e0aba0b4>] bttv_ioctl+0x44/0x70 [bttv] (32)
[<c01741a1>] sys_ioctl+0xe1/0x280 (44)
[<c0103245>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c010438f>] .... die+0x3f/0x190
.....[<c0114fb0>] .. ( <= do_page_fault+0x360/0x6b0)
.. [<c0136f4d>] .... print_traces+0x1d/0x60
.....[<c0104093>] .. ( <= show_stack+0x83/0xa0)
Code: eb 9a 65 df 8b 45 08 8b 4d 0c 8b 80 ec 00 00 00 8d 51 64 8b 30 c7 41 20 02 00 00 00 8d 86 5c 06 00 00 8b 58 04 89 41 64 89 50 04 <89> 13 89 5a 04 8b 8e 78 06 00 00 85 c9 74 08 8b 5d f8 8b 75 fc
BUG: mythbackend/4071, BKL held at task exit time!
BKL acquired at: sys_ioctl+0x5e/0x280
[c03b1d44] {kernel_sem.lock}
.. held by: mythbackend: 4071 [d27fb800, 119]
... acquired at: lock_kernel+0x2f/0x50
BUG: mythbackend/4071, lock held at task exit time!
[de7d4014] {&q->lock}
.. held by: mythbackend: 4071 [d27fb800, 119]
... acquired at: bttv_do_ioctl+0x476/0x15f0 [bttv]
BUG: mythbackend/4071, lock held at task exit time!
[e0ad39dc] {&btv->s_lock}
.. held by: mythbackend: 4071 [d27fb800, 119]
... acquired at: bttv_do_ioctl+0x51b/0x15f0 [bttv]
mythbackend/4071: BUG in __up_mutex at kernel/rt.c:1101
[<c01040d3>] dump_stack+0x23/0x30 (20)
[<c01340a2>] __up_mutex+0x302/0x530 (52)
[<c01351b1>] up+0x101/0x110 (36)
[<c0342df1>] __schedule+0x6b1/0x750 (72)
[<c011e797>] do_exit+0x2f7/0x590 (48)
[<c01044e0>] do_trap+0x0/0x100 (64)
[<c0114fb0>] do_page_fault+0x360/0x6b0 (220)
[<c0103cab>] error_code+0x2b/0x30 (76)
[<e0ab8fb8>] bttv_do_ioctl+0x538/0x15f0 [bttv] (660)
[<c0276a4e>] video_usercopy+0x8e/0x160 (168)
[<e0aba0b4>] bttv_ioctl+0x44/0x70 [bttv] (32)
[<c01741a1>] sys_ioctl+0xe1/0x280 (44)
[<c0103245>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c034278f>] .... __schedule+0x4f/0x750
.....[<c011e797>] .. ( <= do_exit+0x2f7/0x590)
.. [<c0135161>] .... up+0xb1/0x110
.....[<c0342df1>] .. ( <= __schedule+0x6b1/0x750)
.. [<c01342b7>] .... __up_mutex+0x517/0x530
.....[<c01351b1>] .. ( <= up+0x101/0x110)
.. [<c0136f4d>] .... print_traces+0x1d/0x60
.....[<c01040d3>] .. ( <= dump_stack+0x23/0x30)
Regards,
Shane
Steven Rostedt wrote:
> I'm getting a bug print (really a warning) from enable_irq spawned from
> the e100 driver. The reason is that enable_irq is being called because
> the irq depth is zero.
>
> Looking into this, it is because the e100 uses a shared interrupt. On
> setup (see drivers/net/e100.c: e100_up) it disables the irq that it will
> use, and then calls request_irq which calls setup_irq which zeros out
> the depth of the irq if it is not shared. So if the e100 is the first
> to be loaded, then you get this message.
>
> I know that for now this doesn't hurt anything, but besides annoying me
> in my print outs (I can't stop panicking when I see it ;-), is this
> really a bug and thus a design flaw of the e100? How else can a shared
> irq initialize without turning off the irq before setting itself up?
>
> Should it enable the irq before it requests it, and thus open the race
> of a spurious interrupt, or just disable all interrupts?
>
> Thanks,
>
Actually I think it shouldn't call either enable or disable because it
is shared (or allowed to be shared). After creating a patch myself to
fix this I realized that it had already been fixed in the newest version
of the driver on sourceforge. Anyway if you are interested in this fix
temporarily, here it is.
kr
Hi Ingo,
in 29-4 I get a lot of these lines in my log:
drivers/usb/input/hid-core.c: input irq status -71 received
29-0 didn't do that.
Kind regards,
--
Peter Zijlstra <[email protected]>
On Thu, 2004-11-18 at 17:46 +0100, Ingo Molnar wrote:
> i have released the -V0.7.29-0 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
I tried this with CONFIG_PREEMPT_VOLUNTARY (which should theoretically
work like the earlier VP patches, right?) to test for regressions. The
boot process hung after initializing my IDE controller.
Lee
On Fri, 19 Nov 2004 22:22:42 -0500
Lee Revell <[email protected]> wrote:
> On Thu, 2004-11-18 at 17:46 +0100, Ingo Molnar wrote:
> > i have released the -V0.7.29-0 Real-Time Preemption patch, which can be
> > downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> I tried this with CONFIG_PREEMPT_VOLUNTARY (which should theoretically
> work like the earlier VP patches, right?) to test for regressions. The
> boot process hung after initializing my IDE controller.
I thought so, too, until ingo set me straight (quote ingo):
here are the different layers of preemption:
- !PREEMPT
- PREEMPT_VOLUNTARY
- PREEMPT
- PREEMPT_RT
each step forward decreases latencies, at the cost of more runtime
overhead.
so PREEMPT_VOLUNTARY is not "the" feature anymore, it's "a" feature down
the hierarchy. In fact the focus is mostly on PREEMPT_RT now.
quote end..
So with RP kernels, PREEMPT is what gives best latency when full realtime
preemption is not an option
flo
* Lee Revell <[email protected]> wrote:
> On Thu, 2004-11-18 at 17:46 +0100, Ingo Molnar wrote:
> > i have released the -V0.7.29-0 Real-Time Preemption patch, which can be
> > downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> I tried this with CONFIG_PREEMPT_VOLUNTARY (which should theoretically
> work like the earlier VP patches, right?) to test for regressions.
> The boot process hung after initializing my IDE controller.
which patch did you try? I fixed the 'lower' preemption levels in
-V0.7.29-4, earlier kernels are broken.
Ingo
* Florian Schmidt <[email protected]> wrote:
> here are the different layers of preemption:
>
> - !PREEMPT
> - PREEMPT_VOLUNTARY
> - PREEMPT
> - PREEMPT_RT
>
> each step forward decreases latencies, at the cost of more runtime
> overhead.
yes, and here is how they show up in the config:
( ) No Forced Preemption (Server)
( ) Voluntary Kernel Preemption (Desktop)
( ) Preemptible Kernel (Low-Latency Desktop)
(X) Complete Preemption (Real-Time)
measurements are needed to find out how latencies and runtime overhead
vary between these levels.
Ingo
* Shane Shrybman <[email protected]> wrote:
> On Fri, 2004-11-19 at 04:56, Ingo Molnar wrote:
>
> > 0.7.29-1 fixes a similar module-unload problem reported by Christian
> > Meder, could you try it?
> >
> > Ingo
> >
>
> Hi, just tried V0.7.29-1 with the ivtv module and I got this oops:
hm, does vanilla 2.6.10-rc2-mm2 work?
Ingo
* Peter Zijlstra <[email protected]> wrote:
> Hi Ingo,
>
> in 29-4 I get a lot of these lines in my log:
>
> drivers/usb/input/hid-core.c: input irq status -71 received
>
> 29-0 didn't do that.
weird. I've attached the diff between -0 and -4 - nothing should affect
USB, except perhaps the manage.c bits. Could you try to revert the
smaller plaintext patch below, does that solve this problem?
Ingo
--- linux.old/kernel/irq/manage.c
+++ linux.new/kernel/irq/manage.c
@@ -509,9 +509,7 @@ static int start_irq_thread(int irq, str
* such a case:
*/
smp_mb();
-
- if (desc->status & IRQ_INPROGRESS)
- wake_up_process(desc->thread);
+ wake_up_process(desc->thread);
return 0;
}
On Sat, 2004-11-20 at 13:55 +0100, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
>
> > On Thu, 2004-11-18 at 17:46 +0100, Ingo Molnar wrote:
> > > i have released the -V0.7.29-0 Real-Time Preemption patch, which can be
> > > downloaded from the usual place:
> > >
> > > http://redhat.com/~mingo/realtime-preempt/
> >
> > I tried this with CONFIG_PREEMPT_VOLUNTARY (which should theoretically
> > work like the earlier VP patches, right?) to test for regressions.
> > The boot process hung after initializing my IDE controller.
>
> which patch did you try? I fixed the 'lower' preemption levels in
> -V0.7.29-4, earlier kernels are broken.
>
This was the version I tried. I will try to provide some more info.
Lee
* Lee Revell <[email protected]> wrote:
> > > > http://redhat.com/~mingo/realtime-preempt/
> > >
> > > I tried this with CONFIG_PREEMPT_VOLUNTARY (which should theoretically
> > > work like the earlier VP patches, right?) to test for regressions.
> > > The boot process hung after initializing my IDE controller.
> >
> > which patch did you try? I fixed the 'lower' preemption levels in
> > -V0.7.29-4, earlier kernels are broken.
> >
>
> This was the version I tried. I will try to provide some more info.
i only tried the !PREEMPT version though - does that one work for you?
Also, please send me the .config that produces the failing kernel.
Ingo
On Sat, 2004-11-20 at 20:14 +0100, Ingo Molnar wrote:
> i only tried the !PREEMPT version though - does that one work for you?
Not sure, will test. My goal was to see if I could get the stability
and low latency of T3 (this is low enough latency for me!) with the new
versions.
> Also, please send me the .config that produces the failing kernel.
Sent (off-list).
Lee
On Sat, 20 Nov 2004 13:35:44 -0500
Lee Revell <[email protected]> wrote:
> On Sat, 2004-11-20 at 20:14 +0100, Ingo Molnar wrote:
> > i only tried the !PREEMPT version though - does that one work for you?
>
> Not sure, will test. My goal was to see if I could get the stability
> and low latency of T3 (this is low enough latency for me!) with the new
> versions.
>
> > Also, please send me the .config that produces the failing kernel.
>
> Sent (off-list).
Hi,
29-4 with PREEMPT works very good (jackd at 64 frames: 0 xruns (running for
1h now), soundcard irq unthreaded). Opposed to 29-1 PREEMPT_REALTIME which
showed some very weird jackd behaviour (xruns from 10usec to 50msec [!!!]).
rtc_wakeup was showing no large jitter for that kernel though, nor did the
different traces show anything that might have caused the jackd xruns. And
yes, i configured the irq handlers sanely :)
Will build 29-4 PREEMPT_REALTIME now and see how this one behaves.
flo
On Sat, 2004-11-20 at 20:14 +0100, Ingo Molnar wrote:
> i only tried the !PREEMPT version though - does that one work for you?
> Also, please send me the .config that produces the failing kernel.
>
OK it allows me to set PREEMPT_NONE, PREEMPT_SOFTIRQS, and
PREEMPT_HARDIRQS. This should be an illegal combination, right?
Lee
Ingo Molnar wrote:
> i have released the -V0.7.29-0 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
I have some latency test results from -V0.7.29-4 generated using the rtc
histograms and realfeel. The test runs were just over an hour under
heavy load from stress-kernel. One is from a slower 450 UP system and
one is from a 933 SMP system. I will be doing more testing but these are
a start.
http://www.cybsft.com/testresults/histograms/up450test1.hist.png
http://www.cybsft.com/testresults/histograms/up450test1.hist.png
kr
On Sat, 20 Nov 2004 20:11:55 +0100
Florian Schmidt <[email protected]> wrote:
> 29-4 with PREEMPT works very good (jackd at 64 frames: 0 xruns (running for
> 1h now), soundcard irq unthreaded). Opposed to 29-1 PREEMPT_REALTIME which
> showed some very weird jackd behaviour (xruns from 10usec to 50msec [!!!]).
> rtc_wakeup was showing no large jitter for that kernel though, nor did the
> different traces show anything that might have caused the jackd xruns. And
> yes, i configured the irq handlers sanely :)
>
> Will build 29-4 PREEMPT_REALTIME now and see how this one behaves.
Pretty much as bad as 29-1. Sadly i have no idea on how to find out what is
causing jackd to act so weird under a PREEMPT_REALTIME kernel. It seems
there is some correlation to activity on X. Hiding and showing windows has a
certain chance of triggering a large xrun.
Hmm, the max jitter rtc_wakeup shows at 1024hz is around 150us. Which seems
a tiny bit large, too, as the rtc histogram shows a max wakeup latency of
16us..
It seems it's not the threaded irq handlers as jackd peformed quite well
under 29-4 PREEMPT with the soundcrd irq handler threaded and at high prio
(which i forgot to mention in my previous mail).
So i don't really know how to go about this. I suppose i just run PREEMPT
kernels instead of PREEMPT_REALTIME. Maybe it's the overhead which is
killing jackd performance with PREEMPT_REALTIME, but i don't believe so
(50ms? nah!).
flo
P.S.: There's so many variables in this PREEMPT/PREEMPT_REALTIME, handlers
threaded/unthreaded. IRQ handler thread priorities. It would probably be
cool if we could create some testing procedure which produces results which
are comparable. Ideally this procedure would be automated. Any takers?
K.R. Foley wrote:
> Ingo Molnar wrote:
>
>> i have released the -V0.7.29-0 Real-Time Preemption patch, which can be
>> downloaded from the usual place:
>>
>
> I have some latency test results from -V0.7.29-4 generated using the rtc
> histograms and realfeel. The test runs were just over an hour under
> heavy load from stress-kernel. One is from a slower 450 UP system and
> one is from a 933 SMP system. I will be doing more testing but these are
> a start.
>
> http://www.cybsft.com/testresults/histograms/up450test1.hist.png
>
> http://www.cybsft.com/testresults/histograms/up450test1.hist.png
OK. I feel stupid. The above URL should be:
http://www.cybsft.com/testresults/histograms/smp933test1.hist.png
kr
On Sat, 2004-11-20 at 20:14 +0100, Ingo Molnar wrote:
> i only tried the !PREEMPT version though - does that one work for you?
Yup, !PREEMPT works fine. Testing PREEMPT next. So far only
PREEMPT_VOLUNTARY fails to boot.
Lee
* Florian Schmidt <[email protected]> wrote:
> > Will build 29-4 PREEMPT_REALTIME now and see how this one behaves.
>
> Pretty much as bad as 29-1. Sadly i have no idea on how to find out
> what is causing jackd to act so weird under a PREEMPT_REALTIME kernel.
> It seems there is some correlation to activity on X. Hiding and
> showing windows has a certain chance of triggering a large xrun.
do you have chrt-ed the IRQ#0 thread and the soundcard thread as well?
> So i don't really know how to go about this. I suppose i just run
> PREEMPT kernels instead of PREEMPT_REALTIME. Maybe it's the overhead
> which is killing jackd performance with PREEMPT_REALTIME, but i don't
> believe so (50ms? nah!).
agreed, no way can 50msec be related to overhead.
Ingo
* Lee Revell <[email protected]> wrote:
> > i only tried the !PREEMPT version though - does that one work for you?
> > Also, please send me the .config that produces the failing kernel.
>
> OK it allows me to set PREEMPT_NONE, PREEMPT_SOFTIRQS, and
> PREEMPT_HARDIRQS. This should be an illegal combination, right?
in theory it should work just fine.
Ingo
* Florian Schmidt <[email protected]> wrote:
> So i don't really know how to go about this. [...]
you could try to use user-triggered tracing to capture a trace of one
such longer delay.
Ingo
* Florian Schmidt <[email protected]> wrote:
> Hmm, the max jitter rtc_wakeup shows at 1024hz is around 150us. Which
> seems a tiny bit large, too, as the rtc histogram shows a max wakeup
> latency of 16us..
yep, that's a bit too large too. What type of load does it need to
trigger such a 150 usec delay reliably?
Ingo
* Ingo Molnar <[email protected]> wrote:
> * Lee Revell <[email protected]> wrote:
>
> > > i only tried the !PREEMPT version though - does that one work for you?
> > > Also, please send me the .config that produces the failing kernel.
> >
> > OK it allows me to set PREEMPT_NONE, PREEMPT_SOFTIRQS, and
> > PREEMPT_HARDIRQS. This should be an illegal combination, right?
>
> in theory it should work just fine.
hm, in practice it doesnt work - this is that causes the boot-time hang
you saw during PREEMPT_VOLUNTARY. I'll make irq threading depend on
PREEMPT, for the time being.
Ingo
* Ingo Molnar <[email protected]> wrote:
> * Ingo Molnar <[email protected]> wrote:
>
> > * Lee Revell <[email protected]> wrote:
> >
> > > > i only tried the !PREEMPT version though - does that one work for you?
> > > > Also, please send me the .config that produces the failing kernel.
> > >
> > > OK it allows me to set PREEMPT_NONE, PREEMPT_SOFTIRQS, and
> > > PREEMPT_HARDIRQS. This should be an illegal combination, right?
> >
> > in theory it should work just fine.
>
> hm, in practice it doesnt work - this is that causes the boot-time
> hang you saw during PREEMPT_VOLUNTARY. I'll make irq threading depend
> on PREEMPT, for the time being.
this change is in the -5 kernel i just uploaded.
Ingo
> * Florian Schmidt <[email protected]> wrote:
>
> > Hmm, the max jitter rtc_wakeup shows at 1024hz is around 150us. Which
> > seems a tiny bit large, too, as the rtc histogram shows a max wakeup
> > latency of 16us..
>
> yep, that's a bit too large too. What type of load does it need to
> trigger such a 150 usec delay reliably?
on a 2 GHz UP box the worst-case max jitter i can trigger via rtc_wakeup
is 11 usecs, using the -5 kernel. The workload i used was 40 parallel
copies of LTP plus a few hackbench runs. This is how i started
rtc_wakeup:
chrt -f 80 -p `pidof 'IRQ 0'`
chrt -f 98 -p `pidof 'IRQ 8'`
cd rtc_wakeup
./rtc_wakeup -f 1024 -t 100000
i.e. IRQ0 is below IRQ8 and the rtc_wakeup threads, but above every
other IRQ thread. Here's the histogram of a short (~5 minutes) run:
1 247383
2 34842
3 1488
4 3188
5 125
6 1
so this a 6 usecs max delay measured by /dev/rtc. So on your box, if the
max histogram delay was 16 usecs, i'd not expect a worse than ~30 usecs
jitter measured by rtc_wakeup. Can you reproduce the 150 usecs jitter
with the above IRQ setup?
Ingo
* Ingo Molnar <[email protected]> wrote:
> > So i don't really know how to go about this. I suppose i just run
> > PREEMPT kernels instead of PREEMPT_REALTIME. Maybe it's the overhead
> > which is killing jackd performance with PREEMPT_REALTIME, but i don't
> > believe so (50ms? nah!).
>
> agreed, no way can 50msec be related to overhead.
there's one exception: if the RT workload is _just_ below 100% CPU
utilization, then PREEMPT_RT's overhead could push it above 100% and
trigger CPU overload, with big delays. What is the maximum CPU usage
during the test, while the system is otherwise idle?
Ingo
* Florian Schmidt <[email protected]> wrote:
> [...] Opposed to 29-1 PREEMPT_REALTIME which showed some very weird
> jackd behaviour (xruns from 10usec to 50msec [!!!]). rtc_wakeup was
> showing no large jitter for that kernel though, nor did the different
> traces show anything that might have caused the jackd xruns. And yes,
> i configured the irq handlers sanely :)
i am too seeing constant, periodic xruns coming every 100 millisecs or
so. Simply running current Jack CVS via 'jackd -R -p1024 -d alsa' gives
tons of periodic xruns. Are you seeing the same?
Ingo
* Ingo Molnar <[email protected]> wrote:
> i am too seeing constant, periodic xruns coming every 100 millisecs or
> so. Simply running current Jack CVS via 'jackd -R -p1024 -d alsa'
> gives tons of periodic xruns. Are you seeing the same?
but i'm seeing the same with !PREEMPT too - perhaps something broke in
ALSA?
Ingo
On Sun, 21 Nov 2004 16:18:45 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > i am too seeing constant, periodic xruns coming every 100 millisecs or
> > so. Simply running current Jack CVS via 'jackd -R -p1024 -d alsa'
> > gives tons of periodic xruns. Are you seeing the same?
>
> but i'm seeing the same with !PREEMPT too - perhaps something broke in
> ALSA?
possible. Especially as i see those large xruns with jackd, but nothing
comparable with rtc_wakeup. OTOH, just PREEMPT seems to work very fine.
I only tested PREEMPT and PREEMPT_RT yet.
Ah, i use jackd 0.99 from a debian package btw..
flo
On Sun, 21 Nov 2004 13:45:55 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > > Will build 29-4 PREEMPT_REALTIME now and see how this one behaves.
> >
> > Pretty much as bad as 29-1. Sadly i have no idea on how to find out
> > what is causing jackd to act so weird under a PREEMPT_REALTIME kernel.
> > It seems there is some correlation to activity on X. Hiding and
> > showing windows has a certain chance of triggering a large xrun.
>
> do you have chrt-ed the IRQ#0 thread and the soundcard thread as well?
yes. I tried the following combinations with PREEMPT_RT:
- IRQ 0 prio 40, IRQ 3 (soundcard) prio 98
- IRQ 0 prio 99, IRQ 3 prio 98
all other IRQ's at prios around 40-50 (default or set explicitly to 40).
What is the recommended setting for IRQ 0? I thought in this typical
thread-wakeup-by-IRQ scenario the scheduler is "shorted" anyways when an IRQ
occurs, so IRQ 0's prio shouldn't really matter.
flo
On Sun, 21 Nov 2004 13:50:23 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > So i don't really know how to go about this. [...]
>
> you could try to use user-triggered tracing to capture a trace of one
> such longer delay.
Ok, will do (searching relevant email).
flo
On Sun, 21 Nov 2004 13:54:39 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > Hmm, the max jitter rtc_wakeup shows at 1024hz is around 150us. Which
> > seems a tiny bit large, too, as the rtc histogram shows a max wakeup
> > latency of 16us..
>
> yep, that's a bit too large too. What type of load does it need to
> trigger such a 150 usec delay reliably?
I cannot trigger it reliably yet. :(
flo
On Sun, 21 Nov 2004 14:43:54 +0100
Ingo Molnar <[email protected]> wrote:
> on a 2 GHz UP box the worst-case max jitter i can trigger via rtc_wakeup
> is 11 usecs, using the -5 kernel. The workload i used was 40 parallel
> copies of LTP plus a few hackbench runs. This is how i started
> rtc_wakeup:
>
> chrt -f 80 -p `pidof 'IRQ 0'`
> chrt -f 98 -p `pidof 'IRQ 8'`
>
> cd rtc_wakeup
> ./rtc_wakeup -f 1024 -t 100000
>
> i.e. IRQ0 is below IRQ8 and the rtc_wakeup threads, but above every
> other IRQ thread. Here's the histogram of a short (~5 minutes) run:
Ah, ok, this makes sense.. Will try the same. Btw: one more question wrt the
IRQ prios:
Let's assume i have IRQ 0 at 80, my soundcard and the rtc irq both at prio
98 and all others around 40. Now the rtc handler should never get in the way
of the soundcard irq if the rtc is simply not used right? And of course, the
other way around, too. the soundcard irq should not get in the way of the
rtc handler if the soundcard simply is not used and not generating IRQ's?
>
> 1 247383
> 2 34842
> 3 1488
> 4 3188
> 5 125
> 6 1
>
> so this a 6 usecs max delay measured by /dev/rtc. So on your box, if the
> max histogram delay was 16 usecs, i'd not expect a worse than ~30 usecs
> jitter measured by rtc_wakeup. Can you reproduce the 150 usecs jitter
> with the above IRQ setup?
not yet.
flo
On Sun, 21 Nov 2004 16:12:25 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > [...] Opposed to 29-1 PREEMPT_REALTIME which showed some very weird
> > jackd behaviour (xruns from 10usec to 50msec [!!!]). rtc_wakeup was
> > showing no large jitter for that kernel though, nor did the different
> > traces show anything that might have caused the jackd xruns. And yes,
> > i configured the irq handlers sanely :)
>
> i am too seeing constant, periodic xruns coming every 100 millisecs or
> so. Simply running current Jack CVS via 'jackd -R -p1024 -d alsa' gives
> tons of periodic xruns. Are you seeing the same?
Nope, i don't see those.
- my soundcard doesn't allow periodsizes > 512, so i ran with 512 now
- the xruns are very sporadic. My gnome desktop has a little toolbar thingy
which can hide all windows on a click and show em all again on the next
click. I _think_ there _might_ be a correlation to rapidly clicking that
hide/show all windows thing. But i'm not sure at all since it is not really
reproducable.
.config attached (29-4)
Will try 29-5 today (although my time might be limited due to some visitors)
flo
* K.R. Foley <[email protected]> wrote:
> >Looking into this, it is because the e100 uses a shared interrupt. On
> >setup (see drivers/net/e100.c: e100_up) it disables the irq that it will
> >use, and then calls request_irq which calls setup_irq which zeros out
> >the depth of the irq if it is not shared. So if the e100 is the first
> >to be loaded, then you get this message.
> Actually I think it shouldn't call either enable or disable because it
> is shared (or allowed to be shared). After creating a patch myself to
> fix this I realized that it had already been fixed in the newest
> version of the driver on sourceforge. Anyway if you are interested in
> this fix temporarily, here it is.
i've included this in my tree, will drop it once -mm merges the
sourceforge e100 driver.
Ingo
i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
the biggest change in this release are fixes for priority-inheritance
bugs uncovered by Esben Nielsen pi_test suite. These bugs could explain
some of the jackd-under-load latencies reported.
Changes since -V0.7.29-0:
- priority inheritance handling fixes:
- sort the RT wakees at wakeup time, not at block-time: an RT task
might have gotten boosted while it slept.
- fix priority-restoration bug at mutex-release time
- use task_rt() not p->policy to determine whether a task needs
PI handling - a SCHED_OTHER task might be boosted to RT prio.
- fix mutex_setprio() bug: queue now-RT tasks to the active array,
otherwise expired SCHED_OTHER tasks will not be properly boosted.
- went back to the mask-and-delay method of handling hardirqs on
UP-IOAPIC as well. Due to APIC prioritization hardirqs can get
delayed by another, unacked hardirq, so the quick method needs more
work before it can be used.
- added Thomas Gleixner's semaphore -> completion changes for
drv->unload_sem. This fixes the module unload crashes reported by
Rui Nuno Capela and Shane Shrybman.
- dvb mutex updates for RT, this fixes the bug reported by Christian
Meder.
- e100 fix from K.R. Foley - this should fix the boot-time e100
enable_irq warning.
- NFS lockd mutex RT fixes from Thomas Gleixner - this could fix some
of the bugs reported by Bill Huey.
- PREEMPT_VOLUNTARY fixes - this could fix the boot-time hang reported
by Lee Revell.
- wake up irq thread upon creation - this solves the 'irq thread only
changes priority after first interrupt arrives' anomaly reported.
to create a -V0.7.30-2 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm2/2.6.10-rc2-mm2.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm2-V0.7.30-2
Ingo
On Mon, 22 Nov 2004 01:54:11 +0100
Ingo Molnar <[email protected]> wrote:
> i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
> downloaded from the usual place:
> the biggest change in this release are fixes for priority-inheritance
> bugs uncovered by Esben Nielsen pi_test suite. These bugs could explain
> some of the jackd-under-load latencies reported.
It seems these large load related xruns are gone :) At least i wasn't able
to trigger any during my uptime of 52 min. Will report if i ever see any of
those again.
flo
* Florian Schmidt <[email protected]> wrote:
> On Mon, 22 Nov 2004 01:54:11 +0100
> Ingo Molnar <[email protected]> wrote:
>
> > i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
> > downloaded from the usual place:
>
> > the biggest change in this release are fixes for priority-inheritance
> > bugs uncovered by Esben Nielsen pi_test suite. These bugs could explain
> > some of the jackd-under-load latencies reported.
>
> It seems these large load related xruns are gone :) At least i wasn't
> able to trigger any during my uptime of 52 min. Will report if i ever
> see any of those again.
great. I now suspect that some of the xrun problems Rui was observing on
-RT kernel could be (positively) affected by these fixes too.
Ingo
Nov 21 19:22:42 eran kernel: (multiload-apple/5506/CPU#0): 1694 us wakeup latency violates 1000 us threshold.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.29-5
-------------------------------------------------------
latency: 1694 us, entries: 21 (21) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: multiload-apple/5506, uid:500 nice:0 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x165/0x1d0 <c0117af5>
=> ended at: finish_task_switch+0x4f/0xc0 <c0117fbf>
=======>
131 88000006 0.000ms (+0.000ms): __trace_start_sched_wakeup (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): (115) ((116))
131 88000005 0.000ms (+0.000ms): (5506) ((131))
131 88000005 0.001ms (+0.000ms): try_to_wake_up (wake_up_process)
131 88000005 0.001ms (+0.000ms): (0) ((1))
131 88000004 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000004 0.001ms (+0.000ms): wake_up_process (__up_mutex)
131 88000003 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 88000002 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 08000001 0.002ms (+0.000ms): preempt_schedule (__schedule)
131 08000001 0.002ms (+0.000ms): sched_clock (__schedule)
131 88000002 0.003ms (+0.000ms): deactivate_task (__schedule)
131 88000002 0.003ms (+1.687ms): dequeue_task (deactivate_task)
5506 80000002 1.691ms (+0.001ms): __switch_to (__schedule)
5506 80000002 1.692ms (+0.000ms): (131) ((5506))
5506 80000002 1.693ms (+0.000ms): (116) ((115))
5506 80000002 1.693ms (+0.000ms): finish_task_switch (__schedule)
5506 80000001 1.693ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
5506 80000001 1.693ms (+0.005ms): (5506) ((115))
5506 80000001 1.699ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
==================
Nov 21 19:30:58 eran kernel: (firefox-bin/14054/CPU#0): 1991 us wakeup latency violates 1000 us threshold.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.29-5
-------------------------------------------------------
latency: 1991 us, entries: 21 (21) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: firefox-bin/14054, uid:500 nice:0 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x165/0x1d0 <c0117af5>
=> ended at: finish_task_switch+0x4f/0xc0 <c0117fbf>
=======>
131 88000006 0.000ms (+0.000ms): __trace_start_sched_wakeup (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): (115) ((116))
131 88000005 0.000ms (+0.000ms): <000036e6> ((131))
131 88000005 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
131 88000005 0.001ms (+0.000ms): (0) ((1))
131 88000004 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000004 0.001ms (+0.000ms): wake_up_process (__up_mutex)
131 88000003 0.001ms (+0.000ms): preempt_schedule (__up_mutex)
131 88000002 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 08000001 0.002ms (+0.000ms): preempt_schedule (__schedule)
131 08000001 0.002ms (+0.000ms): sched_clock (__schedule)
131 88000002 0.003ms (+0.000ms): deactivate_task (__schedule)
131 88000002 0.003ms (+1.986ms): dequeue_task (deactivate_task)
14054 80000002 1.989ms (+0.000ms): __switch_to (__schedule)
14054 80000002 1.989ms (+0.000ms): (131) (<000036e6>)
14054 80000002 1.989ms (+0.000ms): (116) ((115))
14054 80000002 1.990ms (+0.000ms): finish_task_switch (__schedule)
14054 80000001 1.990ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
14054 80000001 1.990ms (+0.006ms): <000036e6> ((115))
14054 80000001 1.996ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
==================
Nov 21 19:31:44 eran kernel: (IRQ 0/2/CPU#0): 1213 us wakeup latency violates 1000 us threshold.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.29-5
-------------------------------------------------------
latency: 1213 us, entries: 23 (23) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:49
-----------------
=> started at: try_to_wake_up+0x165/0x1d0 <c0117af5>
=> ended at: finish_task_switch+0x4f/0xc0 <c0117fbf>
=======>
3 88010003 0.000ms (+0.000ms): __trace_start_sched_wakeup (try_to_wake_up)
3 88010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
3 88010002 0.000ms (+0.000ms): (50) ((105))
3 88010002 0.000ms (+0.000ms): (2) ((3))
3 88010002 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
3 88010002 0.001ms (+0.000ms): (0) ((1))
3 88010001 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
3 88010001 0.001ms (+0.000ms): wake_up_process (redirect_hardirq)
3 88010000 0.001ms (+0.000ms): preempt_schedule (__do_IRQ)
3 88010000 0.002ms (+0.000ms): irq_exit (do_IRQ)
3 88000001 0.002ms (+0.000ms): do_softirq (irq_exit)
3 88000001 0.002ms (+0.000ms): __do_softirq (do_softirq)
3 88000000 0.002ms (+0.000ms): preempt_schedule_irq (need_resched)
3 98000000 0.003ms (+0.000ms): __schedule (preempt_schedule_irq)
3 98000000 0.003ms (+0.000ms): profile_hit (__schedule)
3 98000001 0.003ms (+0.000ms): sched_clock (__schedule)
2 80000002 0.004ms (+1.207ms): __switch_to (__schedule)
2 80000002 1.212ms (+0.000ms): (3) ((2))
2 80000002 1.212ms (+0.000ms): (105) ((50))
2 80000002 1.212ms (+0.000ms): finish_task_switch (__schedule)
2 80000001 1.212ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000001 1.212ms (+0.006ms): (2) ((50))
2 80000001 1.219ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
==================
Nov 21 19:36:18 eran kernel: (firefox-bin/14054/CPU#0): 1881 us wakeup latency violates 1000 us threshold.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.29-5
-------------------------------------------------------
latency: 1881 us, entries: 21 (21) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: firefox-bin/14054, uid:500 nice:0 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x165/0x1d0 <c0117af5>
=> ended at: finish_task_switch+0x4f/0xc0 <c0117fbf>
=======>
131 88000006 0.000ms (+0.000ms): __trace_start_sched_wakeup (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): (115) ((116))
131 88000005 0.000ms (+0.000ms): <000036e6> ((131))
131 88000005 0.001ms (+0.000ms): try_to_wake_up (wake_up_process)
131 88000005 0.001ms (+0.000ms): (0) ((1))
131 88000004 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000004 0.001ms (+0.000ms): wake_up_process (__up_mutex)
131 88000003 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 88000002 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 08000001 0.002ms (+0.000ms): preempt_schedule (__schedule)
131 08000001 0.002ms (+0.000ms): sched_clock (__schedule)
131 88000002 0.003ms (+0.000ms): deactivate_task (__schedule)
131 88000002 0.003ms (+1.875ms): dequeue_task (deactivate_task)
14054 80000002 1.878ms (+0.001ms): __switch_to (__schedule)
14054 80000002 1.880ms (+0.000ms): (131) (<000036e6>)
14054 80000002 1.880ms (+0.000ms): (116) ((115))
14054 80000002 1.880ms (+0.000ms): finish_task_switch (__schedule)
14054 80000001 1.880ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
14054 80000001 1.881ms (+0.008ms): <000036e6> ((115))
14054 80000001 1.890ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
==================
Nov 21 19:37:18 eran kernel: (firefox-bin/14054/CPU#0): 1347 us wakeup latency violates 1000 us threshold.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.29-5
-------------------------------------------------------
latency: 1347 us, entries: 21 (21) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: firefox-bin/14054, uid:500 nice:0 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x165/0x1d0 <c0117af5>
=> ended at: finish_task_switch+0x4f/0xc0 <c0117fbf>
=======>
131 88000006 0.000ms (+0.000ms): __trace_start_sched_wakeup (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): (115) ((116))
131 88000005 0.000ms (+0.000ms): <000036e6> ((131))
131 88000005 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
131 88000005 0.001ms (+0.000ms): (0) ((1))
131 88000004 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000004 0.001ms (+0.000ms): wake_up_process (__up_mutex)
131 88000003 0.001ms (+0.000ms): preempt_schedule (__up_mutex)
131 88000002 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 08000001 0.002ms (+0.000ms): preempt_schedule (__schedule)
131 08000001 0.002ms (+0.000ms): sched_clock (__schedule)
131 88000002 0.003ms (+0.000ms): deactivate_task (__schedule)
131 88000002 0.003ms (+0.000ms): dequeue_task (deactivate_task)
14054 80000002 0.003ms (+0.000ms): __switch_to (__schedule)
14054 80000002 0.004ms (+0.000ms): (131) (<000036e6>)
14054 80000002 0.004ms (+0.000ms): (116) ((115))
14054 80000002 0.004ms (+0.000ms): finish_task_switch (__schedule)
14054 80000001 0.005ms (+1.342ms): trace_stop_sched_switched (finish_task_switch)
14054 80000001 1.347ms (+0.103ms): <000036e6> ((115))
14054 80000001 1.450ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
==================
Nov 21 19:49:21 eran kernel: (gnome-terminal/5434/CPU#0): 1683 us wakeup latency violates 1000 us threshold.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.29-5
-------------------------------------------------------
latency: 1683 us, entries: 21 (21) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: gnome-terminal/5434, uid:500 nice:0 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x165/0x1d0 <c0117af5>
=> ended at: finish_task_switch+0x4f/0xc0 <c0117fbf>
=======>
131 88000006 0.000ms (+0.000ms): __trace_start_sched_wakeup (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000005 0.000ms (+0.000ms): (115) ((116))
131 88000005 0.001ms (+0.000ms): (5434) ((131))
131 88000005 0.001ms (+0.000ms): try_to_wake_up (wake_up_process)
131 88000005 0.001ms (+0.000ms): (0) ((1))
131 88000004 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
131 88000004 0.001ms (+0.000ms): wake_up_process (__up_mutex)
131 88000003 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 88000002 0.002ms (+0.000ms): preempt_schedule (__up_mutex)
131 08000001 0.002ms (+0.000ms): preempt_schedule (__schedule)
131 08000001 0.002ms (+0.000ms): sched_clock (__schedule)
131 88000002 0.003ms (+0.000ms): deactivate_task (__schedule)
131 88000002 0.003ms (+1.676ms): dequeue_task (deactivate_task)
5434 80000002 1.680ms (+0.001ms): __switch_to (__schedule)
5434 80000002 1.681ms (+0.000ms): (131) ((5434))
5434 80000002 1.682ms (+0.000ms): (116) ((115))
5434 80000002 1.682ms (+0.000ms): finish_task_switch (__schedule)
5434 80000001 1.682ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
5434 80000001 1.682ms (+0.006ms): (5434) ((115))
5434 80000001 1.689ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
==================
Nov 21 19:52:18 eran kernel: (IRQ 0/2/CPU#0): 2009 us wakeup latency violates 1000 us threshold.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.29-5
-------------------------------------------------------
latency: 2009 us, entries: 37 (37) | [VP:0 KP:1 SP:1 HP:1 #CPUS:1]
-----------------
| task: IRQ 0/2, uid:0 nice:0 policy:1 rt_prio:49
-----------------
=> started at: try_to_wake_up+0x165/0x1d0 <c0117af5>
=> ended at: finish_task_switch+0x4f/0xc0 <c0117fbf>
=======>
783 88010003 0.000ms (+0.000ms): __trace_start_sched_wakeup (try_to_wake_up)
783 88010002 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
783 88010002 0.000ms (+0.000ms): (50) ((54))
783 88010002 0.000ms (+0.000ms): (2) ((783))
783 88010002 0.000ms (+0.000ms): try_to_wake_up (wake_up_process)
783 88010002 0.001ms (+0.000ms): (0) ((1))
783 88010001 0.001ms (+0.000ms): preempt_schedule (try_to_wake_up)
783 88010001 0.001ms (+0.000ms): wake_up_process (redirect_hardirq)
783 88010000 0.001ms (+0.000ms): preempt_schedule (__do_IRQ)
783 88010000 0.002ms (+0.000ms): irq_exit (do_IRQ)
783 88000001 0.002ms (+0.000ms): do_softirq (irq_exit)
783 88000001 0.002ms (+0.000ms): __do_softirq (do_softirq)
783 88000001 0.003ms (+0.000ms): wake_up_process (do_softirq)
783 88000001 0.003ms (+0.000ms): try_to_wake_up (wake_up_process)
783 88000001 0.003ms (+0.000ms): task_rq_lock (try_to_wake_up)
783 88000002 0.003ms (+0.000ms): activate_task (try_to_wake_up)
783 88000002 0.004ms (+0.000ms): sched_clock (activate_task)
783 88000002 0.004ms (+0.000ms): recalc_task_prio (activate_task)
783 88000002 0.004ms (+0.000ms): effective_prio (recalc_task_prio)
783 88000002 0.004ms (+0.000ms): enqueue_task (activate_task)
783 88000002 0.005ms (+0.000ms): (105) ((54))
783 88000002 0.005ms (+0.000ms): (3) ((783))
783 88000002 0.005ms (+0.000ms): try_to_wake_up (wake_up_process)
783 88000002 0.006ms (+1.999ms): (0) ((1))
783 88000001 2.005ms (+0.000ms): preempt_schedule (try_to_wake_up)
783 88000001 2.005ms (+0.000ms): wake_up_process (do_softirq)
783 88000000 2.006ms (+0.000ms): preempt_schedule_irq (need_resched)
783 98000000 2.006ms (+0.000ms): __schedule (preempt_schedule_irq)
783 98000000 2.006ms (+0.000ms): profile_hit (__schedule)
783 98000001 2.007ms (+0.000ms): sched_clock (__schedule)
2 80000002 2.007ms (+0.000ms): __switch_to (__schedule)
2 80000002 2.008ms (+0.000ms): (783) ((2))
2 80000002 2.008ms (+0.000ms): (54) ((50))
2 80000002 2.008ms (+0.000ms): finish_task_switch (__schedule)
2 80000001 2.008ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
2 80000001 2.009ms (+0.006ms): (2) ((50))
2 80000001 2.015ms (+0.000ms): trace_stop_sched_switched (finish_task_switch)
* Eran Mann <[email protected]> wrote:
> Ingo Molnar wrote:
> >i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
> >downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> Hi,
> I?m seeing latencies of up to ~2000 microseconds. see attached traces
> file for a small sample. I think I?m missing something obvious
> config-wise but I don?t know what...
131 88000002 0.003ms (+0.000ms): deactivate_task (__schedule)
131 88000002 0.003ms (+1.687ms): dequeue_task (deactivate_task)
5506 80000002 1.691ms (+0.001ms): __switch_to (__schedule)
this seems to be hardware-generated. As you can see it from the trace,
the codepath between __schedule()'s deactive_task() and __switch_to()
has all interrupts and preemption disabled. The O(1) scheduler there has
constant overhead and that codepath should at most take ~1 usec.
(there is one exception, if both LATENCY_TRACING and RT_DEADLOCK_DETECT
are enabled in -V0.7.30-2 and later kernels then the overhead within
__schedule is O(nr_running), because the tracer adds entries for every
runnable task. But this is not the case for your trace because then
you'd see those entries in /proc/latency_trace.)
> The ?load? during the traces consisted of a kernel build in a
> gnome-terminal, and 2 browser windows with a heavy site (Flash ads
> etc.) in each. This load causes a >1 ms latency every 5 minutes on
> average. After the kernel build ended the rate dropped dramatically to
> ~2 traces an hour.
this seems to imply IDE DMA related hardware overhead. Apparently what
happens is that with certain motherboards/chipsets, if IDE DMA happens
then that DMA transfer _completely locks up_ the system bus. Nothing
happens, and the CPU is stalled in essence until the end of the DMA
request.
there's nothing the kernel can do about a hardware latency like that,
but you can try to work it around. Mark H. Johnson has reported up to
500 usec latencies that had a similar pattern as yours, and he has
experimented with lesser DMA modes (udma2?) via hdparm. YMMV and be
careful with hdparm settings.
> The traces were from V-0.7.29-5 but I?ve seen these latencies in all
> RT kernels I tested (2.6.9-mm1-RT-V0.2 was the first). I?ll try
> V0.7.30-2 next. The machine is a PIII 733 Mhz, 256MB RAM, IDE disks.
this very strongly implies some sort of hardware overhead. Btw., the
likely reason why this often shows up within __schedule() is that 1)
it's a very common operation, especially on the -RT kernel 2) we do a
TLB flush there, which can be quite memory-intense, so if the system bus
(the memory bus) is locked up, there is a high likelyhood that this
function generates a cachemiss.
Ingo
On Fri, 2004-11-19 at 10:54 +0100, Ingo Molnar wrote:
> * Christian Meder <[email protected]> wrote:
>
> > This just leaves me with the mysterious traceless jvm related crash.
> > I'll do my best to get a trace ;-)
>
> it would be equally useful to somehow reproduce the lockup with public
> software only, and post the precise steps how to reproduce it.
Hi Ingo,
after two evenings of experimenting this is the current status
(everything based on 0.7.29-0, will try 0.7.30-x during the day):
* the lockup can't be triggered from the console or using a remote
session and I really tried to torture the box ;-)
* the real trigger is mouse activity in X
* the other important factor is running the jvm in profiling mode,
running without jvm or with the jvm in non-profiling mode leaves the box
stable
* I couldn't yet figure out the pattern of java program which is
triggering. Not every java program is triggering but at least I found
several public available ones. I wrote some small test programs doing
simple multithreading but they didn't trigger.
So the simplest setup I found til now is the following:
chris@blue:~$ java -version
java version "1.4.1"
Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.1-01)
Java HotSpot(TM) Client VM (build Blackdown-1.4.1-01, mixed mode)
chris@blue:~$ JAVA_OPTIONS=-Xrunhprof:cpu=samples,file=crap.log,depth=3 jython
Jython 2.1 on java1.4.1 (JIT: null)
Type "copyright", "credits" or "license" for more information.
>>>
Now moving the mouse around in X will make the box lockup in less than
10 seconds.
I'm not sure if JAVA_OPTIONS is a standard jython feature but at least
it's part of the jython-wrapper script of Debian.
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
On Mon, Nov 22, 2004 at 10:31:00AM +0100, Christian Meder wrote:
> Hi Ingo,
>
> after two evenings of experimenting this is the current status
> (everything based on 0.7.29-0, will try 0.7.30-x during the day):
>
> * the lockup can't be triggered from the console or using a remote
> session and I really tried to torture the box ;-)
> * the real trigger is mouse activity in X
> * the other important factor is running the jvm in profiling mode,
> running without jvm or with the jvm in non-profiling mode leaves the box
> stable
> * I couldn't yet figure out the pattern of java program which is
> triggering. Not every java program is triggering but at least I found
> several public available ones. I wrote some small test programs doing
> simple multithreading but they didn't trigger.
>
> So the simplest setup I found til now is the following:
>
> chris@blue:~$ java -version
> java version "1.4.1"
> Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.1-01)
> Java HotSpot(TM) Client VM (build Blackdown-1.4.1-01, mixed mode)
> chris@blue:~$ JAVA_OPTIONS=-Xrunhprof:cpu=samples,file=crap.log,depth=3 jython
> Jython 2.1 on java1.4.1 (JIT: null)
> Type "copyright", "credits" or "license" for more information.
> >>>
>
> Now moving the mouse around in X will make the box lockup in less than
> 10 seconds.
HotSpot is pretty heavy on VM faulting as well as signal handling, SIGUSR1,
during per thread GC safepointing operations to get the current thread
ucontext for GC traversal roots. Debugging the HotSpot VM is nearly impossible
without heavy unit testing and even that isn't going to push through that
almost pure voodoo code easily. I'm a former HotSpot tweeker for the BSDs so
I know a thing or two about that particular VM.
Try a small Java app to see if this triggers the same lock ups. Are you
pushing it using Swing ?
I'd try pushing an incremental load on it, maybe some rapid object creation
and destruction with increasing number of threads.
Also, CC the Sun/Blackdown folks about this. It could very well be some
kind of NPTL glue problem triggering.
bill
Ingo Molnar wrote:
>
> * Florian Schmidt wrote:
>
>>
>> > i have released the -V0.7.30-2 Real-Time Preemption patch, which can
>> > be downloaded from the usual place:
>>
>> > the biggest change in this release are fixes for priority-inheritance
>> > bugs uncovered by Esben Nielsen pi_test suite. These bugs could
>> > explain some of the jackd-under-load latencies reported.
>>
>> It seems these large load related xruns are gone :) At least i wasn't
>> able to trigger any during my uptime of 52 min. Will report if i ever
>> see any of those again.
>
> great. I now suspect that some of the xrun problems Rui was observing on
> -RT kernel could be (positively) affected by these fixes too.
>
Just made some test-runs with RT-V0.7.30-2, with my jackd-R + 8*fluidsynth
benchmark on my laptop (P4/UP), and the results don't seem to be eligible
to the hall of fame, at least when compared with RT-0.7.7 as the ones I
last posted here a few weeks ago.
I hate to say this, but the XRUN rate has increased since RT-0.7.7, and
the maximum scheduling delay reported by jackd has also degraded to 1000
usecs (was around 600 usecs).
Please note that this is not unique to latest RT-V0.7.30-2, but also
applies to each one of the previous iterations I've been testing, only not
reported until now. Again, all test conditions were kept the same
(hardware, jackd, alsa), only the kernel has changed (obviously).
OTOH, there's another thing: I don't seem to be able to build an initrd
image under the latest RT kernels. Something related to the loopback
device. When trying to run mkinitrd it stalls, somewhere under this
process:
mount -t ext2 /root/tmp/initrd.img /root/tmp/initrd.mnt -o loop
This happens on my laptop (P4/UP, Mandrake 10.1c) but not on my desktop
(P4/SMT, SUSE 9.2pro). Speaking of which, the lockups experienced while
unloading the ALSA modules seems to be over, at least as far as I could
try with RT-V0.7.29-4 (probably not enough yet).
Bye now.
--
rncbc aka Rui Nuno Capela
[email protected]
* Christian Meder <[email protected]> wrote:
> * the other important factor is running the jvm in profiling mode,
> running without jvm or with the jvm in non-profiling mode leaves the
> box stable
ah ... CPU profiling i suspect needs SIGPROF, and that is one of the
things that i had to disable in -RT. But it seems this disabling wasnt
fully correct - could you try the patch i attached, does it change the
symptoms?
> So the simplest setup I found til now is the following:
>
> chris@blue:~$ java -version
> java version "1.4.1"
> Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.1-01)
> Java HotSpot(TM) Client VM (build Blackdown-1.4.1-01, mixed mode)
> chris@blue:~$ JAVA_OPTIONS=-Xrunhprof:cpu=samples,file=crap.log,depth=3 jython
> Jython 2.1 on java1.4.1 (JIT: null)
> Type "copyright", "credits" or "license" for more information.
> >>>
>
> Now moving the mouse around in X will make the box lockup in less than
> 10 seconds.
>
> I'm not sure if JAVA_OPTIONS is a standard jython feature but at least
> it's part of the jython-wrapper script of Debian.
on a FC3 box this gives me:
saturn:~> JAVA_OPTIONS=-Xrunhprof:cpu=samples,file=crap.log,depth=3 jython
Exception in thread "main" java.lang.NoClassDefFoundError: error:
(i'm getting the same message when purely running 'jython')
i've got:
saturn:~> java -version
java version "1.4.2_03"
Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02)
Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode)
is my java setup botched perhaps?
Ingo
--- linux/kernel/sched.c.orig
+++ linux/kernel/sched.c
@@ -2354,12 +2354,12 @@ static inline void account_it_virt(struc
if (cputime_gt(it_virt, cputime_zero) &&
cputime_gt(cputime, cputime_zero)) {
-#if 0
if (cputime_ge(cputime, it_virt)) {
it_virt = cputime_add(it_virt, p->it_virt_incr);
+#if 0
send_sig(SIGVTALRM, p, 1);
- }
#endif
+ }
it_virt = cputime_sub(it_virt, cputime);
p->it_virt_value = it_virt;
}
@@ -2376,12 +2376,12 @@ static void account_it_prof(struct task_
if (cputime_gt(it_prof, cputime_zero) &&
cputime_gt(cputime, cputime_zero)) {
-#if 0
if (cputime_ge(cputime, it_prof)) {
it_prof = cputime_add(it_prof, p->it_prof_incr);
+#if 0
send_sig(SIGPROF, p, 1);
- }
#endif
+ }
it_prof = cputime_sub(it_prof, cputime);
p->it_prof_value = it_prof;
}
@@ -2395,7 +2395,6 @@ static void account_it_prof(struct task_
*/
static void check_rlimit(struct task_struct *p, cputime_t cputime)
{
-#if 0
cputime_t total, tmp;
total = cputime_add(p->utime, p->stime);
@@ -2403,14 +2402,19 @@ static void check_rlimit(struct task_str
if (unlikely(cputime_gt(total, tmp))) {
/* Send SIGXCPU every second. */
tmp = cputime_sub(total, cputime);
- if (cputime_to_secs(tmp) < cputime_to_secs(total))
+ if (cputime_to_secs(tmp) < cputime_to_secs(total)) {
+#if 0
send_sig(SIGXCPU, p, 1);
+#endif
+ }
/* and SIGKILL when we go over max.. */
tmp = jiffies_to_cputime(p->signal->rlim[RLIMIT_CPU].rlim_max);
- if (cputime_gt(total, tmp))
+ if (cputime_gt(total, tmp)) {
+#if 0
send_sig(SIGKILL, p, 1);
- }
#endif
+ }
+ }
}
/*
* Rui Nuno Capela <[email protected]> wrote:
> > great. I now suspect that some of the xrun problems Rui was observing on
> > -RT kernel could be (positively) affected by these fixes too.
> >
>
> Just made some test-runs with RT-V0.7.30-2, with my jackd-R +
> 8*fluidsynth benchmark on my laptop (P4/UP), and the results don't
> seem to be eligible to the hall of fame, at least when compared with
> RT-0.7.7 as the ones I last posted here a few weeks ago.
>
> I hate to say this, but the XRUN rate has increased since RT-0.7.7,
> and the maximum scheduling delay reported by jackd has also degraded
> to 1000 usecs (was around 600 usecs).
well, life would be too easy if two bugs were fixed at once ;) These
were nodebug runs, right? Could you give me a description of the precise
commands of how you started jackd and fluidsyth (and their versions) -
so that i could try to reproduce & debug your setup. It is certainly a
complex scheduling scenario.
(perhaps with a link to the .sf2 and .mid files you used, if they are
public - or whether it's fine if i use the VintageDreamsWaves-v2.sf2
sound-fonts that comes with fluidsynth plus a random .mid file from the
net?)
> OTOH, there's another thing: I don't seem to be able to build an
> initrd image under the latest RT kernels. Something related to the
> loopback device. When trying to run mkinitrd it stalls, somewhere
> under this process:
>
> mount -t ext2 /root/tmp/initrd.img /root/tmp/initrd.mnt -o loop
Do you know when this started, roughly?
Ingo
* Bill Huey <[email protected]> wrote:
> Also, CC the Sun/Blackdown folks about this. It could very well be
> some kind of NPTL glue problem triggering.
i'd suggest to not yet bother any upstream folks at this point, this
could easily be an -RT related bug. Hard kernel lockups are almost
always -RT's fault.
Ingo
Ingo Molnar wrote:
>
> Rui Nuno Capela wrote:
>
>> > great. I now suspect that some of the xrun problems Rui was observing
>> > on -RT kernel could be (positively) affected by these fixes too.
>> >
>>
>> Just made some test-runs with RT-V0.7.30-2, with my jackd-R +
>> 8*fluidsynth benchmark on my laptop (P4/UP), and the results don't
>> seem to be eligible to the hall of fame, at least when compared with
>> RT-0.7.7 as the ones I last posted here a few weeks ago.
>>
>> I hate to say this, but the XRUN rate has increased since RT-0.7.7,
>> and the maximum scheduling delay reported by jackd has also degraded
>> to 1000 usecs (was around 600 usecs).
>
> well, life would be too easy if two bugs were fixed at once ;) These
> were nodebug runs, right? Could you give me a description of the precise
> commands of how you started jackd and fluidsyth (and their versions) -
> so that i could try to reproduce & debug your setup. It is certainly a
> complex scheduling scenario.
>
> (perhaps with a link to the .sf2 and .mid files you used, if they are
> public - or whether it's fine if i use the VintageDreamsWaves-v2.sf2
> sound-fonts that comes with fluidsynth plus a random .mid file from the
> net?)
>
These are the command-lines of my test suite:
jackd -R -dalsa -dhw:0 -P20 -r44100 -p64 -n2 -S -P &
fluidsynth -s -i -a jack -j -o jack.audio.id=fluid1 -o shell.port=9800
ct4mgm.sf2 &
fluidsynth -s -i -a jack -j -o jack.audio.id=fluid2 -o shell.port=9801
ct4mgm.sf2 &
.
.
.
fluidsynth -s -i -a jack -j -o jack.audio.id=fluid8 -o shell.port=9807
ct4mgm.sf2 &
Versions are:
jack 0.99.10cvs
fluidsynth 1.0.5
>> OTOH, there's another thing: I don't seem to be able to build an
>> initrd image under the latest RT kernels. Something related to the
>> loopback device. When trying to run mkinitrd it stalls, somewhere
>> under this process:
>>
>> mount -t ext2 /root/tmp/initrd.img /root/tmp/initrd.mnt -o loop
>
> Do you know when this started, roughly?
>
Not sure, but the first time I've noticed it was on RT-0.7.29-2 and that
was purely by chance. Problem is that I've been building the RT kernels
under a non-RT stock kernel, so I can't say how long or when it all
started exactly. I remember however that this is a revisited issue,
thought.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
On Mon, 22 Nov 2004 14:24:59 +0100
Ingo Molnar <[email protected]> wrote:
> > Just made some test-runs with RT-V0.7.30-2, with my jackd-R +
> > 8*fluidsynth benchmark on my laptop (P4/UP), and the results don't
> > seem to be eligible to the hall of fame, at least when compared with
> > RT-0.7.7 as the ones I last posted here a few weeks ago.
> >
> > I hate to say this, but the XRUN rate has increased since RT-0.7.7,
> > and the maximum scheduling delay reported by jackd has also degraded
> > to 1000 usecs (was around 600 usecs).
>
> well, life would be too easy if two bugs were fixed at once ;)
Hi,
i just wanted to mention that a good share of jack clients have issues themself, doing all kinds of funky stuff in the RT thread which they shouldn't do. Maybe the RP kernel just exposes this misuse in a greater visible way. I don't know if fluidsynth is one of them. We could only find out by code inspection.
Another way to test a more complex scenario than just jackd running with an empty graph (assuming that jackd itself isn't to blame) while avoiding the risk of getting bad data due to insane clients would be to code up an example jackd client that does nothing but putting some load onto the jackd graph but in a strictly RT fashion (no blocking stuff whatsoever).
Attached you probably find the most minimal jack client thinkable that does nothing but copy data from its input to its output port. Its only parameter is the time in seconds it will run (default 60). The jack client name is determined by the PID, so it can be started multiple times (jackd requires a unique name for each client).
compile with
g++ -o jack_test jack_test.cc -ljack
This code can easily be adapted to produce more load (just do some math stuff with the data in the process callback).
It seems jackd has a limitation to 14 clients atm (don't ask me why). The 15th kills jackd ;)
Also i wanted to mention that a good share of ALSA drivers have issues, too, and aren't nessecarily suited to low latency audio work. I don't know how to rule these out except for using the ALSA dummy soundcard driver (which might have its own issues, but it's probably simple enough to work reliable. it just doesn't use any hw IRQ's so it's maybe not a good measure for what we want to test) or to use a soundcard with a proven good driver.
flo
On Mon, 2004-11-22 at 14:05 +0100, Ingo Molnar wrote:
> * Christian Meder <[email protected]> wrote:
>
> > * the other important factor is running the jvm in profiling mode,
> > running without jvm or with the jvm in non-profiling mode leaves the
> > box stable
>
> ah ... CPU profiling i suspect needs SIGPROF, and that is one of the
> things that i had to disable in -RT. But it seems this disabling wasnt
> fully correct - could you try the patch i attached, does it change the
> symptoms?
I tried it on top of 0.7.29-0 and it seemed to survive a little bit
longer but it doesn't change fundamentally i.e. I've got to wiggle the
mouse for maybe 20 seconds instead of 10 before the kernel locks up.
>
> > So the simplest setup I found til now is the following:
> >
> > chris@blue:~$ java -version
> > java version "1.4.1"
> > Java(TM) 2 Runtime Environment, Standard Edition (build Blackdown-1.4.1-01)
> > Java HotSpot(TM) Client VM (build Blackdown-1.4.1-01, mixed mode)
> > chris@blue:~$ JAVA_OPTIONS=-Xrunhprof:cpu=samples,file=crap.log,depth=3 jython
> > Jython 2.1 on java1.4.1 (JIT: null)
> > Type "copyright", "credits" or "license" for more information.
> > >>>
> >
> > Now moving the mouse around in X will make the box lockup in less than
> > 10 seconds.
> >
> > I'm not sure if JAVA_OPTIONS is a standard jython feature but at least
> > it's part of the jython-wrapper script of Debian.
>
> on a FC3 box this gives me:
>
> saturn:~> JAVA_OPTIONS=-Xrunhprof:cpu=samples,file=crap.log,depth=3 jython
> Exception in thread "main" java.lang.NoClassDefFoundError: error:
>
> (i'm getting the same message when purely running 'jython')
>
> i've got:
>
> saturn:~> java -version
> java version "1.4.2_03"
> Java(TM) 2 Runtime Environment, Standard Edition (build 1.4.2_03-b02)
> Java HotSpot(TM) Client VM (build 1.4.2_03-b02, mixed mode)
>
> is my java setup botched perhaps?
I'd rather guess your jython setup is botched. I'm sending you offlist
another simple test case which just involves a simple start of the Jetty
servlet container.
Christian
--
Christian Meder, email: [email protected]
The Way-Seeking Mind of a tenzo is actualized
by rolling up your sleeves.
(Eihei Dogen Zenji)
* Christian Meder <[email protected]> wrote:
> On Mon, 2004-11-22 at 14:05 +0100, Ingo Molnar wrote:
> > * Christian Meder <[email protected]> wrote:
> >
> > > * the other important factor is running the jvm in profiling mode,
> > > running without jvm or with the jvm in non-profiling mode leaves the
> > > box stable
> >
> > ah ... CPU profiling i suspect needs SIGPROF, and that is one of the
> > things that i had to disable in -RT. But it seems this disabling wasnt
> > fully correct - could you try the patch i attached, does it change the
> > symptoms?
>
> I tried it on top of 0.7.29-0 and it seemed to survive a little bit
> longer but it doesn't change fundamentally i.e. I've got to wiggle the
> mouse for maybe 20 seconds instead of 10 before the kernel locks up.
do you have serial logging (or working netdump) from that box? If yes
then you could try the debug_direct_keyboard switch still, and when the
lockup happens, do SysRq-P a number of times, and then do a SysRq-D and
a SysRq-T and send me any log output that this might produce.
> > is my java setup botched perhaps?
>
> I'd rather guess your jython setup is botched. I'm sending you offlist
> another simple test case which just involves a simple start of the
> Jetty servlet container.
ok.
Ingo
Ingo Molnar wrote:
> * Eran Mann <[email protected]> wrote:
>>
>>Hi,
>>I?m seeing latencies of up to ~2000 microseconds. see attached traces
>>file for a small sample. I think I?m missing something obvious
>>config-wise but I don?t know what...
...
> this seems to imply IDE DMA related hardware overhead. Apparently what
> happens is that with certain motherboards/chipsets, if IDE DMA happens
> then that DMA transfer _completely locks up_ the system bus. Nothing
> happens, and the CPU is stalled in essence until the end of the DMA
> request.
>
....
> Ingo
Right on.
After hdparm -d0 I see maximum latency of 35 us after a full kernel
build with a few GUI apps in the background. I?ll try to find a
reasonable compromise.
--
Eran Mann
* Eran Mann <[email protected]> wrote:
> >>I?m seeing latencies of up to ~2000 microseconds. see attached traces
> >>file for a small sample. I think I?m missing something obvious
> >>config-wise but I don?t know what...
> ...
>
> >this seems to imply IDE DMA related hardware overhead. Apparently what
> >happens is that with certain motherboards/chipsets, if IDE DMA happens
> >then that DMA transfer _completely locks up_ the system bus. Nothing
> >happens, and the CPU is stalled in essence until the end of the DMA
> >request.
> Right on.
> After hdparm -d0 I see maximum latency of 35 us after a full kernel
> build with a few GUI apps in the background. I?ll try to find a
> reasonable compromise.
it might make sense to report this to the hw vendor as well, as these
latencies dont occur at _every_ IDE DMA, it might be some sort of
chipset (or BIOS) bug they might want to see resolved as well (if this
isnt a ship-and-forget vendor). 2 msec stalls are not nice to a fair
number of applications.
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> These are the command-lines of my test suite:
>
> jackd -R -dalsa -dhw:0 -P20 -r44100 -p64 -n2 -S -P &
> fluidsynth -s -i -a jack -j -o jack.audio.id=fluid1 -o shell.port=9800
> ct4mgm.sf2 &
> fluidsynth -s -i -a jack -j -o jack.audio.id=fluid2 -o shell.port=9801
> ct4mgm.sf2 &
is this enough to generate the xruns in jackd? Shouldnt fluidsynth be
given a MIDI file to play back? (if yes, what is the method i should use
- should i give it on the command line?)
Ingo
* Ingo Molnar <[email protected]> wrote:
> > jackd -R -dalsa -dhw:0 -P20 -r44100 -p64 -n2 -S -P &
> > fluidsynth -s -i -a jack -j -o jack.audio.id=fluid1 -o shell.port=9800
> > ct4mgm.sf2 &
> > fluidsynth -s -i -a jack -j -o jack.audio.id=fluid2 -o shell.port=9801
> > ct4mgm.sf2 &
>
> is this enough to generate the xruns in jackd? Shouldnt fluidsynth be
> given a MIDI file to play back? (if yes, what is the method i should
> use - should i give it on the command line?)
ah, i think i understand: fluidsynth has roughly the same CPU overhead
when it is 'silent' (it's generating small static noise in that case),
compared to when it's playing a MIDI file - so i should be able to see
the xruns if i just run jackd and 8 fluidsynth instances, and then load
the box - correct?
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> Yes, I know all this, and I warned before that this tests were
> strictly and specific to the hardware, jackd and fludisynth code which
> are intentionally kept the same along the several RT kernels that have
> been issued.
i'm aware of this - and i'm perfectly happy about all the testing that
is done, even if a latency problem in the end it turns out to be some
'side issue', an application or configuration bug. You are doing very
useful QA no matter what the problem turns out to be in the end.
when fixing stuff i tend to go from the simpler testcases to the more
complex ones (it's naturally simpler to validate the simpler ones), but
currently all the simple ones are working fine, so i'm looking at more
complex ones again.
Having said that, (not in small part based on the care you are taking to
keep your test environment consistent) it very much looks like as if the
latency problems you are reporting are related to -RT itself. It could
easily be something else, but as usual, there's only one way to find out
...
anyway, dont worry about presenting me with some problem that in the end
turns out to be something else. As Florian's jackd-xrun report from two
days ago has proven, jackd is still triggering genuine -RT bugs that
none of the simple workloads/apps are triggering. Less than one day
after half a dozen such latency bugs were fixed in the -RT patchset i
have no basis to go and blame Jackd or ALSA for any of the remaining
latency problems =B-)
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> > It seems jackd has a limitation to 14 clients atm (don't ask me why). The
> > 15th kills jackd ;)
> >
>
> So true.
is there any fix for that? Loading 14 jack_test clients only uses up
~33% of CPU time on my testbox. Or should it be possible for me to
trigger xruns using so many clients and 33% CPU utilization already?
Could you perhaps try 14 instances of jack_test on your box, and see
whether you can generate similar xruns as you could generate with
fluidsynth? jack_test should certainly exclude as much jack-client side
complexity as possible.
Ingo
Florian Schmidt wrote:
> Ingo Molnar wrote:
>
>> > Just made some test-runs with RT-V0.7.30-2, with my jackd-R +
>> > 8*fluidsynth benchmark on my laptop (P4/UP), and the results don't
>> > seem to be eligible to the hall of fame, at least when compared with
>> > RT-0.7.7 as the ones I last posted here a few weeks ago.
>> >
>> > I hate to say this, but the XRUN rate has increased since RT-0.7.7,
>> > and the maximum scheduling delay reported by jackd has also degraded
>> > to 1000 usecs (was around 600 usecs).
>>
>> well, life would be too easy if two bugs were fixed at once ;)
>
> Hi,
>
> i just wanted to mention that a good share of jack clients have issues
> themself, doing all kinds of funky stuff in the RT thread which they
> shouldn't do. Maybe the RP kernel just exposes this misuse in a greater
> visible way. I don't know if fluidsynth is one of them. We could only find
> out by code inspection.
>
Yes, I know all this, and I warned before that this tests were strictly
and specific to the hardware, jackd and fludisynth code which are
intentionally kept the same along the several RT kernels that have been
issued.
Note that I've kept this consistency to my self, and applies /only/ to my
laptop, where the tests are being evaluated. Again, this test-suite of
mine has the sole intention to compare the jackd workload performance
across kernels, in an almost real softsynth scenario. All kernels tested
are built with no debug options, ressembling production ones as far as
possible.
For example, these are the results-du-jour, which serves as a straight
comparison RT-V0.7.30-2, with the previous posted ones from RT-V0.7.7:
RT-V0.7.7 RT-V0.7.30-2
--------- ------------
XRUN Rate . . . . . . . . . . : 45.6 292.0 /hour
Delay Rate (>spare time) . . : 43.2 265.3 /hour
Delay Rate (>1000 usecs) . . : 3.6 29.3 /hour
Delay Maximum . . . . . . . . : 1249 1045 usecs
Cycle Maximum . . . . . . . . : 946 1127 usecs
Average DSP Load. . . . . . . : 55.2 60.1 %
Average CPU System Load . . . : 13.2 15.5 %
Average CPU User Load . . . . : 41.9 46.2 %
Average CPU Nice Load . . . . : 0.0 0.0 %
Average CPU I/O Wait Load . . : 0.1 0.1 %
Average CPU IRQ Load . . . . : 0.0 0.0 %
Average CPU Soft-IRQ Load . . : 0.0 0.0 %
Average Interrupt Rate . . . : 1675.4 1673.8 /sec
Average Context-Switch Rate . : 13940.9 14894.7 /sec
The only thing that has changed here was the kernel image, as everything
else has remained constant.
> Another way to test a more complex scenario than just jackd running with
> an empty graph (assuming that jackd itself isn't to blame) while avoiding
> the risk of getting bad data due to insane clients would be to code up an
> example jackd client that does nothing but putting some load onto the
> jackd graph but in a strictly RT fashion (no blocking stuff whatsoever).
>
> Attached you probably find the most minimal jack client thinkable that
> does nothing but copy data from its input to its output port. Its only
> parameter is the time in seconds it will run (default 60). The jack client
> name is determined by the PID, so it can be started multiple times (jackd
> requires a unique name for each client).
>
> compile with
>
> g++ -o jack_test jack_test.cc -ljack
>
> This code can easily be adapted to produce more load (just do some math
> stuff with the data in the process callback).
>
> It seems jackd has a limitation to 14 clients atm (don't ask me why). The
> 15th kills jackd ;)
>
So true.
> Also i wanted to mention that a good share of ALSA drivers have issues,
> too, and aren't nessecarily suited to low latency audio work. I don't know
> how to rule these out except for using the ALSA dummy soundcard driver
> (which might have its own issues, but it's probably simple enough to work
> reliable. it just doesn't use any hw IRQ's so it's maybe not a good
> measure for what we want to test) or to use a soundcard with a proven good
> driver.
>
Of course, and the "reference" driver used on my tests is no exception
(snd-ali5451). But again, it's been kept the same on all tests, and the
improvement along the progression of the RT kernel development has been
outstanding nevertheless.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
* Ingo Molnar <[email protected]> wrote:
> ah, i think i understand: fluidsynth has roughly the same CPU overhead
> when it is 'silent' (it's generating small static noise in that case),
> compared to when it's playing a MIDI file - so i should be able to see
> the xruns if i just run jackd and 8 fluidsynth instances, and then
> load the box - correct?
another question: it seems the fluidsynth threads are running as non-RT
threads - they are soft-synthesizing sound that jackd will mix and play.
Now, can any delay of these fluidsynth threads (they are non-RT tasks
and can get delayed indefinitely) cause an xrun in Jackd, or should it
'only' make the sound more choppy?
Ingo
>Just did a build with -V0.7.30-2 and was about to start testing when
>the system locked up (no keyboard response, display frozen, etc.). ...
Same symptom with a slightly different set of steps leading to the
problem.
- boot / telinit 5 OK
- su'd to get privileges
- cat /proc/sys/kernel/preempt_wakeup_latency (showed 0)
- echo 1 > /proc/sys/kernel/preempt_wakeup_latency
- set RT priorities as before
- started scripts to record data
- system-config-soundcard (newline shown)
At this point, the system is locked up again with no response to any
inputs.
One thing I did notice from the previous test, I had two output files
from preempt_trace showing a couple minor (just over 50 usec each)
wakeup traces.
I have a few ideas to simplify the set up to see if I can get some
useful data out of the system.
--Mark H Johnson
<mailto:[email protected]>
> Ingo Molnar
>
> * Rui Nuno Capela <[email protected]> wrote:
>
>> > It seems jackd has a limitation to 14 clients atm (don't ask me why).
>> The
>> > 15th kills jackd ;)
>> >
>>
>> So true.
>
> is there any fix for that? Loading 14 jack_test clients only uses up
> ~33% of CPU time on my testbox. Or should it be possible for me to
> trigger xruns using so many clients and 33% CPU utilization already?
>
> Could you perhaps try 14 instances of jack_test on your box, and see
> whether you can generate similar xruns as you could generate with
> fluidsynth? jack_test should certainly exclude as much jack-client side
> complexity as possible.
>
> Ingo
>
OK. I tried 14 instances of jack_test. I even modded Florian's original
source code, to let each client instance have 4 ins and 4 outs, and to
make things a litle bit heavier, all 4 inputs are mixed into each of the 4
outputs.
Saw at least a couple of XRUNs in a 20 (4*5) minute test-run. CPU load
doesn't get above 30% on my laptop (P4/UP 2.533Ghz).
On attach you may find my "4-multiplex" version of jack_test(.cpp), along
with the jack_test3.sh shell script which has been used for my test runs.
There's also a modified version of nmeter(.c) that served the purpose to
log system performance counters (CPU usage, IRQs and Context Switch
rate).
Bye.
--
rncbc aka Rui Nuno Capela
[email protected]
>i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
After simplifying the test, I have some data on wakeup times that don't
look too good. The set up was...
- booted / telinit 5
- did NOT change any IRQ nor system task priorities
- ran data collection script
- ran a simple script that exercised the disk (write, copy, read)
Nothing was at RT priority except system tasks & data collection script
[script was RT fifo priority 1].
Symptoms seen include:
[1] still see occasional truncated latency_trace output (see below)
[2] variety of long > 1 msec wakeup latencies (see below)
[3] primary long latencies with ksoftirqd/[01] and IRQ 10 tasks
I have over 30 traces in about 5-10 minutes of testing. Let me know if
you want all of them.
--Mark
Truncated example:
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.30-2
-------------------------------------------------------
latency: 2097 us, entries: 1 (1) | [VP:0 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: ksoftirqd/0/4, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x379/0x3e0 <c0118d99>
=> ended at: finish_task_switch+0x4f/0xc0 <c01192af>
=======>
4 88000001 0.000ms (+0.000ms): trace_stop_sched_switched
(finish_task_switch)
Long trace example [> 2 msec]
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.30-2
-------------------------------------------------------
latency: 1537 us, entries: 36 (36) | [VP:0 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: ksoftirqd/0/4, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x379/0x3e0 <c0118d99>
=> ended at: finish_task_switch+0x4f/0xc0 <c01192af>
=======>
0 88000004 0.000ms (+0.000ms): __trace_start_sched_wakeup
(try_to_wake_up)
0 88000004 0.000ms (+0.000ms): _raw_spin_unlock (try_to_wake_up)
0 88000003 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
0 88000003 0.000ms (+0.000ms): (4) ((0))
0 88000003 0.001ms (+0.000ms): (105) ((140))
0 88000003 0.001ms (+0.000ms): try_to_wake_up (wake_up_process)
0 88000003 0.001ms (+0.000ms): (0) ((1))
0 88000003 0.001ms (+0.000ms): _raw_spin_unlock (try_to_wake_up)
0 88000002 0.002ms (+0.000ms): preempt_schedule (try_to_wake_up)
0 88000002 0.002ms (+0.275ms): wake_up_process (do_softirq)
0 88010001 0.277ms (+0.207ms): do_nmi (default_idle)
0 88010001 0.484ms (+0.001ms): do_nmi (__mcount)
0 88010001 0.486ms (+0.077ms): do_nmi (<00200046>)
0 88010001 0.563ms (+0.275ms): preempt_schedule (nmi_watchdog_tick)
0 08000000 0.838ms (+0.000ms): preempt_schedule (cpu_idle)
0 98000000 0.839ms (+0.000ms): __sched_text_start (preempt_schedule)
0 98000001 0.839ms (+0.000ms): sched_clock (__sched_text_start)
0 98000001 0.839ms (+0.000ms): _raw_spin_lock_irq (__sched_text_start)
0 98000001 0.840ms (+0.000ms): _raw_spin_lock_irqsave
(__sched_text_start)
0 88000002 0.840ms (+0.000ms): dequeue_task (__sched_text_start)
0 88000002 0.840ms (+0.000ms): recalc_task_prio (__sched_text_start)
0 88000002 0.841ms (+0.000ms): effective_prio (recalc_task_prio)
0 88000002 0.841ms (+0.000ms): enqueue_task (__sched_text_start)
0 80000002 0.841ms (+0.001ms): trace_array (__sched_text_start)
0 80000002 0.843ms (+0.000ms): (4) ((105))
0 80000002 0.843ms (+0.000ms): (0) ((110))
0 80000002 0.844ms (+0.002ms): trace_array (__sched_text_start)
4 80000002 0.847ms (+0.000ms): __switch_to (__sched_text_start)
4 80000002 0.847ms (+0.000ms): (0) ((4))
4 80000002 0.847ms (+0.000ms): (140) ((105))
4 80000002 0.847ms (+0.000ms): finish_task_switch (__sched_text_start)
4 80000002 0.848ms (+0.000ms): _raw_spin_unlock (finish_task_switch)
4 80000001 0.848ms (+0.000ms): trace_stop_sched_switched
(finish_task_switch)
4 80000001 0.848ms (+0.000ms): (4) ((105))
4 80000001 0.848ms (+1.318ms): _raw_spin_lock_irqsave
(trace_stop_sched_switched)
4 80000001 2.167ms (+0.000ms): trace_stop_sched_switched
(finish_task_switch)
Another long one...
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.30-2
-------------------------------------------------------
latency: 1956 us, entries: 36 (36) | [VP:0 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: ksoftirqd/0/4, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x379/0x3e0 <c0118d99>
=> ended at: finish_task_switch+0x4f/0xc0 <c01192af>
=======>
28 88000003 0.000ms (+0.000ms): __trace_start_sched_wakeup
(try_to_wake_up)
28 88000003 0.000ms (+0.000ms): _raw_spin_unlock (try_to_wake_up)
28 88000002 0.000ms (+0.001ms): preempt_schedule (try_to_wake_up)
28 88000002 0.001ms (+0.000ms): (4) ((28))
28 88000002 0.002ms (+0.000ms): (105) ((110))
28 88000002 0.002ms (+0.000ms): try_to_wake_up (wake_up_process)
28 88000002 0.002ms (+0.000ms): (0) ((1))
28 88000002 0.002ms (+0.000ms): _raw_spin_unlock (try_to_wake_up)
28 88000001 0.003ms (+0.000ms): preempt_schedule (try_to_wake_up)
28 88000001 0.003ms (+0.000ms): wake_up_process (do_softirq)
28 88000000 0.003ms (+0.000ms): preempt_schedule_irq (need_resched)
28 98000000 0.004ms (+0.000ms): __sched_text_start
(preempt_schedule_irq)
28 98000001 0.004ms (+0.000ms): sched_clock (__sched_text_start)
28 98000001 0.004ms (+0.000ms): _raw_spin_lock_irq (__sched_text_start)
28 98000001 0.005ms (+0.000ms): _raw_spin_lock_irqsave
(__sched_text_start)
28 88000002 0.005ms (+0.000ms): dequeue_task (__sched_text_start)
28 88000002 0.006ms (+0.000ms): recalc_task_prio (__sched_text_start)
28 88000002 0.006ms (+0.000ms): effective_prio (recalc_task_prio)
28 88000002 0.006ms (+0.000ms): enqueue_task (__sched_text_start)
28 80000002 0.006ms (+0.001ms): trace_array (__sched_text_start)
28 80000002 0.008ms (+0.000ms): (4) ((105))
28 80000002 0.009ms (+0.000ms): (0) ((110))
28 80000002 0.009ms (+0.000ms): (28) ((110))
28 80000002 0.009ms (+0.000ms): (0) ((115))
28 80000002 0.009ms (+0.000ms): (4344) ((118))
28 80000002 0.010ms (+0.000ms): (0) ((120))
28 80000002 0.010ms (+0.052ms): trace_array (__sched_text_start)
4 80000002 0.063ms (+1.256ms): __switch_to (__sched_text_start)
4 80000002 1.319ms (+0.000ms): (28) ((4))
4 80000002 1.319ms (+0.000ms): (110) ((105))
4 80000002 1.319ms (+0.000ms): finish_task_switch (__sched_text_start)
4 80000002 1.319ms (+0.139ms): _raw_spin_unlock (finish_task_switch)
4 80000001 1.459ms (+0.000ms): trace_stop_sched_switched
(finish_task_switch)
4 80000001 1.460ms (+0.000ms): (4) ((105))
4 80000001 1.460ms (+0.420ms): _raw_spin_lock_irqsave
(trace_stop_sched_switched)
4 88000001 1.880ms (+0.000ms): trace_stop_sched_switched
(finish_task_switch)
A third long one, note inconsistent header / total time.
preemption latency trace v1.0.7 on 2.6.10-rc2-mm2-V0.7.30-2
-------------------------------------------------------
latency: 1000 us, entries: 32 (32) | [VP:0 KP:1 SP:1 HP:1 #CPUS:2]
-----------------
| task: ksoftirqd/0/4, uid:0 nice:-10 policy:0 rt_prio:0
-----------------
=> started at: try_to_wake_up+0x379/0x3e0 <c0118d99>
=> ended at: finish_task_switch+0x4f/0xc0 <c01192af>
=======>
0 88000004 0.000ms (+0.000ms): __trace_start_sched_wakeup
(try_to_wake_up)
0 88000004 0.000ms (+0.000ms): _raw_spin_unlock (try_to_wake_up)
0 88000003 0.000ms (+0.000ms): preempt_schedule (try_to_wake_up)
0 88000003 0.000ms (+0.000ms): (4) ((0))
0 88000003 0.001ms (+0.000ms): (105) ((140))
0 88000003 0.001ms (+0.000ms): try_to_wake_up (wake_up_process)
0 88000003 0.001ms (+0.000ms): (0) ((1))
0 88000003 0.001ms (+0.000ms): _raw_spin_unlock (try_to_wake_up)
0 88000002 0.002ms (+0.000ms): preempt_schedule (try_to_wake_up)
0 88000002 0.002ms (+0.000ms): wake_up_process (do_softirq)
0 08000000 0.003ms (+0.000ms): preempt_schedule (cpu_idle)
0 98000000 0.003ms (+0.000ms): __sched_text_start (preempt_schedule)
0 98000001 0.004ms (+0.000ms): sched_clock (__sched_text_start)
0 98000001 0.004ms (+0.000ms): _raw_spin_lock_irq (__sched_text_start)
0 98000001 0.004ms (+0.000ms): _raw_spin_lock_irqsave
(__sched_text_start)
0 88000002 0.005ms (+0.000ms): dequeue_task (__sched_text_start)
0 88000002 0.005ms (+0.000ms): recalc_task_prio (__sched_text_start)
0 88000002 0.005ms (+0.000ms): effective_prio (recalc_task_prio)
0 88000002 0.006ms (+0.000ms): enqueue_task (__sched_text_start)
0 80000002 0.006ms (+0.001ms): trace_array (__sched_text_start)
0 80000002 0.008ms (+0.000ms): (4) ((105))
0 80000002 0.008ms (+0.000ms): (0) ((110))
0 80000002 0.009ms (+0.002ms): trace_array (__sched_text_start)
4 80000002 0.011ms (+0.000ms): __switch_to (__sched_text_start)
4 80000002 0.012ms (+0.000ms): (0) ((4))
4 80000002 0.012ms (+0.000ms): (140) ((105))
4 80000002 0.012ms (+0.000ms): finish_task_switch (__sched_text_start)
4 80000002 0.012ms (+0.000ms): _raw_spin_unlock (finish_task_switch)
4 80000001 0.013ms (+0.000ms): trace_stop_sched_switched
(finish_task_switch)
4 80000001 0.013ms (+0.000ms): (4) ((105))
4 80000001 0.013ms (+1.307ms): _raw_spin_lock_irqsave
(trace_stop_sched_switched)
4 80000001 1.321ms (+0.000ms): trace_stop_sched_switched
(finish_task_switch)
> [2] variety of long > 1 msec wakeup latencies (see below)
> [3] primary long latencies with ksoftirqd/[01] and IRQ 10 tasks
Never mind on the long latencies.
I forgot to set udma2 on the disk drive. Setting that changed
the wakeup latencies back to the sub 100 usec range.
Will be trying the (non wakeup) traces next.
--Mark H Johnson
<mailto:[email protected]>
>I have a few ideas to simplify the set up to see if I can get some
>useful data out of the system.
It appears the lockup only occurs after I do
echo 0 > /proc/sys/kernel/preempt_wakeup_latency
I did manage to get a few messages out of the system before the lockup
this time and here is what was on the serial console.
[I didn't do "dmesg -n 1" like I usually do...]
--Mark
--------
[starts with a couple of the wakeup latency messages, then the
messages from changing the tracing type]
(X/2869/CPU#1): new 76 us maximum-latency wakeup.
(ksoftirqd/1/7/CPU#1): new 110 us maximum-latency wakeup.
(X/2869/CPU#0): new 825 us maximum-latency critical section.
=> started at timestamp 4156290879: <try_to_wake_up+0x379/0x3e0>
=> ended at timestamp 4156291705: <__up_mutex+0x469/0x4d0>
[<c0104e83>] dump_stack+0x23/0x30 (20)
[<c013d107>] check_critical_timing+0x1d7/0x390 (88)
[<c013d58d>] trace_irqs_on+0x7d/0x90 (24)
[<c013ad19>] __up_mutex+0x469/0x4d0 (60)
[<c013b4c6>] up_mutex+0xb6/0x110 (40)
[<c016e234>] fget+0x54/0x70 (28)
[<c0182c37>] do_select+0x207/0x2b0 (120)
[<c0182fee>] sys_select+0x2be/0x580 (92)
[<c0103f8d>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c013e32d>] .... print_traces+0x1d/0x60
.....[<c0104e83>] .. ( <= dump_stack+0x23/0x30)
=> dump-end timestamp 4156528745
(IRQ 0/2/CPU#0): new 51 us maximum-latency critical section.
=> started at timestamp 4208681904: <__up_mutex+0x9c/0x4d0>
=> ended at timestamp 4208681955: <schedule+0x4c/0x140>
[<c0104e83>] dump_stack+0x23/0x30 (20)
[<c013d107>] check_critical_timing+0x1d7/0x390 (88)
[<c013d58d>] trace_irqs_on+0x7d/0x90 (24)
[<c032ad7c>] schedule+0x4c/0x140 (36)
[<c032c386>] __down_mutex+0x2a6/0x310 (84)
[<c013bccb>] __spin_lock+0x4b/0x60 (24)
[<c013bcfd>] _spin_lock+0x1d/0x30 (16)
[<c0109260>] timer_interrupt+0x20/0x110 (32)
[<c0147b63>] handle_IRQ_event+0x53/0xa0 (40)
[<c01483d5>] do_hardirq+0xa5/0x100 (40)
[<c0148571>] do_irqd+0x141/0x210 (48)
[<c0138b6b>] kthread+0xbb/0xc0 (48)
[<c0102019>] kernel_thread_helper+0x5/0xc (536952852)
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c013e32d>] .... print_traces+0x1d/0x60
.....[<c0104e83>] .. ( <= dump_stack+0x23/0x30)
=> dump-end timestamp 4208969387
(get_ltrace.sh/6305/CPU#1): new 64 us maximum-latency critical section.
=> started at timestamp 4209005150: <do_exit+0x32c/0x5e0>
=> ended at timestamp 4209005214: <try_to_wake_up+0x23d/0x3e0>
[<c0104e83>] dump_stack+0x23/0x30 (20)
[<c013d107>] check_critical_timing+0x1d7/0x390 (88)
[<c013d320>] touch_critical_timing+0x60/0x90 (24)
[<c0118c5d>] try_to_wake_up+0x23d/0x3e0 (72)
[<c0118e2b>] wake_up_process+0x2b/0x40 (28)
[<c01201c7>] __mmdrop_delayed+0x67/0xa0 (20)
[<c01192f7>] finish_task_switch+0x97/0xc0 (24)
[<c032a857>] __sched_text_start+0x457/0x930 (112)
[<c032ad70>] schedule+0x40/0x140 (36)
[<c01257c6>] do_wait+0x1d6/0x540 (140)
[<c0125c01>] sys_wait4+0x41/0x50 (28)
[<c0125c3a>] sys_waitpid+0x2a/0x30 (24)
[<c0103f8d>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000004 ]
| 4-level deep critical section nesting:
----------------------------------------
.. [<c032a44f>] .... __sched_text_start+0x4f/0x930
.....[<c032ad70>] .. ( <= schedule+0x40/0x140)
.. [<c012017f>] .... __mmdrop_delayed+0x1f/0xa0
.....[<c01192f7>] .. ( <= finish_task_switch+0x97/0xc0)
.. [<c032ca3f>] .... _raw_spin_lock+0x1f/0x70
.....[<c0117f62>] .. ( <= task_rq_lock+0x42/0x80)
.. [<c013e32d>] .... print_traces+0x1d/0x60
.....[<c0104e83>] .. ( <= dump_stack+0x23/0x30)
=> dump-end timestamp 4209394021
(get_ltrace.sh/6305/CPU#1): new 77 us maximum-latency critical section.
=> started at timestamp 4209417650: <__sched_text_start+0x4f/0x930>
=> ended at timestamp 4209417728: <preempt_schedule+0x6e/0x80>
[<c0104e83>] dump_stack+0x23/0x30 (20)
[<c013d107>] check_critical_timing+0x1d7/0x390 (88)
[<c013d58d>] trace_irqs_on+0x7d/0x90 (24)
[<c032aede>] preempt_schedule+0x6e/0x80 (20)
[<c01190fb>] wake_up_new_task+0x16b/0x240 (44)
[<c011fce9>] do_fork+0x129/0x1d0 (132)
[<c01029be>] sys_clone+0x3e/0x50 (32)
[<c0103f8d>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000001 ]
| 1-level deep critical section nesting:
----------------------------------------
.. [<c013e32d>] .... print_traces+0x1d/0x60
.....[<c0104e83>] .. ( <= dump_stack+0x23/0x30)
=> dump-end timestamp 4209652960
* [email protected] <[email protected]> wrote:
> - echo 0 > /proc/sys/kernel/preempt_wakeup_timing [entered, but
> display was frozen at this point and did not see newline nor any
> further output]
managed to reproduce this on an SMP box but not on an UP box, so i think
this is SMP related. It definitely happens almost immediately after
preempt_wakeup_timing is reset - or after preempt_max_timing is reset.
(Perhaps a dump_stack() from the wrong place, or something like that.)
Ingo
#
# Automatically generated make config: don't edit
# Linux kernel version: 2.6.10-rc2-mm2-V0.7.30-2
# Mon Nov 22 09:24:34 2004
#
CONFIG_X86=y
CONFIG_MMU=y
CONFIG_UID16=y
CONFIG_GENERIC_ISA_DMA=y
CONFIG_GENERIC_IOMAP=y
#
# Code maturity level options
#
CONFIG_EXPERIMENTAL=y
CONFIG_CLEAN_COMPILE=y
CONFIG_LOCK_KERNEL=y
#
# General setup
#
CONFIG_LOCALVERSION=""
CONFIG_SWAP=y
CONFIG_SYSVIPC=y
CONFIG_POSIX_MQUEUE=y
CONFIG_BSD_PROCESS_ACCT=y
# CONFIG_BSD_PROCESS_ACCT_V3 is not set
CONFIG_SYSCTL=y
# CONFIG_AUDIT is not set
CONFIG_LOG_BUF_SHIFT=14
CONFIG_HOTPLUG=y
CONFIG_KOBJECT_UEVENT=y
CONFIG_IKCONFIG=y
CONFIG_IKCONFIG_PROC=y
# CONFIG_EMBEDDED is not set
CONFIG_KALLSYMS=y
# CONFIG_KALLSYMS_ALL is not set
# CONFIG_KALLSYMS_EXTRA_PASS is not set
CONFIG_FUTEX=y
CONFIG_EPOLL=y
# CONFIG_CPUSETS is not set
# CONFIG_CC_OPTIMIZE_FOR_SIZE is not set
CONFIG_SHMEM=y
CONFIG_CC_ALIGN_FUNCTIONS=0
CONFIG_CC_ALIGN_LABELS=0
CONFIG_CC_ALIGN_LOOPS=0
CONFIG_CC_ALIGN_JUMPS=0
# CONFIG_TINY_SHMEM is not set
#
# Loadable module support
#
CONFIG_MODULES=y
# CONFIG_MODULE_UNLOAD is not set
CONFIG_OBSOLETE_MODPARM=y
CONFIG_MODVERSIONS=y
# CONFIG_MODULE_SRCVERSION_ALL is not set
CONFIG_KMOD=y
#
# Processor type and features
#
CONFIG_X86_PC=y
# CONFIG_X86_ELAN is not set
# CONFIG_X86_VOYAGER is not set
# CONFIG_X86_NUMAQ is not set
# CONFIG_X86_SUMMIT is not set
# CONFIG_X86_BIGSMP is not set
# CONFIG_X86_VISWS is not set
# CONFIG_X86_GENERICARCH is not set
# CONFIG_X86_ES7000 is not set
# CONFIG_M386 is not set
# CONFIG_M486 is not set
# CONFIG_M586 is not set
# CONFIG_M586TSC is not set
# CONFIG_M586MMX is not set
# CONFIG_M686 is not set
# CONFIG_MPENTIUMII is not set
# CONFIG_MPENTIUMIII is not set
# CONFIG_MPENTIUMM is not set
CONFIG_MPENTIUM4=y
# CONFIG_MK6 is not set
# CONFIG_MK7 is not set
# CONFIG_MK8 is not set
# CONFIG_MCRUSOE is not set
# CONFIG_MEFFICEON is not set
# CONFIG_MWINCHIPC6 is not set
# CONFIG_MWINCHIP2 is not set
# CONFIG_MWINCHIP3D is not set
# CONFIG_MCYRIXIII is not set
# CONFIG_MVIAC3_2 is not set
CONFIG_X86_GENERIC=y
CONFIG_X86_CMPXCHG=y
CONFIG_X86_XADD=y
CONFIG_X86_L1_CACHE_SHIFT=7
CONFIG_GENERIC_CALIBRATE_DELAY=y
CONFIG_X86_WP_WORKS_OK=y
CONFIG_X86_INVLPG=y
CONFIG_X86_BSWAP=y
CONFIG_X86_POPAD_OK=y
CONFIG_X86_GOOD_APIC=y
CONFIG_X86_INTEL_USERCOPY=y
CONFIG_X86_USE_PPRO_CHECKSUM=y
# CONFIG_HPET_TIMER is not set
CONFIG_SMP=y
CONFIG_NR_CPUS=8
CONFIG_SCHED_SMT=y
# CONFIG_PREEMPT_NONE is not set
# CONFIG_PREEMPT_VOLUNTARY is not set
# CONFIG_PREEMPT_DESKTOP is not set
CONFIG_PREEMPT_RT=y
CONFIG_PREEMPT=y
CONFIG_PREEMPT_SOFTIRQS=y
CONFIG_PREEMPT_HARDIRQS=y
CONFIG_PREEMPT_BKL=y
CONFIG_X86_LOCAL_APIC=y
CONFIG_X86_IO_APIC=y
CONFIG_X86_TSC=y
CONFIG_X86_MCE=y
# CONFIG_X86_MCE_NONFATAL is not set
# CONFIG_X86_MCE_P4THERMAL is not set
# CONFIG_TOSHIBA is not set
# CONFIG_I8K is not set
CONFIG_MICROCODE=m
CONFIG_X86_MSR=m
CONFIG_X86_CPUID=m
#
# Firmware Drivers
#
CONFIG_EDD=m
CONFIG_NOHIGHMEM=y
# CONFIG_HIGHMEM4G is not set
# CONFIG_HIGHMEM64G is not set
# CONFIG_MATH_EMULATION is not set
CONFIG_MTRR=y
CONFIG_IRQBALANCE=y
CONFIG_HAVE_DEC_LOCK=y
#
# Performance-monitoring counters support
#
# CONFIG_PERFCTR is not set
CONFIG_KERN_PHYS_OFFSET=1
# CONFIG_KEXEC is not set
#
# Power management options (ACPI, APM)
#
# CONFIG_PM is not set
#
# ACPI (Advanced Configuration and Power Interface) Support
#
# CONFIG_ACPI is not set
CONFIG_ACPI_BOOT=y
CONFIG_ACPI_BLACKLIST_YEAR=0
#
# CPU Frequency scaling
#
# CONFIG_CPU_FREQ is not set
#
# Bus options (PCI, PCMCIA, EISA, MCA, ISA)
#
CONFIG_PCI=y
# CONFIG_PCI_GOBIOS is not set
# CONFIG_PCI_GOMMCONFIG is not set
# CONFIG_PCI_GODIRECT is not set
CONFIG_PCI_GOANY=y
CONFIG_PCI_BIOS=y
CONFIG_PCI_DIRECT=y
# CONFIG_PCI_MSI is not set
# CONFIG_PCI_LEGACY_PROC is not set
CONFIG_PCI_NAMES=y
CONFIG_ISA=y
# CONFIG_EISA is not set
# CONFIG_MCA is not set
# CONFIG_SCx200 is not set
# CONFIG_HOTPLUG_CPU is not set
#
# PCCARD (PCMCIA/CardBus) support
#
# CONFIG_PCCARD is not set
#
# PC-card bridges
#
CONFIG_PCMCIA_PROBE=y
#
# PCI Hotplug Support
#
# CONFIG_HOTPLUG_PCI is not set
#
# Executable file formats
#
CONFIG_BINFMT_ELF=y
CONFIG_BINFMT_AOUT=m
CONFIG_BINFMT_MISC=m
#
# Device Drivers
#
#
# Generic Driver Options
#
CONFIG_STANDALONE=y
CONFIG_PREVENT_FIRMWARE_BUILD=y
CONFIG_FW_LOADER=m
# CONFIG_DEBUG_DRIVER is not set
#
# Memory Technology Devices (MTD)
#
# CONFIG_MTD is not set
#
# Parallel port support
#
# CONFIG_PARPORT is not set
#
# Plug and Play support
#
CONFIG_PNP=y
# CONFIG_PNP_DEBUG is not set
#
# Protocols
#
CONFIG_ISAPNP=y
# CONFIG_PNPBIOS is not set
#
# Block devices
#
CONFIG_BLK_DEV_FD=m
# CONFIG_BLK_DEV_XD is not set
CONFIG_BLK_CPQ_DA=m
CONFIG_BLK_CPQ_CISS_DA=m
CONFIG_CISS_SCSI_TAPE=y
# CONFIG_BLK_DEV_DAC960 is not set
# CONFIG_BLK_DEV_UMEM is not set
CONFIG_BLK_DEV_LOOP=m
# CONFIG_BLK_DEV_CRYPTOLOOP is not set
CONFIG_BLK_DEV_NBD=m
# CONFIG_BLK_DEV_SX8 is not set
# CONFIG_BLK_DEV_UB is not set
CONFIG_BLK_DEV_RAM=y
CONFIG_BLK_DEV_RAM_SIZE=8192
CONFIG_BLK_DEV_INITRD=y
CONFIG_INITRAMFS_SOURCE=""
# CONFIG_LBD is not set
# CONFIG_CDROM_PKTCDVD is not set
#
# IO Schedulers
#
CONFIG_IOSCHED_NOOP=y
CONFIG_IOSCHED_AS=y
CONFIG_IOSCHED_DEADLINE=y
CONFIG_IOSCHED_CFQ=y
#
# ATA/ATAPI/MFM/RLL support
#
CONFIG_IDE=y
CONFIG_BLK_DEV_IDE=y
#
# Please see Documentation/ide.txt for help/info on IDE drives
#
# CONFIG_BLK_DEV_IDE_SATA is not set
# CONFIG_BLK_DEV_HD_IDE is not set
CONFIG_BLK_DEV_IDEDISK=y
CONFIG_IDEDISK_MULTI_MODE=y
CONFIG_BLK_DEV_IDECD=m
# CONFIG_BLK_DEV_IDETAPE is not set
# CONFIG_BLK_DEV_IDEFLOPPY is not set
# CONFIG_BLK_DEV_IDESCSI is not set
# CONFIG_IDE_TASK_IOCTL is not set
#
# IDE chipset support/bugfixes
#
CONFIG_IDE_GENERIC=y
CONFIG_BLK_DEV_CMD640=y
# CONFIG_BLK_DEV_CMD640_ENHANCED is not set
# CONFIG_BLK_DEV_IDEPNP is not set
CONFIG_BLK_DEV_IDEPCI=y
CONFIG_IDEPCI_SHARE_IRQ=y
# CONFIG_BLK_DEV_OFFBOARD is not set
CONFIG_BLK_DEV_GENERIC=y
# CONFIG_BLK_DEV_OPTI621 is not set
CONFIG_BLK_DEV_RZ1000=y
CONFIG_BLK_DEV_IDEDMA_PCI=y
# CONFIG_BLK_DEV_IDEDMA_FORCED is not set
CONFIG_IDEDMA_PCI_AUTO=y
# CONFIG_IDEDMA_ONLYDISK is not set
CONFIG_BLK_DEV_AEC62XX=y
CONFIG_BLK_DEV_ALI15X3=y
# CONFIG_WDC_ALI15X3 is not set
CONFIG_BLK_DEV_AMD74XX=y
# CONFIG_BLK_DEV_ATIIXP is not set
CONFIG_BLK_DEV_CMD64X=y
CONFIG_BLK_DEV_TRIFLEX=y
CONFIG_BLK_DEV_CY82C693=y
# CONFIG_BLK_DEV_CS5520 is not set
CONFIG_BLK_DEV_CS5530=y
CONFIG_BLK_DEV_HPT34X=y
# CONFIG_HPT34X_AUTODMA is not set
CONFIG_BLK_DEV_HPT366=y
# CONFIG_BLK_DEV_SC1200 is not set
CONFIG_BLK_DEV_PIIX=y
# CONFIG_BLK_DEV_NS87415 is not set
CONFIG_BLK_DEV_PDC202XX_OLD=y
# CONFIG_PDC202XX_BURST is not set
CONFIG_BLK_DEV_PDC202XX_NEW=y
CONFIG_BLK_DEV_SVWKS=y
CONFIG_BLK_DEV_SIIMAGE=y
CONFIG_BLK_DEV_SIS5513=y
CONFIG_BLK_DEV_SLC90E66=y
# CONFIG_BLK_DEV_TRM290 is not set
CONFIG_BLK_DEV_VIA82CXXX=y
# CONFIG_IDE_ARM is not set
# CONFIG_IDE_CHIPSETS is not set
CONFIG_BLK_DEV_IDEDMA=y
# CONFIG_IDEDMA_IVB is not set
CONFIG_IDEDMA_AUTO=y
# CONFIG_BLK_DEV_HD is not set
#
# SCSI device support
#
CONFIG_SCSI=m
CONFIG_SCSI_PROC_FS=y
#
# SCSI support type (disk, tape, CD-ROM)
#
CONFIG_BLK_DEV_SD=m
CONFIG_CHR_DEV_ST=m
CONFIG_CHR_DEV_OSST=m
CONFIG_BLK_DEV_SR=m
CONFIG_BLK_DEV_SR_VENDOR=y
CONFIG_CHR_DEV_SG=m
#
# Some SCSI devices (e.g. CD jukebox) support multiple LUNs
#
# CONFIG_SCSI_MULTI_LUN is not set
CONFIG_SCSI_CONSTANTS=y
CONFIG_SCSI_LOGGING=y
#
# SCSI Transport Attributes
#
CONFIG_SCSI_SPI_ATTRS=m
# CONFIG_SCSI_FC_ATTRS is not set
# CONFIG_SCSI_ISCSI_ATTRS is not set
#
# SCSI low-level drivers
#
CONFIG_BLK_DEV_3W_XXXX_RAID=m
# CONFIG_SCSI_3W_9XXX is not set
CONFIG_SCSI_7000FASST=m
CONFIG_SCSI_ACARD=m
CONFIG_SCSI_AHA152X=m
CONFIG_SCSI_AHA1542=m
CONFIG_SCSI_AACRAID=m
CONFIG_SCSI_AIC7XXX=m
CONFIG_AIC7XXX_CMDS_PER_DEVICE=32
CONFIG_AIC7XXX_RESET_DELAY_MS=15000
# CONFIG_AIC7XXX_DEBUG_ENABLE is not set
CONFIG_AIC7XXX_DEBUG_MASK=0
# CONFIG_AIC7XXX_REG_PRETTY_PRINT is not set
CONFIG_SCSI_AIC7XXX_OLD=m
CONFIG_SCSI_AIC79XX=m
CONFIG_AIC79XX_CMDS_PER_DEVICE=32
CONFIG_AIC79XX_RESET_DELAY_MS=15000
# CONFIG_AIC79XX_ENABLE_RD_STRM is not set
# CONFIG_AIC79XX_DEBUG_ENABLE is not set
CONFIG_AIC79XX_DEBUG_MASK=0
# CONFIG_AIC79XX_REG_PRETTY_PRINT is not set
# CONFIG_SCSI_DPT_I2O is not set
CONFIG_SCSI_IN2000=m
# CONFIG_MEGARAID_NEWGEN is not set
# CONFIG_MEGARAID_LEGACY is not set
CONFIG_SCSI_SATA=y
# CONFIG_SCSI_SATA_AHCI is not set
CONFIG_SCSI_SATA_SVW=m
CONFIG_SCSI_ATA_PIIX=m
# CONFIG_SCSI_SATA_NV is not set
CONFIG_SCSI_SATA_PROMISE=m
# CONFIG_SCSI_SATA_SX4 is not set
CONFIG_SCSI_SATA_SIL=m
# CONFIG_SCSI_SATA_SIS is not set
# CONFIG_SCSI_SATA_ULI is not set
CONFIG_SCSI_SATA_VIA=m
# CONFIG_SCSI_SATA_VITESSE is not set
CONFIG_SCSI_BUSLOGIC=m
# CONFIG_SCSI_OMIT_FLASHPOINT is not set
CONFIG_SCSI_DMX3191D=m
CONFIG_SCSI_DTC3280=m
CONFIG_SCSI_EATA=m
CONFIG_SCSI_EATA_TAGGED_QUEUE=y
# CONFIG_SCSI_EATA_LINKED_COMMANDS is not set
CONFIG_SCSI_EATA_MAX_TAGS=16
CONFIG_SCSI_EATA_PIO=m
CONFIG_SCSI_FUTURE_DOMAIN=m
CONFIG_SCSI_GDTH=m
CONFIG_SCSI_GENERIC_NCR5380=m
# CONFIG_SCSI_GENERIC_NCR5380_MMIO is not set
# CONFIG_SCSI_GENERIC_NCR53C400 is not set
CONFIG_SCSI_IPS=m
# CONFIG_SCSI_INITIO is not set
CONFIG_SCSI_INIA100=m
CONFIG_SCSI_NCR53C406A=m
CONFIG_SCSI_SYM53C8XX_2=m
CONFIG_SCSI_SYM53C8XX_DMA_ADDRESSING_MODE=1
CONFIG_SCSI_SYM53C8XX_DEFAULT_TAGS=16
CONFIG_SCSI_SYM53C8XX_MAX_TAGS=64
# CONFIG_SCSI_SYM53C8XX_IOMAPPED is not set
# CONFIG_SCSI_IPR is not set
CONFIG_SCSI_PAS16=m
CONFIG_SCSI_PSI240I=m
CONFIG_SCSI_QLOGIC_FAS=m
CONFIG_SCSI_QLOGIC_ISP=m
CONFIG_SCSI_QLOGIC_FC=m
# CONFIG_SCSI_QLOGIC_FC_FIRMWARE is not set
CONFIG_SCSI_QLOGIC_1280=m
# CONFIG_SCSI_QLOGIC_1280_1040 is not set
CONFIG_SCSI_QLA2XXX=m
# CONFIG_SCSI_QLA21XX is not set
# CONFIG_SCSI_QLA22XX is not set
# CONFIG_SCSI_QLA2300 is not set
# CONFIG_SCSI_QLA2322 is not set
# CONFIG_SCSI_QLA6312 is not set
# CONFIG_SCSI_QLA6322 is not set
CONFIG_SCSI_SYM53C416=m
# CONFIG_SCSI_DC395x is not set
CONFIG_SCSI_DC390T=m
CONFIG_SCSI_T128=m
CONFIG_SCSI_U14_34F=m
# CONFIG_SCSI_U14_34F_TAGGED_QUEUE is not set
# CONFIG_SCSI_U14_34F_LINKED_COMMANDS is not set
CONFIG_SCSI_U14_34F_MAX_TAGS=8
CONFIG_SCSI_ULTRASTOR=m
CONFIG_SCSI_NSP32=m
CONFIG_SCSI_DEBUG=m
#
# Old CD-ROM drivers (not SCSI, not IDE)
#
# CONFIG_CD_NO_IDESCSI is not set
#
# Multi-device support (RAID and LVM)
#
CONFIG_MD=y
CONFIG_BLK_DEV_MD=y
CONFIG_MD_LINEAR=m
CONFIG_MD_RAID0=m
CONFIG_MD_RAID1=m
# CONFIG_MD_RAID10 is not set
CONFIG_MD_RAID5=m
# CONFIG_MD_RAID6 is not set
CONFIG_MD_MULTIPATH=m
# CONFIG_MD_FAULTY is not set
CONFIG_BLK_DEV_DM=m
# CONFIG_DM_CRYPT is not set
# CONFIG_DM_SNAPSHOT is not set
# CONFIG_DM_MIRROR is not set
# CONFIG_DM_ZERO is not set
#
# Fusion MPT device support
#
# CONFIG_FUSION is not set
#
# IEEE 1394 (FireWire) support
#
# CONFIG_IEEE1394 is not set
#
# I2O device support
#
# CONFIG_I2O is not set
#
# Networking support
#
CONFIG_NET=y
#
# Networking options
#
CONFIG_PACKET=y
CONFIG_PACKET_MMAP=y
CONFIG_NETLINK_DEV=y
CONFIG_UNIX=y
# CONFIG_IPMI_SOCKET is not set
CONFIG_NET_KEY=m
CONFIG_INET=y
CONFIG_IP_MULTICAST=y
CONFIG_IP_ADVANCED_ROUTER=y
CONFIG_IP_MULTIPLE_TABLES=y
CONFIG_IP_ROUTE_FWMARK=y
CONFIG_IP_ROUTE_MULTIPATH=y
CONFIG_IP_ROUTE_VERBOSE=y
# CONFIG_IP_PNP is not set
CONFIG_NET_IPIP=m
CONFIG_NET_IPGRE=m
CONFIG_NET_IPGRE_BROADCAST=y
CONFIG_IP_MROUTE=y
CONFIG_IP_PIMSM_V1=y
CONFIG_IP_PIMSM_V2=y
# CONFIG_ARPD is not set
CONFIG_SYN_COOKIES=y
CONFIG_INET_AH=m
CONFIG_INET_ESP=m
CONFIG_INET_IPCOMP=m
CONFIG_INET_TUNNEL=m
CONFIG_IP_TCPDIAG=y
# CONFIG_IP_TCPDIAG_IPV6 is not set
#
# IP: Virtual Server Configuration
#
CONFIG_IP_VS=m
# CONFIG_IP_VS_DEBUG is not set
CONFIG_IP_VS_TAB_BITS=16
#
# IPVS transport protocol load balancing support
#
# CONFIG_IP_VS_PROTO_TCP is not set
# CONFIG_IP_VS_PROTO_UDP is not set
# CONFIG_IP_VS_PROTO_ESP is not set
# CONFIG_IP_VS_PROTO_AH is not set
#
# IPVS scheduler
#
CONFIG_IP_VS_RR=m
CONFIG_IP_VS_WRR=m
CONFIG_IP_VS_LC=m
CONFIG_IP_VS_WLC=m
CONFIG_IP_VS_LBLC=m
CONFIG_IP_VS_LBLCR=m
CONFIG_IP_VS_DH=m
CONFIG_IP_VS_SH=m
# CONFIG_IP_VS_SED is not set
# CONFIG_IP_VS_NQ is not set
#
# IPVS application helper
#
# CONFIG_IPV6 is not set
CONFIG_NETFILTER=y
# CONFIG_NETFILTER_DEBUG is not set
CONFIG_BRIDGE_NETFILTER=y
#
# IP: Netfilter Configuration
#
CONFIG_IP_NF_CONNTRACK=m
# CONFIG_IP_NF_CT_ACCT is not set
# CONFIG_IP_NF_CONNTRACK_MARK is not set
# CONFIG_IP_NF_CT_PROTO_SCTP is not set
CONFIG_IP_NF_FTP=m
CONFIG_IP_NF_IRC=m
CONFIG_IP_NF_TFTP=m
CONFIG_IP_NF_AMANDA=m
CONFIG_IP_NF_QUEUE=m
CONFIG_IP_NF_IPTABLES=m
CONFIG_IP_NF_MATCH_LIMIT=m
# CONFIG_IP_NF_MATCH_IPRANGE is not set
CONFIG_IP_NF_MATCH_MAC=m
CONFIG_IP_NF_MATCH_PKTTYPE=m
CONFIG_IP_NF_MATCH_MARK=m
CONFIG_IP_NF_MATCH_MULTIPORT=m
CONFIG_IP_NF_MATCH_TOS=m
CONFIG_IP_NF_MATCH_RECENT=m
CONFIG_IP_NF_MATCH_ECN=m
CONFIG_IP_NF_MATCH_DSCP=m
CONFIG_IP_NF_MATCH_AH_ESP=m
CONFIG_IP_NF_MATCH_LENGTH=m
CONFIG_IP_NF_MATCH_TTL=m
CONFIG_IP_NF_MATCH_TCPMSS=m
CONFIG_IP_NF_MATCH_HELPER=m
CONFIG_IP_NF_MATCH_STATE=m
CONFIG_IP_NF_MATCH_CONNTRACK=m
CONFIG_IP_NF_MATCH_OWNER=m
# CONFIG_IP_NF_MATCH_PHYSDEV is not set
# CONFIG_IP_NF_MATCH_ADDRTYPE is not set
# CONFIG_IP_NF_MATCH_REALM is not set
# CONFIG_IP_NF_MATCH_SCTP is not set
# CONFIG_IP_NF_MATCH_COMMENT is not set
# CONFIG_IP_NF_MATCH_HASHLIMIT is not set
CONFIG_IP_NF_FILTER=m
CONFIG_IP_NF_TARGET_REJECT=m
CONFIG_IP_NF_TARGET_LOG=m
CONFIG_IP_NF_TARGET_ULOG=m
CONFIG_IP_NF_TARGET_TCPMSS=m
CONFIG_IP_NF_NAT=m
CONFIG_IP_NF_NAT_NEEDED=y
CONFIG_IP_NF_TARGET_MASQUERADE=m
CONFIG_IP_NF_TARGET_REDIRECT=m
# CONFIG_IP_NF_TARGET_NETMAP is not set
# CONFIG_IP_NF_TARGET_SAME is not set
# CONFIG_IP_NF_NAT_LOCAL is not set
CONFIG_IP_NF_NAT_SNMP_BASIC=m
CONFIG_IP_NF_NAT_IRC=m
CONFIG_IP_NF_NAT_FTP=m
CONFIG_IP_NF_NAT_TFTP=m
CONFIG_IP_NF_NAT_AMANDA=m
CONFIG_IP_NF_MANGLE=m
CONFIG_IP_NF_TARGET_TOS=m
CONFIG_IP_NF_TARGET_ECN=m
CONFIG_IP_NF_TARGET_DSCP=m
CONFIG_IP_NF_TARGET_MARK=m
# CONFIG_IP_NF_TARGET_CLASSIFY is not set
# CONFIG_IP_NF_RAW is not set
CONFIG_IP_NF_ARPTABLES=m
CONFIG_IP_NF_ARPFILTER=m
CONFIG_IP_NF_ARP_MANGLE=m
CONFIG_IP_NF_COMPAT_IPCHAINS=m
CONFIG_IP_NF_COMPAT_IPFWADM=m
#
# Bridge: Netfilter Configuration
#
# CONFIG_BRIDGE_NF_EBTABLES is not set
CONFIG_XFRM=y
CONFIG_XFRM_USER=m
#
# SCTP Configuration (EXPERIMENTAL)
#
# CONFIG_IP_SCTP is not set
CONFIG_ATM=y
CONFIG_ATM_CLIP=y
# CONFIG_ATM_CLIP_NO_ICMP is not set
CONFIG_ATM_LANE=m
CONFIG_ATM_MPOA=m
CONFIG_ATM_BR2684=m
CONFIG_ATM_BR2684_IPFILTER=y
CONFIG_BRIDGE=m
CONFIG_VLAN_8021Q=m
# CONFIG_DECNET is not set
CONFIG_LLC=y
# CONFIG_LLC2 is not set
CONFIG_IPX=m
# CONFIG_IPX_INTERN is not set
CONFIG_ATALK=m
CONFIG_DEV_APPLETALK=y
CONFIG_LTPC=m
CONFIG_COPS=m
CONFIG_COPS_DAYNA=y
CONFIG_COPS_TANGENT=y
CONFIG_IPDDP=m
CONFIG_IPDDP_ENCAP=y
CONFIG_IPDDP_DECAP=y
# CONFIG_X25 is not set
# CONFIG_LAPB is not set
CONFIG_NET_DIVERT=y
# CONFIG_ECONET is not set
CONFIG_WAN_ROUTER=m
#
# QoS and/or fair queueing
#
CONFIG_NET_SCHED=y
CONFIG_NET_SCH_CLK_JIFFIES=y
# CONFIG_NET_SCH_CLK_GETTIMEOFDAY is not set
# CONFIG_NET_SCH_CLK_CPU is not set
CONFIG_NET_SCH_CBQ=m
CONFIG_NET_SCH_HTB=m
# CONFIG_NET_SCH_HFSC is not set
# CONFIG_NET_SCH_ATM is not set
CONFIG_NET_SCH_PRIO=m
CONFIG_NET_SCH_RED=m
CONFIG_NET_SCH_SFQ=m
CONFIG_NET_SCH_TEQL=m
CONFIG_NET_SCH_TBF=m
CONFIG_NET_SCH_GRED=m
CONFIG_NET_SCH_DSMARK=m
# CONFIG_NET_SCH_NETEM is not set
CONFIG_NET_SCH_INGRESS=m
CONFIG_NET_QOS=y
CONFIG_NET_ESTIMATOR=y
CONFIG_NET_CLS=y
CONFIG_NET_CLS_TCINDEX=m
CONFIG_NET_CLS_ROUTE4=m
CONFIG_NET_CLS_ROUTE=y
CONFIG_NET_CLS_FW=m
CONFIG_NET_CLS_U32=m
# CONFIG_CLS_U32_PERF is not set
# CONFIG_NET_CLS_IND is not set
CONFIG_NET_CLS_RSVP=m
CONFIG_NET_CLS_RSVP6=m
# CONFIG_NET_CLS_ACT is not set
CONFIG_NET_CLS_POLICE=y
#
# Network testing
#
# CONFIG_NET_PKTGEN is not set
# CONFIG_KGDBOE is not set
CONFIG_NETPOLL=y
# CONFIG_NETPOLL_RX is not set
# CONFIG_NETPOLL_TRAP is not set
CONFIG_NET_POLL_CONTROLLER=y
# CONFIG_HAMRADIO is not set
# CONFIG_IRDA is not set
# CONFIG_BT is not set
CONFIG_NETDEVICES=y
CONFIG_DUMMY=m
# CONFIG_BONDING is not set
# CONFIG_EQUALIZER is not set
CONFIG_TUN=m
CONFIG_ETHERTAP=m
# CONFIG_NET_SB1000 is not set
#
# ARCnet devices
#
# CONFIG_ARCNET is not set
#
# Ethernet (10 or 100Mbit)
#
CONFIG_NET_ETHERNET=y
CONFIG_MII=m
CONFIG_HAPPYMEAL=m
CONFIG_SUNGEM=m
CONFIG_NET_VENDOR_3COM=y
CONFIG_EL1=m
CONFIG_EL2=m
CONFIG_ELPLUS=m
CONFIG_EL16=m
CONFIG_EL3=m
CONFIG_3C515=m
CONFIG_VORTEX=m
CONFIG_TYPHOON=m
CONFIG_LANCE=m
CONFIG_NET_VENDOR_SMC=y
CONFIG_WD80x3=m
CONFIG_ULTRA=m
CONFIG_SMC9194=m
CONFIG_NET_VENDOR_RACAL=y
CONFIG_NI52=m
CONFIG_NI65=m
#
# Tulip family network device support
#
# CONFIG_NET_TULIP is not set
CONFIG_AT1700=m
CONFIG_DEPCA=m
CONFIG_HP100=m
CONFIG_NET_ISA=y
CONFIG_E2100=m
CONFIG_EWRK3=m
CONFIG_EEXPRESS=m
CONFIG_EEXPRESS_PRO=m
CONFIG_HPLAN_PLUS=m
CONFIG_HPLAN=m
CONFIG_LP486E=m
CONFIG_ETH16I=m
CONFIG_NE2000=m
# CONFIG_ZNET is not set
# CONFIG_SEEQ8005 is not set
CONFIG_NET_PCI=y
CONFIG_PCNET32=m
CONFIG_AMD8111_ETH=m
# CONFIG_AMD8111E_NAPI is not set
CONFIG_ADAPTEC_STARFIRE=m
# CONFIG_ADAPTEC_STARFIRE_NAPI is not set
CONFIG_AC3200=m
CONFIG_APRICOT=m
CONFIG_B44=m
# CONFIG_FORCEDETH is not set
CONFIG_CS89x0=m
CONFIG_DGRS=m
CONFIG_EEPRO100=m
CONFIG_E100=m
# CONFIG_E100_NAPI is not set
CONFIG_FEALNX=m
CONFIG_NATSEMI=m
CONFIG_NE2K_PCI=m
CONFIG_8139CP=m
CONFIG_8139TOO=m
CONFIG_8139TOO_PIO=y
# CONFIG_8139TOO_TUNE_TWISTER is not set
CONFIG_8139TOO_8129=y
# CONFIG_8139_OLD_RX_RESET is not set
CONFIG_SIS900=m
CONFIG_EPIC100=m
CONFIG_SUNDANCE=m
# CONFIG_SUNDANCE_MMIO is not set
CONFIG_TLAN=m
CONFIG_VIA_RHINE=m
# CONFIG_VIA_RHINE_MMIO is not set
CONFIG_NET_POCKET=y
CONFIG_ATP=m
CONFIG_DE600=m
CONFIG_DE620=m
#
# Ethernet (1000 Mbit)
#
CONFIG_ACENIC=m
# CONFIG_ACENIC_OMIT_TIGON_I is not set
CONFIG_DL2K=m
CONFIG_E1000=m
CONFIG_E1000_NAPI=y
CONFIG_NS83820=m
CONFIG_HAMACHI=m
CONFIG_YELLOWFIN=m
CONFIG_R8169=m
# CONFIG_R8169_NAPI is not set
# CONFIG_R8169_VLAN is not set
CONFIG_SK98LIN=m
# CONFIG_VIA_VELOCITY is not set
CONFIG_TIGON3=m
#
# Ethernet (10000 Mbit)
#
# CONFIG_IXGB is not set
# CONFIG_S2IO is not set
#
# Token Ring devices
#
CONFIG_TR=y
CONFIG_IBMTR=m
CONFIG_IBMOL=m
CONFIG_IBMLS=m
CONFIG_3C359=m
CONFIG_TMS380TR=m
CONFIG_TMSPCI=m
# CONFIG_SKISA is not set
# CONFIG_PROTEON is not set
CONFIG_ABYSS=m
CONFIG_SMCTR=m
#
# Wireless LAN (non-hamradio)
#
# CONFIG_NET_RADIO is not set
#
# Wan interfaces
#
# CONFIG_WAN is not set
#
# ATM drivers
#
# CONFIG_ATM_TCP is not set
# CONFIG_ATM_LANAI is not set
# CONFIG_ATM_ENI is not set
# CONFIG_ATM_FIRESTREAM is not set
# CONFIG_ATM_ZATM is not set
# CONFIG_ATM_NICSTAR is not set
# CONFIG_ATM_IDT77252 is not set
# CONFIG_ATM_AMBASSADOR is not set
# CONFIG_ATM_HORIZON is not set
# CONFIG_ATM_IA is not set
# CONFIG_ATM_FORE200E_MAYBE is not set
# CONFIG_ATM_HE is not set
# CONFIG_FDDI is not set
# CONFIG_HIPPI is not set
CONFIG_PPP=m
CONFIG_PPP_MULTILINK=y
CONFIG_PPP_FILTER=y
CONFIG_PPP_ASYNC=m
CONFIG_PPP_SYNC_TTY=m
CONFIG_PPP_DEFLATE=m
# CONFIG_PPP_BSDCOMP is not set
CONFIG_PPPOE=m
CONFIG_PPPOATM=m
CONFIG_SLIP=m
CONFIG_SLIP_COMPRESSED=y
CONFIG_SLIP_SMART=y
CONFIG_SLIP_MODE_SLIP6=y
CONFIG_NET_FC=y
CONFIG_SHAPER=m
CONFIG_NETCONSOLE=m
#
# ISDN subsystem
#
# CONFIG_ISDN is not set
#
# Telephony Support
#
# CONFIG_PHONE is not set
#
# Input device support
#
CONFIG_INPUT=y
#
# Userland interfaces
#
CONFIG_INPUT_MOUSEDEV=y
# CONFIG_INPUT_MOUSEDEV_PSAUX is not set
CONFIG_INPUT_MOUSEDEV_SCREEN_X=1024
CONFIG_INPUT_MOUSEDEV_SCREEN_Y=768
CONFIG_INPUT_JOYDEV=m
# CONFIG_INPUT_TSDEV is not set
CONFIG_INPUT_EVDEV=m
# CONFIG_INPUT_EVBUG is not set
#
# Input I/O drivers
#
# CONFIG_GAMEPORT is not set
CONFIG_SOUND_GAMEPORT=y
CONFIG_SERIO=y
CONFIG_SERIO_I8042=y
CONFIG_SERIO_SERPORT=y
# CONFIG_SERIO_CT82C710 is not set
# CONFIG_SERIO_PCIPS2 is not set
CONFIG_SERIO_LIBPS2=y
# CONFIG_SERIO_RAW is not set
#
# Input Device Drivers
#
CONFIG_INPUT_KEYBOARD=y
CONFIG_KEYBOARD_ATKBD=y
# CONFIG_KEYBOARD_SUNKBD is not set
# CONFIG_KEYBOARD_LKKBD is not set
# CONFIG_KEYBOARD_XTKBD is not set
# CONFIG_KEYBOARD_NEWTON is not set
CONFIG_INPUT_MOUSE=y
CONFIG_MOUSE_PS2=y
# CONFIG_MOUSE_SERIAL is not set
# CONFIG_MOUSE_INPORT is not set
# CONFIG_MOUSE_LOGIBM is not set
# CONFIG_MOUSE_PC110PAD is not set
# CONFIG_MOUSE_VSXXXAA is not set
# CONFIG_INPUT_JOYSTICK is not set
# CONFIG_INPUT_TOUCHSCREEN is not set
CONFIG_INPUT_MISC=y
# CONFIG_INPUT_PCSPKR is not set
# CONFIG_INPUT_UINPUT is not set
#
# Character devices
#
CONFIG_VT=y
CONFIG_VT_CONSOLE=y
CONFIG_HW_CONSOLE=y
CONFIG_SERIAL_NONSTANDARD=y
CONFIG_ROCKETPORT=m
CONFIG_CYCLADES=m
# CONFIG_CYZ_INTR is not set
CONFIG_SYNCLINK=m
# CONFIG_SYNCLINKMP is not set
CONFIG_N_HDLC=m
CONFIG_STALDRV=y
#
# Serial drivers
#
CONFIG_SERIAL_8250=m
CONFIG_SERIAL_8250_NR_UARTS=4
# CONFIG_SERIAL_8250_EXTENDED is not set
#
# Non-8250 serial port support
#
CONFIG_SERIAL_CORE=m
CONFIG_UNIX98_PTYS=y
CONFIG_LEGACY_PTYS=y
CONFIG_LEGACY_PTY_COUNT=256
#
# IPMI
#
CONFIG_IPMI_HANDLER=m
# CONFIG_IPMI_PANIC_EVENT is not set
CONFIG_IPMI_DEVICE_INTERFACE=m
# CONFIG_IPMI_SI is not set
CONFIG_IPMI_WATCHDOG=m
# CONFIG_IPMI_POWEROFF is not set
#
# Watchdog Cards
#
CONFIG_WATCHDOG=y
# CONFIG_WATCHDOG_NOWAYOUT is not set
#
# Watchdog Device Drivers
#
CONFIG_SOFT_WATCHDOG=m
CONFIG_ACQUIRE_WDT=m
CONFIG_ADVANTECH_WDT=m
CONFIG_ALIM1535_WDT=m
CONFIG_ALIM7101_WDT=m
CONFIG_SC520_WDT=m
CONFIG_EUROTECH_WDT=m
CONFIG_IB700_WDT=m
CONFIG_WAFER_WDT=m
# CONFIG_I8XX_TCO is not set
CONFIG_SC1200_WDT=m
# CONFIG_SCx200_WDT is not set
# CONFIG_60XX_WDT is not set
# CONFIG_CPU5_WDT is not set
# CONFIG_W83627HF_WDT is not set
CONFIG_W83877F_WDT=m
CONFIG_MACHZ_WDT=m
#
# ISA-based Watchdog Cards
#
CONFIG_PCWATCHDOG=m
# CONFIG_MIXCOMWD is not set
CONFIG_WDT=m
# CONFIG_WDT_501 is not set
#
# PCI-based Watchdog Cards
#
# CONFIG_PCIPCWATCHDOG is not set
CONFIG_WDTPCI=m
# CONFIG_WDT_501_PCI is not set
#
# USB-based Watchdog Cards
#
# CONFIG_USBPCWATCHDOG is not set
# CONFIG_HW_RANDOM is not set
CONFIG_NVRAM=m
CONFIG_RTC=y
CONFIG_RTC_HISTOGRAM=y
CONFIG_DTLK=m
CONFIG_R3964=m
# CONFIG_APPLICOM is not set
CONFIG_SONYPI=m
#
# Ftape, the floppy tape device driver
#
CONFIG_AGP=m
CONFIG_AGP_ALI=m
CONFIG_AGP_ATI=m
CONFIG_AGP_AMD=m
# CONFIG_AGP_AMD64 is not set
CONFIG_AGP_INTEL=m
# CONFIG_AGP_INTEL_MCH is not set
CONFIG_AGP_NVIDIA=m
CONFIG_AGP_SIS=m
CONFIG_AGP_SWORKS=m
CONFIG_AGP_VIA=m
# CONFIG_AGP_EFFICEON is not set
# CONFIG_DRM is not set
CONFIG_MWAVE=m
# CONFIG_RAW_DRIVER is not set
# CONFIG_HANGCHECK_TIMER is not set
#
# I2C support
#
CONFIG_I2C=m
CONFIG_I2C_CHARDEV=m
#
# I2C Algorithms
#
CONFIG_I2C_ALGOBIT=m
CONFIG_I2C_ALGOPCF=m
# CONFIG_I2C_ALGOPCA is not set
#
# I2C Hardware Bus support
#
CONFIG_I2C_ALI1535=m
# CONFIG_I2C_ALI1563 is not set
CONFIG_I2C_ALI15X3=m
CONFIG_I2C_AMD756=m
# CONFIG_I2C_AMD756_S4882 is not set
# CONFIG_I2C_AMD8111 is not set
CONFIG_I2C_I801=m
CONFIG_I2C_I810=m
CONFIG_I2C_ISA=m
# CONFIG_I2C_NFORCE2 is not set
# CONFIG_I2C_PARPORT_LIGHT is not set
CONFIG_I2C_PIIX4=m
# CONFIG_I2C_PROSAVAGE is not set
# CONFIG_I2C_SAVAGE4 is not set
# CONFIG_SCx200_ACB is not set
CONFIG_I2C_SIS5595=m
# CONFIG_I2C_SIS630 is not set
# CONFIG_I2C_SIS96X is not set
# CONFIG_I2C_STUB is not set
CONFIG_I2C_VIA=m
CONFIG_I2C_VIAPRO=m
CONFIG_I2C_VOODOO3=m
# CONFIG_I2C_PCA_ISA is not set
#
# Hardware Sensors Chip support
#
CONFIG_I2C_SENSOR=m
CONFIG_SENSORS_ADM1021=m
# CONFIG_SENSORS_ADM1025 is not set
# CONFIG_SENSORS_ADM1031 is not set
# CONFIG_SENSORS_ASB100 is not set
CONFIG_SENSORS_DS1621=m
# CONFIG_SENSORS_FSCHER is not set
CONFIG_SENSORS_GL518SM=m
CONFIG_SENSORS_IT87=m
# CONFIG_SENSORS_LM63 is not set
CONFIG_SENSORS_LM75=m
# CONFIG_SENSORS_LM77 is not set
CONFIG_SENSORS_LM78=m
CONFIG_SENSORS_LM80=m
# CONFIG_SENSORS_LM83 is not set
# CONFIG_SENSORS_LM85 is not set
# CONFIG_SENSORS_LM87 is not set
# CONFIG_SENSORS_LM90 is not set
# CONFIG_SENSORS_MAX1619 is not set
# CONFIG_SENSORS_PC87360 is not set
# CONFIG_SENSORS_SMSC47M1 is not set
CONFIG_SENSORS_VIA686A=m
CONFIG_SENSORS_W83781D=m
# CONFIG_SENSORS_W83L785TS is not set
# CONFIG_SENSORS_W83627HF is not set
#
# Other I2C Chip support
#
CONFIG_SENSORS_EEPROM=m
CONFIG_SENSORS_PCF8574=m
CONFIG_SENSORS_PCF8591=m
# CONFIG_SENSORS_RTC8564 is not set
# CONFIG_I2C_DEBUG_CORE is not set
# CONFIG_I2C_DEBUG_ALGO is not set
# CONFIG_I2C_DEBUG_BUS is not set
# CONFIG_I2C_DEBUG_CHIP is not set
#
# Dallas's 1-wire bus
#
# CONFIG_W1 is not set
#
# Misc devices
#
# CONFIG_IBM_ASM is not set
#
# Multimedia devices
#
# CONFIG_VIDEO_DEV is not set
#
# Digital Video Broadcasting Devices
#
# CONFIG_DVB is not set
#
# Graphics support
#
CONFIG_FB=y
CONFIG_FB_MODE_HELPERS=y
# CONFIG_FB_TILEBLITTING is not set
# CONFIG_FB_CIRRUS is not set
CONFIG_FB_PM2=m
# CONFIG_FB_PM2_FIFO_DISCONNECT is not set
# CONFIG_FB_CYBER2000 is not set
# CONFIG_FB_ASILIANT is not set
# CONFIG_FB_IMSTT is not set
CONFIG_FB_VGA16=m
CONFIG_FB_VESA=y
CONFIG_VIDEO_SELECT=y
CONFIG_FB_HGA=m
# CONFIG_FB_HGA_ACCEL is not set
CONFIG_FB_RIVA=m
# CONFIG_FB_RIVA_I2C is not set
# CONFIG_FB_RIVA_DEBUG is not set
# CONFIG_FB_I810 is not set
# CONFIG_FB_INTEL is not set
CONFIG_FB_MATROX=m
CONFIG_FB_MATROX_MILLENIUM=y
CONFIG_FB_MATROX_MYSTIQUE=y
CONFIG_FB_MATROX_G450=y
CONFIG_FB_MATROX_G100=y
CONFIG_FB_MATROX_I2C=m
CONFIG_FB_MATROX_MAVEN=m
CONFIG_FB_MATROX_MULTIHEAD=y
# CONFIG_FB_RADEON_OLD is not set
CONFIG_FB_RADEON=m
CONFIG_FB_RADEON_I2C=y
# CONFIG_FB_RADEON_DEBUG is not set
CONFIG_FB_ATY128=m
CONFIG_FB_ATY=m
CONFIG_FB_ATY_CT=y
# CONFIG_FB_ATY_GENERIC_LCD is not set
# CONFIG_FB_ATY_XL_INIT is not set
CONFIG_FB_ATY_GX=y
# CONFIG_FB_SAVAGE is not set
CONFIG_FB_SIS=m
CONFIG_FB_SIS_300=y
CONFIG_FB_SIS_315=y
CONFIG_FB_NEOMAGIC=m
# CONFIG_FB_KYRO is not set
CONFIG_FB_3DFX=m
# CONFIG_FB_3DFX_ACCEL is not set
CONFIG_FB_VOODOO1=m
# CONFIG_FB_TRIDENT is not set
# CONFIG_FB_VIRTUAL is not set
#
# Console display driver support
#
CONFIG_VGA_CONSOLE=y
CONFIG_MDA_CONSOLE=m
CONFIG_DUMMY_CONSOLE=y
# CONFIG_FRAMEBUFFER_CONSOLE is not set
#
# Logo configuration
#
# CONFIG_LOGO is not set
#
# Sound
#
CONFIG_SOUND=m
#
# Advanced Linux Sound Architecture
#
# CONFIG_SND is not set
#
# Open Sound System
#
# CONFIG_SOUND_PRIME is not set
#
# USB support
#
CONFIG_USB=m
# CONFIG_USB_DEBUG is not set
#
# Miscellaneous USB options
#
CONFIG_USB_DEVICEFS=y
# CONFIG_USB_BANDWIDTH is not set
# CONFIG_USB_DYNAMIC_MINORS is not set
# CONFIG_USB_OTG is not set
CONFIG_USB_ARCH_HAS_HCD=y
CONFIG_USB_ARCH_HAS_OHCI=y
#
# USB Host Controller Drivers
#
# CONFIG_USB_EHCI_HCD is not set
# CONFIG_USB_OHCI_HCD is not set
CONFIG_USB_UHCI_HCD=m
#
# USB Device Class drivers
#
CONFIG_USB_AUDIO=m
# CONFIG_USB_BLUETOOTH_TTY is not set
CONFIG_USB_MIDI=m
CONFIG_USB_ACM=m
CONFIG_USB_PRINTER=m
CONFIG_USB_STORAGE=m
# CONFIG_USB_STORAGE_DEBUG is not set
# CONFIG_USB_STORAGE_RW_DETECT is not set
CONFIG_USB_STORAGE_DATAFAB=y
CONFIG_USB_STORAGE_FREECOM=y
CONFIG_USB_STORAGE_ISD200=y
CONFIG_USB_STORAGE_DPCM=y
CONFIG_USB_STORAGE_HP8200e=y
CONFIG_USB_STORAGE_SDDR09=y
CONFIG_USB_STORAGE_SDDR55=y
CONFIG_USB_STORAGE_JUMPSHOT=y
#
# USB Input Devices
#
CONFIG_USB_HID=m
CONFIG_USB_HIDINPUT=y
# CONFIG_HID_FF is not set
# CONFIG_USB_HIDDEV is not set
#
# USB HID Boot Protocol drivers
#
# CONFIG_USB_KBD is not set
# CONFIG_USB_MOUSE is not set
CONFIG_USB_AIPTEK=m
CONFIG_USB_WACOM=m
CONFIG_USB_KBTAB=m
CONFIG_USB_POWERMATE=m
# CONFIG_USB_MTOUCH is not set
# CONFIG_USB_EGALAX is not set
# CONFIG_USB_XPAD is not set
# CONFIG_USB_ATI_REMOTE is not set
#
# USB Imaging devices
#
CONFIG_USB_MDC800=m
CONFIG_USB_MICROTEK=m
CONFIG_USB_HPUSBSCSI=m
#
# USB Multimedia devices
#
CONFIG_USB_DABUSB=m
#
# Video4Linux support is needed for USB Multimedia device support
#
#
# USB Network Adapters
#
CONFIG_USB_CATC=m
CONFIG_USB_KAWETH=m
CONFIG_USB_PEGASUS=m
CONFIG_USB_RTL8150=m
CONFIG_USB_USBNET=m
#
# USB Host-to-Host Cables
#
CONFIG_USB_ALI_M5632=y
CONFIG_USB_AN2720=y
CONFIG_USB_BELKIN=y
CONFIG_USB_GENESYS=y
CONFIG_USB_NET1080=y
CONFIG_USB_PL2301=y
CONFIG_USB_KC2190=y
#
# Intelligent USB Devices/Gadgets
#
CONFIG_USB_ARMLINUX=y
CONFIG_USB_EPSON2888=y
CONFIG_USB_ZAURUS=y
CONFIG_USB_CDCETHER=y
#
# USB Network Adapters
#
CONFIG_USB_AX8817X=y
#
# USB port drivers
#
#
# USB Serial Converter support
#
CONFIG_USB_SERIAL=m
CONFIG_USB_SERIAL_GENERIC=y
CONFIG_USB_SERIAL_BELKIN=m
CONFIG_USB_SERIAL_DIGI_ACCELEPORT=m
# CONFIG_USB_SERIAL_CYPRESS_M8 is not set
CONFIG_USB_SERIAL_EMPEG=m
CONFIG_USB_SERIAL_FTDI_SIO=m
CONFIG_USB_SERIAL_VISOR=m
CONFIG_USB_SERIAL_IPAQ=m
CONFIG_USB_SERIAL_IR=m
CONFIG_USB_SERIAL_EDGEPORT=m
CONFIG_USB_SERIAL_EDGEPORT_TI=m
# CONFIG_USB_SERIAL_IPW is not set
CONFIG_USB_SERIAL_KEYSPAN_PDA=m
CONFIG_USB_SERIAL_KEYSPAN=m
# CONFIG_USB_SERIAL_KEYSPAN_MPR is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA28 is not set
CONFIG_USB_SERIAL_KEYSPAN_USA28X=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XA=y
CONFIG_USB_SERIAL_KEYSPAN_USA28XB=y
# CONFIG_USB_SERIAL_KEYSPAN_USA19 is not set
# CONFIG_USB_SERIAL_KEYSPAN_USA18X is not set
CONFIG_USB_SERIAL_KEYSPAN_USA19W=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QW=y
CONFIG_USB_SERIAL_KEYSPAN_USA19QI=y
CONFIG_USB_SERIAL_KEYSPAN_USA49W=y
CONFIG_USB_SERIAL_KEYSPAN_USA49WLC=y
CONFIG_USB_SERIAL_KLSI=m
CONFIG_USB_SERIAL_KOBIL_SCT=m
CONFIG_USB_SERIAL_MCT_U232=m
CONFIG_USB_SERIAL_PL2303=m
# CONFIG_USB_SERIAL_SAFE is not set
CONFIG_USB_SERIAL_CYBERJACK=m
CONFIG_USB_SERIAL_XIRCOM=m
CONFIG_USB_SERIAL_OMNINET=m
CONFIG_USB_EZUSB=y
#
# USB Miscellaneous drivers
#
# CONFIG_USB_EMI62 is not set
# CONFIG_USB_EMI26 is not set
CONFIG_USB_TIGL=m
CONFIG_USB_AUERSWALD=m
CONFIG_USB_RIO500=m
# CONFIG_USB_LEGOTOWER is not set
CONFIG_USB_LCD=m
# CONFIG_USB_LED is not set
# CONFIG_USB_CYTHERM is not set
# CONFIG_USB_PHIDGETKIT is not set
# CONFIG_USB_PHIDGETSERVO is not set
# CONFIG_USB_TEST is not set
#
# USB ATM/DSL drivers
#
CONFIG_USB_ATM=m
CONFIG_USB_SPEEDTOUCH=m
#
# USB Gadget Support
#
# CONFIG_USB_GADGET is not set
#
# File systems
#
CONFIG_EXT2_FS=y
# CONFIG_EXT2_FS_XATTR is not set
CONFIG_EXT3_FS=m
CONFIG_EXT3_FS_XATTR=y
CONFIG_EXT3_FS_POSIX_ACL=y
# CONFIG_EXT3_FS_SECURITY is not set
CONFIG_JBD=m
# CONFIG_JBD_DEBUG is not set
CONFIG_FS_MBCACHE=y
# CONFIG_REISER4_FS is not set
CONFIG_REISERFS_FS=m
# CONFIG_REISERFS_CHECK is not set
CONFIG_REISERFS_PROC_INFO=y
# CONFIG_REISERFS_FS_XATTR is not set
CONFIG_JFS_FS=m
# CONFIG_JFS_POSIX_ACL is not set
CONFIG_JFS_DEBUG=y
# CONFIG_JFS_STATISTICS is not set
CONFIG_FS_POSIX_ACL=y
# CONFIG_XFS_FS is not set
CONFIG_MINIX_FS=m
CONFIG_ROMFS_FS=m
CONFIG_QUOTA=y
# CONFIG_QFMT_V1 is not set
CONFIG_QFMT_V2=y
CONFIG_QUOTACTL=y
CONFIG_DNOTIFY=y
CONFIG_AUTOFS_FS=m
CONFIG_AUTOFS4_FS=m
#
# Caches
#
# CONFIG_FSCACHE is not set
#
# CD-ROM/DVD Filesystems
#
CONFIG_ISO9660_FS=y
CONFIG_JOLIET=y
CONFIG_ZISOFS=y
CONFIG_ZISOFS_FS=y
CONFIG_UDF_FS=m
CONFIG_UDF_NLS=y
#
# DOS/FAT/NT Filesystems
#
CONFIG_FAT_FS=m
CONFIG_MSDOS_FS=m
CONFIG_VFAT_FS=m
CONFIG_FAT_DEFAULT_CODEPAGE=437
CONFIG_FAT_DEFAULT_IOCHARSET="iso8859-1"
# CONFIG_NTFS_FS is not set
#
# Pseudo filesystems
#
CONFIG_PROC_FS=y
CONFIG_PROC_KCORE=y
CONFIG_SYSFS=y
# CONFIG_DEVFS_FS is not set
# CONFIG_DEVPTS_FS_XATTR is not set
CONFIG_TMPFS=y
# CONFIG_TMPFS_XATTR is not set
# CONFIG_HUGETLBFS is not set
# CONFIG_HUGETLB_PAGE is not set
CONFIG_RAMFS=y
#
# Miscellaneous filesystems
#
# CONFIG_ADFS_FS is not set
# CONFIG_AFFS_FS is not set
# CONFIG_HFS_FS is not set
# CONFIG_HFSPLUS_FS is not set
# CONFIG_BEFS_FS is not set
# CONFIG_BFS_FS is not set
# CONFIG_EFS_FS is not set
CONFIG_CRAMFS=m
# CONFIG_VXFS_FS is not set
# CONFIG_HPFS_FS is not set
# CONFIG_QNX4FS_FS is not set
CONFIG_SYSV_FS=m
CONFIG_UFS_FS=m
# CONFIG_UFS_FS_WRITE is not set
#
# Network File Systems
#
CONFIG_NFS_FS=m
CONFIG_NFS_V3=y
# CONFIG_NFS_V4 is not set
CONFIG_NFS_DIRECTIO=y
CONFIG_NFSD=m
CONFIG_NFSD_V3=y
# CONFIG_NFSD_V4 is not set
CONFIG_NFSD_TCP=y
CONFIG_LOCKD=m
CONFIG_LOCKD_V4=y
CONFIG_EXPORTFS=m
CONFIG_SUNRPC=m
# CONFIG_RPCSEC_GSS_KRB5 is not set
# CONFIG_RPCSEC_GSS_SPKM3 is not set
CONFIG_SMB_FS=m
# CONFIG_SMB_NLS_DEFAULT is not set
# CONFIG_CIFS is not set
CONFIG_NCP_FS=m
CONFIG_NCPFS_PACKET_SIGNING=y
CONFIG_NCPFS_IOCTL_LOCKING=y
CONFIG_NCPFS_STRONG=y
CONFIG_NCPFS_NFS_NS=y
CONFIG_NCPFS_OS2_NS=y
CONFIG_NCPFS_SMALLDOS=y
CONFIG_NCPFS_NLS=y
CONFIG_NCPFS_EXTRAS=y
CONFIG_CODA_FS=m
# CONFIG_CODA_FS_OLD_API is not set
CONFIG_AFS_FS=m
CONFIG_RXRPC=m
#
# Partition Types
#
CONFIG_PARTITION_ADVANCED=y
# CONFIG_ACORN_PARTITION is not set
CONFIG_OSF_PARTITION=y
# CONFIG_AMIGA_PARTITION is not set
# CONFIG_ATARI_PARTITION is not set
# CONFIG_MAC_PARTITION is not set
CONFIG_MSDOS_PARTITION=y
CONFIG_BSD_DISKLABEL=y
CONFIG_MINIX_SUBPARTITION=y
CONFIG_SOLARIS_X86_PARTITION=y
CONFIG_UNIXWARE_DISKLABEL=y
# CONFIG_LDM_PARTITION is not set
CONFIG_SGI_PARTITION=y
# CONFIG_ULTRIX_PARTITION is not set
CONFIG_SUN_PARTITION=y
# CONFIG_EFI_PARTITION is not set
#
# Native Language Support
#
CONFIG_NLS=y
CONFIG_NLS_DEFAULT="iso8859-1"
CONFIG_NLS_CODEPAGE_437=m
CONFIG_NLS_CODEPAGE_737=m
CONFIG_NLS_CODEPAGE_775=m
CONFIG_NLS_CODEPAGE_850=m
CONFIG_NLS_CODEPAGE_852=m
CONFIG_NLS_CODEPAGE_855=m
CONFIG_NLS_CODEPAGE_857=m
CONFIG_NLS_CODEPAGE_860=m
CONFIG_NLS_CODEPAGE_861=m
CONFIG_NLS_CODEPAGE_862=m
CONFIG_NLS_CODEPAGE_863=m
CONFIG_NLS_CODEPAGE_864=m
CONFIG_NLS_CODEPAGE_865=m
CONFIG_NLS_CODEPAGE_866=m
CONFIG_NLS_CODEPAGE_869=m
CONFIG_NLS_CODEPAGE_936=m
CONFIG_NLS_CODEPAGE_950=m
CONFIG_NLS_CODEPAGE_932=m
CONFIG_NLS_CODEPAGE_949=m
CONFIG_NLS_CODEPAGE_874=m
CONFIG_NLS_ISO8859_8=m
CONFIG_NLS_CODEPAGE_1250=m
CONFIG_NLS_CODEPAGE_1251=m
# CONFIG_NLS_ASCII is not set
CONFIG_NLS_ISO8859_1=m
CONFIG_NLS_ISO8859_2=m
CONFIG_NLS_ISO8859_3=m
CONFIG_NLS_ISO8859_4=m
CONFIG_NLS_ISO8859_5=m
CONFIG_NLS_ISO8859_6=m
CONFIG_NLS_ISO8859_7=m
CONFIG_NLS_ISO8859_9=m
CONFIG_NLS_ISO8859_13=m
CONFIG_NLS_ISO8859_14=m
CONFIG_NLS_ISO8859_15=m
CONFIG_NLS_KOI8_R=m
CONFIG_NLS_KOI8_U=m
CONFIG_NLS_UTF8=m
#
# Profiling support
#
CONFIG_PROFILING=y
CONFIG_OPROFILE=m
#
# Kernel hacking
#
CONFIG_DEBUG_KERNEL=y
CONFIG_MAGIC_SYSRQ=y
# CONFIG_SCHEDSTATS is not set
# CONFIG_DEBUG_SLAB is not set
CONFIG_DEBUG_PREEMPT=y
CONFIG_WAKEUP_TIMING=y
CONFIG_PREEMPT_TRACE=y
CONFIG_CRITICAL_PREEMPT_TIMING=y
CONFIG_CRITICAL_IRQSOFF_TIMING=y
CONFIG_CRITICAL_TIMING=y
CONFIG_LATENCY_TIMING=y
CONFIG_LATENCY_TRACE=y
CONFIG_MCOUNT=y
CONFIG_RT_DEADLOCK_DETECT=y
# CONFIG_DEBUG_KOBJECT is not set
CONFIG_DEBUG_BUGVERBOSE=y
# CONFIG_DEBUG_INFO is not set
CONFIG_FRAME_POINTER=y
CONFIG_EARLY_PRINTK=y
# CONFIG_DEBUG_STACKOVERFLOW is not set
# CONFIG_KPROBES is not set
# CONFIG_DEBUG_STACK_USAGE is not set
# CONFIG_DEBUG_PAGEALLOC is not set
# CONFIG_4KSTACKS is not set
CONFIG_X86_FIND_SMP_CONFIG=y
CONFIG_X86_MPPARSE=y
# CONFIG_KGDB is not set
#
# Security options
#
# CONFIG_KEYS is not set
CONFIG_SECURITY=y
CONFIG_SECURITY_NETWORK=y
CONFIG_SECURITY_CAPABILITIES=y
# CONFIG_SECURITY_ROOTPLUG is not set
# CONFIG_SECURITY_SECLVL is not set
# CONFIG_SECURITY_SELINUX is not set
#
# Cryptographic options
#
CONFIG_CRYPTO=y
CONFIG_CRYPTO_HMAC=y
CONFIG_CRYPTO_NULL=m
CONFIG_CRYPTO_MD4=m
CONFIG_CRYPTO_MD5=m
CONFIG_CRYPTO_SHA1=m
CONFIG_CRYPTO_SHA256=m
CONFIG_CRYPTO_SHA512=m
# CONFIG_CRYPTO_WP512 is not set
CONFIG_CRYPTO_DES=m
CONFIG_CRYPTO_BLOWFISH=m
# CONFIG_CRYPTO_TWOFISH is not set
CONFIG_CRYPTO_SERPENT=m
# CONFIG_CRYPTO_AES_586 is not set
CONFIG_CRYPTO_CAST5=m
# CONFIG_CRYPTO_CAST6 is not set
# CONFIG_CRYPTO_TEA is not set
# CONFIG_CRYPTO_ARC4 is not set
# CONFIG_CRYPTO_KHAZAD is not set
# CONFIG_CRYPTO_ANUBIS is not set
CONFIG_CRYPTO_DEFLATE=m
# CONFIG_CRYPTO_MICHAEL_MIC is not set
# CONFIG_CRYPTO_CRC32C is not set
# CONFIG_CRYPTO_TEST is not set
#
# Library routines
#
CONFIG_CRC_CCITT=m
CONFIG_CRC32=y
# CONFIG_LIBCRC32C is not set
CONFIG_ZLIB_INFLATE=y
CONFIG_ZLIB_DEFLATE=m
CONFIG_GENERIC_HARDIRQS=y
CONFIG_GENERIC_IRQ_PROBE=y
CONFIG_X86_SMP=y
CONFIG_X86_HT=y
CONFIG_X86_BIOS_REBOOT=y
CONFIG_X86_TRAMPOLINE=y
CONFIG_PC=y
>i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
>downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
Just did a build with -V0.7.30-2 and was about to start testing when
the system locked up (no keyboard response, display frozen, etc.). No
response to Alt-SysRq keys. No messages on the serial console (other than
those showing a normal boot / telinit 5). Kernel was PREEMPT_RT plus a
patch to profile on NMI, not timer (been using this latter one for some
time). Basically same .config as previously provided but can send if
needed. Boot parameters included serial console, profile=2, nmi_watchdog.
Will retry shortly, but the steps leading to the failure were:
- boot single user
- telinit 5
- su'd 3 times
- created directories to log data / moved some files around
- set IRQ threads, ksoftirqd/[01], events/[01] to RT fifo 99 priority
- started two monitoring scripts (looking at latency & profile data)
- cat /proc/sys/kernel/preempt_wakeup_timing (was 1)
- echo 0 > /proc/sys/kernel/preempt_wakeup_timing [entered, but display
was frozen at this point and did not see newline nor any further output]
--Mark H Johnson
<mailto:[email protected]>
On Mon, 22 Nov 2004, wrote:
> >i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
> >downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
I'm seeing something very odd. It's against 29-0. I also seem to recall
seeing something similiar reported earlier.
I'm seeing pauses on my system. Not certain what is causing it. Hitting a
key on the keyboard unsticks it.
I don't know if everything stops. I do know that the runtime displays(remote
xterm displaying continous data, wmnet) stop updating, as did a simple shell
script loop(while date; do sleep 1; done).
Last saturday night, I started a large download. When I got back to check on
it sunday morning, the time on the machine said 1:15am, while it was 11am.
I am using default RT priorities on the irq threads.
gradall:/home.local/adam# cat /etc/sysctl.conf
kernel/trace_verbose=1
kernel/preempt_max_latency=0
config is attached.
Ingo Molnar wrote:
> * Eran Mann <[email protected]> wrote:
>
>>Right on.
>>After hdparm -d0 I see maximum latency of 35 us after a full kernel
>>build with a few GUI apps in the background. I?ll try to find a
>>reasonable compromise.
>
> it might make sense to report this to the hw vendor as well, as these
> latencies dont occur at _every_ IDE DMA, it might be some sort of
> chipset (or BIOS) bug they might want to see resolved as well (if this
> isnt a ship-and-forget vendor). 2 msec stalls are not nice to a fair
> number of applications.
>
> Ingo
It?s a rather old white-box machine with a noname VIA-based motherboard,
So I don?t really have whom to report the problem to. On the other hand
with udma2 I see latencies < 170 us which seems reasonable.
Thanks for the advice.
Eran.
* Ingo Molnar <[email protected]> wrote:
> * [email protected] <[email protected]> wrote:
>
> > - echo 0 > /proc/sys/kernel/preempt_wakeup_timing [entered, but
> > display was frozen at this point and did not see newline nor any
> > further output]
>
> managed to reproduce this on an SMP box but not on an UP box, so i
> think this is SMP related. It definitely happens almost immediately
> after preempt_wakeup_timing is reset - or after preempt_max_timing is
> reset. (Perhaps a dump_stack() from the wrong place, or something
> like that.)
The lockup was caused by the mutex wakeup being done under the PI lock,
and if a new critical-section latency is reported within try_to_wake_up
then the trylock done there deadlocked. The NMI watchdog triggered but
the printks done there deadlocked as well.
I fixed both the deadlock scenario, and made the NMI printout path more
robust to get the messages out to the console in even such a case.
The fixes are in the latest (-30-4) patch which can be found at the
usual place:
http://redhat.com/~mingo/realtime-preempt/
Ingo
* Adam Heath <[email protected]> wrote:
> > >i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
> > >downloaded from the usual place:
> > >
> > > http://redhat.com/~mingo/realtime-preempt/
>
> I'm seeing something very odd. It's against 29-0. I also seem to
> recall seeing something similiar reported earlier.
>
> I'm seeing pauses on my system. Not certain what is causing it.
> Hitting a key on the keyboard unsticks it.
at first sight this looks like a scheduling/wakeup anomaly. Please
re-report this if it happens with the current (30-4) kernel too. Also,
could you test the vanilla -mm tree, it has a few scheduler updates too.
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> OK. I tried 14 instances of jack_test. I even modded Florian's
> original source code, to let each client instance have 4 ins and 4
> outs, and to make things a litle bit heavier, all 4 inputs are mixed
> into each of the 4 outputs.
>
> Saw at least a couple of XRUNs in a 20 (4*5) minute test-run. CPU load
> doesn't get above 30% on my laptop (P4/UP 2.533Ghz).
the max CPU load i get here is 46% (your laptop is faster), but no
xruns. The result of a 5-minute run is:
************* SUMMARY RESULT ****************
Timeout Count . . . . . . . . :( 0)
XRUN Count . . . . . . . . . : 0
Delay Count (>spare time) . . : 0
Delay Count (>1000 usecs) . . : 0
Delay Maximum . . . . . . . . : 0 usecs
Cycle Maximum . . . . . . . . : 1016 usecs
Average DSP Load. . . . . . . : 46.4 %
Average CPU System Load . . . : 40.5 %
Average CPU User Load . . . . : 2.3 %
Average CPU Nice Load . . . . : 0.0 %
Average CPU I/O Wait Load . . : 0.0 %
Average CPU IRQ Load . . . . : 0.0 %
Average CPU Soft-IRQ Load . . : 0.0 %
Average Interrupt Rate . . . : 2374.1 /sec
Average Context-Switch Rate . : 19172.8 /sec
i suspect i need to activate some option/define in jackd to get some of
the more advanced stats such as delay-maximum?
the kernel i used was -30-6 and i used the snd-via82xx driver. (I had to
do -n3 instead of -n2 when starting up jackd - otherwise i'd get an
endless stream of very small xruns, apparently a via82xx driver bug?)
Ingo
* Ingo Molnar <[email protected]> wrote:
> the max CPU load i get here is 46% (your laptop is faster), but no
> xruns. The result of a 5-minute run is:
>
> ************* SUMMARY RESULT ****************
> Timeout Count . . . . . . . . :( 0)
> XRUN Count . . . . . . . . . : 0
> Delay Count (>spare time) . . : 0
> Delay Count (>1000 usecs) . . : 0
> Delay Maximum . . . . . . . . : 0 usecs
> Cycle Maximum . . . . . . . . : 1016 usecs
> Average DSP Load. . . . . . . : 46.4 %
> Average CPU System Load . . . : 40.5 %
> Average CPU User Load . . . . : 2.3 %
> Average CPU Nice Load . . . . : 0.0 %
> Average CPU I/O Wait Load . . : 0.0 %
> Average CPU IRQ Load . . . . : 0.0 %
> Average CPU Soft-IRQ Load . . : 0.0 %
> Average Interrupt Rate . . . : 2374.1 /sec
> Average Context-Switch Rate . : 19172.8 /sec
here are the settings i used:
JACKD_PRIO=20
JACKD_RATE=44100
JACKD_PERIOD=64
JACKD_SECONDS=300
CLIENT_MAX=14
jackd 0.99.0
Ingo
> Saw at least a couple of XRUNs in a 20 (4*5) minute test-run. CPU load
> doesn't get above 30% on my laptop (P4/UP 2.533Ghz).
i'm wondering, do you get any xruns (or other bad behavior) if you use
the dummy ALSA driver for the latency test?
Ingo
another test on the same system, this time completely idle:
************* SUMMARY RESULT ****************
Timeout Count . . . . . . . . :( 0)
XRUN Count . . . . . . . . . : 0
Delay Count (>spare time) . . : 0
Delay Count (>1000 usecs) . . : 0
Delay Maximum . . . . . . . . : 0 usecs
Cycle Maximum . . . . . . . . : 724 usecs
Average DSP Load. . . . . . . : 32.1 %
Average CPU System Load . . . : 30.8 %
Average CPU User Load . . . . : 1.6 %
Average CPU Nice Load . . . . : 0.0 %
Average CPU I/O Wait Load . . : 0.0 %
Average CPU IRQ Load . . . . : 0.0 %
Average CPU Soft-IRQ Load . . : 0.0 %
Average Interrupt Rate . . . : 1669.0 /sec
Average Context-Switch Rate . : 16975.4 /sec
*********************************************
Ingo
* Ingo Molnar <[email protected]> wrote:
> another test on the same system, this time completely idle:
ah ... used the jack_test.cc client instead of your new jack_test.cpp.
Ingo
* Ingo Molnar <[email protected]> wrote:
>
> * Ingo Molnar <[email protected]> wrote:
>
> > another test on the same system, this time completely idle:
>
> ah ... used the jack_test.cc client instead of your new jack_test.cpp.
no big difference in the results though:
************* SUMMARY RESULT ****************
Timeout Count . . . . . . . . :( 0)
XRUN Count . . . . . . . . . : 0
Delay Count (>spare time) . . : 0
Delay Count (>1000 usecs) . . : 0
Delay Maximum . . . . . . . . : 0 usecs
Cycle Maximum . . . . . . . . : 582 usecs
Average DSP Load. . . . . . . : 30.7 %
Average CPU System Load . . . : 22.1 %
Average CPU User Load . . . . : 8.7 %
Average CPU Nice Load . . . . : 0.0 %
Average CPU I/O Wait Load . . : 0.1 %
Average CPU IRQ Load . . . . : 0.0 %
Average CPU Soft-IRQ Load . . : 0.0 %
Average Interrupt Rate . . . : 1676.5 /sec
Average Context-Switch Rate . : 17020.5 /sec
*********************************************
(i turned off some debugging options in the kernel, hence the small
latency improvement.)
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> OK. I tried 14 instances of jack_test. I even modded Florian's
> original source code, to let each client instance have 4 ins and 4
> outs, and to make things a litle bit heavier, all 4 inputs are mixed
> into each of the 4 outputs.
i tried your new test-client, and i've got a generic question: should a
jack client be able to generate an xrun via, other than via overloading
jackd? E.g. i'm wondering about the following behavior: if start up
jackd in the usual way:
jackd -R -P20 -dalsa -dhw:0 -r44100 -p64 -n2 -S -P
and then i start a single test-client (jack_test.cpp):
# ./jack_test
seconds to run: 60
client_new: jack_test-4215
port_register
set_process_callback
activate
running
and then if i now Ctrl-Z the Jack client, i get an immediate xrun
message from jackd:
**** alsa_pcm: xrun of at least 2.119 msecs
and when i 'fg' the client again then jackd sees a big delay:
**** alsa_pcm: xrun of at least 742.064 msecs
corresponding the amount of time i waited between the Ctrl-Z and the
'fg'.
since the client runs as SCHED_OTHER, doesnt this mean that simple
delays between SCHED_OTHER tasks could cause xruns in jackd too? A
SCHED_OTHER task can be delayed indefinitely at any stage. So shouldnt
the test-clients have RT priority as well, to guarantee xrun-less jackd?
Ingo
> since the client runs as SCHED_OTHER, doesnt this mean that simple
> delays between SCHED_OTHER tasks could cause xruns in jackd too? A
> SCHED_OTHER task can be delayed indefinitely at any stage. So shouldnt
> the test-clients have RT priority as well, to guarantee xrun-less
> jackd?
if SCHED_OTHER is the goal, then the xruns you are seeing in the
fluidsynth test are likely the result of random fluctuations in
scheduling of SCHED_OTHER tasks.
The way to find out whether that's the main source of xruns would be the
following test: start each fluidsynth instance as "nice -19 fluidsynth
..." to renice it to +19. Nice +19 gives the smoothest timeslices to
CPU-bound tasks (such as fluidsynth), and should give the smallest
global fluctuations. (The system must be idle during the test of course,
a nice +19 task is easily preempted.)
Ingo
On Tue, 23 Nov 2004 15:46:22 +0100
Ingo Molnar <[email protected]> wrote:
> i tried your new test-client, and i've got a generic question: should a
> jack client be able to generate an xrun via, other than via overloading
> jackd? E.g. i'm wondering about the following behavior: if start up
> jackd in the usual way:
The process() callback in a jackd client is run in a thread created by
libjack. This thread is run with SCHED_FIFO and at the same priority (or
one lower it seems) as the jackd server. Thus a client can only cause an
xrun when it takes a too long time to return from its process callback.
~$ ps -C jackd -cmL
PID LWP CLS PRI TTY TIME CMD
975 - - - ? 00:00:00 jackd
- 975 TS 19 - 00:00:00 -
- 976 TS 23 - 00:00:00 -
- 977 FF 110 - 00:00:00 -
- 978 FF 100 - 00:00:00 -
~$ ps -C jack_test -cmL
PID LWP CLS PRI TTY TIME CMD
988 - - - pts/1 00:00:00 jack_test
- 988 TS 20 - 00:00:00 -
- 989 FF 99 - 00:00:00 -
So when you ctrl-z out of jack_test you cause its process() thread to be
suspended, too, thus jackd cannot finish processing its graph.
flo
Ingo Molnar wrote:
>
> the max CPU load i get here is 46% (your laptop is faster), but no
> xruns. The result of a 5-minute run is:
>
> ************* SUMMARY RESULT ****************
> Timeout Count . . . . . . . . :( 0)
> XRUN Count . . . . . . . . . : 0
> Delay Count (>spare time) . . : 0
> Delay Count (>1000 usecs) . . : 0
> Delay Maximum . . . . . . . . : 0 usecs
> Cycle Maximum . . . . . . . . : 1016 usecs
> Average DSP Load. . . . . . . : 46.4 %
> Average CPU System Load . . . : 40.5 %
> Average CPU User Load . . . . : 2.3 %
> Average CPU Nice Load . . . . : 0.0 %
> Average CPU I/O Wait Load . . : 0.0 %
> Average CPU IRQ Load . . . . : 0.0 %
> Average CPU Soft-IRQ Load . . : 0.0 %
> Average Interrupt Rate . . . : 2374.1 /sec
> Average Context-Switch Rate . : 19172.8 /sec
>
> i suspect i need to activate some option/define in jackd to get some of
> the more advanced stats such as delay-maximum?
>
Yes, there's a non-official patch to jackd from Lee Revell's. Without that
you don't get to read the maximum delay from jackd. Sorry. But if you have
the patience to rebuild jack, here comes attached the minimal patch for
just that.
Seeya.
--
rncbc aka Rui Nuno Capela
[email protected]
* Florian Schmidt <[email protected]> wrote:
> So when you ctrl-z out of jack_test you cause its process() thread to
> be suspended, too, thus jackd cannot finish processing its graph.
ok, that makes sense. So under what priority does most of the
CPU-intense processing run in the test client? SCHED_OTHER or
SCHED_FIFO?
Ingo
* Florian Schmidt <[email protected]> wrote:
> ~$ ps -C jack_test -cmL
> PID LWP CLS PRI TTY TIME CMD
> 988 - - - pts/1 00:00:00 jack_test
> - 988 TS 20 - 00:00:00 -
> - 989 FF 99 - 00:00:00 -
>
> So when you ctrl-z out of jack_test you cause its process() thread to
> be suspended, too, thus jackd cannot finish processing its graph.
so in theory any scheduling delay of PID 988 in the above setup (the
SCHED_OTHER task) should not be able to negatively influence jackd,
correct? In fact, does in this particular jack_test case PID 988 do
anything substantial?
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> Yes, there's a non-official patch to jackd from Lee Revell's. Without
> that you don't get to read the maximum delay from jackd. Sorry. But if
> you have the patience to rebuild jack, here comes attached the minimal
> patch for just that.
thx, it applied cleanly to jack-cvs. Here's the 5-minute idle-system
test again:
************* SUMMARY RESULT ****************
Timeout Count . . . . . . . . :( 0)
XRUN Count . . . . . . . . . : 0
Delay Count (>spare time) . . : 0
Delay Count (>1000 usecs) . . : 0
Delay Maximum . . . . . . . . : 28 usecs
Cycle Maximum . . . . . . . . : 414 usecs
Average DSP Load. . . . . . . : 20.6 %
Average CPU System Load . . . : 11.6 %
Average CPU User Load . . . . : 8.6 %
Average CPU Nice Load . . . . : 0.0 %
Average CPU I/O Wait Load . . : 0.0 %
Average CPU IRQ Load . . . . : 0.0 %
Average CPU Soft-IRQ Load . . : 0.0 %
Average Interrupt Rate . . . : 1671.7 /sec
Average Context-Switch Rate . : 17003.1 /sec
*********************************************
but i can reproduce xruns on another, much slower box, using just 3-4
jack_test clients. The xruns dont seem to be justified, they happen at
30-40% CPU utilization already.
Ingo
On Tue, 23 Nov 2004 16:21:26 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > ~$ ps -C jack_test -cmL
> > PID LWP CLS PRI TTY TIME CMD
> > 988 - - - pts/1 00:00:00 jack_test
> > - 988 TS 20 - 00:00:00 -
> > - 989 FF 99 - 00:00:00 -
> >
> > So when you ctrl-z out of jack_test you cause its process() thread to
> > be suspended, too, thus jackd cannot finish processing its graph.
>
> so in theory any scheduling delay of PID 988 in the above setup (the
> SCHED_OTHER task) should not be able to negatively influence jackd,
> correct?
correct
> In fact, does in this particular jack_test case PID 988 do
> anything substantial?
Well, it registers the client with jackd, sets up the ports, registers
the process() callback and then simply goes to sleep() for the desired
runtime of the program. All these are non RT ops and should never be
able to cause any xruns.
All the work is done by the process() callback which is called by
libjack in a SCHED_FIFO thread. The process() callback is called once
for each buffer that jackd processes.
I cannot explain the detailed mechanism of how jackd wakes its clients
and communicates with them myself too well, so i'll leave this to Paul
Davis (CC'ed). Care to elaborate, Paul?
flo
On Tue, 23 Nov 2004, Ingo Molnar wrote:
>
> * Adam Heath <[email protected]> wrote:
>
> > > >i have released the -V0.7.30-2 Real-Time Preemption patch, which can be
> > > >downloaded from the usual place:
> > > >
> > > > http://redhat.com/~mingo/realtime-preempt/
> >
> > I'm seeing something very odd. It's against 29-0. I also seem to
> > recall seeing something similiar reported earlier.
> >
> > I'm seeing pauses on my system. Not certain what is causing it.
> > Hitting a key on the keyboard unsticks it.
>
> at first sight this looks like a scheduling/wakeup anomaly. Please
> re-report this if it happens with the current (30-4) kernel too. Also,
> could you test the vanilla -mm tree, it has a few scheduler updates too.
2.6.10-rc1-mm3 doesn't have the same problem. Didn't have a more recent mm
kernel available last night. Will compile one, and always keep it available.
No tracing using signals:
rtc latency histogram of {amlat/7215, 535020 samples}:
4 1
9 18916
10 125448
11 29422
12 41496
13 22222
14 15344
15 12256
16 10935
17 11157
18 11192
19 11456
20 12994
21 13832
22 14704
23 14330
24 12625
25 11837
26 11839
27 11206
28 10307
29 10057
30 10183
31 13115
32 11438
33 11897
34 11559
35 7817
36 5823
37 4660
38 3944
39 3106
40 2763
41 2424
42 1961
43 1729
44 1348
45 1154
46 984
47 801
48 682
49 537
50 467
51 345
52 288
53 214
54 182
55 162
56 131
57 117
58 135
59 118
60 111
61 120
62 118
63 89
64 103
65 95
66 92
67 85
68 62
69 45
70 61
71 39
72 29
73 40
74 28
75 25
76 25
77 18
78 18
79 11
80 13
81 9
82 13
83 10
84 11
85 5
86 2
87 4
88 2
89 3
90 2
91 5
92 2
93 1
94 2
95 4
96 3
97 2
98 1
99 3
103 1
104 1
108 2
118 1
7860 1
9999 43
On Tue, 23 Nov 2004, Adam Heath wrote:
> 2.6.10-rc1-mm3 doesn't have the same problem. Didn't have a more recent mm
> kernel available last night. Will compile one, and always keep it available.
Running 30-9. I'll report any issues that come up.
Ingo Molnar wrote:
> i have released the -V0.7.30-9 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
A couple of observations that I would like to share:
One other thing regarding my last email. The numbers were generated
using -V0.7.30-3, not -9. Sorry.
kr
Just curious to why you scale the interrupts from 49 down to 25. What
would be wrong with keeping all of them at 49 (or whatever). Being a
FIFO, no interrupt would preempt another. Why would you want the first
IRQs to be registered have higher priority than (and thus will preempt)
irqs registered later.
--
Steven Rostedt
Senior Engineer
Kihon Technologies
* Rui Nuno Capela <[email protected]> wrote:
> Now, with the default workload (14 clients * 4 * 4 ports) I'm reaching
> 60% of CPU, and a "fair" number of XRUNs on my [email protected] laptop, against
> the on-board alsa driver (snd-ali5451), while under RT-V0.7.30-2.
it would be very interesting to see how the new -30-9 kernel performs
using your workload (both fluidsynth and jackd_test), whether your xruns
are impacted by the fifo fix, and/or whether there are any other large
xrun sources left.
Ingo
On Tue, 2004-11-23 at 16:22 -0500, Steven Rostedt wrote:
> Just curious to why you scale the interrupts from 49 down to 25. What
> would be wrong with keeping all of them at 49 (or whatever). Being a
> FIFO, no interrupt would preempt another. Why would you want the first
> IRQs to be registered have higher priority than (and thus will preempt)
> irqs registered later.
I raised this issue before. I agree that all interrupts should get the
same RT prio by default. Otherwise the default behavior is arbitrary.
Lee
On Tue, 2004-11-23 at 16:47 -0500, Lee Revell wrote:
> On Tue, 2004-11-23 at 16:22 -0500, Steven Rostedt wrote:
> > Just curious to why you scale the interrupts from 49 down to 25. What
> > would be wrong with keeping all of them at 49 (or whatever). Being a
> > FIFO, no interrupt would preempt another. Why would you want the first
> > IRQs to be registered have higher priority than (and thus will preempt)
> > irqs registered later.
>
> I raised this issue before. I agree that all interrupts should get the
> same RT prio by default. Otherwise the default behavior is arbitrary.
>
> Lee
I'll even add that the default behavior slows the system down with extra
scheduling switches. If IRQ 10 is preempted by IRQ 2 then there's an
extra switch to get back and finish IRQ 10. For every IRQ that comes
during a "lower" priority IRQ there's an extra switch needed. If the
IRQs really don't have a order of priority, then they should be the
same. Some cases you need to set IRQs at different priorities, but that
should be done by the user and not the kernel giving the first irq
preference.
--
Steven Rostedt
Senior Engineer
Kihon Technologies
i have released the -V0.7.30-9 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is a fixes-only release.
most importantly it includes a JACK related latency fix. With Florian
Schmidt's great detective work we honed in on a big latency source
within JACK: the use of named pipes (fifos) on journalled filesystems.
This issue has been empirically identified before (and is mentioned in
the JACK howto) but has never been given high enough prominence. It
turns out that the atime updates done while read()ing or write()ing
named pipes causes the delays - it may under certain circumstances call
out into the journalling code. It may block even on non-journalled
filesystems.
To work this issue around, when PREEMPT_RT is enabled the -30-9 kernel
skips atime updates on named-fifos. (it's pretty pointless anyway.)
Alternative userspace workarounds are to put the fifos on tmpfs/ramfs,
or to mark the filesystem noatime,nodiratime.
those experiencing xruns under JACK should definitely try the -30-9
kernel.
Changes since -V0.7.30-2:
- named fifo reads/writes are now atomic, whenever possible
- fixed pi_lock related SMP & CRITICAL_IRQSOFF_TIMING lockups, this
could resolve the lockups reported by Mark H. Johnson.
- fixed one more PI buglet: wake up the new owner _after_ restoring
the priority of the old owner.
- made the NMI oopser more robust - it should print out some message
in pretty much any locking scenario.
- added the blocker device used by Esben Nielsen's pi_test suite.
- added user-triggerable ALSA xrun tracing to the patch: if a
sound IO channel has xrun_debug enabled in /proc then
user_trace_stop() will be called before printing the xrun message,
and the current trace will be saved to /proc/latency_trace. This is a
'one-shot' tracing method for now. I can be activated via:
echo 1 > /proc/asound/card0/pcm0p/xrun_debug
echo 1 > /proc/sys/kernel/trace_user_triggered
echo 1 > /proc/sys/kernel/trace_freerunning
echo 0 > /proc/sys/kernel/preempt_max_latency
echo 0 > /proc/sys/kernel/preempt_thresh
echo 0 > /proc/sys/kernel/preempt_wakeup_timing
./gettimeofday 0 1
gettimeofday.c is attached below. The JACK fifo xrun source was found
via this tracing facility.
to create a -V0.7.30-9 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm2/2.6.10-rc2-mm2.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm2-V0.7.30-9
Ingo
-- gettimeofday.c:
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <signal.h>
#include <sys/wait.h>
#include <linux/unistd.h>
int main (int argc, char **argv)
{
if (argc != 3) {
printf("usage: gettimeofday <val1> <val2>\n");
exit(0);
}
gettimeofday(atol(argv[1]), atol(argv[2]));
return 0;
}
Ingo Molnar wrote:
>
> * Rui Nuno Capela wrote:
>
>> Yes, there's a non-official patch to jackd from Lee Revell's. Without
>> that you don't get to read the maximum delay from jackd. Sorry. But if
>> you have the patience to rebuild jack, here comes attached the minimal
>> patch for just that.
>
> thx, it applied cleanly to jack-cvs. Here's the 5-minute idle-system test
> again:
>
> ************* SUMMARY RESULT ****************
> Timeout Count . . . . . . . . :( 0)
> XRUN Count . . . . . . . . . : 0
> Delay Count (>spare time) . . : 0
> Delay Count (>1000 usecs) . . : 0
> Delay Maximum . . . . . . . . : 28 usecs
> Cycle Maximum . . . . . . . . : 414 usecs
> Average DSP Load. . . . . . . : 20.6 %
> Average CPU System Load . . . : 11.6 %
> Average CPU User Load . . . . : 8.6 %
> Average CPU Nice Load . . . . : 0.0 %
> Average CPU I/O Wait Load . . : 0.0 %
> Average CPU IRQ Load . . . . : 0.0 %
> Average CPU Soft-IRQ Load . . : 0.0 %
> Average Interrupt Rate . . . : 1671.7 /sec
> Average Context-Switch Rate . : 17003.1 /sec
> *********************************************
>
> but i can reproduce xruns on another, much slower box, using just 3-4
> jack_test clients. The xruns dont seem to be justified, they happen at
> 30-40% CPU utilization already.
>
Now that you're liking, here goes a more contained jackd test-suite (see
attached tarball, jackd_test3.1.tar.gz).
Just launch the provided shell script, from the command line as:
./jack_test3_run.sh [secs] [clients] [ports]
where:
secs - number of seconds to run jackd workload (default = 300)
clients - number of test-clients to run (default = 14)
ports - number of interface ports per client (default = 4)
As before, each client (jack_test3_client) registers the same number of
input and output ports (default is 4ins x 4outs), where each output is the
audio mix of all inputs.
Note that you can breakup the 14 client barrier, as that limit seems to be
related to the maximum number of client ports jackd can handle by default
(128). The jack_test_run.sh script sets this jackd maximum port limit
number as it sees fit, so any number of clients greater than 14 is
allowed, provided there's enough CPU and/or RAM ;)
Now, with the default workload (14 clients * 4 * 4 ports) I'm reaching 60%
of CPU, and a "fair" number of XRUNs on my [email protected] laptop, against the
on-board alsa driver (snd-ali5451), while under RT-V0.7.30-2.
Each test run produces a kernel-timestamped log filename with the complete
captured stdout/err. Consolidated results can be produced by feeding
several of those logfiles into the jack_test3_consolidated.awk script,
just like this:
cat *.log | awk -f jack_test3_consolidated.awk
Enjoy.
--
rncbc aka Rui Nuno Capela
[email protected]
On Tue, 23 Nov 2004 18:58:23 +0100
Ingo Molnar <[email protected]> wrote:
>
> - added user-triggerable ALSA xrun tracing to the patch: if a
> sound IO channel has xrun_debug enabled in /proc then
> user_trace_stop() will be called before printing the xrun message,
> and the current trace will be saved to /proc/latency_trace. This is a
> 'one-shot' tracing method for now. I can be activated via:
>
> echo 1 > /proc/asound/card0/pcm0p/xrun_debug
>
> echo 1 > /proc/sys/kernel/trace_user_triggered
> echo 1 > /proc/sys/kernel/trace_freerunning
> echo 0 > /proc/sys/kernel/preempt_max_latency
> echo 0 > /proc/sys/kernel/preempt_thresh
> echo 0 > /proc/sys/kernel/preempt_wakeup_timing
>
> ./gettimeofday 0 1
>
> gettimeofday.c is attached below. The JACK fifo xrun source was found
> via this tracing facility.
Hi, i have some problem with unresolved symbols loading my alsa sound
card driver with this kernel version. At first i suspected an unclean
build, but then i did make clean bzImage modules and the unresolved
symbols persist (i have wakeup/nonpreemptible/interrupts-off tracing
enabled (see .config)):
snd_pcm: Unknown symbol user_trace_stop
snd_ac97_codec: Unknown symbol snd_interval_refine
snd_ac97_codec: Unknown symbol snd_pcm_hw_rule_add
snd_cs46xx: Unknown symbol snd_ac97_write_cache
snd_cs46xx: Unknown symbol snd_pcm_new
snd_cs46xx: Unknown symbol snd_pcm_lib_preallocate_pages_for_all
snd_cs46xx: Unknown symbol snd_pcm_format_unsigned
snd_cs46xx: Unknown symbol snd_pcm_format_physical_width
snd_cs46xx: Unknown symbol snd_ac97_mixer
snd_cs46xx: Unknown symbol snd_ac97_bus
snd_cs46xx: Unknown symbol snd_pcm_lib_malloc_pages
snd_cs46xx: Unknown symbol snd_pcm_lib_ioctl
snd_cs46xx: Unknown symbol snd_pcm_lib_free_pages
snd_cs46xx: Unknown symbol snd_pcm_set_ops
snd_cs46xx: Unknown symbol snd_pcm_hw_constraint_list
snd_cs46xx: Unknown symbol snd_pcm_format_big_endian
snd_cs46xx: Unknown symbol snd_pcm_lib_preallocate_free_for_all
snd_cs46xx: Unknown symbol snd_pcm_period_elapsed
snd_cs46xx: Unknown symbol snd_ac97_write
snd_cs46xx: Unknown symbol snd_ac97_read
snd_cs46xx: Unknown symbol snd_pcm_format_width
.config attached
flo
On Tue, 2004-11-23 at 18:58 +0100, Ingo Molnar wrote:
> i have released the -V0.7.30-9 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
I have notices some weird interactivity issues with this. These are
also present in T3.
The symptom is that CPU bound tasks like kernel compiles will starve I/O
bound tasks like evolution for a _long_ time. If I have a kernel build
and external modules building at the same time and Evolution goes to
"Update message list...", it can sit and spin with a blank message pane
for a minute or two. If I suspend the builds, the message list renders
immediately.
It seems like the build process is constantly preempting the Evolution
process, preventing the latter from making much progress. The build on
the other hand progresses fine.
AIUI I/O bound, interactive tasks like a mail client should get
scheduled in preference to CPU bound tasks like builds. The scheduler
has heuristics to distinguish the two types of tasks and boosts the
dynamic priority of the former, right? It seems like exactly the
opposite is happening.
Another possibility is that Evolution really DOES use so many cycles to
generate the message list that it looks like a CPU bound process to the
kernel. Unfortunately I think this seems most likely. For example,
evolution still consumes a hell of a lot of CPU at a low nice value. It
just makes other tasks stall this way.
Do I need to just find a better mail client?
Lee
* Florian Schmidt <[email protected]> wrote:
> Hi, i have some problem with unresolved symbols loading my alsa sound
> card driver with this kernel version. At first i suspected an unclean
> build, but then i did make clean bzImage modules and the unresolved
> symbols persist (i have wakeup/nonpreemptible/interrupts-off tracing
> enabled (see .config)):
>
> snd_pcm: Unknown symbol user_trace_stop
does adding this line to kernel/latency.c resolve it?:
EXPORT_SYMBOL(user_trace_stop);
Ingo
* Lee Revell <[email protected]> wrote:
> On Tue, 2004-11-23 at 16:22 -0500, Steven Rostedt wrote:
> > Just curious to why you scale the interrupts from 49 down to 25. What
> > would be wrong with keeping all of them at 49 (or whatever). Being a
> > FIFO, no interrupt would preempt another. Why would you want the first
> > IRQs to be registered have higher priority than (and thus will preempt)
> > irqs registered later.
>
> I raised this issue before. I agree that all interrupts should get
> the same RT prio by default. Otherwise the default behavior is
> arbitrary.
i agree that it's arbitrary. There are two reasons for the ordering:
1) _usually_ the IRQs that get registered first are the 'more important'
ones. E.g. timer and keyboard interrupts will preempt the IDE
interrupt. This is in no way a generic thing though.
2) testing: if all IRQs are at the same priority level then alot less
inter-IRQ preemption occurs, and testing coverage is lower. With all
irqs on different levels the bugs will trigger sooner.
To solve this cleanly some userspace policy code is needed that would
take some settings (e.g. sound_highprio) through which the priority
setup could be configured. It's not a simple task as that could would
have to discover the type of devices that are in the system and their
irqs - possibly a component of udev could do this?
Ingo
* Lee Revell <[email protected]> wrote:
> On Tue, 2004-11-23 at 18:58 +0100, Ingo Molnar wrote:
> > i have released the -V0.7.30-9 Real-Time Preemption patch, which can be
> > downloaded from the usual place:
> >
> > http://redhat.com/~mingo/realtime-preempt/
>
> I have notices some weird interactivity issues with this. These are
> also present in T3.
>
> The symptom is that CPU bound tasks like kernel compiles will starve
> I/O bound tasks like evolution for a _long_ time. If I have a kernel
> build and external modules building at the same time and Evolution
> goes to "Update message list...", it can sit and spin with a blank
> message pane for a minute or two. If I suspend the builds, the
> message list renders immediately.
could you try the vanilla -rc2-mm2 kernel (with PREEMPT enabled), does
it behave in such a way too? At first sight this could be a property of
the upstream scheduler, but maybe it's special to PREEMPT_RT.
Ingo
On Wed, 24 Nov 2004 04:19:08 +0100
Ingo Molnar <[email protected]> wrote:
>
> * Florian Schmidt <[email protected]> wrote:
>
> > Hi, i have some problem with unresolved symbols loading my alsa sound
> > card driver with this kernel version. At first i suspected an unclean
> > build, but then i did make clean bzImage modules and the unresolved
> > symbols persist (i have wakeup/nonpreemptible/interrupts-off tracing
> > enabled (see .config)):
> >
> > snd_pcm: Unknown symbol user_trace_stop
>
> does adding this line to kernel/latency.c resolve it?:
>
> EXPORT_SYMBOL(user_trace_stop);
yes, modules load fine now. thanks.
flo
* Adam Heath <[email protected]> wrote:
> > > I'm seeing something very odd. It's against 29-0. I also seem to
> > > recall seeing something similiar reported earlier.
> > >
> > > I'm seeing pauses on my system. Not certain what is causing it.
> > > Hitting a key on the keyboard unsticks it.
> >
> > at first sight this looks like a scheduling/wakeup anomaly. Please
> > re-report this if it happens with the current (30-4) kernel too. Also,
> > could you test the vanilla -mm tree, it has a few scheduler updates too.
>
> 2.6.10-rc1-mm3 doesn't have the same problem. Didn't have a more
> recent mm kernel available last night. Will compile one, and always
> keep it available.
-rc2-mm2 would be nice to test - there are a number of new interactivity
fixes from Con being test-driven in -mm right now. In particular, these
patches were added in -rc1-mm4. These are the patches in question:
sched-adjust_timeslice_granularity.patch
requeue_granularity.patch
sched-remove_interactive_credit.patch
you can download them individually from:
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm2/broken-out/
so if these symptoms still occur with vanilla -rc2-mm2, could you try to
unapply them, in reverse order? (there might be rejects when you try
that, due to patch dependencies - let me know if it doesnt work out and
i'll do an undo patch.)
Ingo
i have released the -V0.7.30-10 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is a fixes-only release.
the most important fixes are the ones to the priority inheritance logic
(affecting the latency of RT tasks), discovered and reported by Esben
Nielsen. I also found two more PI bugs running the new pi_test2 code
from Esben.
Changes since -V0.7.30-9:
- PI fixes:
- the waiter->prio field caused wrong priority settings upon unlock,
resulting in PI bugs in the nested-locking case.
- use rt_task() when determining PI tasks, not p->policy.
- in the blocking-on-blocked-task nesting case both promote now-RT
tasks to the pi_waiters list and queue them to the head of the wait
list, and demote now-non-RT tasks from the pi_waiters list and
queue them to the tail of the wait list.
- PI-debugging blocker device update from Esben Nielsen
- module build fix: export user_trace_stop symbol, this fixes the error
reported by Florian Schmidt
- tracer fix: in the default !freerunning tracing mode, if the trace
buffer overflows (this is relatively rare, but can happen) then the
tracer overwrote kernel memory that leads to lockups/kernel crashes.
Maybe this bug was also the source of the truncated trace bug
reported by Mark H. Johnson?
- reduce tracing overhead within schedule() when !tracing_enabled.
to create a -V0.7.30-10 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm2/2.6.10-rc2-mm2.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm2-V0.7.30-10
Ingo
i have released the -V0.7.31-0 Real-Time Preemption patch, which can be
downloaded from the usual place:
http://redhat.com/~mingo/realtime-preempt/
this is a merge of the -30-10 patch to 2.6.10-rc2-mm3. There are no
other changes.
to create a -V0.7.31-0 tree from scratch, the patching order is:
http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz2
http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm3/2.6.10-rc2-mm3.bz2
http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-rc2-mm3-V0.7.31-0
Ingo
* Lee Revell <[email protected]> wrote:
> > i only tried the !PREEMPT version though - does that one work for you?
>
> Yup, !PREEMPT works fine. Testing PREEMPT next. So far only
> PREEMPT_VOLUNTARY fails to boot.
found a bug that causes !PREEMPT boot failures. The seqlock type was
wrong for the !RT case, resulting in a subtle bug:
write_seqlock_irqsave() didnt actually disable interrupts. This results
in a deadlock scenario in where the timer interrupt interrupts
update_times (which runs in softirq context). I've uploaded the -31-1
patch with this fix included to the usual place:
http://redhat.com/~mingo/realtime-preempt/
i'm cycling through the various options, but it's looking good so far,
PREEMPT_NONE, PREEMPT_VOLUNTARY and PREEMPT_DESKTOP all booted up fine.
Ingo
* Ingo Molnar <[email protected]> wrote:
> > Yup, !PREEMPT works fine. Testing PREEMPT next. So far only
> > PREEMPT_VOLUNTARY fails to boot.
>
> found a bug that causes !PREEMPT boot failures. The seqlock type was
> wrong for the !RT case, resulting in a subtle bug:
> write_seqlock_irqsave() didnt actually disable interrupts. This
> results in a deadlock scenario in where the timer interrupt interrupts
> update_times (which runs in softirq context). I've uploaded the -31-1
> patch with this fix included to the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> i'm cycling through the various options, but it's looking good so far,
> PREEMPT_NONE, PREEMPT_VOLUNTARY and PREEMPT_DESKTOP all booted up
> fine.
all variations i tried booted up fine. While fixing the non-RT cases a
bug slipped into the PREEMPT_RT branch of seqlock.h though, i fixed that
in the -31-2 patch i just uploaded.
Ingo
On Wed, 2004-11-24 at 04:45 +0100, Ingo Molnar wrote:
> * Lee Revell <[email protected]> wrote:
> >
> > The symptom is that CPU bound tasks like kernel compiles will starve
> > I/O bound tasks like evolution for a _long_ time. If I have a kernel
> > build and external modules building at the same time and Evolution
> > goes to "Update message list...", it can sit and spin with a blank
> > message pane for a minute or two. If I suspend the builds, the
> > message list renders immediately.
>
> could you try the vanilla -rc2-mm2 kernel (with PREEMPT enabled), does
> it behave in such a way too? At first sight this could be a property of
> the upstream scheduler, but maybe it's special to PREEMPT_RT.
>
Have you notice this behavior with other interactive (I/O) tasks, such
as bash. Evolution is quite a big utility, and might be doing something
in the background. If you see the same behavior with bash then there is
no doubt that the compile is slowing down an I/O intensive task.
Another variable can be memory. Are you running this on something with
adequate memory, or is you harddrive churning like mad and you're
constantly thrashing the swap space?
-- Steve
On Wed, 2004-11-24 at 08:33 -0500, Steven Rostedt wrote:
> On Wed, 2004-11-24 at 04:45 +0100, Ingo Molnar wrote:
> > * Lee Revell <[email protected]> wrote:
>
> > >
> > > The symptom is that CPU bound tasks like kernel compiles will starve
> > > I/O bound tasks like evolution for a _long_ time. If I have a kernel
> > > build and external modules building at the same time and Evolution
> > > goes to "Update message list...", it can sit and spin with a blank
> > > message pane for a minute or two. If I suspend the builds, the
> > > message list renders immediately.
> >
> > could you try the vanilla -rc2-mm2 kernel (with PREEMPT enabled), does
> > it behave in such a way too? At first sight this could be a property of
> > the upstream scheduler, but maybe it's special to PREEMPT_RT.
> >
>
> Have you notice this behavior with other interactive (I/O) tasks, such
> as bash. Evolution is quite a big utility, and might be doing something
> in the background. If you see the same behavior with bash then there is
> no doubt that the compile is slowing down an I/O intensive task.
>
No. Only evolution (2.0) exhibits the problem. But, it looks like
evolution uses a comparable amount of CPU to a kernel build just
updating the message list. All stracing it shows me is that it spends a
hell of a lot of time polling(). I think this might be a bloat issue.
> Another variable can be memory. Are you running this on something with
> adequate memory, or is you harddrive churning like mad and you're
> constantly thrashing the swap space?
>
No, I have plenty of RAM (512M). I am using a 600Mhz C3 so the system
is probably CPU bound. But, it seems like evolution should make a
little more progress. I often find myself having to background all
build processes for a few seconds to let the message list render. Once
the list renders, and I resume the builds, evolution is more or less
usable. Running gtk-gnutella in the background will also make evolution
horribly slow.
Running the offending, CPU bound processes at a high nice value solves
the problem. But now I am wasting half my fscking cycles "Updating
message list...". Grr.
Lee
On Wednesday 24 November 2004 06:27, Ingo Molnar wrote:
>i have released the -V0.7.31-0 Real-Time Preemption patch, which can
> be downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
>this is a merge of the -30-10 patch to 2.6.10-rc2-mm3. There are no
>other changes.
>
>to create a -V0.7.31-0 tree from scratch, the patching order is:
Except by the time I got there, it was 31-3.
> http://kernel.org/pub/linux/kernel/v2.6/linux-2.6.9.tar.bz2
>
> http://kernel.org/pub/linux/kernel/v2.6/testing/patch-2.6.10-rc2.bz
>2
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-r
>c2/2.6.10-rc2-mm3/2.6.10-rc2-mm3.bz2
> http://redhat.com/~mingo/realtime-preempt/realtime-preempt-2.6.10-r
>c2-mm3-V0.7.31-0
>
> Ingo
It built relatively uneventfully, set at 4 for the preempt behaviour,
and with a couple of minor additions to the dmesg, booted about the
same.
The keyboard actually feels snappier in kmail as it tends to hang
pretty bad while spamassassin is running. So far, everything I've
tried has just worked, which is a long ways from the last time I
tried it Ingo, congratulations & a tip of the hat to you. Its come a
heck of a long ways in the last 3 weeks.
Now, since this is based on the rc2-mm3 tree, I note that I've got all
the irq shareing back in place that was fixed for the most part in
rc2-plain with the kernel argument 'acpi_skip_timer_override', which
did not seem to be effective for later than 2.6.10-rc2 builds.
I'm also running the cfq scheduler as that also seems to level the cpu
hogs quite a bit here, although it did slow down the amanda run last
night (not on this kernel, but -rc2-bk8) by an hour or so since gzip
was pretty badly starved by cfq. That doesn't hurt my feelings as
the machine now remains usable while amanda is running. Running the
anticipatory scheduler, amanda made this 2800XP a real sickly dog.
What else can I report on that you would like to see? Take a look at
my attached .config and shoot me a list.
Gawd this is smooth!
--
Cheers, Gene
"There are four boxes to be used in defense of liberty:
soap, ballot, jury, and ammo. Please use in that order."
-Ed Howdershelt (Author)
99.29% setiathome rank, not too shabby for a WV hillbilly
Yahoo.com attorneys please note, additions to this message
by Gene Heskett are:
Copyright 2004 by Maurice Eugene Heskett, all rights reserved.
On Tue, 2004-11-23 at 09:58, Ingo Molnar wrote:
> i have released the -V0.7.30-9 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a fixes-only release.
Using PREEMPT_DESKTOP I see a irq related problem with my network
interface:
cat /proc/interrupts
CPU0
0: 134209 IO-APIC-edge timer 0/34148
1: 221 IO-APIC-edge i8042 0/221
8: 1 IO-APIC-edge rtc 0/1
9: 0 IO-APIC-level acpi 0/0
11: 0 IO-APIC-edge radeon@PCI:1:0:0 0/0
12: 2215 IO-APIC-edge i8042 0/2215
14: 650 IO-APIC-edge ide0 2/648
16: 100000 IO-APIC-level eth0 0/0
17: 59 IO-APIC-level libata, libata 0/59
18: 0 IO-APIC-level ICE1712 0/0
20: 15160 IO-APIC-level libata 0/15160
21: 0 IO-APIC-level ehci_hcd, uhci_hcd, uhci_hcd, uhci_hcd,
uhci_hcd 0/0
NMI: 0
LOC: 134034
ERR: 0
MIS: 0
relevant portion of dmesg:
ip_tables: (C) 2000-2002 Netfilter core team
ip_conntrack version 2.1 (8192 buckets, 65536 max) - 304 bytes per
conntrack
r8169 Gigabit Ethernet driver 1.6LK loaded
ACPI: PCI interrupt 0000:00:0b.0[A] -> GSI 16 (level, low) -> IRQ 16
divert: allocating divert_blk for eth0
eth0: Identified chip type is 'RTL8169s/8110s'.
eth0: RTL8169 at 0xf88b6f00, 00:0c:76:b3:c2:43, IRQ 16
IRQ#16 thread RT prio: 40.
hm: ioapic cache empty for irq 16 (e:00000000/d:00010000) 0001a9c1
r8169: eth0: link up
Bluetooth: Core ver 2.7
NET: Registered protocol family 31
Bluetooth: HCI device and connection manager initialized
Bluetooth: HCI socket layer initialized
Bluetooth: L2CAP ver 2.5
Bluetooth: L2CAP socket layer initialized
Bluetooth: RFCOMM ver 1.3
Bluetooth: RFCOMM socket layer initialized
Bluetooth: RFCOMM TTY layer initialized
IRQ#4 thread RT prio: 39.
parport_pc: Ignoring new-style parameters in presence of obsolete ones
parport0: PC-style at 0x378 (0x778) [PCSPP,TRISTATE,EPP]
parport0: irq 7 detected
lp0: using parport0 (polling).
lp0: console ready
NET: Registered protocol family 10
Disabled Privacy Extensions on device c03d7e80(lo)
IPv6 over IPv4 tunneling driver
divert: not allocating divert_blk for non-ethernet device sit0
eth0: no IPv6 routers present
[drm] Initialized radeon 1.11.0 20020828 on minor 0:
ACPI: PCI interrupt 0000:01:00.0[A] -> GSI 16 (level, low) -> IRQ 16
agpgart: Found an AGP 3.0 compliant device at 0000:00:00.0.
agpgart: Putting AGP V3 device at 0000:00:00.0 into 4x mode
agpgart: Putting AGP V3 device at 0000:01:00.0 into 4x mode
[drm] Loading R200 Microcode
IRQ#11 thread RT prio: 38.
irq 16: nobody cared!
[<c0104173>] dump_stack+0x23/0x30 (20)
[<c0147970>] __report_bad_irq+0x30/0xa0 (24)
[<c0147a80>] note_interrupt+0x70/0xb0 (32)
[<c01477dc>] do_hardirq+0x13c/0x150 (40)
[<c0147889>] do_irqd+0x99/0xd0 (32)
[<c0139fda>] kthread+0xaa/0xb0 (48)
[<c0101335>] kernel_thread_helper+0x5/0x10 (153083924)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c014777a>] .... do_hardirq+0xda/0x150
.....[<c0147889>] .. ( <= do_irqd+0x99/0xd0)
.. [<c013c98d>] .... print_traces+0x1d/0x60
.....[<c0104173>] .. ( <= dump_stack+0x23/0x30)
handlers:
[<f892ce60>] (rtl8169_interrupt+0x0/0x140 [r8169])
Disabling IRQ #16
ACPI: PCI interrupt 0000:00:07.0[A] -> GSI 18 (level, low) -> IRQ 18
IRQ#18 thread RT prio: 37.
hm: ioapic cache empty for irq 18 (e:00000000/d:00010000) 0001a9c9
output of lspci -v for the card:
00:0b.0 Ethernet controller: Realtek Semiconductor Co., Ltd. RTL-8169
Gigabit Ethernet (rev 10)
Subsystem: Micro-Star International Co., Ltd.: Unknown device 702c
Flags: bus master, 66Mhz, medium devsel, latency 32, IRQ 16
I/O ports at d000 [size=cffa0000]
Memory at cfffef00 (32-bit, non-prefetchable) [size=256]
Expansion ROM at 00020000 [disabled]
Capabilities: [dc] Power Management version 2
-- Fernando
* Fernando Lopez-Lezcano <[email protected]> wrote:
> Using PREEMPT_DESKTOP I see a irq related problem with my network
> interface:
> IRQ#11 thread RT prio: 38.
> irq 16: nobody cared!
> [<c0104173>] dump_stack+0x23/0x30 (20)
> [<c0147970>] __report_bad_irq+0x30/0xa0 (24)
> [<c0147a80>] note_interrupt+0x70/0xb0 (32)
> [<c01477dc>] do_hardirq+0x13c/0x150 (40)
> [<c0147889>] do_irqd+0x99/0xd0 (32)
> [<c0139fda>] kthread+0xaa/0xb0 (48)
> [<c0101335>] kernel_thread_helper+0x5/0x10 (153083924)
does it otherwise get detected and does it work fine afterwards?
Ingo
On Wed, 2004-11-24 at 14:17, Ingo Molnar wrote:
> * Fernando Lopez-Lezcano <[email protected]> wrote:
>
> > Using PREEMPT_DESKTOP I see a irq related problem with my network
> > interface:
(I don't think this is only with PREEMPT_DESKTOP, on a previous kernel
_RT was giving the same error)
> > IRQ#11 thread RT prio: 38.
> > irq 16: nobody cared!
> > [<c0104173>] dump_stack+0x23/0x30 (20)
> > [<c0147970>] __report_bad_irq+0x30/0xa0 (24)
> > [<c0147a80>] note_interrupt+0x70/0xb0 (32)
> > [<c01477dc>] do_hardirq+0x13c/0x150 (40)
> > [<c0147889>] do_irqd+0x99/0xd0 (32)
> > [<c0139fda>] kthread+0xaa/0xb0 (48)
> > [<c0101335>] kernel_thread_helper+0x5/0x10 (153083924)
>
> does it otherwise get detected and does it work fine afterwards?
Anything network related activity hangs (for example, trying to ping).
Otherwise the machine seems to work.
Same kernel on my laptop loads the network driver with no problems
(different driver, e100), but has problems with the usb uhci-hcd driver.
This is when the driver is loaded:
USB Universal Host Controller Interface driver v2.2
ACPI: PCI interrupt 0000:00:1d.0[A] -> GSI 9 (level, low) -> IRQ 9
uhci_hcd 0000:00:1d.0: UHCI Host Controller
PCI: Setting latency timer of device 0000:00:1d.0 to 64
uhci_hcd 0000:00:1d.0: irq 9, io base 0x1800
uhci_hcd 0000:00:1d.0: new USB bus registered, assigned bus number 1
hub 1-0:1.0: USB hub found
hub 1-0:1.0: 2 ports detected
ACPI: PCI Interrupt Link [LNKD] enabled at IRQ 9
ACPI: PCI interrupt 0000:00:1d.1[B] -> GSI 9 (level, low) -> IRQ 9
uhci_hcd 0000:00:1d.1: UHCI Host Controller
PCI: Setting latency timer of device 0000:00:1d.1 to 64
uhci_hcd 0000:00:1d.1: irq 9, io base 0x1820
uhci_hcd 0000:00:1d.1: new USB bus registered, assigned bus number 2
hub 2-0:1.0: USB hub found
hub 2-0:1.0: 2 ports detected
ACPI: PCI Interrupt Link [LNKC] enabled at IRQ 9
ACPI: PCI interrupt 0000:00:1d.2[C] -> GSI 9 (level, low) -> IRQ 9
uhci_hcd 0000:00:1d.2: UHCI Host Controller
PCI: Setting latency timer of device 0000:00:1d.2 to 64
uhci_hcd 0000:00:1d.2: irq 9, io base 0x1840
uhci_hcd 0000:00:1d.2: new USB bus registered, assigned bus number 3
hub 3-0:1.0: USB hub found
hub 3-0:1.0: 2 ports detected
usb 3-1: new full speed USB device using uhci_hcd and address 2
Initializing USB Mass Storage driver...
scsi0 : SCSI emulation for USB Mass Storage devices
usbcore: registered new driver usb-storage
USB Mass Storage support registered.
usb-storage: device found at 2
usb-storage: waiting for device to settle before scanning
inserting floppy driver for 2.6.9-1.520.1rV0.7.30_9.ll.rhfc2.ccrma
inserting floppy driver for 2.6.9-1.520.1rV0.7.30_9.ll.rhfc2.ccrma
Vendor: Sony Model: MSC-U03 Rev: 1.00
Type: Direct-Access ANSI SCSI revision: 00
Attached scsi generic sg0 at scsi0, channel 0, id 0, lun 0, type 0
usb-storage: device scan complete
Attached scsi removable disk sda at scsi0, channel 0, id 0, lun 0
uhci_hcd 0000:00:1d.0: remove, state 1
usb usb1: USB disconnect, address 1
Device not ready. Make sure there is a disc in the drive.
Device not ready. Make sure there is a disc in the drive.
inserting floppy driver for 2.6.9-1.520.1rV0.7.30_9.ll.rhfc2.ccrma
uhci_hcd 0000:00:1d.0: USB bus 1 deregistered
uhci_hcd 0000:00:1d.1: remove, state 1
usb usb2: USB disconnect, address 1
Device not ready. Make sure there is a disc in the drive.
Device not ready. Make sure there is a disc in the drive.
inserting floppy driver for 2.6.9-1.520.1rV0.7.30_9.ll.rhfc2.ccrma
Also, after the driver is loaded I get a "subdevfs not supported in
kernel" message when I try to:
mount -t usbdevfs usbdevfs /proc/bus/usb
But I do have (I think) the right kernel options for this to work...
And this is when I try to remove it:
uhci_hcd 0000:00:1d.1: USB bus 2 deregistered
uhci_hcd 0000:00:1d.2: remove, state 1
usb usb3: USB disconnect, address 1
usb 3-1: USB disconnect, address 2
rmmod/3999: BUG in do_drain at mm/slab.c:1522
[<c0104173>] dump_stack+0x23/0x30 (20)
[<c0151fc9>] do_drain+0xb9/0xc0 (44)
[<c0151ece>] smp_call_function_all_cpus+0x2e/0x70 (28)
[<c0151ff1>] drain_cpu_caches+0x21/0x90 (24)
[<c0152079>] __cache_shrink+0x19/0x160 (36)
[<c01522cf>] kmem_cache_destroy+0xaf/0x1c0 (28)
[<e094657b>] scsi_destroy_command_freelist+0x6b/0xa0 [scsi_mod] (28)
[<e0947917>] scsi_host_dev_release+0x37/0xa0 [scsi_mod] (36)
[<c0271725>] device_release+0x85/0x90 (32)
[<c01ef505>] kobject_cleanup+0x95/0xa0 (28)
[<c01f00f5>] kref_put+0x45/0x100 (40)
[<c01ef556>] kobject_put+0x26/0x30 (16)
[<e0adfd18>] usb_stor_release_resources+0x78/0x150 [usb_storage] (24)
[<e0ae0294>] storage_disconnect+0xa4/0xb1 [usb_storage] (20)
[<c02ba207>] usb_unbind_interface+0x87/0x90 (28)
[<c0272e46>] device_release_driver+0x86/0x90 (28)
[<c0273109>] bus_remove_device+0x89/0xd0 (28)
[<c0271c14>] device_del+0x74/0xb0 (28)
[<c02c1fd8>] usb_disable_device+0xb8/0x100 (28)
[<c02bcb76>] usb_disconnect+0xc6/0x1a0 (40)
[<c02bcc18>] usb_disconnect+0x168/0x1a0 (40)
[<c02c50d5>] usb_hcd_pci_remove+0x85/0x1c0 (36)
[<c01fbec6>] pci_device_remove+0x46/0x50 (16)
[<c0272e46>] device_release_driver+0x86/0x90 (28)
[<c0272e7b>] driver_detach+0x2b/0x40 (20)
[<c02733c1>] bus_remove_driver+0x71/0xc0 (28)
[<c02739e9>] driver_unregister+0x19/0x30 (16)
[<c01fc15c>] pci_unregister_driver+0x1c/0x30 (16)
[<e0ac2147>] uhci_hcd_cleanup+0x17/0x68 [uhci_hcd] (16)
[<c013eec6>] sys_delete_module+0x146/0x190 (96)
[<c01032a1>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c0151ec3>] .... smp_call_function_all_cpus+0x23/0x70
.....[<c0151ff1>] .. ( <= drain_cpu_caches+0x21/0x90)
.. [<c013c98d>] .... print_traces+0x1d/0x60
.....[<c0104173>] .. ( <= dump_stack+0x23/0x30)
inserting floppy driver for 2.6.9-1.520.1rV0.7.30_9.ll.rhfc2.ccrma
inserting floppy driver for 2.6.9-1.520.1rV0.7.30_9.ll.rhfc2.ccrma
uhci_hcd 0000:00:1d.2: USB bus 3 deregistered
rmmod/3999: BUG in do_drain at mm/slab.c:1522
[<c0104173>] dump_stack+0x23/0x30 (20)
[<c0151fc9>] do_drain+0xb9/0xc0 (44)
[<c0151ece>] smp_call_function_all_cpus+0x2e/0x70 (28)
[<c0151ff1>] drain_cpu_caches+0x21/0x90 (24)
[<c0152079>] __cache_shrink+0x19/0x160 (36)
[<c01522cf>] kmem_cache_destroy+0xaf/0x1c0 (28)
[<e0ac2154>] uhci_hcd_cleanup+0x24/0x68 [uhci_hcd] (16)
[<c013eec6>] sys_delete_module+0x146/0x190 (96)
[<c01032a1>] sysenter_past_esp+0x52/0x71 (-8124)
---------------------------
| preempt count: 00000002 ]
| 2-level deep critical section nesting:
----------------------------------------
.. [<c0151ec3>] .... smp_call_function_all_cpus+0x23/0x70
.....[<c0151ff1>] .. ( <= drain_cpu_caches+0x21/0x90)
.. [<c013c98d>] .... print_traces+0x1d/0x60
.....[<c0104173>] .. ( <= dump_stack+0x23/0x30)
-- Fernando
On Wed, 24 Nov 2004, Ingo Molnar wrote:
>
> * Adam Heath <[email protected]> wrote:
>
> > > > I'm seeing something very odd. It's against 29-0. I also seem to
> > > > recall seeing something similiar reported earlier.
> > > >
> > > > I'm seeing pauses on my system. Not certain what is causing it.
> > > > Hitting a key on the keyboard unsticks it.
> > >
> > > at first sight this looks like a scheduling/wakeup anomaly. Please
> > > re-report this if it happens with the current (30-4) kernel too. Also,
> > > could you test the vanilla -mm tree, it has a few scheduler updates too.
> >
> > 2.6.10-rc1-mm3 doesn't have the same problem. Didn't have a more
> > recent mm kernel available last night. Will compile one, and always
> > keep it available.
>
> -rc2-mm2 would be nice to test - there are a number of new interactivity
> fixes from Con being test-driven in -mm right now. In particular, these
> patches were added in -rc1-mm4. These are the patches in question:
>
> sched-adjust_timeslice_granularity.patch
> requeue_granularity.patch
> sched-remove_interactive_credit.patch
>
> you can download them individually from:
>
> http://kernel.org/pub/linux/kernel/people/akpm/patches/2.6/2.6.10-rc2/2.6.10-rc2-mm2/broken-out/
>
> so if these symptoms still occur with vanilla -rc2-mm2, could you try to
> unapply them, in reverse order? (there might be rejects when you try
> that, due to patch dependencies - let me know if it doesnt work out and
> i'll do an undo patch.)
The symptoms still occur with 30-9. I'll be trying rc2-mm2 over the holiday.
Came in this morning, and after hitting a key, my machine said it was 2:38am,
when it was actually 11:10am. All internet connections had died(obviously).
But the machine started working fine once I hit that key. No messages in
dmesg.
On Wed, 24 Nov 2004, Adam Heath wrote:
> The symptoms still occur with 30-9. I'll be trying rc2-mm2 over the holiday.
Been running rc2-mm3 all day. No issues yet.
* Rui Nuno Capela <[email protected]> wrote:
> last thing, at the moment, that "reliably" locks up the machine is
> accessing the floppy-disk (dev/fd0). Yes, I still have one here, and
> it was just yesterday that I've tried to mount on it and bang!
> power-off and a cold-boot follows. Reproducibility? ALWAYS is often
> enough. Nothing shows up via serial console.
will take a look.
> [...] Jackd XRUN rates are pretty low and on the same level (e.g. less
> than 5 per hour with the default jack_test3.1 test), [...]
could you post the jack_test summary outputs?
> Oh well. But let's get back to reality :) How can I help on fixing
> this floppy showstopper? I've tried with almost every debug option set
> and nothing is dumped either on syslog or serial console. The only
> visible thing is that, once the floppy starts spinning (LED is on) the
> machine freezes. Weird.
how hard of a freeze is it? I.e. if you log in over the text console,
and do:
chrt -f 99 -p `pidof 'IRQ 1'`
chrt -f 99 -p $$
can you access the sysrq keys after the freeze happens? If not, can you
access them if you do:
echo 1 > /proc/sys/kernel/debug_direct_keyboard
? And finally, if the above experiments suggest that it's a hard lockup,
do you have a working NMI watchdog? (i.e. do the NMI counts in
/proc/interrupt increase on all CPUs?)
Ingo
* Ingo Molnar <[email protected]> wrote:
> (this does not solve the irq threading related SMP lockup though, i'm
> attacking that problem next - now that my fd0 gets detected fine ;-) )
in fact, i cannot reproduce the SMP lockup anymore and the floppy works
just fine. Maybe the stale irq, even if detection went fine,
mis-programmed the controller, which ended up totally locking up upon
the first attempted IO?
Ingo
> + current->state = TASK_UNINTERRUPTIBLE;
> + schedule_timeout(HZ/100 + 1);
should use msleep() of course:
Signed-off-by: Ingo Molnar <[email protected]>
--- linux/drivers/block/floppy.c.orig
+++ linux/drivers/block/floppy.c
@@ -4504,6 +4504,12 @@ int __init floppy_init(void)
floppy_track_buffer = NULL;
max_buffer_sectors = 0;
}
+ /*
+ * Small 10 msec delay to let through any interrupt that
+ * initialization might have triggered, to not
+ * confuse detection:
+ */
+ msleep(10);
for (i = 0; i < N_FDC; i++) {
fdc = i;
Ingo Molnar wrote:
>
> i have released the -V0.7.31-0 Real-Time Preemption patch, which can be
> downloaded from the usual place:
>
> http://redhat.com/~mingo/realtime-preempt/
>
> this is a merge of the -30-10 patch to 2.6.10-rc2-mm3. There are no
> other changes.
>
I have a problem. Better said, one half-of-a-problem :)
I've been testing the RT patches on both of my personal machines, one
laptop (P4/UP) and a desktop (P4/SMT). That you probably already know.
On the P4/UP side everything has evolved smoothly, with the only major
quirk now being the loopback device hanging while on mkinitrd. It's not a
system lockup, only the "mkinitrd" and "mount -o loop" processes gets
stuck (distro is Mandrake 10.1c). OTOH, audio performance when regarding
jackd low-latency has reached such levels never dreamt before. To seal my
confidence on the RT I've committed to be the primary kernel that boots by
default and production. I'm happy here, so let's get to the topic.
On the P4/SMP/HT side, history has tought quite a different tale. It was
already late on the VP era when this was even able to boot to the
login-prompt. Then it suffered from all sorts of lockups and starvations
when able to start jackd. Then happily, all that has been ironed out. One
last thing, at the moment, that "reliably" locks up the machine is
accessing the floppy-disk (dev/fd0). Yes, I still have one here, and it
was just yesterday that I've tried to mount on it and bang! power-off and
a cold-boot follows. Reproducibility? ALWAYS is often enough. Nothing
shows up via serial console.
OTOH, my confidence goes down the drain when I compare the jackd
low-latency performance between the latest RT-V0.7.31-3 kernel and the one
supplied from SUSE 9.2 Pro (2.6.8-24). I have been checking and
double-checking this too far many times with even stressful workloads:
SUSE's non-RT kernel has an edge over the latest RT ones. Jackd XRUN rates
are pretty low and on the same level (e.g. less than 5 per hour with the
default jack_test3.1 test), but SUSE's 2.6.8-24 is consistently on par of
RT-V0.7.31-3, and even better if the RT kernel is built with some
preempt-debugging options set.
Is this black-magic or what? :)
Oh well. But let's get back to reality :) How can I help on fixing this
floppy showstopper? I've tried with almost every debug option set and
nothing is dumped either on syslog or serial console. The only visible
thing is that, once the floppy starts spinning (LED is on) the machine
freezes. Weird.
Nuff said.
Cheers.
--
rncbc aka Rui Nuno Capela
[email protected]
Hi Ingo,
>
> * Rui Nuno Capela wrote:
>
>> last thing, at the moment, that "reliably" locks up the machine is
>> accessing the floppy-disk (dev/fd0). Yes, I still have one here, and
>> it was just yesterday that I've tried to mount on it and bang!
>> power-off and a cold-boot follows. Reproducibility? ALWAYS is often
>> enough. Nothing shows up via serial console.
>
> will take a look.
>
Thanks, as always ;)
>> [...] Jackd XRUN rates are pretty low and on the same level (e.g. less
>> than 5 per hour with the default jack_test3.1 test), [...]
>
> could you post the jack_test summary outputs?
>
Of course, but only later tonight (12 hours from now?). Sorry.
>> Oh well. But let's get back to reality :) How can I help on fixing
>> this floppy showstopper? I've tried with almost every debug option set
>> and nothing is dumped either on syslog or serial console. The only
>> visible thing is that, once the floppy starts spinning (LED is on) the
>> machine freezes. Weird.
>
> how hard of a freeze is it? I.e. if you log in over the text console,
> and do:
>
> chrt -f 99 -p `pidof 'IRQ 1'`
> chrt -f 99 -p $$
>
> can you access the sysrq keys after the freeze happens?
The lockup is pretty hard indeed. Complete lockup. No sysrq, not even any
output thru serial console. The only action that has some visible effect
is turning the power/reset switch off :)
> If not, can you access them if you do:
>
> echo 1 > /proc/sys/kernel/debug_direct_keyboard
>
> ? And finally, if the above experiments suggest that it's a hard lockup,
> do you have a working NMI watchdog? (i.e. do the NMI counts in
> /proc/interrupt increase on all CPUs?)
>
Yes, nmi_watchdog=1 was set but have to double-check if the NMI counts
does really pump on /proc/interrupts. Will retry and check later, again.
Bye.
--
rncbc aka Rui Nuno Capela
[email protected]
* Adam Heath <[email protected]> wrote:
> > The symptoms still occur with 30-9. I'll be trying rc2-mm2 over the
> > holiday.
>
> Been running rc2-mm3 all day. No issues yet.
thanks, this very much looks like an -RT related scheduling bug. I've
fixed a handful of scheduling problems in recent kernels (latest is
-31-7), you might want to try it. As far as i can tell, none of the bugs
fixed should cause the symptoms you are seeing, but maybe i'm wrong.
Ingo
* Ingo Molnar <[email protected]> wrote:
> > [...] Jackd XRUN rates are pretty low and on the same level (e.g. less
> > than 5 per hour with the default jack_test3.1 test), [...]
>
> could you post the jack_test summary outputs?
also, could you change JACKD_PRIO from 20 to 50? Otherwise normal IRQ
handlers (IDE, etc.) will preempt jackd.
(the test-clients inherit this prio 50 setting, right?)
Ingo
On Thu, Nov 25, 2004 at 03:33:37PM +0100, Ingo Molnar wrote:
> --- linux/drivers/block/floppy.c.orig
> +++ linux/drivers/block/floppy.c
> @@ -4504,6 +4578,13 @@ int __init floppy_init(void)
> floppy_track_buffer = NULL;
> max_buffer_sectors = 0;
> }
> + /*
> + * Small 10 msec delay to let through any interrupt that
> + * initialization might have triggered, to not
> + * confuse detection:
> + */
> + current->state = TASK_UNINTERRUPTIBLE;
> + schedule_timeout(HZ/100 + 1);
how about using msleep() ?
On Wed, 24 Nov 2004, Adam Heath wrote:
> On Wed, 24 Nov 2004, Adam Heath wrote:
>
> > The symptoms still occur with 30-9. I'll be trying rc2-mm2 over the holiday.
>
> Been running rc2-mm3 all day. No issues yet.
Been almost a day now. Still no odd pause issues.
* Ingo Molnar <[email protected]> wrote:
> the -RT patchset doesnt properly detect my fd0 device, so there's
> definitely something broken in that area. The unpatched -rc2-mm3
> kernel detects it fine. Might be an effect of IRQ threading - the
> floppy hardware/driver is a fragile beast.
found the bug that causes the fd detection failure. It's a generic race
in the upstream floppy driver, which happens to work by chance in the
vanilla kernel but breaks when IRQ and softirq threading is enabled:
when the FDC hardware is initialized, it sometimes generates a floppy
interrupt right away - without being told to. This interrupt can hit the
detection code that executes right after the initialization code, in
particular it can get intermixed with user_reset_fdc() that the
detection code uses. The fd driver is fundamentally single-threaded when
it comes to handling events: an unexpected irq that arrives in the wrong
moment can confuse the reset_fdc() code, which, with softirq and hardirq
threading on, executes in keventd.
in the stock kernel this stale irq doesnt seem to hit the detection code
in the wrong moment, but i think under certain circumstances it may
still happen. One of the typical incarnations of the race was the
following message:
reset set in interrupt, calling c0258400
and googling for "reset set in interrupt, calling" does turn up a fair
number of bootlogs (most of them 2.4 ones) that show such a detection
failure, so i think upstream wants to have the fix too.
the fix is simple: delay a bit after initialization, to make sure the
stale irq does not interfere with the detection code. It will be safely
ignored, since do_floppy is still NULL. It might look sloppy that i went
for a delay, but delay i think it is better than waiting for the irq to
occur, because i dont think there's a guarantee that fdc initialization
triggers an interrupt, so waiting for it could hang the boot process. A
delay OTOH is totally harmless.
The attached patch implements this fix, which resolves the detection
problem on my testbox.
here's again how a failure looks like:
Floppy drive(s): fd0 is 1.44M
reset set in interrupt, calling c0258400
floppy0: no floppy controllers found
and this is how it works with the fix:
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
i've tested this on vanilla 2.6.10-rc2-mm3 too (to make sure this doesnt
break the floppy driver), and it should work fine in -BK too.
(this does not solve the irq threading related SMP lockup though, i'm
attacking that problem next - now that my fd0 gets detected fine ;-) )
Ingo
Signed-off-by: Ingo Molnar <[email protected]>
--- linux/drivers/block/floppy.c.orig
+++ linux/drivers/block/floppy.c
@@ -4504,6 +4578,13 @@ int __init floppy_init(void)
floppy_track_buffer = NULL;
max_buffer_sectors = 0;
}
+ /*
+ * Small 10 msec delay to let through any interrupt that
+ * initialization might have triggered, to not
+ * confuse detection:
+ */
+ current->state = TASK_UNINTERRUPTIBLE;
+ schedule_timeout(HZ/100 + 1);
for (i = 0; i < N_FDC; i++) {
fdc = i;
the -RT patchset doesnt properly detect my fd0 device, so there's
definitely something broken in that area. The unpatched -rc2-mm3 kernel
detects it fine. Might be an effect of IRQ threading - the floppy
hardware/driver is a fragile beast.
Ingo
Ingo Molnar wrote:
>
>
> the -RT patchset doesnt properly detect my fd0 device, so there's
> definitely something broken in that area. The unpatched -rc2-mm3 kernel
> detects it fine. Might be an effect of IRQ threading - the floppy
> hardware/driver is a fragile beast.
>
But it works flawlessly on my laptop (P4/UP). Could it be SMP/HT related?
--
rncbc aka Rui Nuno Capela
[email protected]
* Ingo Molnar <[email protected]> wrote:
> > (this does not solve the irq threading related SMP lockup though, i'm
> > attacking that problem next - now that my fd0 gets detected fine ;-) )
>
> in fact, i cannot reproduce the SMP lockup anymore and the floppy works
> just fine. [...]
it turns out that this was the side-effect of me doing idle=poll.
The reason for the lockup was a bug in the -RT patch, it broke
disable_hlt() which resulted in all CPUs spinning in idle with
interrupts disabled - with an obvious hard lockup as a result.
Fixed it in my tree.
Ingo
* Rui Nuno Capela <[email protected]> wrote:
> Ingo Molnar wrote:
> >
> > the -RT patchset doesnt properly detect my fd0 device, so there's
> > definitely something broken in that area. The unpatched -rc2-mm3 kernel
> > detects it fine. Might be an effect of IRQ threading - the floppy
> > hardware/driver is a fragile beast.
> >
>
> But it works flawlessly on my laptop (P4/UP). Could it be SMP/HT
> related?
yeah, could be, the failure i'm seeing is on SMP. On SMP if you have
interrupt threads then concurrency is higher. fd0 gets detected fine
when i turn IRQ threading off.
Ingo
Ingo Molnar
>
> * Rui Nuno Capela wrote:
>
>> > how hard of a freeze is it? I.e. if you log in over the text console,
>> > and do:
>> >
>> > chrt -f 99 -p `pidof 'IRQ 1'`
>> > chrt -f 99 -p $$
>> >
>> > can you access the sysrq keys after the freeze happens?
>>
>> The lockup is pretty hard indeed. Complete lockup. No sysrq, not even
>> any output thru serial console. The only action that has some visible
>> effect is turning the power/reset switch off :)
>
> note that unless you try the above, or the debug_direct_keyboard switch,
> 'soft' lockups will have the same symptoms: no sysrq, no serial console,
> an apparently hung system. So unless you've done the equivalent already,
> please try my suggestions.
>
> Ingo
>
Yes Master :)
--
rncbc aka Rui Nuno Capela
[email protected]
* Rui Nuno Capela <[email protected]> wrote:
> > how hard of a freeze is it? I.e. if you log in over the text console,
> > and do:
> >
> > chrt -f 99 -p `pidof 'IRQ 1'`
> > chrt -f 99 -p $$
> >
> > can you access the sysrq keys after the freeze happens?
>
> The lockup is pretty hard indeed. Complete lockup. No sysrq, not even
> any output thru serial console. The only action that has some visible
> effect is turning the power/reset switch off :)
note that unless you try the above, or the debug_direct_keyboard switch,
'soft' lockups will have the same symptoms: no sysrq, no serial console,
an apparently hung system. So unless you've done the equivalent already,
please try my suggestions.
Ingo