Hi all
Please CC: followups to me direct since I don't always get to read the list
every day.
I have encountered an oops under 2.4.20 while burning CDs (the ksymoops
output is included later in this email). The writer is an ATAPI device
currently connected as the master on the secondary IDE interface. The
complete IDE setup in use is:
Primary IDE (ide0), master = boot HDD
Primary IDE (ide0), slave = second HDD
Secondary IDE (ide1), master = CD writer
Secondary IDE (ide1), slave = third HDD
IDE from ESS1868 sound card (ide3), master = CD reader
Both CDs (the writer and the reader) are handled by ide-scsi. Other stats
on the box:
CPU: Pentium-MMX 233
RAM: 96MB SDRAM
Video card: Trio64V2 PCI
Other cards: Tulip-based NIC, ieee1394 card, Ensoniq AudioPCI soundcard,
ESS1868 soundcard (ISA), ALS007 soundcard (ISA)
cdrecord version: 1.11a24
Up until a system upgrade on 29 November 2002 many CDs had been successfully
burnt using this setup under 2.2.19. Since then some CDs have been burnt
successfully using cdrdao.
The CD data does NOT come from /dev/hdd (in all current tests, partitions on
/dev/hda were the source).
The kernel is compiled with ISAPnP in the kernel and IDE has PnP-IDE
enabled. The target machine has been set as 586.
Following the oops, the caps-lock and scroll-lock keyboard LEDs blink at a
rate of 1Hz. The num-lock LED remains off. I don't know if this is
significant.
I have also tried 2.4.21pre3 and 2.4.18 and the opps occurs there as well.
The oops occurs after writing an apparently random number of bytes: it has
occurred after 25, 27, 57 and 580MB for instance.
Under 2.4.20 the oops usually occurs very soon after writing commences -
normally between 7MB and 30MB are written before disaster strikes. 2.4.18
seems less prone to the trouble - it tends to oops only after several
hundred MB have been written.
Other changes to the system which do NOT prevent the oops:
* making the CDwriter the slave on its IDE channel, with the third HDD
being the master
* compiling the kernel WITHOUT IDE-PnP enabled
* removing the ESS1868 from the system (both with and without IDE-PnP
enabled)
* have ide-scsi handle only the CD writer
* removing/never loading USB and IEEE1394 driver modules
* removing ESS1868 and ALS007 soundcards
The following is the output of ksymoops run on the 2.4.20 oops.
ksymoops 2.4.5 on i586 2.4.20. Options used
-V (default)
-k /proc/ksyms (default)
-l /proc/modules (default)
-o /lib/modules/2.4.20/ (default)
-m /boot/System.map (specified)
Unable to handle kernel NULL pointer dereference at virtual address 00000018
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<ccbc6805>] Not tainted
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010286
eax: c9adbc40 ebx: c0310328 ecx: c9711f20 edx: c9711f20
esi: 00000000 edi: c0210328 ebp: 00000000 esp: c02cfe8c
ds: 0018 es: 0018 ss: 0018
Process swapper (pid: 0, stackpage=c02cf000)
Stack: c0310328 c0310328 c0310328 00000000 00000000 c0310328 c0310328 c9adbc40
ccbc699f c0310328 00000000 00000000 c01ccc56 c0310328 c9711f20 00000000
c9711f20 c03102e4 c0310328 00000282 00000000 c03102e4 c0310328 c01ccfb7
Call trace: [<ccbc699f>] [<c01ccc56>] [<c01ccfb7>] [<c01cd281>] [<c01cd0e0>]
[<c011c91c>] [<c01199c6>] [<c0119906>] [<c011971a>] [<c0109ced>] [<c0106e10>]
[<c010be78>] [<c0106e10>] [<c0106e33>] [<c0106e97>] [<c0105000>] [<c0105027>]
Code: 8b 56 18 c7 44 24 18 00 00 00 00 89 70 04 8b 46 1c 8b 7e 0c
>>EIP; ccbc6805 <[ide-scsi]idescsi_issue_pc+19/198> <=====
>>eax; c9adbc40 <_end+97c61bc/c86c57c>
>>ebx; c0310328 <ide_hwifs+3a8/21e8>
>>ecx; c9711f20 <_end+93fc49c/c86c57c>
>>edx; c9711f20 <_end+93fc49c/c86c57c>
>>edi; c0210328 <rtnetlink_event+0/3c>
>>esp; c02cfe8c <init_task_union+1e8c/2000>
Trace; ccbc699f <[ide-scsi]idescsi_do_request+1b/4c>
Trace; c01ccc56 <start_request+1ae/228>
Trace; c01ccfb7 <ide_do_request+28b/2d8>
Trace; c01cd281 <ide_timer_expiry+1a1/1b0>
Trace; c01cd0e0 <ide_timer_expiry+0/1b0>
Trace; c011c91c <timer_bh+274/390>
Trace; c01199c6 <bh_action+1a/48>
Trace; c0119906 <tasklet_hi_action+4a/70>
Trace; c011971a <do_softirq+5a/ac>
Trace; c0109ced <do_IRQ+a1/b4>
Trace; c0106e10 <default_idle+0/28>
Trace; c010be78 <call_do_IRQ+5/d>
Trace; c0106e10 <default_idle+0/28>
Trace; c0106e33 <default_idle+23/28>
Trace; c0106e97 <cpu_idle+3f/54>
Trace; c0105000 <_stext+0/0>
Trace; c0105027 <rest_init+27/28>
Code; ccbc6805 <[ide-scsi]idescsi_issue_pc+19/198>
00000000 <_EIP>:
Code; ccbc6805 <[ide-scsi]idescsi_issue_pc+19/198> <=====
0: 8b 56 18 mov 0x18(%esi),%edx <=====
Code; ccbc6808 <[ide-scsi]idescsi_issue_pc+1c/198>
3: c7 44 24 18 00 00 00 movl $0x0,0x18(%esp,1)
Code; ccbc680f <[ide-scsi]idescsi_issue_pc+23/198>
a: 00
Code; ccbc6810 <[ide-scsi]idescsi_issue_pc+24/198>
b: 89 70 04 mov %esi,0x4(%eax)
Code; ccbc6813 <[ide-scsi]idescsi_issue_pc+27/198>
e: 8b 46 1c mov 0x1c(%esi),%eax
Code; ccbc6816 <[ide-scsi]idescsi_issue_pc+2a/198>
11: 8b 7e 0c mov 0xc(%esi),%edi
<0>Kernel panic: Aiee, killing interrupt handler!
Excluding the value in ax, the oops is always the same.
All other functionality on the box is rock solid. memtest86 reports no RAM
errors.
If more detail is needed I'm happy to provide it - just email me.
Best regards
jonathan
--
* Jonathan Woithe [email protected] *
* http://www.physics.adelaide.edu.au/~jwoithe *
***-----------------------------------------------------------------------***
** "Time is an illusion; lunchtime doubly so" **
* "...you wouldn't recognize a subtle plan if it painted itself purple and *
* danced naked on a harpsichord singing 'subtle plans are here again'" *
Hi all
This is a followup to my previous message regarding a consistant
reproducable oops under 2.4.20 when burning CDs using cdrecord and ide-scsi.
Last night I tried disabling DMA on the CD drive (it was enabled by
default):
hdparm -d0 -c1 /dev/hdc
Following this, two dummy writes (700MB and 680MB respectively) proceeded
normally and completed without errors. Given that under 2.4.20 the dummy
writes usually failed very quickly (after around 7-30MB), it would appear
that in the first instance at least, DMA must be disabled for ide-scsi CD
writers under 2.4.20 in order for burning to be reliable.
Obviously this isn't the intended behaviour though; is anyone aware of this
issue and/or is it being worked on for 2.4/2.5 as appropriate?
Best regards
jonathan
--
* Jonathan Woithe [email protected] *
* http://www.physics.adelaide.edu.au/~jwoithe *
***-----------------------------------------------------------------------***
** "Time is an illusion; lunchtime doubly so" **
* "...you wouldn't recognize a subtle plan if it painted itself purple and *
* danced naked on a harpsichord singing 'subtle plans are here again'" *
2.4.20 is kind of the nailign together of the old IDE as best as
possible. 2.4.21 is the IDE updates. It isnt perfect yet but its a lot
lot better.