2001-02-27 02:11:00

by Michal Jaegermann

[permalink] [raw]
Subject: 2.4 kernels - "attempt to access beyond end of device"

I have right now on hands a system with PDC20265 controller, not used
as "raid", and it gives me a hard time. It looks like that after some
number of megabytes copied to a disk, where "number" seems to be
somewhere between 100 and 150, something in a kernel internal structures
get overwritten and the whole thing just blows up. After an oops mostly
anything will end up with errors so even a clean reboot will likely
be not possible.

In particular this prevents me from installing the recent Red Hat public
beta with its kernel based on 2.4.1. I tried also some other variants
of 2.4 kernels and so far results are the same. If there is something
left in logs then I see messages of that sort (21:01 is /dev/hde1).

21:01: rw=0, want=536992869, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992870, limit=4506201
attempt to access beyond end of device

The following log file is for 2.4.2-ac5. It has less extraneous
stuff like LVM, and RAID, and USB support, and whatever...
These were effects of an attempt to copy from one vfat to another
vfat file system. Below is also decoded oops.


Linux version 2.4.2-ac5 ([email protected]) (gcc version 2.96 20000731 (Red Hat Linux 7.0)) #4 Mon Feb 26 18:11:13 MST 2001
BIOS-provided physical RAM map:
BIOS-e820: 000000000009dc00 @ 0000000000000000 (usable)
BIOS-e820: 0000000000002400 @ 000000000009dc00 (reserved)
BIOS-e820: 0000000000010000 @ 00000000000f0000 (reserved)
BIOS-e820: 000000001feec000 @ 0000000000100000 (usable)
BIOS-e820: 0000000000003000 @ 000000001ffec000 (ACPI data)
BIOS-e820: 0000000000010000 @ 000000001ffef000 (reserved)
BIOS-e820: 0000000000001000 @ 000000001ffff000 (ACPI NVS)
BIOS-e820: 0000000000010000 @ 00000000ffff0000 (reserved)
On node 0 totalpages: 131052
zone(0): 4096 pages.
zone(1): 126956 pages.
zone(2): 0 pages.
Kernel command line: initrd=initrd.img root=/dev/hdg3 BOOT_IMAGE=vmlinuz auto
Initializing CPU#0
Detected 1109.899 MHz processor.
Console: colour VGA+ 80x25
Calibrating delay loop... 2215.11 BogoMIPS
Memory: 513140k/524208k available (920k kernel code, 10672k reserved, 351k data, 176k init, 0k highmem)
Dentry-cache hash table entries: 65536 (order: 7, 524288 bytes)
Buffer-cache hash table entries: 32768 (order: 5, 131072 bytes)
Page-cache hash table entries: 131072 (order: 7, 524288 bytes)
Inode-cache hash table entries: 32768 (order: 6, 262144 bytes)
CPU: Before vendor init, caps: 0183f9ff c1c7f9ff 00000000, vendor = 2
CPU: L1 I Cache: 64K (64 bytes/line), D cache 64K (64 bytes/line)
CPU: L2 Cache: 256K (64 bytes/line)
CPU: After vendor init, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: After generic, caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: Common caps: 0183f9ff c1c7f9ff 00000000 00000000
CPU: AMD Athlon(tm) Processor stepping 02
Enabling fast FPU save and restore... done.
Checking 'hlt' instruction... OK.
POSIX conformance testing by UNIFIX
mtrr: v1.37 (20001109) Richard Gooch ([email protected])
mtrr: detected mtrr type: Intel
PCI: PCI BIOS revision 2.10 entry at 0xf1150, last bus=1
PCI: Using configuration type 1
PCI: Probing PCI hardware
PCI: Using IRQ router VIA [1106/0686] at 00:04.0
PCI: Found IRQ 9 for device 00:09.0
PCI: The same IRQ used for device 00:04.2
PCI: The same IRQ used for device 00:04.3
PCI: The same IRQ used for device 00:0d.0
PCI: Found IRQ 11 for device 00:0c.0
Linux NET4.0 for Linux 2.4
Based upon Swansea University Computer Society NET3.039
Starting kswapd v1.8
pty: 256 Unix98 ptys configured
block: queued sectors max/low 341080kB/210008kB, 1024 slots per queue
RAMDISK driver initialized: 16 RAM disks of 8192K size 1024 blocksize
Uniform Multi-Platform E-IDE driver Revision: 6.31
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: IDE controller on PCI bus 00 dev 21
VP_IDE: chipset revision 16
VP_IDE: not 100% native mode: will probe irqs later
ide: Assuming 33MHz system bus speed for PIO modes; override with idebus=xx
VP_IDE: VIA vt82c686a (rev 22) IDE UDMA66 controller on pci00:04.1
ide0: BM-DMA at 0xb800-0xb807, BIOS settings: hda:DMA, hdb:pio
ide1: BM-DMA at 0xb808-0xb80f, BIOS settings: hdc:pio, hdd:pio
PDC20265: IDE controller on PCI bus 00 dev 88
PCI: Found IRQ 10 for device 00:11.0
PDC20265: chipset revision 2
PDC20265: not 100% native mode: will probe irqs later
PDC20265: (U)DMA Burst Bit ENABLED Primary PCI Mode Secondary PCI Mode.
ide2: BM-DMA at 0x6800-0x6807, BIOS settings: hde:DMA, hdf:pio
ide3: BM-DMA at 0x6808-0x680f, BIOS settings: hdg:DMA, hdh:pio
hda: CREATIVE CD5233E, ATAPI CD/DVD-ROM drive
hde: IBM-DTLA-307045, ATA DISK drive
hdg: IBM-DTLA-307045, ATA DISK drive
ide0 at 0x1f0-0x1f7,0x3f6 on irq 14
ide2 at 0x8000-0x8007,0x7802 on irq 10
ide3 at 0x7400-0x7407,0x7002 on irq 10
hde: 90069840 sectors (46116 MB) w/1916KiB Cache, CHS=89355/16/63, UDMA(100)
hdg: 90069840 sectors (46116 MB) w/1916KiB Cache, CHS=89355/16/63, UDMA(100)
hda: ATAPI 52X CD-ROM drive, 128kB Cache, DMA
Uniform CD-ROM driver Revision: 3.12
Partition check:
hde: [PTBL] [5606/255/63] hde1
hdg: [PTBL] [5606/255/63] hdg1 hdg2 hdg3 hdg4 < hdg5 hdg6 hdg7 hdg8 hdg9 hdg10 hdg11 >
Floppy drive(s): fd0 is 1.44M
FDC 0 is a post-1991 82077
RAMDISK: Compressed image found at block 0
Freeing initrd memory: 279k freed
loop: loaded (max 8 devices)
SCSI subsystem driver Revision: 1.00
request_module[scsi_hostadapter]: Root fs not mounted
request_module[scsi_hostadapter]: Root fs not mounted
request_module[scsi_hostadapter]: Root fs not mounted
NET4: Linux TCP/IP 1.0 for NET4.0
IP Protocols: ICMP, UDP, TCP
IP: routing cache hash table of 4096 buckets, 32Kbytes
TCP: Hash tables configured (established 32768 bind 32768)
NET4: Unix domain sockets 1.0/SMP for Linux NET4.0.
VFS: Mounted root (ext2 filesystem).
ncr53c8xx: at PCI bus 0, device 10, function 0
ncr53c8xx: 53c810 detected
ncr53c810-0: rev 0x2 on pci bus 0 device 10 function 0 irq 5
ncr53c810-0: ID 7, Fast-10, Parity Checking
ncr53c810-0: restart (scsi reset).
scsi0 : ncr53c8xx - version 3.3b
Vendor: FUJITSU Model: M2952S-512 Rev: 0122
Type: Direct-Access ANSI SCSI revision: 02
Vendor: DEC Model: DSP3210S Rev: X442
Type: Direct-Access ANSI SCSI revision: 02
ncr53c810-0-<2,0>: tagged command queue depth set to 8
ncr53c810-0-<4,0>: tagged command queue depth set to 8
Attached scsi disk sda at scsi0, channel 0, id 2, lun 0
Attached scsi disk sdb at scsi0, channel 0, id 4, lun 0
ncr53c810-0-<2,0>: sync_msgout: 1-3-1-19-8.
ncr53c810-0-<2,0>: sync msgin: 1-3-1-19-8.
ncr53c810-0-<2,0>: sync: per=25 scntl3=0x10 ofs=8 fak=0 chg=0.
ncr53c810-0-<2,*>: FAST-10 SCSI 10.0 MB/s (100 ns, offset 8)
SCSI device sda: 4693462 512-byte hdwr sectors (2403 MB)
sda: sda1 sda2 sda3 sda4 < sda5 sda6 >
ncr53c810-0-<4,0>: sync_msgout: 1-3-1-19-8.
ncr53c810-0-<4,0>: sync msgin: 1-3-1-19-8.
ncr53c810-0-<4,0>: sync: per=25 scntl3=0x10 ofs=8 fak=0 chg=0.
ncr53c810-0-<4,*>: FAST-10 SCSI 10.0 MB/s (100 ns, offset 8)
SCSI device sdb: 4197520 512-byte hdwr sectors (2149 MB)
sdb: sdb1 sdb2 sdb3 sdb4 < sdb5 sdb6 >
via-rhine.c:v1.08b-LK1.1.7 8/9/2000 Written by Donald Becker
http://www.scyld.com/network/via-rhine.html
PCI: Enabling device 00:09.0 (0094 -> 0097)
PCI: Found IRQ 9 for device 00:09.0
PCI: The same IRQ used for device 00:04.2
PCI: The same IRQ used for device 00:04.3
PCI: The same IRQ used for device 00:0d.0
eth0: VIA VT3043 Rhine at 0x9400, 00:50:ba:c1:64:d9, IRQ 9.
eth0: MII PHY found at address 8, status 0x7809 advertising 05e1 Link 0000.
PCI: Enabling device 00:0c.0 (0094 -> 0097)
PCI: Found IRQ 11 for device 00:0c.0
eth1: VIA VT3043 Rhine at 0x8800, 00:50:ba:ab:60:64, IRQ 11.
eth1: MII PHY found at address 8, status 0x782d advertising 05e1 Link 0000.
VFS: Mounted root (ext2 filesystem) readonly.
change_root: old root has d_count=3
Trying to unmount old root ... okay
Freeing unused kernel memory: 176k freed
Adding Swap: 136512k swap-space (priority -1)
Filesystem panic (dev 22:02).
fat_free: deleting beyond EOF
File system has been set read-only
attempt to access beyond end of device
21:01: rw=0, want=536992869, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992870, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992870, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992871, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992871, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992872, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992872, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992873, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992869, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992870, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992870, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992871, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992871, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992872, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992872, limit=4506201
attempt to access beyond end of device
21:01: rw=0, want=536992873, limit=4506201
Unable to handle kernel paging request at virtual address 08000004
printing eip:
c0130fbe
*pde = 0b120067
*pte = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0130fbe>]
EFLAGS: 00010206
eax: c18a0000 ebx: 00006b93 ecx: 00000003 edx: 08000000
esi: 00007fff edi: 0000000f ebp: 00007562 esp: debe9e14
ds: 0018 es: 0018 ss: 0018
Process cp (pid: 586, stackpage=debe9000)
Stack: 2202cd48 00000000 00000200 c13ecd48 00000000 c0131e36 00002202 00007562
00000200 c13ecd48 00000000 c0132177 ceeb6640 00000000 c0132198 ceed7de0
ceed7de0 cec6e000 debe9e6c 00000200 00000008 ceed7de0 00000002 00011ead
Call Trace: [<c0131e36>] [<c0132177>] [<c0132198>] [<c012a85e>] [<c013179f>] [<c0132860>] [<c015a4f0>]
[<c015bf45>] [<c015a4f0>] [<c0124ddc>] [<c015a641>] [<c015a619>] [<c012fbd6>] [<c0108fb3>]

Code: 39 6a 04 75 f5 66 8b 42 08 25 ff ff 00 00 3b 44 24 20 75 e6

And here are results of ksymoops:

Error (pclose_local): find_objects pclose failed 0x100
Unable to handle kernel paging request at virtual address 08000004
c0130fbe
*pde = 0b120067
Oops: 0000
CPU: 0
EIP: 0010:[<c0130fbe>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010206
eax: c18a0000 ebx: 00006b93 ecx: 00000003 edx: 08000000
esi: 00007fff edi: 0000000f ebp: 00007562 esp: debe9e14
ds: 0018 es: 0018 ss: 0018
Process cp (pid: 586, stackpage=debe9000)
Stack: 2202cd48 00000000 00000200 c13ecd48 00000000 c0131e36 00002202 00007562
00000200 c13ecd48 00000000 c0132177 ceeb6640 00000000 c0132198 ceed7de0
ceed7de0 cec6e000 debe9e6c 00000200 00000008 ceed7de0 00000002 00011ead
Call Trace: [<c0131e36>] [<c0132177>] [<c0132198>] [<c012a85e>] [<c013179f>] [<c0132860>] [<c015a4f0>]
[<c015bf45>] [<c015a4f0>] [<c0124ddc>] [<c015a641>] [<c015a619>] [<c012fbd6>] [<c0108fb3>]
Code: 39 6a 04 75 f5 66 8b 42 08 25 ff ff 00 00 3b 44 24 20 75 e6

>>EIP; c0130fbe <get_hash_table+7e/b0> <=====
Trace; c0131e36 <unmap_underlying_metadata+26/70>
Trace; c0132177 <__block_prepare_write+e7/2c0>
Trace; c0132198 <__block_prepare_write+108/2c0>
Trace; c012a85e <nr_free_buffer_pages+e/60>
Trace; c013179f <balance_dirty_state+f/50>
Trace; c0132860 <cont_prepare_write+1f0/2b0>
Trace; c015a4f0 <fat_get_block+0/100>
Trace; c015bf45 <fat_prepare_write+25/30>
Trace; c015a4f0 <fat_get_block+0/100>
Trace; c0124ddc <generic_file_write+3ac/5d0>
Trace; c015a641 <default_fat_file_write+21/60>
Trace; c015a619 <fat_file_write+29/30>
Trace; c012fbd6 <sys_write+96/d0>
Trace; c0108fb3 <system_call+33/40>
Code; c0130fbe <get_hash_table+7e/b0>
00000000 <_EIP>:
Code; c0130fbe <get_hash_table+7e/b0> <=====
0: 39 6a 04 cmp %ebp,0x4(%edx) <=====
Code; c0130fc1 <get_hash_table+81/b0>
3: 75 f5 jne fffffffa <_EIP+0xfffffffa> c0130fb8 <get_hash_table+78/b0>
Code; c0130fc3 <get_hash_table+83/b0>
5: 66 8b 42 08 mov 0x8(%edx),%ax
Code; c0130fc7 <get_hash_table+87/b0>
9: 25 ff ff 00 00 and $0xffff,%eax
Code; c0130fcc <get_hash_table+8c/b0>
e: 3b 44 24 20 cmp 0x20(%esp,1),%eax
Code; c0130fd0 <get_hash_table+90/b0>
12: 75 e6 jne fffffffa <_EIP+0xfffffffa> c0130fb8 <get_hash_table+78/b0>


3 errors issued. Results may not be reliable.

Here is a similar one, from another kernel (2.4.1-0.1.9 Red Hat from
"Wolverine"), with what looks like cdrom code inside. As far as I can
tell no VFAT stuff was involved this time.

Unable to handle kernel paging request at virtual address 08000004
c0134336
*pde = 1d3a7067
Oops: 0000
CPU: 0
EIP: 0010:[<c0134336>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010206
eax: c18a0000 ebx: 00000009 ecx: 000014e0 edx: 08000000
esi: 00040008 edi: 00002202 ebp: 0000000f esp: dd311e14
ds: 0018 es: 0018 ss: 0018
Process cp (pid: 791, stackpage=dd311000)
Stack: 000014e0 00000000 00000e00 cb2b0da0 00000c00 c013517b 00002202 00040008
00000200 cb2b0da0 00000c00 c0135517 cbd54420 00000000 c0135538 cb2b0d40
cb2b0f80 cb2ac000 dd311e6c 00000200 000016ae cb2b0d40 06b60600 00001000
Call Trace: [<c013517b>] [<c0135517>] [<c0135538>] [<c0134b0c>] [<c0134b5c>] [<e086d5f0>] [<c0135cfa>]
[<e086d5f0>] [<e086efd5>] [<e086d5f0>] [<c0127ad9>] [<e086d731>] [<e086d709>] [<c0132fe6>] [<c0109007>]
Code: 39 72 04 75 f5 0f b7 42 08 3b 44 24 20 75 eb 66 39 7a 0c 75

>>EIP; c0134336 <get_hash_table+66/90> <=====
Trace; c013517b <unmap_underlying_metadata+1b/60>
Trace; c0135517 <__block_prepare_write+117/300>
Trace; c0135538 <__block_prepare_write+138/300>
Trace; c0134b0c <balance_dirty_state+c/50>
Trace; c0134b5c <balance_dirty+c/40>
Trace; e086d5f0 <[cdrom]cdrom_ioctl+ab0/e20>
Trace; c0135cfa <cont_prepare_write+22a/370>
Trace; e086d5f0 <[cdrom]cdrom_ioctl+ab0/e20>
Trace; e086efd5 <[cdrom]cdrom_sysctl_info+5a5/5d0>
Trace; e086d5f0 <[cdrom]cdrom_ioctl+ab0/e20>
Trace; c0127ad9 <generic_file_write+3a9/5f0>
Trace; e086d731 <[cdrom]cdrom_ioctl+bf1/e20>
Trace; e086d709 <[cdrom]cdrom_ioctl+bc9/e20>
Trace; c0132fe6 <sys_write+96/d0>
Trace; c0109007 <system_call+33/38>
Code; c0134336 <get_hash_table+66/90>
00000000 <_EIP>:
Code; c0134336 <get_hash_table+66/90> <=====
0: 39 72 04 cmp %esi,0x4(%edx) <=====
Code; c0134339 <get_hash_table+69/90>
3: 75 f5 jne fffffffa <_EIP+0xfffffffa> c0134330 <get_hash_table+60/90>
Code; c013433b <get_hash_table+6b/90>
5: 0f b7 42 08 movzwl 0x8(%edx),%eax
Code; c013433f <get_hash_table+6f/90>
9: 3b 44 24 20 cmp 0x20(%esp,1),%eax
Code; c0134343 <get_hash_table+73/90>
d: 75 eb jne fffffffa <_EIP+0xfffffffa> c0134330 <get_hash_table+60/90>
Code; c0134345 <get_hash_table+75/90>
f: 66 39 7a 0c cmp %di,0xc(%edx)
Code; c0134349 <get_hash_table+79/90>
13: 75 00 jne 15 <_EIP+0x15> c013434b <get_hash_table+7b/90>

Anybody with an insight to offer?

Michal
[email protected]


2001-02-27 23:37:07

by Michal Jaegermann

[permalink] [raw]
Subject: Re: 2.4 kernels - "attempt to access beyond end of device"


To add to my report about troubles with disk activity on a system with
PDC20265 IDE controller (this is on Asus AV7 mobo, BTW) I tried the
same experiments with 2.2.19pre14 patched with ide patches to get a
support for Promise.

I got similar results - i.e. problems after some 130-150 megabytes
was copied. On different occasions I got things like that:

file_cluster badly computed!!! 0 <> 536870911
file_cluster badly computed!!! 1 <> 0

practically immediately and followed by a period of a lively disk
activity and a crash.

Whoops: end_buffer_io_async: b_count != 1 on async io.

after which 'cp' process hanged in a "D" state.

attempt to access beyond end of device
21:01: rw=0, want=537238629, limit=4506201
dev 21:01 blksize=512 blocknr=1074477258 sector=1074477258 size=512 count=1
...
(and more of these) terminated with oops decoded below.

To take 'vfat' out of picture I also tried 'cp' from ext2 partitions
(I had to collect number of things as I do not have enough of data on
this system yet) to an ext2 partition while using 2.4.2-ac5. This resulted
in:

EXT2-fs error (device ide3(34,9)): ext2_readdir: bad entry in
directory #16584: inode out of bounds - offset=0, inode=134234312,
rec_len=12, name_len=1
EXT2-fs error (device ide3(34,9)): ext2_readdir: bad entry in
directory #131542: inode out of bounds - offset=0, inode=134349270,
rec_len=12, name_len=1
EXT2-fs error (device ide3(34,9)): ext2_readdir: bad entry in
directory #82294: inode out of bounds - offset=0, inode=134300022,
rec_len=12, name_len=1
EXT2-fs error (device ide3(34,9)): ext2_readdir: bad entry in
directory #164456: inode out of bounds - offset=0, inode=134382184,
rec_len=12, name_len=1
EXT2-fs error (device ide3(34,9)): ext2_readdir: bad entry in
directory #98872: inode out of bounds - offset=0, inode=134316600,
rec_len=12, name_len=1
22:09: rw=0, want=537530884, limit=1574338
attempt to access beyond end of device
22:09: rw=0, want=537530884, limit=1574338
.....
punctuated by oops.

Here is a decoded oops from 2.2.19pre14

Unable to handle kernel paging request at virtual address 08000000
current->tss.cr3 = 1f052000, %cr3 = 1f052000
*pde = 1f67b067
Oops: 0000
CPU: 0
EIP: 0010:[<c01277a8>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010206
eax: 08000000 ebx: 00000007 ecx: 00053d24 edx: 08000000
esi: 0000000d edi: 00002202 ebp: 0004906a esp: de955e7c
ds: 0018 es: 0018 ss: 0018
Process cp (pid: 573, process nr: 19, stackpage=de955000)
Stack: 0004906a 00002202 00053d24 c01277e8 00002202 0004906a 00000200 00000000
c0127b9a 00002202 0004906a 00000200 00000000 0004906a 00000000 ded6a200
ffffffff c0145398 00002202 0004906a 00000200 c014a0db ded6a200 0004906a
Call Trace: [<c01277e8>] [<c0127b9a>] [<c0145398>] [<c014a0db>] [<c0145b6c>] [<c014a26f>] [<c014784e>]
[<c01262d9>] [<c01476d0>] [<c0109534>]
Code: 8b 00 39 6a 04 75 15 8b 4c 24 20 39 4a 08 75 0c 66 39 7a 0c

>>EIP; c01277a8 <find_buffer+68/90> <=====
Trace; c01277e8 <get_hash_table+18/48>
Trace; c0127b9a <getblk+1e/144>
Trace; c0145398 <fat_getblk+3c/4c>
Trace; c014a0db <fat_add_cluster1+243/3cc>
Trace; c0145b6c <fat_get_cluster+58/98>
Trace; c014a26f <fat_add_cluster+b/2c>
Trace; c014784e <fat_file_write+17e/4ac>
Trace; c01262d9 <sys_write+e5/118>
Trace; c01476d0 <fat_file_write+0/4ac>
Trace; c0109534 <system_call+34/38>
Code; c01277a8 <find_buffer+68/90>
00000000 <_EIP>:
Code; c01277a8 <find_buffer+68/90> <=====
0: 8b 00 mov (%eax),%eax <=====
Code; c01277aa <find_buffer+6a/90>
2: 39 6a 04 cmp %ebp,0x4(%edx)
Code; c01277ad <find_buffer+6d/90>
5: 75 15 jne 1c <_EIP+0x1c> c01277c4 <find_buffer+84/90>
Code; c01277af <find_buffer+6f/90>
7: 8b 4c 24 20 mov 0x20(%esp,1),%ecx
Code; c01277b3 <find_buffer+73/90>
b: 39 4a 08 cmp %ecx,0x8(%edx)
Code; c01277b6 <find_buffer+76/90>
e: 75 0c jne 1c <_EIP+0x1c> c01277c4 <find_buffer+84/90>
Code; c01277b8 <find_buffer+78/90>
10: 66 39 7a 0c cmp %di,0xc(%edx)


And here is the one from ext2 to ext2 copy under 2.4.2-ac5

Unable to handle kernel paging request at virtual address ea096084
c0128edf
*pde = 00000000
Oops: 0000
CPU: 0
EIP: 0010:[<c0128edf>]
Using defaults from ksymoops -t elf32-i386 -a i386
EFLAGS: 00010003
eax: 0800001b ebx: 0800001b ecx: 00000282 edx: ca096000
esi: dffd7cdc edi: 00000000 ebp: 00001000 esp: dec6de38
ds: 0018 es: 0018 ss: 0018
Process cp (pid: 543, stackpage=dec6d000)
Stack: 0000220a 0001220a 00000286 00000010 c17cb2e0 00000000 c0133314 dffd7cdc
00000003 c17cb2e0 00000000 c01333d2 00000001 0007ccfc 00000000 c0121be9
220a0000 c17cb2e0 c17cb2e0 0000220a 00000000 c0133687 c17cb2e0 00001000
Call Trace: [<c0133314>] [<c01333d2>] [<c0121be9>] [<c0133687>] [<c01339ab>] [<c0123e8c>] [<c013423d>]
[<c014ffb0>] [<c0126886>] [<c014ffb0>] [<c0124a70>] [<c0131468>] [<c01090a3>]
Code: 8b 44 82 18 0f af 5e 0c 89 42 14 03 5a 0c 40 75 05 8b 02 89

>>EIP; c0128edf <kmem_cache_alloc+1f/60> <=====
Trace; c0133314 <get_unused_buffer_head+34/90>
Trace; c01333d2 <create_buffers+22/1c0>
Trace; c0121be9 <vmtruncate+d9/1a0>
Trace; c0133687 <create_empty_buffers+17/70>
Trace; c01339ab <__block_prepare_write+4b/2b0>
Trace; c0123e8c <add_to_page_cache_unique+ac/c0>
Trace; c013423d <block_prepare_write+1d/40>
Trace; c014ffb0 <ext2_get_block+0/4c0>
Trace; c0126886 <generic_file_write+3b6/5d0>
Trace; c014ffb0 <ext2_get_block+0/4c0>
Trace; c0124a70 <file_read_actor+0/60>
Trace; c0131468 <sys_write+98/d0>
Trace; c01090a3 <system_call+33/40>
Code; c0128edf <kmem_cache_alloc+1f/60>
00000000 <_EIP>:
Code; c0128edf <kmem_cache_alloc+1f/60> <=====
0: 8b 44 82 18 mov 0x18(%edx,%eax,4),%eax <=====
Code; c0128ee3 <kmem_cache_alloc+23/60>
4: 0f af 5e 0c imul 0xc(%esi),%ebx
Code; c0128ee7 <kmem_cache_alloc+27/60>
8: 89 42 14 mov %eax,0x14(%edx)
Code; c0128eea <kmem_cache_alloc+2a/60>
b: 03 5a 0c add 0xc(%edx),%ebx
Code; c0128eed <kmem_cache_alloc+2d/60>
e: 40 inc %eax
Code; c0128eee <kmem_cache_alloc+2e/60>
f: 75 05 jne 16 <_EIP+0x16> c0128ef5 <kmem_cache_alloc+35/60>
Code; c0128ef0 <kmem_cache_alloc+30/60>
11: 8b 02 mov (%edx),%eax
Code; c0128ef2 <kmem_cache_alloc+32/60>
13: 89 00 mov %eax,(%eax)


Does this rings a bell with anybody? I cannot exclude here a faulty
hardware, but it is not overclocked in any way, or BIOS (Award ACPI BIOS
1005C - but this should not matter once I booted - right?).

Michal

2001-02-28 20:46:42

by Michal Jaegermann

[permalink] [raw]
Subject: Re: 2.4 kernels - "attempt to access beyond end of device"

I think that I found what gives me a hell with this box and it
looks like that this not Linux at all. Once again, this is Athlon
K6 on Asus AV7 mobo and "Award Advanced ACPI BIOS" version 1005C.
I have more checks to make before I will be fully satisfied but
this looks like it.

In this BIOS setup there are two "advanced" options:

System Performance Setting [Optimal, Normal]
USB Legacy Support [Auto, Enabled, Disabled]

If the first one is set to "Normal" and the second one to "Disabled"
then the whole system becomes stable. I copied from various file
systems to a directory+on ext2 around 1.2 GB of files without any ill
effects and run succesfully 'diff -r' between two directories 475 MB
each. If BIOS options are any other way then one should expect
spectacular blowups with corrupted file systems and other nasty effects
after the first oops. It survives up to something between 130 to 150
MB of data moved, does not matter which kernel, and that is it.

It is difficult to know what is "System Performance Setting" as it
always shows "Optimal" regardless of a status on the last save. But a
system behaviour depends on how it was set so it seems to change even if
a display, on the next visit, does not. How "USB Legacy Support" comes
into the picture I cannot even imagine.

I did try with 2.2.19pre and 2.4 kernels and the picture does not
change. Any rational explanation beyond that BIOS is doing something
really nasty?

Cheers,
Michal

2001-02-28 21:56:36

by Petr Vandrovec

[permalink] [raw]
Subject: Re: 2.4 kernels - "attempt to access beyond end of device"

On 28 Feb 01 at 13:46, Michal Jaegermann wrote:
> I think that I found what gives me a hell with this box and it
> looks like that this not Linux at all. Once again, this is Athlon
> K6 on Asus AV7 mobo and "Award Advanced ACPI BIOS" version 1005C.

K7 on A7V, I believe...

> I have more checks to make before I will be fully satisfied but
> this looks like it.
...
> System Performance Setting [Optimal, Normal]
...

Try BIOS 1006. AFAIK 1005D changed some VIA values for 'optimal'.
And 1006 contains newer Promise BIOS - but I did not notice any difference:
Windows98 still do not boot if I connect harddisk to /dev/hdh :-(
But Linux works fine...
Best regards,
Petr Vandrovec
[email protected]

2001-02-28 22:47:45

by Michal Jaegermann

[permalink] [raw]
Subject: Re: 2.4 kernels - "attempt to access beyond end of device"

On Wed, Feb 28, 2001 at 10:54:15PM +0000, Petr Vandrovec wrote:
> On 28 Feb 01 at 13:46, Michal Jaegermann wrote:
> > I think that I found what gives me a hell with this box and it
> > looks like that this not Linux at all. Once again, this is Athlon
> > K6 on Asus AV7 mobo and "Award Advanced ACPI BIOS" version 1005C.
>
> K7 on A7V, I believe...

Maybe. 'cat /proc/cpuinfo' says "cpu family: 6" and "model :4"
(and "stepping: 2"). I possibly misinterpreted that. Do not believe
me when I am talking about x86 chips. :-)

> > I have more checks to make before I will be fully satisfied but
> > this looks like it.
> ...
> > System Performance Setting [Optimal, Normal]
> ...
>
> Try BIOS 1006. AFAIK 1005D changed some VIA values for 'optimal'.

Is that important here? IDE drives in question were not connected to
on-board controller but the Promise one. Results seem to indicate
that this 'optimal' was important here anyway.

> And 1006 contains newer Promise BIOS - but I did not notice any difference:
> Windows98 still do not boot if I connect harddisk to /dev/hdh :-(

There is at this moment Windows98 installation on /dev/hde1 and it boots
so far. It got installed and it was booting regardless with these
"other" BIOS seetings.

> But Linux works fine...

Hope so....

Michal
[email protected]

2001-02-28 23:11:25

by Petr Vandrovec

[permalink] [raw]
Subject: Re: 2.4 kernels - "attempt to access beyond end of device"

On 28 Feb 01 at 15:47, Michal Jaegermann wrote:
> > > I have more checks to make before I will be fully satisfied but
> > > this looks like it.
> > ...
> > > System Performance Setting [Optimal, Normal]
> > ...
> >
> > Try BIOS 1006. AFAIK 1005D changed some VIA values for 'optimal'.
>
> Is that important here? IDE drives in question were not connected to
> on-board controller but the Promise one. Results seem to indicate
> that this 'optimal' was important here anyway.

VIA host-bridge, not VIA-IDE... It is important even if you use Promise
only - look back through archives, there must be something really wrong
with this motherboard.

> > And 1006 contains newer Promise BIOS - but I did not notice any difference:
> > Windows98 still do not boot if I connect harddisk to /dev/hdh :-(
>
> There is at this moment Windows98 installation on /dev/hde1 and it boots
> so far. It got installed and it was booting regardless with these
> "other" BIOS seetings.

Connect UDMA2 CDROM to hda and UDMA2 IDE to hdg. And then look how Win98
lockup after they print 'Starting Win98...'. But that's offtopic for
linux-kernel.
Best regards,
Petr Vandrovec
[email protected]

2001-03-01 12:22:29

by Thomas Molina

[permalink] [raw]
Subject: Re: 2.4 kernels - "attempt to access beyond end of device"

On Thu, 1 Mar 2001, Petr Vandrovec wrote:

> On 28 Feb 01 at 15:47, Michal Jaegermann wrote:
> > > > I have more checks to make before I will be fully satisfied but
> > > > this looks like it.
> > > ...
> > > > System Performance Setting [Optimal, Normal]
> > > ...
> > >
> > > Try BIOS 1006. AFAIK 1005D changed some VIA values for 'optimal'.
> >
> > Is that important here? IDE drives in question were not connected to
> > on-board controller but the Promise one. Results seem to indicate
> > that this 'optimal' was important here anyway.
>
> VIA host-bridge, not VIA-IDE... It is important even if you use Promise
> only - look back through archives, there must be something really wrong
> with this motherboard.

I'm beginning to believe it may be BIOS revision related. I haven't
tried the Promise controller since I don't have an ATA-100 drive, but I
don't seem to have any of the data corruption or other problems that
people have mentioned. I guess I'll hold off updating the BIOS for now
though. I bought the motherboard not two weeks ago, together with a
Athlon 900MHz processor and it has BIOS version 1004D. I have seen
problems even with Windows using the 1005D version though. My shop has
been selling a LOT of this board; the problems we've seen come back seem
to be specifically related to 1005D. Reflashing to 1004D has cured any
problems I've seen. I've not seen any new hardware come through from
the factory with 1006 though.