2012-02-13 07:45:46

by Meelis Roos

[permalink] [raw]
Subject: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

(Resend with proper To-s for OF people)

This is my first post-3.2 test on 2-CPU Sun Enterprise 3500 (PCI+SBus
IO). prtconf is also below. Something OF-related seems to be happening
here.

[ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03'
[ 0.000000] PROMLIB: Root node compatible:
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88 (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #64 SMP Sun Feb 12 22:26:40 EET 2012
[ 0.000000] debug: ignoring loglevel setting.
[ 0.000000] bootconsole [earlyprom0] enabled
[ 0.000000] ARCH: SUN4U
[ 0.000000] Ethernet address: 08:00:20:b6:ee:e2
[ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
[ 0.000000] Remapping the kernel... done.
[ 0.000000] Unable to handle kernel NULL pointer dereference
[ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
[ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0
[ 0.000000] \|/ ____ \|/
[ 0.000000] "@'/ .. \`@"
[ 0.000000] /_| \__/ |_\
[ 0.000000] \__U_/
[ 0.000000] swapper(0): Oops [#1]
[ 0.000000] TSTATE: 0000000080e01607 TPC: 00000000006459a0 TNPC: 0000000000645964 Y: 00000037 Not tainted
[ 0.000000] TPC: <of_find_node_by_path+0x60/0x80>
[ 0.000000] g0: 0000000000000000 g1: 0000000000000001 g2: 00000000000000ff g3: 00000000000000f0
[ 0.000000] g4: 0000000000853fd0 g5: 0000000000000000 g6: 0000000000834000 g7: 0000000000000050
[ 0.000000] o0: 0000000000000001 o1: fffff8007fced7c0 o2: 0000000001010101 o3: 0000000080808080
[ 0.000000] o4: fffff8007fcc0a4d o5: 00000000000199b5 sp: 0000000000837231 ret_pc: 0000000000645970
[ 0.000000] RPC: <of_find_node_by_path+0x30/0x80>
[ 0.000000] l0: 00000000008ab400 l1: fffff8007fcc1f40 l2: 000000000085c5ec l3: 0000000000000025
[ 0.000000] l4: 00000000005c0400 l5: 00000000008fa5e6 l6: 0000000000000006 l7: 0028280000000000
[ 0.000000] i0: fffff8007fced7c0 i1: 0000000000808fd8 i2: 0000000001010101 i3: 0000000080808080
[ 0.000000] i4: 0000000000876c00 i5: 0000000000000050 i6: 00000000008372e1 i7: 000000000064684c
[ 0.000000] I7: <of_alias_scan+0xcc/0x1c0>
[ 0.000000] Call Trace:
[ 0.000000] [000000000064684c] of_alias_scan+0xcc/0x1c0
[ 0.000000] [00000000008a0350] of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] [000000000088c540] prom_build_devicetree+0x10/0x3c
[ 0.000000] [00000000008904d4] paging_init+0x59c/0x6bc
[ 0.000000] [000000000088bebc] setup_arch+0xf8/0x110
[ 0.000000] [000000000088a51c] start_kernel+0x8c/0x34c
[ 0.000000] [00000000006fbf28] tlb_fixup_done+0xa0/0xa8
[ 0.000000] [0000000000000000] (null)
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Caller[000000000064684c]: of_alias_scan+0xcc/0x1c0
[ 0.000000] Caller[00000000008a0350]: of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] Caller[000000000088c540]: prom_build_devicetree+0x10/0x3c
[ 0.000000] Caller[00000000008904d4]: paging_init+0x59c/0x6bc
[ 0.000000] Caller[000000000088bebc]: setup_arch+0xf8/0x110
[ 0.000000] Caller[000000000088a51c]: start_kernel+0x8c/0x34c
[ 0.000000] Caller[00000000006fbf28]: tlb_fixup_done+0xa0/0xa8
[ 0.000000] Caller[0000000000000000]: (null)
[ 0.000000] Instruction DUMP: 01000000 fa5f6050 2aff7ff2 <c25f6018> 901720f0 40034b86 b010001d 81cfe008 01000000
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] Press Stop-A (L1-A) to return to the boot prom


System Configuration: Sun Microsystems sun4u
Memory size: 2048 Megabytes
System Peripherals (PROM Nodes):

Node 0xf0029c88
.node: f0029c88
clock-frequency: 05f5e100
previous-reset-reason: 'S-POR'
banner-name: '5-slot Sun Enterprise E3500'
idprom: 01800800.20b6eee2.00000000.b6eee2a9.00000000.00000000.00000000.00000000
reset-reason: 'S-POR'
fatal-reset-info: 00006000
breakpoint-trap: 0000007f
#size-cells: 00000002
name: 'SUNW,Ultra-Enterprise'

Node 0xf002cf50
.node: f002cf50
name: 'packages'

Node 0xf00365c0
.node: f00365c0
iso6429-1983-colors:
name: 'terminal-emulator'

Node 0xf003932c
.node: f003932c
disk-write-fix:
name: 'deblocker'

Node 0xf0039a08
.node: f0039a08
name: 'obp-tftp'

Node 0xf00447cc
.node: f00447cc
name: 'disk-label'

Node 0xf002cfc0
.node: f002cfc0
stdout: ffdc1428
stdin: ffdc1658
eeprom: f005dd0c
mmu: fffe9f70
memory: fffea170
bootargs: 00
bootpath: '/pci@f,4000/SUNW,isptwo@3/sd@2,0:a'
stdout-#lines: ffffffff
name: 'chosen'

Node 0xf002d02c
.node: f002d02c
add-brd-supported-types: '014'
version: 'OBP 3.2.30 2002/10/25 14:03'
model: 'SUNW,3.2'
decode-complete:
aligned-allocator:
relative-addressing:
name: 'openprom'

Node 0xf002d0bc
.node: f002d0bc
name: 'client-services'

Node 0xf002d164
.node: f002d164
disabled-memory-list:
disabled-board-list:
memory-interleave: 'max'
configuration-policy: 'component'
scsi-initiator-id: '7'
keyboard-click?: 'false'
keymap:
ttyb-rts-dtr-off: 'false'
ttyb-ignore-cd: 'true'
ttya-rts-dtr-off: 'false'
ttya-ignore-cd: 'true'
ttyb-mode: '9600,8,n,1,-'
ttya-mode: '9600,8,n,1,-'
sbus-specific-probe:
sbus-probe-default: 'd3120'
mfg-mode: 'off '
diag-level: 'min'
powerfail-time: '0'
#power-cycles: '52'
fcode-debug?: 'false'
output-device: 'ttya'
input-device: 'ttya'
load-base: '16384'
boot-command: 'boot'
auto-boot?: 'false'
watchdog-reboot?: 'false'
diag-file:
diag-device: 'mydisk'
boot-file:
boot-device: 'mydisk'
local-mac-address?: 'false'
ansi-terminal?: 'true'
screen-#columns: '80'
screen-#rows: '34'
silent-mode?: 'false'
use-nvramrc?: 'true'
nvramrc: 64657661.6c696173.206d7964.69736b20.2f706369.40662c34.3030302f.53554e57.2c697370.74776f40.332f7364.40322c30.0a
security-mode: 'none'
security-password:
security-#badlogins: '0'
oem-logo:
oem-logo?: 'false'
oem-banner:
oem-banner?: 'false'
hardware-revision:
last-hardware-update: '0'
diag-switch?: 'false'
name: 'options'

Node 0xf002d1d4
.node: f002d1d4
mydisk: '/pci@f,4000/SUNW,isptwo@3/sd@2,0'
disk: '/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0'
disksocal: '/sbus@2,0/SUNW,socal@d,10000/sf@0,0/ssd@0,0'
diskbrd: '/sbus@3,0/SUNW,fas@3,8800000/sd@a,0'
diskisp: '/sbus@3,0/QLGC,isp@0,10000/sd@0,0'
net: '/sbus@3,0/SUNW,hme@3,8c00000'
cdrom: '/sbus@3,0/SUNW,fas@3,8800000/sd@6,0:f'
tape: '/sbus@3,0/SUNW,fas@3,8800000/st@4,0'
scsi: '/sbus@3,0/SUNW,fas@3,8800000'
disk0: '/sbus@3,0/SUNW,fas@3,8800000/sd@0,0'
disk1: '/sbus@3,0/SUNW,fas@3,8800000/sd@1,0'
disk2: '/sbus@3,0/SUNW,fas@3,8800000/sd@2,0'
disk3: '/sbus@3,0/SUNW,fas@3,8800000/sd@3,0'
disk4: '/sbus@3,0/SUNW,fas@3,8800000/sd@4,0'
disk5: '/sbus@3,0/SUNW,fas@3,8800000/sd@5,0'
tape0: '/sbus@3,0/SUNW,fas@3,8800000/st@4,0'
tape1: '/sbus@3,0/SUNW,fas@3,8800000/st@5,0'
ttya: '/central/fhc/zs@0,902000:a'
ttyb: '/central/fhc/zs@0,902000:b'
keyboard: '/central/fhc/zs@0,904000'
keyboard!: '/central/fhc/zs@0,904000:forcemode'
name: 'aliases'

Node 0xf004efb4
.node: f004efb4
reg: 00000000.00000000.00000000.40000000.00000000.40000000.00000000.40000000
available: 00000000.7fce2000.00000000.00014000.00000000.7fc00000.00000000.000d2000.00000000.00000000.00000000.7f7de000
name: 'memory'

Node 0xf004f594
.node: f004f594
translations: 00000000.fffd0000.00000000.00020000.80000000.7ff600b6.00000000.fff70000.00000000.00060000.80000000.7fef80b6.00000000.fff6e000.00000000.00002000.80000000.7fbfe0b6.00000000.fff6c000.00000000.00002000.80000000.7fef60b6.00000000.fff66000.00000000.00002000.800001ff.f890208e.00000000.fff64000.00000000.00002000.800001ff.f890808e.00000000.fff62000.00000000.00002000.800001ff.f890808e.00000000.fff60000.00000000.00002000.800001c4.f830008e.00000000.fff5e000.00000000.00002000.800001d4.f830008e.00000000.fff5c000.00000000.00002000.800001dc.f830008e.00000000.ffdd8000.00000000.00184000.80000000.7fd720b6.00000000.ffdcc000.00000000.0000c000.800001cc.f880408e.00000000.ffdca000.00000000.00002000.80000000.7fd700b6.00000000.ffdc8000.00000000.00002000.800001ff.f820608e.00000000.ffdc4000.00000000.00004000.80000000.7fd3c0b6.00000000.ffdc2000.00000000.00002000.800001ff.f890408e.00000000.ffdc0000.00000000.00002000.80000000.7fd3a0b6.00000000.ffdb2000.00000000.00006000.800001c4.0000
008e.00000000.ffdac000.00000000.00006000.800001c4.0000008e.00000000.ffda4000.00000000.00008000.80000000.7fd680b6.00000000.ffd98000.00000000.0000c000.800001c4.f880408e.00000000.ffd96000.00000000.00002000.800001c4.f830008e.00000000.ffd94000.00000000.00002000.800001c4.0000208e.00000000.ffd92000.00000000.00002000.800001c4.0000208e.00000000.ffd90000.00000000.00002000.800001c4.0000208e.00000000.ffd8a000.00000000.00006000.800001c6.0000008e.00000000.ffd84000.00000000.00006000.800001c6.0000008e.00000000.ffd7c000.00000000.00008000.80000000.7fd600b6.00000000.ffd7a000.00000000.00002000.800001c6.0000208e.00000000.ffd78000.00000000.00002000.800001c6.0000208e.00000000.ffd76000.00000000.00002000.800001c6.0000208e.00000000.ffd70000.00000000.00006000.800001d4.0000008e.00000000.ffd6a000.00000000.00006000.800001d4.0000008e.00000000.ffd62000.00000000.00008000.80000000.7fd580b6.00000000.ffd56000.00000000.0000c000.800001d4.f880408e.00000000.ffd54000.00000000.00002000.800001d4.f830008e.00000000.ffd
52000.00000000.00002000.800001d4.0000208e.00000000.ffd50000.00000000.00002000.800001d4.0000208e.00000000.ffd4e000.00000000.00002000.800001d4.0000208e.00000000.ffd48000.00000000.00006000.800001d6.0000008e.00000000.ffd42000.00000000.00006000.800001d6.0000008e.00000000.ffd3a000.00000000.00008000.80000000.7fd500b6.00000000.ffd38000.00000000.00002000.800001d6.0000208e.00000000.ffd36000.00000000.00002000.800001d6.0000208e.00000000.ffd34000.00000000.00002000.800001d6.0000208e.00000000.ffd32000.00000000.00002000.800001dc.0000408e.00000000.ffd30000.00000000.00002000.880001dc.0100008e.00000000.ffd22000.00000000.0000e000.800001dc.0000008e.00000000.ffd1a000.00000000.00008000.80000000.7fd480b6.00000000.ffd18000.00000000.00002000.800001dc.0000208e.00000000.ffd16000.00000000.00002000.880001dc.0180008e.00000000.ffd08000.00000000.0000e000.800001dc.0000008e.00000000.ffcfc000.00000000.0000c000.800001dc.f880408e.00000000.ffcfa000.00000000.00002000.800001dc.f830008e.00000000.ffcf8000.00000000.00
002000.800001dc.0000008e.00000000.ffcf6000.00000000.00002000.800001dc.0000008e.00000000.ffcf4000.00000000.00002000.800001dc.0000008e.00000000.ffcf2000.00000000.00002000.800001de.0000408e.00000000.ffcf0000.00000000.00002000.880001de.0100008e.00000000.ffce2000.00000000.0000e000.800001de.0000008e.00000000.ffcda000.00000000.00008000.80000000.7fd400b6.00000000.ffcd8000.00000000.00002000.800001de.0000208e.00000000.ffcd6000.00000000.00002000.880001de.0180008e.00000000.ffcc8000.00000000.0000e000.800001de.0000008e.00000000.ffcc6000.00000000.00002000.800001de.0000008e.00000000.ffcc4000.00000000.00002000.800001de.0000008e.00000000.ffcc2000.00000000.00002000.800001de.0000008e.00000000.ffac2000.00000000.00200000.80000000.7f7de0b6.00000000.f07fe000.00000000.00002000.800001ff.f004208e.00000000.f02a0000.00000000.00040000.80000000.7fcfa0b6.00000000.f0080000.00000000.00220000.80000000.7f9de0b6.00000000.f0000000.00000000.00080000.80000000.7ff800b6.00000000.4162a000.00000000.029d6000.80000000.0
1a2a036.00000000.40000000.00000000.00c00000.80000000.00400036.00000000.00002000.00000000.00bfe000.80000000.00002036
existing: 00000000.00000000.00000800.00000000.fffff800.00000000.00000800.00000000
available: fffff800.00000000.000007fc.00000000.00000001.00000000.000007ff.00000000.00000000.ffff0000.00000000.0000e000.00000000.00000000.00000000.f0000000.00000000.ffdb8000.00000000.00008000.00000000.f0800000.00000000.0f2c2000
page-size: 00002000
name: 'virtual-memory'

Node 0xf005da70
.node: f005da70
ranges: 00000000.f8000000.000001ff.f8000000.08000000
reg: 000001ff.00000000.00000000.08000000
name: 'central'

Node 0xf005db8c
.node: f005db8c
board-model: 'SUNW,501-2511'
ranges: 00000000.00000000.00000000.f8000000.08000000
reg: 00000000.f8800000.00000110.00000000.f8802000.00000010.00000000.f8804000.00000020.00000000.f8806000.00000020.00000000.f8808000.00000020.00000000.f880a000.00000020
name: 'fhc'

Node 0xf005dd0c
.node: f005dd0c
address: fff62000
watchdog-enable:
interrupts: 0000003a
reg: 00000000.00908000.00002000
model: 'mk48t59'
name: 'eeprom'

Node 0xf005de3c
.node: f005de3c
port-b-ignore-cd:
port-a-ignore-cd:
address: fff66000
interrupts: 00000039
device_type: 'serial'
reg: 00000000.00902000.00000008
name: 'zs'

Node 0xf005df14
.node: f005df14
address: ffdc2000
port-b-ignore-cd:
port-a-ignore-cd:
keyboard:
interrupts: 00000039
device_type: 'serial'
reg: 00000000.00904000.00000008
name: 'zs'

Node 0xf005e05c
.node: f005e05c
reg: 00000000.00900000.00000008.00000000.00906000.00000060.00000000.0090c000.00000001
interrupts: 00000038
name: 'clock-board'

Node 0xf00df7bc
.node: f00df7bc
board-type: 'cpu'
board-model: 'SUNW,501-2557'
ranges: 00000000.00000000.000001cc.f8000000.08000000
central-space:
board#: 00000003
reg: 000001cc.f8800000.00000000.00000110.000001cc.f8802000.00000000.00000010.000001cc.f8804000.00000000.00000020.000001cc.f8806000.00000000.00000020.000001cc.f8808000.00000000.00000020.000001cc.f880a000.00000000.00000020
manfid#: 0000003e
version#: 00000001
model: 'SUNW,fhc0FA0'
name: 'fhc'

Node 0xf00dfa08
.node: f00dfa08
reg: 00000000.01000000.00008000.00000000.02000000.01000000
bank-0-status: 'ok'
bank-1-status: 'ok'
manfid#: 0000003e
version#: 00000005
model: 'SUNW,ac0F9E'
device_type: 'memory-controller'
name: 'ac'

Node 0xf00dfb90
.node: f00dfb90
reg: 00000000.00600000.00000010
name: 'simm-status'

Node 0xf00dfc28
.node: f00dfc28
interrupts: 0000003b
reg: 00000000.00400000.00000010
name: 'environment'

Node 0xf00dfce4
.node: f00dfce4
reg: 00000000.00200000.00008000.00000000.00280000.00008000
name: 'sram'

Node 0xf00dfd80
.node: f00dfd80
version: 4f425020.2020332e.322e3330.20323030.322f3130.2f323520.31343a30.3300504f.53542020.332e392e.33302032.3030322f.31302f32.35203134.3a303400
model: 'SUNW,525-1431'
reg: 00000000.00000000.00080000
name: 'flashprom'

Node 0xf00dfec4
.node: f00dfec4
manufacturer#: 00000017
implementation#: 00000011
mask#: 000000a0
sparc-version: 00000009
ecache-associativity: 00000001
ecache-line-size: 00000040
ecache-size: 00800000
#dtlb-entries: 00000040
dcache-associativity: 00000001
dcache-line-size: 00000020
dcache-size: 00004000
#itlb-entries: 00000040
icache-associativity: 00000002
icache-line-size: 00000020
icache-size: 00004000
upa-portid: 00000006
clock-frequency: 17d78400
rated-frequency: 17d78400
reg: 000001cc.00000000.00000000.00000008
board#: 00000003
device_type: 'cpu'
name: 'SUNW,UltraSPARC-II'

Node 0xf00e0284
.node: f00e0284
manufacturer#: 00000017
implementation#: 00000011
mask#: 000000a0
sparc-version: 00000009
ecache-associativity: 00000001
ecache-line-size: 00000040
ecache-size: 00800000
#dtlb-entries: 00000040
dcache-associativity: 00000001
dcache-line-size: 00000020
dcache-size: 00004000
#itlb-entries: 00000040
icache-associativity: 00000002
icache-line-size: 00000020
icache-size: 00004000
upa-portid: 00000007
clock-frequency: 17d78400
rated-frequency: 17d78400
reg: 000001ce.00000000.00000000.00000008
board#: 00000003
device_type: 'cpu'
name: 'SUNW,UltraSPARC-II'

Node 0xf006f7bc
.node: f006f7bc
ranges: 00000001.00000000.000001c5.10000000.10000000.00000002.00000000.000001c5.20000000.10000000.0000000d.00000000.000001c5.d0000000.10000000
interrupts: 000000b4.000000b5.000000b6.000000a5.000000aa.000000b7
version#: 00000001
implementation#: 00000000
bus-parity-generated:
address: ffdb2000
scsi-initiator-id: 00000007
model: 'SUNW,sysio'
reg: 000001c4.00000000.00000000.00006000
slot-address-bits: 0000001c
up-burst-sizes: 0078007f
burst-sizes: 00f8007f
device_type: 'sbus'
name: 'sbus'
upa-portid: 00000002
clock-frequency: 017d7840
board#: 00000001

Node 0xf0075084
.node: f0075084
wwn: 20040800.20b6eee2
intr: 00000003.00000000
interrupts: 00000022
ranges: 00000000.00000000.0000000d.00010240.00000018.00000001.00000000.0000000d.00010258.00000018.00000010.00000000.0000000d.00010300.00000008.00000011.00000000.0000000d.00010308.00000008
reg: 0000000d.00010000.00010018
device_type: 'socal'
version: '@(#) FCode 1.12 99/07/30'
manufacturer: 'SUNW'
model: '501-3060'
name: 'SUNW,socal'

Node 0xf007c8ec
.node: f007c8ec
port-wwn: 20050800.20b6eee2
reg: 00000000.00000000.00000018.00000010.00000000.00000008
port#: 00000000
#address-cells: 00000004
device_type: 'scsi-3'
name: 'sf'

Node 0xf007e704
.node: f007e704
device_type: 'block'
name: 'ssd'

Node 0xf007efd4
.node: f007efd4
port-wwn: 20060800.20b6eee2
reg: 00000001.00000000.00000018.00000011.00000000.00000008
port#: 00000001
#address-cells: 00000004
device_type: 'scsi-3'
name: 'sf'

Node 0xf007f670
.node: f007f670
device_type: 'block'
name: 'ssd'

Node 0xf0080050
.node: f0080050
local-mac-address: 080020ee.2248
gem-rev: 00000000
burst-sizes: 0078007f
shared-pins: 'serdes'
board-rev: 00000005
interrupts: 00000004
compatible: 'SUNW,sbus-gem'
model: 'SUNW,sbus-gem'
has-fcode: ' '
version: '1.7'
device_type: 'network'
address-bits: 00000030
max-frame-size: 00004000
reg: 00000001.00100000.00000014.00000001.00200000.00009060
name: 'network'

Node 0xf0086420
.node: f0086420
scsi-initiator-id: 00000007
isp-fcode: '1.21 95/05/18'
device_type: 'scsi'
intr: 00000003.00000000
interrupts: 00000003
wide: 00
clock-frequency: 02625a00
reg: 00000002.00010000.00000450
64-bit-clean: 00
model: 'QLGC,ISP1000'
name: 'QLGC,isp'

Node 0xf008bc8c
.node: f008bc8c
device_type: 'block'
name: 'sd'

Node 0xf008c4a0
.node: f008c4a0
device_type: 'byte'
name: 'st'

Node 0xf0071c1c
.node: f0071c1c
board-type: 'dual-sbus-soc+'
manfid#: 0000003e
version#: 00000001
ranges: 00000000.00000000.000001c4.f8000000.08000000
reg: 000001c4.f8800000.00000000.00000110.000001c4.f8802000.00000000.00000010.000001c4.f8804000.00000000.00000020.000001c4.f8806000.00000000.00000020.000001c4.f8808000.00000000.00000020.000001c4.f880a000.00000000.00000020
board-model: 'SUNW,501-2558'
model: 'SUNW,fhc0FA0'
board#: 00000001
name: 'fhc'

Node 0xf00720cc
.node: f00720cc
manfid#: 0000003e
version#: 00000005
device_type: 'memory-controller'
reg: 00000000.01000000.00008000.00000000.02000000.01000000
model: 'SUNW,ac0F9E'
name: 'ac'

Node 0xf0072204
.node: f0072204
interrupts: 0000003b
reg: 00000000.00400000.00000010
name: 'environment'

Node 0xf00722c0
.node: f00722c0
version: 46434f44.4520312e.382e3330.20323030.322f3130.2f323520.31343a30.32006950.4f535420.332e342e.33302032.3030322f.31302f32.35203134.3a303300
model: 'SUNW,525-1757'
reg: 00000000.00000000.00080000
name: 'flashprom'

Node 0xf00726f8
.node: f00726f8
address: ffd96000
interrupts: 0000003a
reg: 00000000.00300000.00002000
model: 'mk48t59'
name: 'eeprom'

Node 0xf00727f0
.node: f00727f0
reg: 00000000.00500000.00000010
name: 'sbus-speed'

Node 0xf00728e4
.node: f00728e4
address: ffd95c00.ffd91860.ffd93060
interrupts: 000000b0.000000b1
reg: 000001c4.00003c00.00000000.00000020.000001c4.00003860.00000000.00000010.000001c4.00003060.00000000.00000010
board#: 00000001
name: 'counter-timer'

Node 0xf0072ad4
.node: f0072ad4
ranges: 00000000.00000000.000001c7.00000000.10000000.00000003.00000000.000001c7.30000000.10000000
interrupts: 000000f4.000000f5.000000f6.000000e5.000000ea.000000f7
version#: 00000001
implementation#: 00000000
bus-parity-generated:
address: ffd8a000
scsi-initiator-id: 00000007
model: 'SUNW,sysio'
reg: 000001c6.00000000.00000000.00006000
slot-address-bits: 0000001c
up-burst-sizes: 0078007f
burst-sizes: 00f8007f
device_type: 'sbus'
name: 'sbus'
upa-portid: 00000003
clock-frequency: 017d7840
board#: 00000001

Node 0xf008d070
.node: f008d070
hm-rev: 00000022
device_type: 'network'
intr: 00000004.00000000
interrupts: 00000004
address-bits: 00000030
max-frame-size: 00004000
reg: 00000003.08c00000.00000108.00000003.08c02000.00002000.00000003.08c04000.00002000.00000003.08c06000.00002000.00000003.08c07000.00000020
name: 'SUNW,hme'

Node 0xf0093c14
.node: f0093c14
hm-rev: 00000022
device_type: 'scsi'
clock-frequency: 02625a00
intr: 00000003.00000000
interrupts: 00000003
reg: 00000003.08800000.00000010.00000003.08810000.00000040
name: 'SUNW,fas'

Node 0xf009864c
.node: f009864c
device_type: 'block'
name: 'sd'

Node 0xf0098f08
.node: f0098f08
device_type: 'byte'
name: 'st'

Node 0xf0099bf4
.node: f0099bf4
local-mac-address: 08002093.7994
hm-rev: 00000022
device_type: 'network'
intr: 00000004.00000000
interrupts: 00000004
address-bits: 00000030
max-frame-size: 00004000
reg: 00000000.08c00000.00000108.00000000.08c02000.00002000.00000000.08c04000.00002000.00000000.08c06000.00002000.00000000.08c07000.00000020
model: 'SUNW,sbus-qfe'
version: '1.11'
name: 'SUNW,qfe'

Node 0xf009fba8
.node: f009fba8
local-mac-address: 08002093.7995
hm-rev: 00000022
device_type: 'network'
intr: 00000004.00000000
interrupts: 00000004
address-bits: 00000030
max-frame-size: 00004000
reg: 00000000.08c10000.00000108.00000000.08c12000.00002000.00000000.08c14000.00002000.00000000.08c16000.00002000.00000000.08c17000.00000020
model: 'SUNW,sbus-qfe'
version: '1.11'
name: 'SUNW,qfe'

Node 0xf00a5a84
.node: f00a5a84
local-mac-address: 08002093.7996
hm-rev: 00000022
device_type: 'network'
intr: 00000004.00000000
interrupts: 00000004
address-bits: 00000030
max-frame-size: 00004000
reg: 00000000.08c20000.00000108.00000000.08c22000.00002000.00000000.08c24000.00002000.00000000.08c26000.00002000.00000000.08c27000.00000020
model: 'SUNW,sbus-qfe'
version: '1.11'
name: 'SUNW,qfe'

Node 0xf00ab960
.node: f00ab960
local-mac-address: 08002093.7997
hm-rev: 00000022
device_type: 'network'
intr: 00000004.00000000
interrupts: 00000004
address-bits: 00000030
max-frame-size: 00004000
reg: 00000000.08c30000.00000108.00000000.08c32000.00002000.00000000.08c34000.00002000.00000000.08c36000.00002000.00000000.08c37000.00000020
model: 'SUNW,sbus-qfe'
version: '1.11'
name: 'SUNW,qfe'

Node 0xf0074e94
.node: f0074e94
address: ffd7bc00.ffd77860.ffd79060
interrupts: 000000f0.000000f1
reg: 000001c6.00003c00.00000000.00000020.000001c6.00003860.00000000.00000010.000001c6.00003060.00000000.00000010
board#: 00000001
name: 'counter-timer'

Node 0xf014f7bc
.node: f014f7bc
ranges: 00000001.00000000.000001d5.10000000.10000000.00000002.00000000.000001d5.20000000.10000000.0000000d.00000000.000001d5.d0000000.10000000
interrupts: 000002b4.000002b5.000002b6.000002a5.000002aa.000002b7
version#: 00000001
implementation#: 00000000
bus-parity-generated:
address: ffd70000
scsi-initiator-id: 00000007
model: 'SUNW,sysio'
reg: 000001d4.00000000.00000000.00006000
slot-address-bits: 0000001c
up-burst-sizes: 0078007f
burst-sizes: 00f8007f
device_type: 'sbus'
name: 'sbus'
upa-portid: 0000000a
clock-frequency: 017d7840
board#: 00000005

Node 0xf0155084
.node: f0155084
wwn: 20140800.20b6eee2
intr: 00000003.00000000
interrupts: 00000022
ranges: 00000000.00000000.0000000d.00010240.00000018.00000001.00000000.0000000d.00010258.00000018.00000010.00000000.0000000d.00010300.00000008.00000011.00000000.0000000d.00010308.00000008
reg: 0000000d.00010000.00010018
device_type: 'socal'
version: '@(#) FCode 1.12 99/07/30'
manufacturer: 'SUNW'
model: '501-3060'
name: 'SUNW,socal'

Node 0xf015c8ec
.node: f015c8ec
port-wwn: 20150800.20b6eee2
reg: 00000000.00000000.00000018.00000010.00000000.00000008
port#: 00000000
#address-cells: 00000004
device_type: 'scsi-3'
name: 'sf'

Node 0xf015e704
.node: f015e704
device_type: 'block'
name: 'ssd'

Node 0xf015efd4
.node: f015efd4
port-wwn: 20160800.20b6eee2
reg: 00000001.00000000.00000018.00000011.00000000.00000008
port#: 00000001
#address-cells: 00000004
device_type: 'scsi-3'
name: 'sf'

Node 0xf015f670
.node: f015f670
device_type: 'block'
name: 'ssd'

Node 0xf0160050
.node: f0160050
scsi-initiator-id: 00000007
clock-frequency: 03938700
differential: 00
isp-fcode: '1.28 99/11/08'
device_type: 'scsi'
intr: 00000003.00000000
interrupts: 00000003
wide: 00
fast-20: 00
reg: 00000001.00010000.00000450
64-bit-clean: 00
model: 'QLGC,ISP1000U'
name: 'QLGC,isp'

Node 0xf0165dc8
.node: f0165dc8
device_type: 'block'
name: 'sd'

Node 0xf01665b8
.node: f01665b8
device_type: 'byte'
name: 'st'

Node 0xf016713c
.node: f016713c
cache-linesize: 00000010
cache-size: 00008000
intr: 00000002.00000000
interrupts: 00000002
reg: 00000002.00010000.00000080.00000002.00020000.00000068.00000002.00030000.0000000c
model: 'SUNW,501-1763-01'
name: 'SUNW,SunPC'

Node 0xf0151c1c
.node: f0151c1c
board-type: 'dual-sbus-soc+'
manfid#: 0000003e
version#: 00000001
ranges: 00000000.00000000.000001d4.f8000000.08000000
reg: 000001d4.f8800000.00000000.00000110.000001d4.f8802000.00000000.00000010.000001d4.f8804000.00000000.00000020.000001d4.f8806000.00000000.00000020.000001d4.f8808000.00000000.00000020.000001d4.f880a000.00000000.00000020
board-model: 'SUNW,501-2558'
model: 'SUNW,fhc0FA0'
board#: 00000005
name: 'fhc'

Node 0xf01520cc
.node: f01520cc
manfid#: 0000003e
version#: 00000005
device_type: 'memory-controller'
reg: 00000000.01000000.00008000.00000000.02000000.01000000
model: 'SUNW,ac0F9E'
name: 'ac'

Node 0xf0152204
.node: f0152204
interrupts: 0000003b
reg: 00000000.00400000.00000010
name: 'environment'

Node 0xf01522c0
.node: f01522c0
version: 46434f44.4520312e.382e3330.20323030.322f3130.2f323520.31343a30.32006950.4f535420.332e342e.33302032.3030322f.31302f32.35203134.3a303300
model: 'SUNW,525-1757'
reg: 00000000.00000000.00080000
name: 'flashprom'

Node 0xf01526f8
.node: f01526f8
address: ffd54000
interrupts: 0000003a
reg: 00000000.00300000.00002000
model: 'mk48t59'
name: 'eeprom'

Node 0xf01527f0
.node: f01527f0
reg: 00000000.00500000.00000010
name: 'sbus-speed'

Node 0xf01528e4
.node: f01528e4
address: ffd53c00.ffd4f860.ffd51060
interrupts: 000002b0.000002b1
reg: 000001d4.00003c00.00000000.00000020.000001d4.00003860.00000000.00000010.000001d4.00003060.00000000.00000010
board#: 00000005
name: 'counter-timer'

Node 0xf0152ad4
.node: f0152ad4
ranges: 00000000.00000000.000001d7.00000000.10000000.00000003.00000000.000001d7.30000000.10000000
interrupts: 000002f4.000002f5.000002f6.000002e5.000002ea.000002f7
version#: 00000001
implementation#: 00000000
bus-parity-generated:
address: ffd48000
scsi-initiator-id: 00000007
model: 'SUNW,sysio'
reg: 000001d6.00000000.00000000.00006000
slot-address-bits: 0000001c
up-burst-sizes: 0078007f
burst-sizes: 00f8007f
device_type: 'sbus'
name: 'sbus'
upa-portid: 0000000b
clock-frequency: 017d7840
board#: 00000005

Node 0xf01673a8
.node: f01673a8
hm-rev: 00000022
device_type: 'network'
intr: 00000004.00000000
interrupts: 00000004
address-bits: 00000030
max-frame-size: 00004000
reg: 00000003.08c00000.00000108.00000003.08c02000.00002000.00000003.08c04000.00002000.00000003.08c06000.00002000.00000003.08c07000.00000020
name: 'SUNW,hme'

Node 0xf016df4c
.node: f016df4c
hm-rev: 00000022
device_type: 'scsi'
clock-frequency: 02625a00
intr: 00000003.00000000
interrupts: 00000003
reg: 00000003.08800000.00000010.00000003.08810000.00000040
name: 'SUNW,fas'

Node 0xf0172984
.node: f0172984
device_type: 'block'
name: 'sd'

Node 0xf0173240
.node: f0173240
device_type: 'byte'
name: 'st'

Node 0xf0173f2c
.node: f0173f2c
scsi-initiator-id: 00000007
clock-frequency: 03938700
differential: 00
isp-fcode: '1.28 99/11/08'
device_type: 'scsi'
intr: 00000003.00000000
interrupts: 00000003
wide: 00
fast-20: 00
reg: 00000000.00010000.00000450
64-bit-clean: 00
model: 'QLGC,ISP1000U'
name: 'QLGC,isp'

Node 0xf0179ca4
.node: f0179ca4
device_type: 'block'
name: 'sd'

Node 0xf017a494
.node: f017a494
device_type: 'byte'
name: 'st'

Node 0xf0154e94
.node: f0154e94
address: ffd39c00.ffd35860.ffd37060
interrupts: 000002f0.000002f1
reg: 000001d6.00003c00.00000000.00000020.000001d6.00003860.00000000.00000010.000001d6.00003060.00000000.00000010
board#: 00000005
name: 'counter-timer'

Node 0xf01bf7bc
.node: f01bf7bc
available: 82000000.00000000.02808000.00000000.7d7f8000.81000000.00000000.00000400.00000000.0000fc00
bus-range: 00000000.00000000
version#: 00000004
implementation#: 00000000
clock-frequency: 01f78a40
upa-portid: 0000000e
interrupts: 000003b1.000003ae.000003af.000003a5.000003a8.000003b2
ranges: 00000000.00000000.00000000.000001dc.01000000.00000000.00800000.01000000.00000000.00000000.000001dc.02010000.00000000.00010000.02000000.00000000.00000000.000001dd.80000000.00000000.80000000.03000000.00000000.00000000.000001dd.80000000.00000000.80000000
address: ffd32000.ffd30000.ffd22000
reg: 000001dc.00004000.00000000.00002000.000001dc.01000000.00000000.00000100.000001dc.00000000.00000000.0000d000
board#: 00000007
model: 'SUNW,psycho'
compatible: 'pci108e,8000'
bus-parity-generated:
#size-cells: 00000002
#address-cells: 00000003
device_type: 'pci'
name: 'pci'

Node 0xf01d3d84
.node: f01d3d84
assigned-addresses: 82000810.00000000.01000000.00000000.01000000.82000814.00000000.02000000.00000000.00800000
power-consumption: 00000000.00e4e1c0
reg: 00000800.00000000.00000000.00000000.00000000.02000810.00000000.00000000.00000000.01000000.02000814.00000000.00000000.00000000.00800000
compatible: 70636931.3038652c.31303030.00706369.636c6173.732c3036.38303030.00
name: 'pci108e,1000'
66mhz-capable: 00000000
udf-supported: 00000000
fast-back-to-back: 00000001
devsel-speed: 00000001
class-code: 00068000
interrupts: 00000001
max-latency: 00000019
min-grant: 0000000a
revision-id: 00000001
device-id: 00001000
vendor-id: 0000108e

Node 0xf01d4058
.node: f01d4058
assigned-addresses: 82000910.00000000.02800000.00000000.00007030
compatible: 'pci108e,1001'
version: '1.17'
device_type: 'network'
hm-rev: 000000c1
address-bits: 00000030
max-frame-size: 00004000
reg: 00000900.00000000.00000000.00000000.00000000.02000910.00000000.00000000.00000000.00007030
model: 'SUNW,cheerio'
name: 'SUNW,hme'
66mhz-capable: 00000000
udf-supported: 00000000
fast-back-to-back: 00000001
devsel-speed: 00000001
class-code: 00020000
interrupts: 000003a1
max-latency: 00000005
min-grant: 0000000a
revision-id: 00000001
device-id: 00001001
vendor-id: 0000108e

Node 0xf01c88e0
.node: f01c88e0
available: 82800000.00000000.00002100.00000000.7fffdf00.81800000.00000000.00000440.00000000.0000fbc0
bus-range: 00000080.00000080
version#: 00000004
implementation#: 00000000
clock-frequency: 01f78a40
slot-names: 00000004.7063692d.736c6f74.203000
upa-portid: 0000000e
66mhz-capable:
interrupts: 000003b0.000003ae.000003af.000003a5.000003a8.000003b2
ranges: 00800000.00000000.00000000.000001dc.01000000.00000000.00800000.01000000.00000000.00000000.000001dc.02000000.00000000.00010000.02000000.00000000.00000000.000001dd.00000000.00000000.80000000.03000000.00000000.00000000.000001dd.00000000.00000000.80000000
address: ffd18000.ffd16000.ffd08000
reg: 000001dc.00002000.00000000.00002000.000001dc.01800000.00000000.00000100.000001dc.00000000.00000000.0000d000
board#: 00000007
model: 'SUNW,psycho'
compatible: 'pci108e,8000'
bus-parity-generated:
#size-cells: 00000002
#address-cells: 00000003
device_type: 'pci'
name: 'pci'

Node 0xf01e7cb8
.node: f01e7cb8
assigned-addresses: 81801020.00000000.00000400.00000000.00000020
power-consumption: 00000000.00e4e1c0
reg: 00801000.00000000.00000000.00000000.00000000.01801020.00000000.00000000.00000000.00000020
compatible: 70636939.32352c31.32333400.70636931.3130362c.33303338.00706369.636c6173.732c3063.30333030.00757362.00
name: 'usb'
66mhz-capable: 00000000
udf-supported: 00000000
fast-back-to-back: 00000000
devsel-speed: 00000001
class-code: 000c0300
interrupts: 00000001
subsystem-vendor-id: 00000925
subsystem-id: 00001234
max-latency: 00000000
min-grant: 00000000
revision-id: 00000050
device-id: 00003038
vendor-id: 00001106

Node 0xf01e7fd4
.node: f01e7fd4
assigned-addresses: 81801120.00000000.00000420.00000000.00000020
reg: 00801100.00000000.00000000.00000000.00000000.01801120.00000000.00000000.00000000.00000020
compatible: 70636939.32352c31.32333400.70636931.3130362c.33303338.00706369.636c6173.732c3063.30333030.00757362.00
name: 'usb'
66mhz-capable: 00000000
udf-supported: 00000000
fast-back-to-back: 00000000
devsel-speed: 00000001
class-code: 000c0300
interrupts: 00000002
subsystem-vendor-id: 00000925
subsystem-id: 00001234
max-latency: 00000000
min-grant: 00000000
revision-id: 00000050
device-id: 00003038
vendor-id: 00001106

Node 0xf01e82c0
.node: f01e82c0
assigned-addresses: 82801210.00000000.00002000.00000000.00000100
reg: 00801200.00000000.00000000.00000000.00000000.02801210.00000000.00000000.00000000.00000100
compatible: 70636939.32352c31.32333400.70636931.3130362c.33313034.00706369.636c6173.732c3063.30333230.00757362.00
name: 'usb'
66mhz-capable: 00000000
udf-supported: 00000000
fast-back-to-back: 00000000
devsel-speed: 00000001
class-code: 000c0320
interrupts: 00000003
subsystem-vendor-id: 00000925
subsystem-id: 00001234
max-latency: 00000000
min-grant: 00000000
revision-id: 00000051
device-id: 00003104
vendor-id: 00001106

Node 0xf01c923c
.node: f01c923c
board-type: 'dual-pci'
manfid#: 0000003e
version#: 00000001
ranges: 00000000.00000000.000001dc.f8000000.08000000
reg: 000001dc.f8800000.00000000.00000110.000001dc.f8802000.00000000.00000010.000001dc.f8804000.00000000.00000020.000001dc.f8806000.00000000.00000020.000001dc.f8808000.00000000.00000020.000001dc.f880a000.00000000.00000020
board-model: 'SUNW,501-3023'
model: 'SUNW,fhc0FA0'
board#: 00000007
name: 'fhc'

Node 0xf01c9718
.node: f01c9718
manfid#: 0000003e
version#: 00000005
device_type: 'memory-controller'
reg: 00000000.01000000.00008000.00000000.02000000.01000000
model: 'SUNW,ac0F9E'
name: 'ac'

Node 0xf01c9850
.node: f01c9850
interrupts: 0000003b
reg: 00000000.00400000.00000010
name: 'environment'

Node 0xf01c990c
.node: f01c990c
version: 46434f44.4520312e.382e3330.20323030.322f3130.2f323520.31343a30.32006950.4f535420.332e302e.33302032.3030322f.31302f32.35203134.3a303300
model: 'SUNW,525-1680'
reg: 00000000.00000000.00080000
name: 'flashprom'

Node 0xf01c9d44
.node: f01c9d44
address: ffcfa000
interrupts: 0000003a
reg: 00000000.00300000.00002000
model: 'mk48t59'
name: 'eeprom'

Node 0xf01c9e3c
.node: f01c9e3c
reg: 00000000.00500000.00000010
name: 'sbus-speed'

Node 0xf01c9f28
.node: f01c9f28
address: ffcf9c00.ffcf5860.ffcf7060
interrupts: 000003ac.000003ad
reg: 000001dc.00001c00.00000000.00000020.000001dc.00001860.00000000.00000010.000001dc.00001060.00000000.00000010
board#: 00000007
name: 'counter-timer'

Node 0xf01ca118
.node: f01ca118
available: 82000000.00000000.00020000.00000000.7ffe0000.81000000.00000000.00000500.00000000.0000fb00
bus-range: 00000000.00000000
version#: 00000004
implementation#: 00000000
clock-frequency: 01f78a40
upa-portid: 0000000f
interrupts: 000003f1.000003ee.000003ef.000003e5.000003e8.000003f2
ranges: 00000000.00000000.00000000.000001de.01000000.00000000.00800000.01000000.00000000.00000000.000001de.02010000.00000000.00010000.02000000.00000000.00000000.000001df.80000000.00000000.80000000.03000000.00000000.00000000.000001df.80000000.00000000.80000000
address: ffcf2000.ffcf0000.ffce2000
reg: 000001de.00004000.00000000.00002000.000001de.01000000.00000000.00000100.000001de.00000000.00000000.0000d000
board#: 00000007
model: 'SUNW,psycho'
compatible: 'pci108e,8000'
bus-parity-generated:
#size-cells: 00000002
#address-cells: 00000003
device_type: 'pci'
name: 'pci'

Node 0xf01dc3c0
.node: f01dc3c0
assigned-addresses: 81001810.00000000.00000400.00000000.00000100.82001814.00000000.00002000.00000000.00001000.82001830.00000000.00010000.00000000.00010000
model: 'QLGC,ISP1040B'
scsi-initiator-id: 00000007
clock-frequency: 03938700
alternate-reg: 00000000.00000000.00000000.00000000.00000000.02001814.00000000.00000000.00000000.00000100.01001810.00000000.00000000.00000000.00000100
reg: 00001800.00000000.00000000.00000000.00000000.01001810.00000000.00000000.00000000.00000100.02001814.00000000.00000000.00000000.00001000.02001830.00000000.00000000.00000000.00010000
power-consumption: 00000000.00000000.00895440.00895440
manufacturer: 'QLGC'
device_type: 'scsi'
name: 'SUNW,isptwo'
66mhz-capable: 00000000
udf-supported: 00000000
fast-back-to-back: 00000000
devsel-speed: 00000001
class-code: 00010000
interrupts: 000003e0
max-latency: 00000000
min-grant: 00000000
revision-id: 00000002
device-id: 00001020
vendor-id: 00001077

Node 0xf01e6534
.node: f01e6534
device_type: 'block'
name: 'sd'

Node 0xf01e7010
.node: f01e7010
device_type: 'byte'
name: 'st'

Node 0xf01d320c
.node: f01d320c
available: 82800000.00000000.00004000.00000000.7fffc000.81800000.00000000.00000900.00000000.0000f700
bus-range: 00000080.00000080
version#: 00000004
implementation#: 00000000
clock-frequency: 03ef1480
slot-names: 00000004.7063692d.736c6f74.203100
upa-portid: 0000000f
66mhz-capable:
interrupts: 000003f0.000003ee.000003ef.000003e5.000003e8.000003f2
ranges: 00800000.00000000.00000000.000001de.01000000.00000000.00800000.01000000.00000000.00000000.000001de.02000000.00000000.00010000.02000000.00000000.00000000.000001df.00000000.00000000.80000000.03000000.00000000.00000000.000001df.00000000.00000000.80000000
address: ffcd8000.ffcd6000.ffcc8000
reg: 000001de.00002000.00000000.00002000.000001de.01800000.00000000.00000100.000001de.00000000.00000000.0000d000
board#: 00000007
model: 'SUNW,psycho'
compatible: 'pci108e,8000'
bus-parity-generated:
#size-cells: 00000002
#address-cells: 00000003
device_type: 'pci'
name: 'pci'

Node 0xf01e86d0
.node: f01e86d0
assigned-addresses: 81801010.00000000.00000400.00000000.00000100.83801014.00000000.00002000.00000000.00002000.8180101c.00000000.00000800.00000000.00000100
power-consumption: 00000000.00e4e1c0
reg: 00801000.00000000.00000000.00000000.00000000.01801010.00000000.00000000.00000000.00000100.03801014.00000000.00000000.00000000.00002000.0180101c.00000000.00000000.00000000.00000100
compatible: 70636939.3030352c.34340070.63693930.30352c38.30313700.70636963.6c617373.2c303130.30303000.73637369.00
name: 'scsi'
66mhz-capable: 00000001
udf-supported: 00000000
fast-back-to-back: 00000000
devsel-speed: 00000002
class-code: 00010000
interrupts: 00000001
subsystem-vendor-id: 00009005
subsystem-id: 00000044
max-latency: 00000019
min-grant: 00000028
revision-id: 00000010
device-id: 00008017
vendor-id: 00009005

Node 0xf01d3b68
.node: f01d3b68
address: ffcc7c00.ffcc3860.ffcc5060
interrupts: 000003ec.000003ed
reg: 000001de.00001c00.00000000.00000020.000001de.00001860.00000000.00000010.000001de.00001060.00000000.00000010
board#: 00000007
name: 'counter-timer'


--
Meelis Roos ([email protected])


2012-02-13 08:06:22

by Grant Likely

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Mon, Feb 13, 2012 at 09:45:40AM +0200, Meelis Roos wrote:
> (Resend with proper To-s for OF people)
>
> This is my first post-3.2 test on 2-CPU Sun Enterprise 3500 (PCI+SBus
> IO). prtconf is also below. Something OF-related seems to be happening
> here.
>
> [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03'
> [ 0.000000] PROMLIB: Root node compatible:
> [ 0.000000] Initializing cgroup subsys cpu
> [ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88 (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #64 SMP Sun Feb 12 22:26:40 EET 2012
> [ 0.000000] debug: ignoring loglevel setting.
> [ 0.000000] bootconsole [earlyprom0] enabled
> [ 0.000000] ARCH: SUN4U
> [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2
> [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
> [ 0.000000] Remapping the kernel... done.
> [ 0.000000] Unable to handle kernel NULL pointer dereference
> [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
> [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0
> [ 0.000000] \|/ ____ \|/
> [ 0.000000] "@'/ .. \`@"
> [ 0.000000] /_| \__/ |_\
> [ 0.000000] \__U_/
> [ 0.000000] swapper(0): Oops [#1]
> [ 0.000000] TSTATE: 0000000080e01607 TPC: 00000000006459a0 TNPC: 0000000000645964 Y: 00000037 Not tainted
> [ 0.000000] TPC: <of_find_node_by_path+0x60/0x80>
> [ 0.000000] g0: 0000000000000000 g1: 0000000000000001 g2: 00000000000000ff g3: 00000000000000f0
> [ 0.000000] g4: 0000000000853fd0 g5: 0000000000000000 g6: 0000000000834000 g7: 0000000000000050
> [ 0.000000] o0: 0000000000000001 o1: fffff8007fced7c0 o2: 0000000001010101 o3: 0000000080808080
> [ 0.000000] o4: fffff8007fcc0a4d o5: 00000000000199b5 sp: 0000000000837231 ret_pc: 0000000000645970
> [ 0.000000] RPC: <of_find_node_by_path+0x30/0x80>
> [ 0.000000] l0: 00000000008ab400 l1: fffff8007fcc1f40 l2: 000000000085c5ec l3: 0000000000000025
> [ 0.000000] l4: 00000000005c0400 l5: 00000000008fa5e6 l6: 0000000000000006 l7: 0028280000000000
> [ 0.000000] i0: fffff8007fced7c0 i1: 0000000000808fd8 i2: 0000000001010101 i3: 0000000080808080
> [ 0.000000] i4: 0000000000876c00 i5: 0000000000000050 i6: 00000000008372e1 i7: 000000000064684c
> [ 0.000000] I7: <of_alias_scan+0xcc/0x1c0>
> [ 0.000000] Call Trace:
> [ 0.000000] [000000000064684c] of_alias_scan+0xcc/0x1c0
> [ 0.000000] [00000000008a0350] of_pdt_build_devicetree+0x90/0xa0
> [ 0.000000] [000000000088c540] prom_build_devicetree+0x10/0x3c
> [ 0.000000] [00000000008904d4] paging_init+0x59c/0x6bc
> [ 0.000000] [000000000088bebc] setup_arch+0xf8/0x110
> [ 0.000000] [000000000088a51c] start_kernel+0x8c/0x34c

Try the following patch. I suspect the new of_alias_scan() isn't careful
enough about which properties it dereferences:

---

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 133908a..9188caa 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -1174,6 +1174,10 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align))
!strcmp(pp->name, "linux,phandle"))
continue;

+ /* Check for null value or non-strings (no null termination) */
+ if (!pp->value || strnlen(pp->value, pp->length) == pp->length)
+ continue;
+
np = of_find_node_by_path(pp->value);
if (!np)
continue;

2012-02-13 09:20:42

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Try the following patch. I suspect the new of_alias_scan() isn't careful
> enough about which properties it dereferences:
>
> ---
>
> diff --git a/drivers/of/base.c b/drivers/of/base.c
> index 133908a..9188caa 100644
> --- a/drivers/of/base.c
> +++ b/drivers/of/base.c
> @@ -1174,6 +1174,10 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align))
> !strcmp(pp->name, "linux,phandle"))
> continue;
>
> + /* Check for null value or non-strings (no null termination) */
> + if (!pp->value || strnlen(pp->value, pp->length) == pp->length)
> + continue;
> +
> np = of_find_node_by_path(pp->value);
> if (!np)
> continue;
>

Yes, it probably gets past this problem but oopses in a different place:

[ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03'
[ 0.000000] PROMLIB: Root node compatible:
[ 0.000000] Initializing cgroup subsys cpu
[ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 42
[ 0.000000] debug: ignoring loglevel setting.
[ 0.000000] bootconsole [earlyprom0] enabled
[ 0.000000] ARCH: SUN4U
[ 0.000000] Ethernet address: 08:00:20:b6:ee:e2
[ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
[ 0.000000] Remapping the kernel... done.
[ 0.000000] Unable to handle kernel NULL pointer dereference
[ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
[ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0
[ 0.000000] \|/ ____ \|/
[ 0.000000] "@'/ .. \`@"
[ 0.000000] /_| \__/ |_\
[ 0.000000] \__U_/
[ 0.000000] swapper(0): Oops [#1]
[ 0.000000] TSTATE: 0000000080e01606 TPC: 0000000000645810 TNPC: 0000000000645814 Y: 00000037 Not d
[ 0.000000] TPC: <of_find_node_by_phandle+0x30/0x60>
[ 0.000000] g0: 0000000000837b88 g1: 00000000fffff800 g2: 0000000000000000 g3: 0000000000000002
[ 0.000000] g4: 0000000000853fd0 g5: 0000000000000000 g6: 0000000000834000 g7: 0000000000000050
[ 0.000000] o0: 0000000000876cf0 o1: fffff8007fcc0900 o2: 0000000001010101 o3: 0000000080808080
[ 0.000000] o4: 000000000000000e o5: 000000000086c000 sp: 0000000000837301 ret_pc: 00000000006457e8
[ 0.000000] RPC: <of_find_node_by_phandle+0x8/0x60>
[ 0.000000] l0: 0000000000808fd8 l1: 0000000000876d28 l2: 000000000072a800 l3: 0000000000000080
[ 0.000000] l4: 0000000000000013 l5: 0000000000000013 l6: 0000000000000000 l7: 0000000000000281
[ 0.000000] i0: 00000000f005de3c i1: ffffffffffdc1428 i2: 0000000000000100 i3: 0000000000000004
[ 0.000000] i4: 0000000000000050 i5: 0000000000876c00 i6: 00000000008373b1 i7: 000000000088cd10
[ 0.000000] I7: <of_console_init+0xa4/0x144>
[ 0.000000] Call Trace:
[ 0.000000] [000000000088cd10] of_console_init+0xa4/0x144
[ 0.000000] [000000000088c548] prom_build_devicetree+0x18/0x3c
[ 0.000000] [00000000008904d4] paging_init+0x59c/0x6bc
[ 0.000000] [000000000088bebc] setup_arch+0xf8/0x110
[ 0.000000] [000000000088a51c] start_kernel+0x8c/0x34c
[ 0.000000] [00000000006fbf28] tlb_fixup_done+0xa0/0xa8
[ 0.000000] [0000000000000000] (null)
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Caller[000000000088cd10]: of_console_init+0xa4/0x144
[ 0.000000] Caller[000000000088c548]: prom_build_devicetree+0x18/0x3c
[ 0.000000] Caller[00000000008904d4]: paging_init+0x59c/0x6bc
[ 0.000000] Caller[000000000088bebc]: setup_arch+0xf8/0x110
[ 0.000000] Caller[000000000088a51c]: start_kernel+0x8c/0x34c
[ 0.000000] Caller[00000000006fbf28]: tlb_fixup_done+0xa0/0xa8
[ 0.000000] Caller[0000000000000000]: (null)
[ 0.000000] Instruction DUMP: 901760f0 02c70007 901760f0 <c2072010> 80a04018 324ffffc f85f2050 9
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] Press Stop-A (L1-A) to return to the boot prom

--
Meelis Roos ([email protected])

2012-02-13 09:51:00

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88


Another variation of the crash, without the patch, but backtrace is
slightly different (strlen) - maybe fixed by the patch, maybe not.

0.000000] Unable to handle kernel NULL pointer dereference
[ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
[ 0.000000] tsk->{mm,active_mm}->pgd = fffff800604ea3a8
[ 0.000000] \|/ ____ \|/
[ 0.000000] "@'/ .. \`@"
[ 0.000000] /_| \__/ |_\
[ 0.000000] \__U_/
[ 0.000000] swapper(0): Oops [#1]
[ 0.000000] TSTATE: 0000004480e01606 TPC: 00000000005be460 TNPC: 00000000005be464 Y: 00000037 Not d
[ 0.000000] TPC: <strlen+0x60/0xd4>
[ 0.000000] g0: 000000000000002f g1: 0000000000000001 g2: 0000000000000000 g3: 000000000073a700
[ 0.000000] g4: 000000000085ea50 g5: 0000000000000000 g6: 0000000000854000 g7: 0030a80000000000
[ 0.000000] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000001010101 o3: 0000000080808080
[ 0.000000] o4: 0000000001010000 o5: fffff8006feae140 sp: 00000000008572c1 ret_pc: 0000000000655108
[ 0.000000] RPC: <of_alias_scan+0x68/0x200>
[ 0.000000] l0: 00000000008a4380 l1: fffff8006feae6b5 l2: fffff8006feae140 l3: fffff8006fe98e00
[ 0.000000] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: 00000000008678d0
[ 0.000000] i0: 00000000008c3f24 i1: 0000000000896ca0 i2: 00000000008268c0 i3: 00000000008268b8
[ 0.000000] i4: 00000000008038c8 i5: fffff8006feae5c0 i6: 0000000000857381 i7: 00000000008c4314
[ 0.000000] I7: <of_pdt_build_devicetree+0x90/0xa0>
[ 0.000000] Call Trace:
[ 0.000000] [00000000008c4314] of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] [00000000008b0330] prom_build_devicetree+0x10/0x3c
[ 0.000000] [00000000008b3bb8] paging_init+0xa3c/0xde8
[ 0.000000] [00000000008af978] setup_arch+0x324/0x688
[ 0.000000] [00000000008ae4ec] start_kernel+0x80/0x338
[ 0.000000] [0000000000715b30] tlb_fixup_done+0x88/0x90
[ 0.000000] [0000000000000000] (null)
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Caller[00000000008c4314]: of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] Caller[00000000008b0330]: prom_build_devicetree+0x10/0x3c
[ 0.000000] Caller[00000000008b3bb8]: paging_init+0xa3c/0xde8
[ 0.000000] Caller[00000000008af978]: setup_arch+0x324/0x688
[ 0.000000] Caller[00000000008ae4ec]: start_kernel+0x80/0x338
[ 0.000000] Caller[0000000000715b30]: tlb_fixup_done+0x88/0x90
[ 0.000000] Caller[0000000000000000]: (null)
[ 0.000000] Instruction DUMP: 96132080 19004040 94132101 <da020000> 9823400a 808b000b 024ffffd 9

--
Meelis Roos ([email protected])

2012-02-13 10:22:06

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Another variation of the crash, without the patch, but backtrace is
> slightly different (strlen) - maybe fixed by the patch, maybe not.

This variation means it's from a different machine - sorry to be
confusing.

--
Meelis Roos ([email protected]) http://www.cs.ut.ee/~mroos/

2012-02-13 10:35:55

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Another variation of the crash, without the patch, but backtrace is
> slightly different (strlen) - maybe fixed by the patch, maybe not.

Tried this machine with the patvch too, same backtrace to strlen.
prtconf below.

> [ 0.000000] Unable to handle kernel NULL pointer dereference
> [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
> [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800604ea3a8
> [ 0.000000] \|/ ____ \|/
> [ 0.000000] "@'/ .. \`@"
> [ 0.000000] /_| \__/ |_\
> [ 0.000000] \__U_/
> [ 0.000000] swapper(0): Oops [#1]
> [ 0.000000] TSTATE: 0000004480e01606 TPC: 00000000005be460 TNPC: 00000000005be464 Y: 00000037 Not d
> [ 0.000000] TPC: <strlen+0x60/0xd4>
> [ 0.000000] g0: 000000000000002f g1: 0000000000000001 g2: 0000000000000000 g3: 000000000073a700
> [ 0.000000] g4: 000000000085ea50 g5: 0000000000000000 g6: 0000000000854000 g7: 0030a80000000000
> [ 0.000000] o0: 0000000000000000 o1: 0000000000000000 o2: 0000000001010101 o3: 0000000080808080
> [ 0.000000] o4: 0000000001010000 o5: fffff8006feae140 sp: 00000000008572c1 ret_pc: 0000000000655108
> [ 0.000000] RPC: <of_alias_scan+0x68/0x200>
> [ 0.000000] l0: 00000000008a4380 l1: fffff8006feae6b5 l2: fffff8006feae140 l3: fffff8006fe98e00
> [ 0.000000] l4: 0000000000000000 l5: 0000000000000000 l6: 0000000000000000 l7: 00000000008678d0
> [ 0.000000] i0: 00000000008c3f24 i1: 0000000000896ca0 i2: 00000000008268c0 i3: 00000000008268b8
> [ 0.000000] i4: 00000000008038c8 i5: fffff8006feae5c0 i6: 0000000000857381 i7: 00000000008c4314
> [ 0.000000] I7: <of_pdt_build_devicetree+0x90/0xa0>
> [ 0.000000] Call Trace:
> [ 0.000000] [00000000008c4314] of_pdt_build_devicetree+0x90/0xa0
> [ 0.000000] [00000000008b0330] prom_build_devicetree+0x10/0x3c
> [ 0.000000] [00000000008b3bb8] paging_init+0xa3c/0xde8
> [ 0.000000] [00000000008af978] setup_arch+0x324/0x688
> [ 0.000000] [00000000008ae4ec] start_kernel+0x80/0x338
> [ 0.000000] [0000000000715b30] tlb_fixup_done+0x88/0x90
> [ 0.000000] [0000000000000000] (null)
> [ 0.000000] Disabling lock debugging due to kernel taint
> [ 0.000000] Caller[00000000008c4314]: of_pdt_build_devicetree+0x90/0xa0
> [ 0.000000] Caller[00000000008b0330]: prom_build_devicetree+0x10/0x3c
> [ 0.000000] Caller[00000000008b3bb8]: paging_init+0xa3c/0xde8
> [ 0.000000] Caller[00000000008af978]: setup_arch+0x324/0x688
> [ 0.000000] Caller[00000000008ae4ec]: start_kernel+0x80/0x338
> [ 0.000000] Caller[0000000000715b30]: tlb_fixup_done+0x88/0x90
> [ 0.000000] Caller[0000000000000000]: (null)
> [ 0.000000] Instruction DUMP: 96132080 19004040 94132101 <da020000> 9823400a 808b000b 024ffffd 9

System Configuration: Sun Microsystems sun4u
Memory size: 1024 Megabytes
System Peripherals (PROM Nodes):

Node 0xf002a678
.node: f002a678
idprom: 01830003.ba11b371.000003ba.11b37182.00000000.00000000.00000000.00000000
scsi-initiator-id: 00000007
reset-reason: 'S-POR'
breakpoint-trap: 0000007f
#size-cells: 00000002
model: 'SUNW,375-3015'
name: 'SUNW,UltraAX-i2'
clock-frequency: 05f5e100
banner-name: 'Sun Fire V100 (UltraSPARC-IIe 500MHz)'
compatible: 'sun4u'
device_type: 'upa'
stick-frequency: 0054c563

Node 0xf002d908
.node: f002d908
name: 'packages'

Node 0xf0035e4c
.node: f0035e4c
iso6429-1983-colors:
name: 'terminal-emulator'

Node 0xf0038e7c
.node: f0038e7c
disk-write-fix:
name: 'deblocker'

Node 0xf00395c4
.node: f00395c4
name: 'obp-tftp'

Node 0xf0044b08
.node: f0044b08
name: 'disk-label'

Node 0xf0059f74
.node: f0059f74
name: 'SUNW,builtin-drivers'

Node 0xf0062644
.node: f0062644
source: '/pci@1f,0/isa@7/flashprom@1f,0:'
name: 'dropins'

Node 0xf00730e0
.node: f00730e0
name: 'kbd-translator'

Node 0xf002d978
.node: f002d978
mmu: fffe7ae0
memory: fffe7ce0
bootargs: 00
bootpath: '/pci@1f,0/ide@d/disk@2,0:a'
stdout: fffbd7b8
stdin: fffbda00
stdout-#lines: ffffffff
name: 'chosen'

Node 0xf002d9e4
.node: f002d9e4
version: 'OBP 4.0.18 2002/05/23 18:22'
model: 'SUNW,4.0'
aligned-allocator:
relative-addressing:
name: 'openprom'

Node 0xf002da74
.node: f002da74
name: 'client-services'

Node 0xf002db1c
.node: f002db1c
ras-shutdown-enabled?: 'false'
shutdown-temp: '75'
warning-temp: '70'
env-monitor: 'enabled'
diag-passes: '1'
diag-continue?: '0'
diag-targets: '0'
diag-verbosity: '0'
keyboard-click?: 'false'
keymap:
scsi-initiator-id: '7'
#power-cycles: '100'
system-board-serial#:
system-board-date:
ttyb-rts-dtr-off: 'false'
ttyb-ignore-cd: 'true'
ttya-rts-dtr-off: 'false'
ttya-ignore-cd: 'true'
ttyb-mode: '9600,8,n,1,-'
ttya-mode: '9600,8,n,1,-'
pci-probe-list: '7,3,c,5,a,d'
mfg-mode: 'off'
diag-level: 'max'
fcode-debug?: 'false'
output-device: 'ttya'
input-device: 'ttya'
load-base: '16384'
auto-boot-retry?: 'false'
boot-command: 'boot'
auto-boot?: 'true'
watchdog-reboot?: 'true'
diag-file:
diag-device: 'disk'
boot-file:
boot-device: 'disk net'
local-mac-address?: 'false'
net-timeout: '0'
ansi-terminal?: 'true'
screen-#columns: '80'
screen-#rows: '34'
silent-mode?: 'false'
use-nvramrc?: 'false'
nvramrc:
security-mode: 'none'
security-password:
security-#badlogins: '0'
oem-logo:
oem-logo?: 'false'
oem-banner:
oem-banner?: 'false'
hardware-revision:
last-hardware-update:
diag-switch?: 'true'
name: 'options'

Node 0xf002db8c
.node: f002db8c
disk: '/pci@1f,0/ide@d/disk@2,0'
rtc: '/pci@1f,0/isa@7/rtc@0,70'
usb: '/pci@1f,0/usb@a'
flash: '/pci@1f,0/isa@7/flashprom@1f,0'
lom: '/pci@1f,0/isa@7/SUNW,lomh@0,8010'
i2c-nvram: '/pci@1f,0/pmu@3/i2c@0,0/i2c-nvram@0,aa'
net1: '/pci@1f,0/ethernet@5'
dload1: '/pci@1f,0/ethernet@5:,'
dload: '/pci@1f,0/ethernet@c:,'
net0: '/pci@1f,0/ethernet@c'
net: '/pci@1f,0/ethernet@c'
cdrom: '/pci@1f,0/ide@d/cdrom@3,0:f'
disk3: '/pci@1f,0/ide@d/disk@3,0'
disk2: '/pci@1f,0/ide@d/disk@2,0'
disk1: '/pci@1f,0/ide@d/disk@1,0'
disk0: '/pci@1f,0/ide@d/disk@0,0'
ide: '/pci@1f,0/ide@d'
floppy: '/pci@1f,0/isa@7/dma/floppy'
ttyb: '/pci@1f,0/isa@7/serial@0,2e8'
ttya: '/pci@1f,0/isa@7/serial@0,3f8'
name: 'aliases'

Node 0xf0050050
.node: f0050050
reg: 00000000.00000000.00000000.10000000.00000000.20000000.00000000.10000000.00000000.40000000.00000000.10000000.00000000.60000000.00000000.10000000
available: 00000000.6fec0000.00000000.00006000.00000000.6fe80000.00000000.00030000.00000000.6f000000.00000000.00e00000.00000000.60000000.00000000.0effe000.00000000.40000000.00000000.10000000.00000000.20000000.00000000.10000000.00000000.00000000.00000000.10000000
name: 'memory'

Node 0xf0050634
.node: f0050634
translations: 00000000.fffe0000.00000000.00010000.80000000.6fef00b6.00000000.fffdc000.00000000.00004000.80000000.6fee40b6.00000000.fffd4000.00000000.00004000.80000000.6fede0b6.00000000.fffd2000.00000000.00002000.800001fe.0200808e.00000000.fffd0000.00000000.00002000.80000000.6fed60b6.00000000.fffce000.00000000.00002000.800001fe.0200008e.00000000.fffcc000.00000000.00002000.800001fe.0200208e.00000000.fffca000.00000000.00002000.800001fe.0200408e.00000000.fffc8000.00000000.00002000.80000000.6effe0b6.00000000.fffc6000.00000000.00002000.80000000.6fed20b6.00000000.fffc4000.00000000.00002000.80000000.6fedc0b6.00000000.fffc2000.00000000.00002000.800001fe.0200008e.00000000.fffbc000.00000000.00004000.80000000.6fec80b6.00000000.fff82000.00000000.00010000.800001fe.0000008e.00000000.fff7e000.00000000.00004000.80000000.6fed80b6.00000000.f0000000.00000000.00100000.80000000.6ff000b6.00000000.40000000.00000000.04000000.80000000.60000036.00000000.00400000.00000000.01000000.80000000.6000
0036.00000000.00002000.00000000.003fe000.80000000.00002036
existing: 00000000.00000000.00000800.00000000.fffff800.00000000.00000800.00000000
available: fffff800.00000000.000007fc.00000000.00000001.00000000.000007ff.00000000.00000000.ffff0000.00000000.0000e000.00000000.00000000.00000000.f0000000.00000000.fffc0000.00000000.00002000.00000000.fff92000.00000000.0002a000.00000000.fff00000.00000000.0007e000.00000000.f0f80000.00000000.0e080000.00000000.f0800000.00000000.00700000
page-size: 00002000
name: 'virtual-memory'

Node 0xf0069d48
.node: f0069d48
available: 81000000.00000000.00010230.00000000.00bffdd0.82000000.00000000.00004000.00000000.0003c000.82000000.00000000.000c0000.00000000.00f40000.82000000.00000000.02000000.00000000.5e000000.82000000.00000000.80000000.00000000.40000000.82000000.00000000.e0000000.00000000.10000000
bus-range: 00000000.00000000
interrupt-map: 00006800.00000000.00000000.00000001.f0069d48.0000000c.00005000.00000000.00000000.00000001.f0069d48.00000024.00006000.00000000.00000000.00000001.f0069d48.00000006.00002800.00000000.00000000.00000001.f0069d48.0000001c.00003800.00000000.00000000.00000004.f0069d48.0000002b.00003800.00000000.00000000.00000005.f0069d48.00000023.00003800.00000000.00000000.00000001.f0069d48.0000002a.00001800.00000000.00000000.00000001.f0069d48.00000022
interrupt-map-mask: 00fff800.00000000.00000000.00000007
#interrupt-cells: 00000001
virtual-dma: 60000000.20000000
reg: 000001fe.00000000.00000000.00010000.000001fe.01000000.00000000.00000100
ranges: 00000000.00000000.00000000.000001fe.01000000.00000000.01000000.01000000.00000000.00000000.000001fe.02000000.00000000.01000000.02000000.00000000.00000000.000001ff.00000000.00000001.00000000.03000000.00000000.00000000.000001ff.00000000.00000001.00000000
#virtual-dma-size-cells: 00000001
#virtual-dma-addr-cells: 00000001
clock-frequency: 03ef1480
latency-timer:
button-interrupt:
no-streaming-cache:
66mhz-capable:
interrupts: 00000030.0000002e.0000002f.00000025
upa-portid: 0000001f
bus-parity-generated:
compatible: 'pci108e,a001'
model: 'SUNW,sabre'
name: 'pci'
device_type: 'pci'
#address-cells: 00000003
#size-cells: 00000002

Node 0xf0073e2c
.node: f0073e2c
cache-line-size: 00000000
latency-timer: 00000000
#size-cells: 00000001
#address-cells: 00000002
name: 'isa'
ranges: 00000000.00000000.81003810.00000000.00000000.00010000.0000001f.00000000.82003814.00000000.f0000000.00080000
reg: 00003800.00000000.00000000.00000000.00000000.81003810.00000000.00000000.00000000.00010000.82003814.00000000.00000000.00000000.00100000
devsel-speed: 00000001
class-code: 00060100
max-latency: 00000000
min-grant: 00000000
subsystem-id: 00001533
subsystem-vendor-id: 000010b9
revision-id: 00000000
device-id: 00001533
vendor-id: 000010b9

Node 0xf00749f4
.node: f00749f4
reg: 00000000.00000000.00010000
interrupts: 00000001
compatible: 'isadma'
name: 'dma'

Node 0xf0074ccc
.node: f0074ccc
address: fffce070
reg: 00000000.00000070.00000002
compatible: 'm5819'
model: 'm5819'
name: 'rtc'

Node 0xf009cac4
.node: f009cac4
device_type: 'tod'
name: 'todm5819'

Node 0xf007583c
.node: f007583c
compatible: 'acpi-power'
button:
interrupts: 00000005
reg: 00000000.00002000.00000008
name: 'power'

Node 0xf00759d0
.node: f00759d0
reg: 00000000.00008010.00000002
interrupts: 00000001
device_type: 'block'
name: 'SUNW,lomh'

Node 0xf0076e0c
.node: f0076e0c
port-a-ignore-cd:
nohupcl: 00
interrupt-priorities: 0000000c.0000000c
reg: 00000000.000003f8.00000008
compatible: 73753136.35353000.737500
device_type: 'serial'
name: 'serial'
interrupts: 00000004

Node 0xf0078af8
.node: f0078af8
port-b-ignore-cd:
nohupcl: 00
interrupt-priorities: 0000000c.0000000c
reg: 00000000.000002e8.00000008
compatible: 73753136.35353000.737500
device_type: 'serial'
name: 'serial'
interrupts: 00000004

Node 0xf007ac10
.node: f007ac10
model: 'SUNW,258-7883'
version: 'CORE 1.0.18 2002/05/23 18:22'
name: 'flashprom'
reg: 0000001f.00000000.00080000

Node 0xf007b6bc
.node: f007b6bc
name: 'pmu'
ranges: 00000000.00000000.00001800.00000000.00000000.00000100.00000001.00000000.81001810.00000000.00004000.00000100.00000002.00000000.81001814.00000000.00000000.00000100
reg: 00001800.00000000.00000000.00000000.00000000.81001810.00000000.00004000.00000000.00000010
compatible: 70636931.3062392c.37313031.00706369.636c6173.732c3030.30303030.00
#address-cells: 00000002
#size-cells: 00000001
devsel-speed: 00000001
class-code: 00000000
max-latency: 00000000
min-grant: 00000000
revision-id: 00000000
device-id: 00007101
vendor-id: 000010b9

Node 0xf007be84
.node: f007be84
reg: 00000000.00000000.00000100.00000001.00000000.00000100
#address-cells: 00000002
#size-cells: 00000000
interrupts: 00000001
compatible: 'i2c-smbus'
name: 'i2c'

Node 0xf007d31c
.node: f007d31c
compatible: 'i2c-max1617'
name: 'temperature'
reg: 00000000.00000030

Node 0xf007d48c
.node: f007d48c
compatible: 'i2c-at34c02'
name: 'dimm'
reg: 00000000.000000a8

Node 0xf007d544
.node: f007d544
compatible: 'i2c-at34c02'
name: 'dimm'
reg: 00000000.000000aa

Node 0xf007d5fc
.node: f007d5fc
compatible: 'i2c-at34c02'
name: 'dimm'
reg: 00000000.000000ac

Node 0xf007d6b4
.node: f007d6b4
compatible: 'i2c-at34c02'
name: 'dimm'
reg: 00000000.000000ae

Node 0xf007d76c
.node: f007d76c
reg: 00000000.000000a0
#address-cells: 00000001
compatible: 'i2c-at24c64'
device_type: 'nvram'
name: 'i2c-nvram'

Node 0xf007e284
.node: f007e284
reg: 00001fd8.00000028
device_type: 'idprom'
name: 'idprom'

Node 0xf007e538
.node: f007e538
reg: 00000000.000000a2
#address-cells: 00000001
compatible: 'i2c-at24c64'
name: 'motherboard-fru'

Node 0xf007f0d0
.node: f007f0d0
compatible: 'SUNW,smbus-ppm'
name: 'ppm'
register-mask: 00000000.00000001
reg: 00000000.000000b3.00000001.80000000.000000ba.00000001.00000000.000000bb.00000001

Node 0xf007f344
.node: f007f344
compatible: 'SUNW,smbus-beep'
name: 'beep'
reg: 00000000.000000b2.00000001.00000000.000000d3.00000001.00000002.00000042.00000002.00000002.00000061.00000001

Node 0xf007f45c
.node: f007f45c
compatible: 'SUNW,smbus-fan-control'
name: 'fan-control'
register-mask: 00000000.00000002
reg: 00000000.000000c8.00000004.80000000.000000ba.00000001

Node 0xf007f660
.node: f007f660
name: 'lomp'
reg: 00001800.00000000.00000000.00000000.00000000.81001810.00004000.00000000.00000000.00000010

Node 0xf007fae8
.node: f007fae8
local-mac-address: 0003ba11.b371
assigned-addresses: 81006010.00000000.00010000.00000000.00000100.82006014.00000000.00000000.00000000.00002000.82006030.00000000.00040000.00000000.00040000
version: '1.0'
compatible: 70636934.3535342c.34333465.00706369.31323868.2c393130.32007063.69313238.322c3931.30320070.6369636c.6173732c.30323030.303000
device_type: 'network'
subsystem-id: 0000434e
subsystem-vendor-id: 00004554
reg: 00006000.00000000.00000000.00000000.00000000.01006010.00000000.00000000.00000000.00000100.02006014.00000000.00000000.00000000.00000100
name: 'ethernet'
devsel-speed: 00000001
class-code: 00020000
interrupts: 00000001
max-latency: 00000028
min-grant: 00000014
revision-id: 00000031
device-id: 00009102
vendor-id: 00001282

Node 0xf0089634
.node: f0089634
local-mac-address: 0003ba11.b372
assigned-addresses: 81002810.00000000.00010100.00000000.00000100.82002814.00000000.00002000.00000000.00002000.82002830.00000000.00080000.00000000.00040000
version: '1.0'
compatible: 70636934.3535342c.34333465.00706369.31323868.2c393130.32007063.69313238.322c3931.30320070.6369636c.6173732c.30323030.303000
device_type: 'network'
subsystem-id: 0000434e
subsystem-vendor-id: 00004554
reg: 00002800.00000000.00000000.00000000.00000000.01002810.00000000.00000000.00000000.00000100.02002814.00000000.00000000.00000000.00000100
name: 'ethernet'
devsel-speed: 00000001
class-code: 00020000
interrupts: 00000001
max-latency: 00000028
min-grant: 00000014
revision-id: 00000031
device-id: 00009102
vendor-id: 00001282

Node 0xf0093180
.node: f0093180
assigned-addresses: 82005010.00000000.01000000.00000000.01000000
sunw,find-fcode: f009838c
maximum-frame#: 0000ffff
reg: 00005000.00000000.00000000.00000000.00000000.02005010.00000000.00000000.00000000.01000000
#size-cells: 00000000
#address-cells: 00000001
compatible: 70636931.3062392c.35323337.2e330070.63693130.62392c35.32333700.70636963.6c617373.2c306330.33313000.70636963.6c617373.2c306330.3300
name: 'usb'
fast-back-to-back:
devsel-speed: 00000001
class-code: 000c0310
interrupts: 00000001
max-latency: 00000050
min-grant: 00000000
revision-id: 00000003
device-id: 00005237
vendor-id: 000010b9

Node 0xf0098ff8
.node: f0098ff8
assigned-addresses: 81006810.00000000.00010200.00000000.00000008.81006814.00000000.00010218.00000000.00000008.81006818.00000000.00010210.00000000.00000008.8100681c.00000000.00010208.00000000.00000008.81006820.00000000.00010220.00000000.00000010
reg: 00006800.00000000.00000000.00000000.00000000.01006810.00000000.00000000.00000000.00000008.01006814.00000000.00000000.00000000.00000004.01006818.00000000.00000000.00000000.00000008.0100681c.00000000.00000000.00000000.00000004.01006820.00000000.00000000.00000000.00000010
compatible: 70636931.3062392c.35323239.00706369.636c6173.732c3031.30316666.00
#address-cells: 00000002
device_type: 'ide'
name: 'ide'
fast-back-to-back:
devsel-speed: 00000001
class-code: 000101ff
interrupts: 00000001
max-latency: 00000004
min-grant: 00000002
revision-id: 000000c3
device-id: 00005229
vendor-id: 000010b9

Node 0xf009b86c
.node: f009b86c
device_type: 'block'
name: 'disk'
compatible: 'ide-disk'

Node 0xf009bf18
.node: f009bf18
device_type: 'block'
name: 'cdrom'
compatible: 'ide-cdrom'

Node 0xf0072d50
.node: f0072d50
manufacturer#: 00000017
implementation#: 00000013
mask#: 00000014
ecache-size: 00040000
clock-frequency: 1dcd6500
name: 'SUNW,UltraSPARC-IIe'
sparc-version: 00000009
ecache-associativity: 00000001
ecache-line-size: 00000040
#dtlb-entries: 00000040
dcache-associativity: 00000001
dcache-line-size: 00000020
dcache-size: 00004000
#itlb-entries: 00000040
icache-associativity: 00000002
icache-line-size: 00000020
icache-size: 00004000
upa-portid: 00000000
reg: 000001c0.00000000.00000000.00000008
device_type: 'cpu'


--
Meelis Roos ([email protected]) http://www.cs.ut.ee/~mroos/

2012-02-13 21:46:26

by Grant Likely

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Mon, Feb 13, 2012 at 11:20:36AM +0200, Meelis Roos wrote:
> > Try the following patch. I suspect the new of_alias_scan() isn't careful
> > enough about which properties it dereferences:
> >
> > ---
> >
> > diff --git a/drivers/of/base.c b/drivers/of/base.c
> > index 133908a..9188caa 100644
> > --- a/drivers/of/base.c
> > +++ b/drivers/of/base.c
> > @@ -1174,6 +1174,10 @@ void of_alias_scan(void * (*dt_alloc)(u64 size, u64 align))
> > !strcmp(pp->name, "linux,phandle"))
> > continue;
> >
> > + /* Check for null value or non-strings (no null termination) */
> > + if (!pp->value || strnlen(pp->value, pp->length) == pp->length)
> > + continue;
> > +
> > np = of_find_node_by_path(pp->value);
> > if (!np)
> > continue;
> >
>
> Yes, it probably gets past this problem but oopses in a different place:
>
> [ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03'
> [ 0.000000] PROMLIB: Root node compatible:
> [ 0.000000] Initializing cgroup subsys cpu
> [ 0.000000] Linux version 3.3.0-rc3-00188-g3ec1e88-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 42
> [ 0.000000] debug: ignoring loglevel setting.
> [ 0.000000] bootconsole [earlyprom0] enabled
> [ 0.000000] ARCH: SUN4U
> [ 0.000000] Ethernet address: 08:00:20:b6:ee:e2
> [ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
> [ 0.000000] Remapping the kernel... done.
> [ 0.000000] Unable to handle kernel NULL pointer dereference
> [ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
> [ 0.000000] tsk->{mm,active_mm}->pgd = fffff800008c77d0
> [ 0.000000] \|/ ____ \|/
> [ 0.000000] "@'/ .. \`@"
> [ 0.000000] /_| \__/ |_\
> [ 0.000000] \__U_/
> [ 0.000000] swapper(0): Oops [#1]
> [ 0.000000] TSTATE: 0000000080e01606 TPC: 0000000000645810 TNPC: 0000000000645814 Y: 00000037 Not d
> [ 0.000000] TPC: <of_find_node_by_phandle+0x30/0x60>

Ugh; that looks bad. If it failed there, then the global device node list
is corrupted. I hate to ask you this, but would you be able to git bisect to
narrow down the commit that causes the problem?

g.

2012-02-14 00:59:18

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: Grant Likely <[email protected]>
Date: Mon, 13 Feb 2012 14:46:23 -0700

> Ugh; that looks bad. If it failed there, then the global device node list
> is corrupted. I hate to ask you this, but would you be able to git bisect to
> narrow down the commit that causes the problem?

Wild guess on all of these bugs, bad OF node reference counting and a
OF node is free'd up prematurely.

If you look at the sparc code that has been subsumed into the generic
drivers/of/ stuff over the past few years, you'll see that we never
consistently did any of the reference counting bits on the sparc side.

I never did it, because I don't anticipate ever having hot-plug
support for OF nodes.

Anyways, if you now start to mix the drivers/of/ stuff which
religiously does the reference counting with of_node_{get,put}()
with the remaining scraps of sparc code that doesn't... it might
not be pretty.

In the crash dump after your test patch, we are in
of_find_node_by_phandle() with a 'np' pointer in the allnodes list
equal to 0x50.

The signature in the original crash dump is identical, except
that time we were in of_find_node_by_path(), but again the 'np'
pointer was 0x50.

Something else that might be suspicious were the memblock changes
that happened this release cycle, so I wouldn't be surprised if
a bisect turned up something in there.

FWIW I've been running current kernels on my niagara boxes without
incident for several weeks.

2012-02-14 02:30:25

by Grant Likely

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Mon, Feb 13, 2012 at 5:58 PM, David Miller <[email protected]> wrote:
> From: Grant Likely <[email protected]>
> Date: Mon, 13 Feb 2012 14:46:23 -0700
>
>> Ugh; that looks bad. ?If it failed there, then the global device node list
>> is corrupted. ?I hate to ask you this, but would you be able to git bisect to
>> narrow down the commit that causes the problem?
>
> Wild guess on all of these bugs, bad OF node reference counting and a
> OF node is free'd up prematurely.
>
> If you look at the sparc code that has been subsumed into the generic
> drivers/of/ stuff over the past few years, you'll see that we never
> consistently did any of the reference counting bits on the sparc side.

Hmmm.... The of_node_put() code path shouldn't exist on sparc. You'll
see that it is #ifdef'd out in include/linux/of.h. Plus, only
'OF_DETACHED' nodes are allowed to be released, an there are only 3
code paths (all calling of_detach_node()) specific to powerpc that can
detach a node.

> I never did it, because I don't anticipate ever having hot-plug
> support for OF nodes.
>
> Anyways, if you now start to mix the drivers/of/ stuff which
> religiously does the reference counting with of_node_{get,put}()
> with the remaining scraps of sparc code that doesn't... it might
> not be pretty.
>
> In the crash dump after your test patch, we are in
> of_find_node_by_phandle() with a 'np' pointer in the allnodes list
> equal to 0x50.

Definitely not right! It would be interesting to add a printk() to
of_find_node_by_phandle() or of_find_node_by_path() to blast out the
node names as it traverses the tree. That could help track down
corruption.

>
> The signature in the original crash dump is identical, except
> that time we were in of_find_node_by_path(), but again the 'np'
> pointer was 0x50.
>
> Something else that might be suspicious were the memblock changes
> that happened this release cycle, so I wouldn't be surprised if
> a bisect turned up something in there.
>
> FWIW I've been running current kernels on my niagara boxes without
> incident for several weeks.
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at ?http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at ?http://www.tux.org/lkml/



--
Grant Likely, B.Sc., P.Eng.
Secret Lab Technologies Ltd.

2012-02-14 02:42:38

by Grant Likely

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Mon, Feb 13, 2012 at 7:30 PM, Grant Likely <[email protected]> wrote:
> On Mon, Feb 13, 2012 at 5:58 PM, David Miller <[email protected]> wrote:
>> From: Grant Likely <[email protected]>
>> Date: Mon, 13 Feb 2012 14:46:23 -0700
>>
>>> Ugh; that looks bad. ?If it failed there, then the global device node list
>>> is corrupted. ?I hate to ask you this, but would you be able to git bisect to
>>> narrow down the commit that causes the problem?
>>
>> Wild guess on all of these bugs, bad OF node reference counting and a
>> OF node is free'd up prematurely.
>>
>> If you look at the sparc code that has been subsumed into the generic
>> drivers/of/ stuff over the past few years, you'll see that we never
>> consistently did any of the reference counting bits on the sparc side.
>
> Hmmm.... The of_node_put() code path shouldn't exist on sparc. ?You'll
> see that it is #ifdef'd out in include/linux/of.h. ?Plus, only
> 'OF_DETACHED' nodes are allowed to be released, an there are only 3
> code paths (all calling of_detach_node()) specific to powerpc that can
> detach a node.

In fact, I should disable those paths always when CONFIG_OF_DYNAMIC is
disabled. I'll look into doing so for v3.4.

g.

2012-02-14 05:54:25

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> FWIW I've been running current kernels on my niagara boxes without
> incident for several weeks.

It runs for me on Ultra 1, Ultra 5 IDE, Ultra 10 SCSI and Blade 100.
Fails on E3500, V100 and Netra X1 so it's probably dependent on
something in the device tree.

I will try bisecting and the suggested printk's but it takes time since
I will be away from computers most of today.

--
Meelis Roos ([email protected])

2012-02-16 19:53:18

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Ugh; that looks bad. If it failed there, then the global device node list
> is corrupted. I hate to ask you this, but would you be able to git bisect to
> narrow down the commit that causes the problem?

Finished bisecting on E2500 (the original machine where I found the
problem). Bisecting leads to
[0ee332c1451869963626bf9cac88f165a90990e1] memblock: Kill early_node_map[]
So yes, it looks like memblock.

--
Meelis Roos ([email protected])

2012-02-16 21:08:09

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Definitely not right! It would be interesting to add a printk() to
> of_find_node_by_phandle() or of_find_node_by_path() to blast out the
> node names as it traverses the tree. That could help track down
> corruption.

[ 0.000000] of_find_node_by_path: /chosen
[ 0.000000] of_find_node_by_path: /aliases ¥_6䥷~ê7\eý+õï*¢ꢏñ?¿sM ý{
aliases000000] ò7find_node_by_path: ðÑÔ_Bÿ
[ 0.000000] Unable to handle kernel NULL pointer dereference

--
Meelis Roos ([email protected])

2012-02-16 21:23:56

by Sam Ravnborg

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Thu, Feb 16, 2012 at 09:53:14PM +0200, Meelis Roos wrote:
> > Ugh; that looks bad. If it failed there, then the global device node list
> > is corrupted. I hate to ask you this, but would you be able to git bisect to
> > narrow down the commit that causes the problem?
>
> Finished bisecting on E2500 (the original machine where I found the
> problem). Bisecting leads to
> [0ee332c1451869963626bf9cac88f165a90990e1] memblock: Kill early_node_map[]
> So yes, it looks like memblock.

Added Tejun.

Sam

2012-02-20 09:11:11

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> So yes, it looks like memblock.

Finished bisecting on the other machine too (Sun Fire V100 where strlen
crashes):

7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 is the first bad commit
commit 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077
Author: Tejun Heo <[email protected]>
Date: Thu Dec 8 10:22:09 2011 -0800

memblock: Reimplement memblock allocation using reverse free area iterator

Now that all early memory information is in memblock when enabled, we
can implement reverse free area iterator and use it to implement NUMA
aware allocator which is then wrapped for simpler variants instead of
the confusing and inefficient mending of information in separate NUMA
aware allocator.

Implement for_each_free_mem_range_reverse(), use it to reimplement
memblock_find_in_range_node() which in turn is used by all allocators.

The visible allocator interface is inconsistent and can probably use
some cleanup too.

Signed-off-by: Tejun Heo <[email protected]>
Cc: Benjamin Herrenschmidt <[email protected]>
Cc: Yinghai Lu <[email protected]>

:040000 040000 f74f55a80162a0a1a45c135ca62a51b9af824d53 a2dc2bccf4a30ee516709d0fdcb33faae11059ff M include
:040000 040000 e4c4292fe66c4d8d6aa89710ce9f538fbf550ae8 5677586fad018ae9978d53084ba5d617fe231a3d M mm

--
Meelis Roos ([email protected])

2012-02-20 17:06:10

by Tejun Heo

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

Hello, Meelis, Sam.

Sorry about the delay. I've been pretty swamped lately.

On Mon, Feb 20, 2012 at 11:11:05AM +0200, Meelis Roos wrote:
> Finished bisecting on the other machine too (Sun Fire V100 where strlen
> crashes):
>
> 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 is the first bad commit
> commit 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077
> Author: Tejun Heo <[email protected]>
> Date: Thu Dec 8 10:22:09 2011 -0800
>
> memblock: Reimplement memblock allocation using reverse free area iterator
>
> Now that all early memory information is in memblock when enabled, we
> can implement reverse free area iterator and use it to implement NUMA
> aware allocator which is then wrapped for simpler variants instead of
> the confusing and inefficient mending of information in separate NUMA
> aware allocator.
>
> Implement for_each_free_mem_range_reverse(), use it to reimplement
> memblock_find_in_range_node() which in turn is used by all allocators.
>
> The visible allocator interface is inconsistent and can probably use
> some cleanup too.
>
> Signed-off-by: Tejun Heo <[email protected]>
> Cc: Benjamin Herrenschmidt <[email protected]>
> Cc: Yinghai Lu <[email protected]>

Hmmm.... So, different bisection results from two machines? That's a
bit weird. I *think* this bisection result makes more sense. Can you
please verify the bisection result on e2500 once more?

Thanks.

--
tejun

2012-02-20 20:04:13

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Hmmm.... So, different bisection results from two machines? That's a
> bit weird. I *think* this bisection result makes more sense. Can you
> please verify the bisection result on e2500 once more?

Will do.

--
Meelis Roos ([email protected])

2012-02-20 21:01:49

by Tejun Heo

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

Hello,

On Mon, Feb 20, 2012 at 10:04:10PM +0200, Meelis Roos wrote:
> > Hmmm.... So, different bisection results from two machines? That's a
> > bit weird. I *think* this bisection result makes more sense. Can you
> > please verify the bisection result on e2500 once more?
>
> Will do.

Thanks a lot. I'm *suspecting* that somehow memory used to back the
device tree is not fully reserved and the change in allocation logic
is giving out it as part of allocation. I'll look through the change
more and see if I can spot a bug in the new code but I guess we'll
probably have to print out some pointer values to find out the
offending address.

Thanks.

--
tejun

2012-02-20 22:32:13

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> On Mon, Feb 20, 2012 at 11:11:05AM +0200, Meelis Roos wrote:
> > Finished bisecting on the other machine too (Sun Fire V100 where strlen
> > crashes):
> >
> > 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077 is the first bad commit
> > commit 7bd0b0f0da3b1ec11cbcc798eb0ef747a1184077
> > Author: Tejun Heo <[email protected]>
> > Date: Thu Dec 8 10:22:09 2011 -0800
> >
> > memblock: Reimplement memblock allocation using reverse free area iterator
> >
> > Now that all early memory information is in memblock when enabled, we
> > can implement reverse free area iterator and use it to implement NUMA
> > aware allocator which is then wrapped for simpler variants instead of
> > the confusing and inefficient mending of information in separate NUMA
> > aware allocator.
> >
> > Implement for_each_free_mem_range_reverse(), use it to reimplement
> > memblock_find_in_range_node() which in turn is used by all allocators.
> >
> > The visible allocator interface is inconsistent and can probably use
> > some cleanup too.
> >
> > Signed-off-by: Tejun Heo <[email protected]>
> > Cc: Benjamin Herrenschmidt <[email protected]>
> > Cc: Yinghai Lu <[email protected]>
>
> Hmmm.... So, different bisection results from two machines? That's a
> bit weird. I *think* this bisection result makes more sense. Can you
> please verify the bisection result on e2500 once more?

You were right. The first machine now bisects down to the same commit -
I was confused by "0 revisions to test" and did not run the last step
whe first bisecting.

--
Meelis Roos ([email protected])

2012-02-21 01:05:47

by Tejun Heo

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

Hello,

Meelis, can you please apply the following patch before & after the
offending commit, boot with "memblock=debug" added as kernel param and
post the boot log? The patch will generate some offset warnings after
the commit but should work fine.

Sam, David, as I'm not familiar with the code base, is it possible to
tell which address is corrupted (zeroed, it seems)? ie. can we add
"if (XXX == NULL) printk("%p is corrputed\n"...);" somewhere?

Thanks.

diff --git a/mm/memblock.c b/mm/memblock.c
index 1adbef0..dccfced 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -179,9 +179,15 @@ int __init_memblock memblock_reserve_reserved_regions(void)

static void __init_memblock memblock_remove_region(struct memblock_type *type, unsigned long r)
{
- type->total_size -= type->regions[r].size;
- memmove(&type->regions[r], &type->regions[r + 1],
- (type->cnt - (r + 1)) * sizeof(type->regions[r]));
+ struct memblock_region *rgn = &type->regions[r];
+
+ memblock_dbg(" memblock %s: rm [%#016llx-%#016llx] node %d\n",
+ memblock_type_name(type),
+ (unsigned long long)rgn->base,
+ (unsigned long long)rgn->base + rgn->size, rgn->nid);
+
+ type->total_size -= rgn->size;
+ memmove(rgn, rgn + 1, (type->cnt - (r + 1)) * sizeof(*rgn));
type->cnt--;

/* Special case for empty arrays */
@@ -317,6 +323,9 @@ static void __init_memblock memblock_insert_region(struct memblock_type *type,
memblock_set_region_node(rgn, nid);
type->cnt++;
type->total_size += size;
+ memblock_dbg(" memblock %s: add [%#016llx-%016llx] node %d @%d\n",
+ memblock_type_name(type), (unsigned long long)base,
+ (unsigned long long)base + size, nid, idx);
}

/**
@@ -342,6 +351,10 @@ static int __init_memblock memblock_add_region(struct memblock_type *type,
phys_addr_t end = base + memblock_cap_size(base, &size);
int i, nr_new;

+ memblock_dbg(" memblock %s: ADD [%#016llx-%#016llx] node %d\n",
+ memblock_type_name(type), (unsigned long long)base,
+ (unsigned long long)base + size, nid);
+
/* special case for empty array */
if (type->regions[0].size == 0) {
WARN_ON(type->cnt != 1 || type->total_size);
@@ -349,6 +362,8 @@ static int __init_memblock memblock_add_region(struct memblock_type *type,
type->regions[0].size = size;
memblock_set_region_node(&type->regions[0], nid);
type->total_size = size;
+ memblock_dbg(" memblock %s: add first entry\n",
+ memblock_type_name(type));
return 0;
}
repeat:
@@ -494,6 +509,10 @@ static int __init_memblock __memblock_remove(struct memblock_type *type,
int start_rgn, end_rgn;
int i, ret;

+ memblock_dbg(" memblock %s: RM [%#016llx-%016llx]\n",
+ memblock_type_name(type), (unsigned long long)base,
+ (unsigned long long)base + size);
+
ret = memblock_isolate_range(type, base, size, &start_rgn, &end_rgn);
if (ret)
return ret;

2012-02-22 00:36:18

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Meelis, can you please apply the following patch before & after the
> offending commit, boot with "memblock=debug" added as kernel param and
> post the boot log? The patch will generate some offset warnings after
> the commit but should work fine.

Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached)
After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached)

In addition, a third type of sparc machines breaks in a third way - V210
and V240 just hang after telling

console [tty0] enabled, bootconsole disabled

and before calibrating the delay loop. Bisect has led to the same commit.

--
Meelis Roos ([email protected])


Attachments:
memblock1.gz (48.77 kB)
memblock2.gz (38.59 kB)
Download all attachments

2012-02-22 17:03:35

by Sam Ravnborg

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Mon, Feb 20, 2012 at 05:05:37PM -0800, Tejun Heo wrote:
> Hello,
>
> Meelis, can you please apply the following patch before & after the
> offending commit, boot with "memblock=debug" added as kernel param and
> post the boot log? The patch will generate some offset warnings after
> the commit but should work fine.
>
> Sam, David, as I'm not familiar with the code base, is it possible to
> tell which address is corrupted (zeroed, it seems)? ie. can we add
> "if (XXX == NULL) printk("%p is corrputed\n"...);" somewhere?

No idea - sorry. I spend most of the time with sparc32 - which I
do not even feel familiar with yet :-(

One thing I noticed while working with memblock for sparc32 (*) is that allocations
are done top-down. So we may end up allocatng memory with a considerably higher
address than we are used to.
This is obviously just a wild guess...

Meelis - do the affected boxes have any special memory configurations?
Could you try to boot with a sensible mem=xxx value to see if limiting the memory
helps.

(*) I have re-done the original patch-set and I have a quite good feeling about it.
HIGHMEM support is outstanding - I got a bit confused when I looked at x86.

But my ss5 crashes the first time I try to use the allocated memory -
so I assume I have some silly issue somewhere. Nothing points at memblock
in this case.

Sam

2012-02-22 17:12:14

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Meelis - do the affected boxes have any special memory configurations?

Nothin special to me. E3500 has 2G, V100 has 1G, V210 and V240 have 2G
and 1.5G.

> Could you try to boot with a sensible mem=xxx value to see if limiting the memory
> helps.

Like mem=256M? Will try.

--
Meelis Roos ([email protected]) http://www.cs.ut.ee/~mroos/

2012-02-22 17:22:01

by Sam Ravnborg

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Wed, Feb 22, 2012 at 07:12:06PM +0200, Meelis Roos wrote:
> > Meelis - do the affected boxes have any special memory configurations?
>
> Nothin special to me. E3500 has 2G, V100 has 1G, V210 and V240 have 2G
> and 1.5G.
>
> > Could you try to boot with a sensible mem=xxx value to see if limiting the memory
> > helps.
>
> Like mem=256M? Will try.
Think just a little more - I do not think this will help.
I confused myself with some of the sparc32 issues I have hit.

I have looked a little at the log files you included.
The only thing that looked different was that the faulty version
had a number after "@" which is higher than 1 - where the OK always have 1.

This is "idx" in memblock_insert_region() - but I did not look closer.

Sam

2012-02-22 17:41:40

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> > > Could you try to boot with a sensible mem=xxx value to see if limiting the memory
> > > helps.
> >
> > Like mem=256M? Will try.
> Think just a little more - I do not think this will help.

Tried it on the 2G V210. It changes the picture. With 2G RAM, it
just hangs.

With mem=256M it produces a crash in strlen and of_alias_scan like in
V100 with 1G.

mem=512M results in the same strlen error.

mem=1G results in a stranger error:

[ 0.000000] Kernel panic - not syncing: ERROR: Failed to allocate 0x90 bytes below 0x0.
[ 0.000000]
[ 0.000000] Call Trace:
[ 0.000000] [00000000007a6a28] memblock_alloc_base+0x28/0x38
[ 0.000000] [000000000079ca50] prom_early_alloc+0xc/0x60
[ 0.000000] [00000000007ae090] of_pdt_create_node.part.0+0x4/0xe0
[ 0.000000] [00000000007ae250] of_pdt_build_devicetree+0x30/0xa0
[ 0.000000] [000000000079c4a8] prom_build_devicetree+0x18/0x38
[ 0.000000] [00000000007a03c0] paging_init+0x59c/0x6bc
[ 0.000000] [000000000079be50] setup_arch+0xf8/0x108
[ 0.000000] [000000000079a4e8] start_kernel+0x78/0x30c
[ 0.000000] [00000000006a3e80] tlb_fixup_done+0x98/0xa0
[ 0.000000] [0000000000000000] (null)

The working machines have 512M RAM, 834M RAM and 2G RAM so it's not just
the amount of RAM.

--
Meelis Roos ([email protected])

2012-02-22 17:48:33

by Tejun Heo

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Wed, Feb 22, 2012 at 02:36:13AM +0200, Meelis Roos wrote:
> > Meelis, can you please apply the following patch before & after the
> > offending commit, boot with "memblock=debug" added as kernel param and
> > post the boot log? The patch will generate some offset warnings after
> > the commit but should work fine.
>
> Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached)
> After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached)

Can you please try the following patch? If it still fails to boot,
please attach the failing log. Thank you.

diff --git a/mm/memblock.c b/mm/memblock.c
index 77b5f22..99f2855 100644
--- a/mm/memblock.c
+++ b/mm/memblock.c
@@ -99,9 +99,6 @@ phys_addr_t __init_memblock memblock_find_in_range_node(phys_addr_t start,
phys_addr_t this_start, this_end, cand;
u64 i;

- /* align @size to avoid excessive fragmentation on reserved array */
- size = round_up(size, align);
-
/* pump up @end */
if (end == MEMBLOCK_ALLOC_ACCESSIBLE)
end = memblock.current_limit;
@@ -731,6 +728,9 @@ static phys_addr_t __init memblock_alloc_base_nid(phys_addr_t size,
{
phys_addr_t found;

+ /* align @size to avoid excessive fragmentation on reserved array */
+ size = round_up(size, align);
+
found = memblock_find_in_range_node(0, max_addr, size, align, nid);
if (found && !memblock_reserve(found, size))
return found;

2012-02-22 18:25:36

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Can you please try the following patch? If it still fails to boot,
> please attach the failing log. Thank you.

It works on E3500! Will try other machines tomorrow.

--
Meelis Roos ([email protected])

2012-02-22 18:30:11

by Richard Mortimer

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On 22/02/2012 00:36, Meelis Roos wrote:
>> Meelis, can you please apply the following patch before& after the
>> offending commit, boot with "memblock=debug" added as kernel param and
>> post the boot log? The patch will generate some offset warnings after
>> the commit but should work fine.
>
> Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached)
> After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached)
>
Its a long time since I regularly had to worry about SPARC boxes (not)
booting so may be the difference between virtual & physical addresses
but I notice that some of the addresses in the register dump have
non-zero values in the upper 32 bits but the memblock values have zero
in the upper half.


memblock reserved: ADD [0x0000007fcc0a40-0x0000007fcc0a4e] node 1
memblock reserved: add [0x0000007fcc0a40-000000007fcc0a4e] node 1 @767

But a similar address in the registers has fffff800 in there.

o4: fffff8007fcc0a4d

I know that there are a number of explanations why things would be
different (32 bit acesses etc) but it could explain things plus we would
be talking 64 bit addresses in the kernel.

Just a thought.

Richard


> In addition, a third type of sparc machines breaks in a third way - V210
> and V240 just hang after telling
>
> console [tty0] enabled, bootconsole disabled
>
> and before calibrating the delay loop. Bisect has led to the same commit.
>

2012-02-22 20:27:48

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: Richard Mortimer <[email protected]>
Date: Wed, 22 Feb 2012 18:22:36 +0000

> memblock reserved: ADD [0x0000007fcc0a40-0x0000007fcc0a4e] node 1
> memblock reserved: add [0x0000007fcc0a40-000000007fcc0a4e] node 1 @767

These are physical addresses.

> But a similar address in the registers has fffff800 in there.
>
> o4: fffff8007fcc0a4d

All of physical memory is mapped linearly starting at 0xfffff80000000000
and this is such a virtual address.

2012-02-22 20:44:29

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: Tejun Heo <[email protected]>
Date: Wed, 22 Feb 2012 09:48:25 -0800

> On Wed, Feb 22, 2012 at 02:36:13AM +0200, Meelis Roos wrote:
>> > Meelis, can you please apply the following patch before & after the
>> > offending commit, boot with "memblock=debug" added as kernel param and
>> > post the boot log? The patch will generate some offset warnings after
>> > the commit but should work fine.
>>
>> Before the commit (v3.2-rc3-75-g0ee332c): memblock1.gz (attached)
>> After the commit (v3.2-rc3-76-g7bd0b0f): memblock2.gz (attached)
>
> Can you please try the following patch? If it still fails to boot,
> please attach the failing log. Thank you.

Interesting, but two things strike me.

First, this seems like it would only cause problems if the caller
specified a too small size parameter, and then wrote past the 'size'
bytes of the buffer. And if so, this means we have an improperly
sized allocation somewhere, probably in the OF tree fetching code.

For example, maybe we mis-calculate the size of an OF device node
property before we fetch it from the firmware, therefore allocate
too small a buffer, and the property fetch operation splats all
over the end of the buffer. Another possibility is that the
property length reported by the firmware is wrong and too small.

BTW, this kind of bug would be easy to catch, simply put a magic
number signature into all unallocated memblock memory then at
allocation time check that signature. If we signal an error when we
don't see the proper signature and turn on the OF tree building
logging, we can see exactly which operation writes past the end of a
buffer.

Second, you'd need similar handling in other call chains such as
memblock_double_array()'s invocation of memblock_find_in_range().
It seems a bad idea to hide how size is modified, so probably it's
best to pass the address of the size parameter and modify the
caller's value in that way so that the size used in the reserve
matches up.

2012-02-22 21:01:05

by Tejun Heo

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

Hello, David.

On Wed, Feb 22, 2012 at 03:44:17PM -0500, David Miller wrote:
> > Can you please try the following patch? If it still fails to boot,
> > please attach the failing log. Thank you.
>
> Interesting, but two things strike me.
>
> First, this seems like it would only cause problems if the caller
> specified a too small size parameter, and then wrote past the 'size'
> bytes of the buffer. And if so, this means we have an improperly
> sized allocation somewhere, probably in the OF tree fetching code.

There's another, less likely, possibility. It made the allocation
table much larger and the lowest address used ended up lower.
0x0000007fc8fa40 vs 0x0000007fc94000. Not too much of difference and
just allocating some more memory should rule out or confirm it.

> For example, maybe we mis-calculate the size of an OF device node
> property before we fetch it from the firmware, therefore allocate
> too small a buffer, and the property fetch operation splats all
> over the end of the buffer. Another possibility is that the
> property length reported by the firmware is wrong and too small.
>
> BTW, this kind of bug would be easy to catch, simply put a magic
> number signature into all unallocated memblock memory then at
> allocation time check that signature. If we signal an error when we
> don't see the proper signature and turn on the OF tree building
> logging, we can see exactly which operation writes past the end of a
> buffer.

Yeah, redzonning can definitely help but I'm not sure whether we want
to go full on allocation debugging and all for early allocator. The
thing doesn't even support freeing.

> Second, you'd need similar handling in other call chains such as
> memblock_double_array()'s invocation of memblock_find_in_range().
> It seems a bad idea to hide how size is modified, so probably it's
> best to pass the address of the size parameter and modify the
> caller's value in that way so that the size used in the reserve
> matches up.

I suspect the size modification was added later to avoid expanding
allocation table early during boot and we can do that only for
memblock_alloc*() calls as they don't have matching free interface.
If we modify explicit reservations, we have to propagate the modified
size to each user and so on. Given that the allocation table is
discarded after boot completion and there aren't too many explicit
reservations, I don't think we need to expand size aligning to all
find_in_range users. I guess it all depends on how complete allocator
we want for early boot.

Thanks.

--
tejun

2012-02-23 18:55:16

by Tejun Heo

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

Hello,

On Wed, Feb 22, 2012 at 08:25:32PM +0200, Meelis Roos wrote:
> > Can you please try the following patch? If it still fails to boot,
> > please attach the failing log. Thank you.
>
> It works on E3500! Will try other machines tomorrow.

Once confirmed, I'll push the patch through tip. It just hides the
underlying problem but we should be in no worse shape than before,
it's two line change so reproduing the problem again for proper
diagnosing isn't difficult, and we're getting a bit late in release
cycle already.

Thanks.

--
tejun

2012-02-23 23:31:59

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: Tejun Heo <[email protected]>
Date: Thu, 23 Feb 2012 10:55:03 -0800

> Hello,
>
> On Wed, Feb 22, 2012 at 08:25:32PM +0200, Meelis Roos wrote:
>> > Can you please try the following patch? If it still fails to boot,
>> > please attach the failing log. Thank you.
>>
>> It works on E3500! Will try other machines tomorrow.
>
> Once confirmed, I'll push the patch through tip. It just hides the
> underlying problem but we should be in no worse shape than before,
> it's two line change so reproduing the problem again for proper
> diagnosing isn't difficult, and we're getting a bit late in release
> cycle already.

Ok.

2012-02-24 09:20:59

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> > > Can you please try the following patch? If it still fails to boot,
> > > please attach the failing log. Thank you.
> >
> > It works on E3500! Will try other machines tomorrow.
>
> Once confirmed, I'll push the patch through tip. It just hides the
> underlying problem but we should be in no worse shape than before,
> it's two line change so reproduing the problem again for proper
> diagnosing isn't difficult, and we're getting a bit late in release
> cycle already.

It cured the V210 too but I could not test V100 since it's offline until
monday.

--
Meelis Roos ([email protected])

2012-02-27 17:17:51

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> > > > Can you please try the following patch? If it still fails to boot,
> > > > please attach the failing log. Thank you.
> > >
> > > It works on E3500! Will try other machines tomorrow.
> >
> > Once confirmed, I'll push the patch through tip. It just hides the
> > underlying problem but we should be in no worse shape than before,
> > it's two line change so reproduing the problem again for proper
> > diagnosing isn't difficult, and we're getting a bit late in release
> > cycle already.
>
> It cured the V210 too but I could not test V100 since it's offline until
> monday.

Tested V100 too, success!

--
Meelis Roos ([email protected])

2012-02-27 19:43:47

by Sam Ravnborg

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

On Mon, Feb 27, 2012 at 07:17:42PM +0200, Meelis Roos wrote:
> > > > > Can you please try the following patch? If it still fails to boot,
> > > > > please attach the failing log. Thank you.
> > > >
> > > > It works on E3500! Will try other machines tomorrow.
> > >
> > > Once confirmed, I'll push the patch through tip. It just hides the
> > > underlying problem but we should be in no worse shape than before,
> > > it's two line change so reproduing the problem again for proper
> > > diagnosing isn't difficult, and we're getting a bit late in release
> > > cycle already.
> >
> > It cured the V210 too but I could not test V100 since it's offline until
> > monday.
>
> Tested V100 too, success!

Hi Meelis.

I have tried to cook up a small patch that verify the length of what
we read - compared to the original length.

Could you try to give this a quick spin and see if something
turns up. I you have time it would be good to try on a box
that worked before and one that was fixed by the patch from Tejun.

I have not looked much at the of stuff - but this looked like the right place to start.

I have no possibility to try it out myself...

Sam

diff --git a/drivers/of/pdt.c b/drivers/of/pdt.c
index 07cc1d6..826204a 100644
--- a/drivers/of/pdt.c
+++ b/drivers/of/pdt.c
@@ -128,6 +128,10 @@ static struct property * __init of_pdt_build_one_prop(phandle node, char *prev,
p->value = prom_early_alloc(p->length + 1);
len = of_pdt_prom_ops->getproperty(node, p->name,
p->value, p->length);
+
+ if (len != p->length)
+ pr_err("prop: %s %d => %d", p->name, p->length, len);
+
if (len <= 0)
p->length = 0;
((unsigned char *)p->value)[p->length] = '\0';
@@ -161,8 +165,13 @@ static char * __init of_pdt_get_one_property(phandle node, const char *name)

len = of_pdt_prom_ops->getproplen(node, name);
if (len > 0) {
+ int proplen;
buf = prom_early_alloc(len);
- len = of_pdt_prom_ops->getproperty(node, name, buf, len);
+ proplen = of_pdt_prom_ops->getproperty(node, name, buf, len);
+
+ if (proplen != len)
+ pr_err("prop: %s %d => %d\n", name, len, proplen);
+
}

return buf;

2012-02-27 21:25:18

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Could you try to give this a quick spin and see if something
> turns up. I you have time it would be good to try on a box
> that worked before and one that was fixed by the patch from Tejun.

Neither of the machines - already working one and "fixed with the
rounding patch" one emit any prot: messages.

--
Meelis Roos ([email protected])

2012-02-27 21:31:00

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: Meelis Roos <[email protected]>
Date: Mon, 27 Feb 2012 23:25:11 +0200 (EET)

>> Could you try to give this a quick spin and see if something
>> turns up. I you have time it would be good to try on a box
>> that worked before and one that was fixed by the patch from Tejun.
>
> Neither of the machines - already working one and "fixed with the
> rounding patch" one emit any prot: messages.

I think the issue is that OF writes past the end of the buffer even
though the length it reports is smaller than what it writes.

That's why we really need to fill the memblock memory with magic
numbers and scan every allocation for free memory with corrupted
magic values.

2012-02-28 21:10:35

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: David Miller <[email protected]>
Date: Mon, 27 Feb 2012 16:30:44 -0500 (EST)

> I think the issue is that OF writes past the end of the buffer even
> though the length it reports is smaller than what it writes.

Meelis, can you get your tree back into a state where the crash happens
and then add the following debugging patch and see what happens?

Thanks!

diff --git a/drivers/of/pdt.c b/drivers/of/pdt.c
index 07cc1d6..367ef33 100644
--- a/drivers/of/pdt.c
+++ b/drivers/of/pdt.c
@@ -125,12 +125,31 @@ static struct property * __init of_pdt_build_one_prop(phandle node, char *prev,
} else {
int len;

+#if 1
+ int i;
+ p->value = prom_early_alloc(p->length + 1 + 64);
+ for (i = p->length + 1; i < p->length + 1 + 64; i++)
+ ((unsigned char *)p->value)[i] = 0xff;
+#else
p->value = prom_early_alloc(p->length + 1);
+#endif
len = of_pdt_prom_ops->getproperty(node, p->name,
p->value, p->length);
- if (len <= 0)
+ if (len <= 0) {
+ pr_info("OF BUG: getproperty(%s, %d) returns %d\n",
+ p->name, p->length, len);
p->length = 0;
+ }
((unsigned char *)p->value)[p->length] = '\0';
+#if 1
+ for (i = p->length + 1; i < p->length + 1 + 64; i++) {
+ if (((unsigned char *)p->value)[i] != 0xff) {
+ pr_info("OF BUG: Write past end of property buffer\n");
+ pr_info("OF BUG: Property name [%s] length [%d] getprop len [%d]\n",
+ p->name, p->length, len);
+ }
+ }
+#endif
}
}
return p;
@@ -161,7 +180,11 @@ static char * __init of_pdt_get_one_property(phandle node, const char *name)

len = of_pdt_prom_ops->getproplen(node, name);
if (len > 0) {
+#if 1
+ buf = prom_early_alloc(len + 64);
+#else
buf = prom_early_alloc(len);
+#endif
len = of_pdt_prom_ops->getproperty(node, name, buf, len);
}

2012-02-28 21:36:14

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> Meelis, can you get your tree back into a state where the crash happens
> and then add the following debugging patch and see what happens?

Tried it, no obvious results in dmesg, except the crash is in a slightly
different location.

[ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03'
[ 0.000000] PROMLIB: Root node compatible:
[ 0.000000] Linux version 3.2.0-rc3-00076-g7bd0b0f-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #84 SMP Tue Feb 28 23:28:49 EET 2012
[ 0.000000] debug: ignoring loglevel setting.
[ 0.000000] bootconsole [earlyprom0] enabled
[ 0.000000] ARCH: SUN4U
[ 0.000000] Ethernet address: 08:00:20:b6:ee:e2
[ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
[ 0.000000] Remapping the kernel... done.
[ 0.000000] Unable to handle kernel paging request at virtual address 000000007fcf2000
[ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
[ 0.000000] tsk->{mm,active_mm}->pgd = fffff800007db7d0
[ 0.000000] \|/ ____ \|/
[ 0.000000] "@'/ .. \`@"
[ 0.000000] /_| \__/ |_\
[ 0.000000] \__U_/
[ 0.000000] swapper(0): Oops [#1]
[ 0.000000] TSTATE: 0000008880e01600 TPC: 000000000057b4c8 TNPC: 000000000057b4cc Y: 00000037 Not tainted
[ 0.000000] TPC: <strcmp+0x8/0x60>
[ 0.000000] g0: 000000000077f7f0 g1: 0000000000000000 g2: 000000000000002f g3: 00000000000000f0
[ 0.000000] g4: 000000000077f350 g5: 0000000000000000 g6: 0000000000760000 g7: 0000000000000050
[ 0.000000] o0: 000000000079dbc8 o1: 0000000000000000 o2: 0000000000000000 o3: 0000000000000002
[ 0.000000] o4: 0000000000000002 o5: 0000000000000000 sp: 0000000000763181 ret_pc: 00000000006a9984
[ 0.000000] RPC: <_raw_read_lock+0x24/0x40>
[ 0.000000] l0: 0000000001028000 l1: fffff8007fcbc380 l2: 8000000000000000 l3: 0800000000000000
[ 0.000000] l4: 0000000000000080 l5: 0000000000000002 l6: 0000000000000000 l7: 0020280000000000
[ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000001010101 i3: 0000000080808080
[ 0.000000] i4: fffff8007fcb8ccd i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606250
[ 0.000000] I7: <of_find_node_by_path+0x30/0x80>
[ 0.000000] Call Trace:
[ 0.000000] [0000000000606250] of_find_node_by_path+0x30/0x80
[ 0.000000] [0000000000606e0c] of_alias_scan+0xcc/0x1c0
[ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c
[ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc
[ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110
[ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c
[ 0.000000] [00000000006918c8] tlb_fixup_done+0xa0/0xa8
[ 0.000000] [0000000000000000] (null)
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Caller[0000000000606250]: of_find_node_by_path+0x30/0x80
[ 0.000000] Caller[0000000000606e0c]: of_alias_scan+0xcc/0x1c0
[ 0.000000] Caller[00000000007c328c]: of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] Caller[00000000007b0680]: prom_build_devicetree+0x10/0x3c
[ 0.000000] Caller[00000000007b4614]: paging_init+0x59c/0x6bc
[ 0.000000] Caller[00000000007afffc]: setup_arch+0xf8/0x110
[ 0.000000] Caller[00000000007ae514]: start_kernel+0x84/0x32c
[ 0.000000] Caller[00000000006918c8]: tlb_fixup_done+0xa0/0xa8
[ 0.000000] Caller[0000000000000000]: (null)
[ 0.000000] Instruction DUMP: 01000000 9de3bf50 82102000 <c40e0001> c60e4001 80a08003 12400008 82006001 80a0a000
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] Call Trace:
[ 0.000000] [000000000069c7fc] panic+0x68/0x1e4
[ 0.000000] [0000000000461a30] do_exit+0x230/0x2c0
[ 0.000000] [00000000004292c0] die_if_kernel+0x180/0x260
[ 0.000000] [000000000069c224] unhandled_fault+0x8c/0x98
[ 0.000000] [0000000000445778] do_kernel_fault+0xd8/0x100
[ 0.000000] [000000000044584c] do_sparc64_fault+0xac/0x540
[ 0.000000] [0000000000407948] sparc64_realfault_common+0x10/0x20
[ 0.000000] [000000000057b4c8] strcmp+0x8/0x60
[ 0.000000] [0000000000606250] of_find_node_by_path+0x30/0x80
[ 0.000000] [0000000000606e0c] of_alias_scan+0xcc/0x1c0
[ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c
[ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc
[ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110
[ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c
[ 0.000000] [00000000006918c8] tlb_fixup_done+0xa0/0xa8
[ 0.000000] Press Stop-A (L1-A) to return to the boot prom

--
Meelis Roos ([email protected])

2012-02-28 22:57:14

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: Meelis Roos <[email protected]>
Date: Tue, 28 Feb 2012 23:36:07 +0200 (EET)

>> Meelis, can you get your tree back into a state where the crash happens
>> and then add the following debugging patch and see what happens?
>
> Tried it, no obvious results in dmesg, except the crash is in a slightly
> different location.

Interesting, the corruption is a little bit different this time, yet similar
to the ones we saw previously:

> [ 0.000000] TPC: <strcmp+0x8/0x60>
...
> [ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000001010101 i3: 0000000080808080
> [ 0.000000] i4: fffff8007fcb8ccd i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606250

This is strcmp(0x000000007fcf3c80, 0xfffff8007fcec480), the first arg is
a bad pointer, somehow the top virtual address bits have been zero'd out.

It comes from dp->full_name, so something walked all over the beginning
of a device_node object.

Let's see if we can figure out anything else about the nature of the
corruption, please add this patch on top.

diff --git a/drivers/of/base.c b/drivers/of/base.c
index 133908a..7c0f7f4 100644
--- a/drivers/of/base.c
+++ b/drivers/of/base.c
@@ -376,6 +376,18 @@ struct device_node *of_find_node_by_path(const char *path)

read_lock(&devtree_lock);
for (; np; np = np->allnext) {
+ if (!np->full_name)
+ continue;
+
+ if ((unsigned long)np->full_name < 0xfffff80000000000) {
+ pr_info("OF BUG: Bogus full_name pointer [%p]\n",
+ np->full_name);
+ pr_info("OF BUG: np[%p] np->name[%p] np->type[%p] np->phandle[0x%08x]\n",
+ np, np->name, np->type, (unsigned int) np->phandle);
+ pr_info("OF BUG: np->name(%s) np->type(%s)\n",
+ np->name, np->type);
+ }
+
if (np->full_name && (of_node_cmp(np->full_name, path) == 0)
&& of_node_get(np))
break;

2012-02-29 06:15:11

by Meelis Roos

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

> > Tried it, no obvious results in dmesg, except the crash is in a slightly
> > different location.
>
> Interesting, the corruption is a little bit different this time, yet similar
> to the ones we saw previously:
>
> > [ 0.000000] TPC: <strcmp+0x8/0x60>
> ...
> > [ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000001010101 i3: 0000000080808080
> > [ 0.000000] i4: fffff8007fcb8ccd i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606250
>
> This is strcmp(0x000000007fcf3c80, 0xfffff8007fcec480), the first arg is
> a bad pointer, somehow the top virtual address bits have been zero'd out.
>
> It comes from dp->full_name, so something walked all over the beginning
> of a device_node object.
>
> Let's see if we can figure out anything else about the nature of the
> corruption, please add this patch on top.

Here it is - triggers this time:

[ 0.000000] PROMLIB: Sun IEEE Boot Prom 'OBP 3.2.30 2002/10/25 14:03'
[ 0.000000] PROMLIB: Root node compatible:
[ 0.000000] Linux version 3.2.0-rc3-00076-g7bd0b0f-dirty (mroos@korvits) (gcc version 4.6.2 (Debian 4.6.2-14) ) #85 SMP Wed Feb 29 08:06:38 EET 2012
[ 0.000000] debug: ignoring loglevel setting.
[ 0.000000] bootconsole [earlyprom0] enabled
[ 0.000000] ARCH: SUN4U
[ 0.000000] Ethernet address: 08:00:20:b6:ee:e2
[ 0.000000] Kernel: Using 4 locked TLB entries for main kernel image.
[ 0.000000] Remapping the kernel... done.
[ 0.000000] OF BUG: Bogus full_name pointer [0000000000730e08]
[ 0.000000] OF BUG: np[fffff8007fcf3f40] np->name[fffff8007fcf3ec0] np->type[0000000000756bf8] np->phandle[0xf0029c88]
[ 0.000000] OF BUG: np->name(SUNW,Ultra-Enterprise) np->type(<NULL>)
[ 0.000000] OF BUG: Bogus full_name pointer [0000000000730e08]
[ 0.000000] OF BUG: np[fffff8007fcf3f40] np->name[fffff8007fcf3ec0] np->type[0000000000756bf8] np->phandle[0xf0029c88]
[ 0.000000] OF BUG: np->name(SUNW,Ultra-Enterprise) np->type(<NULL>)
[ 0.000000] OF BUG: Bogus full_name pointer [0000000000730e08]
[ 0.000000] OF BUG: np[fffff8007fcf3f40] np->name[fffff8007fcf3ec0] np->type[0000000000756bf8] np->phandle[0xf0029c88]
[ 0.000000] OF BUG: np->name(SUNW,Ultra-Enterprise) np->type(<NULL>)
[ 0.000000] OF BUG: Bogus full_name pointer [000000007fcf3c80]
[ 0.000000] OF BUG: np[fffff8007fceacc0] np->name[ (null)] np->type[ (null)] np->phandle[0x00000001]
[ 0.000000] OF BUG: np->name((null)) np->type((null))
[ 0.000000] Unable to handle kernel paging request at virtual address 000000007fcf2000
[ 0.000000] tsk->{mm,active_mm}->context = 0000000000000000
[ 0.000000] tsk->{mm,active_mm}->pgd = fffff800007db7d0
[ 0.000000] \|/ ____ \|/
[ 0.000000] "@'/ .. \`@"
[ 0.000000] /_| \__/ |_\
[ 0.000000] \__U_/
[ 0.000000] swapper(0): Oops [#1]
[ 0.000000] TSTATE: 0000004480e01600 TPC: 000000000057b4c8 TNPC: 000000000057b4cc Y: 00000037 Not tainted
[ 0.000000] TPC: <strcmp+0x8/0x60>
[ 0.000000] g0: 000000000077f7f0 g1: 0000000000000000 g2: 0000000000000000 g3: 0000000000787950
[ 0.000000] g4: 000000000077f350 g5: 0000000000000000 g6: 0000000000760000 g7: 0000000000000040
[ 0.000000] o0: 000000000000003f o1: 0000000000763930 o2: 0000000000000003 o3: 00000000007879e4
[ 0.000000] o4: 000000000080ee45 o5: 000000000080ee1b sp: 0000000000763181 ret_pc: 000000000069cad0
[ 0.000000] RPC: <printk+0x24/0x38>
[ 0.000000] l0: 0000000001028000 l1: fffff8007fcbc380 l2: 8000000000000000 l3: 0800000000000000
[ 0.000000] l4: 0000000000000080 l5: 0000000000000002 l6: 0000000000000000 l7: 0020280000000000
[ 0.000000] i0: 000000007fcf3c80 i1: fffff8007fcec480 i2: 0000000000000000 i3: 0000000000000000
[ 0.000000] i4: 0000000000000001 i5: 0000000000028337 i6: 0000000000763231 i7: 0000000000606278
[ 0.000000] I7: <of_find_node_by_path+0x58/0xe0>
[ 0.000000] Call Trace:
[ 0.000000] [0000000000606278] of_find_node_by_path+0x58/0xe0
[ 0.000000] [0000000000606e6c] of_alias_scan+0xcc/0x1c0
[ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c
[ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc
[ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110
[ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c
[ 0.000000] [0000000000691928] tlb_fixup_done+0xa0/0xa8
[ 0.000000] [0000000000000000] (null)
[ 0.000000] Disabling lock debugging due to kernel taint
[ 0.000000] Caller[0000000000606278]: of_find_node_by_path+0x58/0xe0
[ 0.000000] Caller[0000000000606e6c]: of_alias_scan+0xcc/0x1c0
[ 0.000000] Caller[00000000007c328c]: of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] Caller[00000000007b0680]: prom_build_devicetree+0x10/0x3c
[ 0.000000] Caller[00000000007b4614]: paging_init+0x59c/0x6bc
[ 0.000000] Caller[00000000007afffc]: setup_arch+0xf8/0x110
[ 0.000000] Caller[00000000007ae514]: start_kernel+0x84/0x32c
[ 0.000000] Caller[0000000000691928]: tlb_fixup_done+0xa0/0xa8
[ 0.000000] Caller[0000000000000000]: (null)
[ 0.000000] Instruction DUMP: 01000000 9de3bf50 82102000 <c40e0001> c60e4001 80a08003 12400008 82006001 80a0a000
[ 0.000000] Kernel panic - not syncing: Attempted to kill the idle task!
[ 0.000000] Call Trace:
[ 0.000000] [000000000069c85c] panic+0x68/0x1e4
[ 0.000000] [0000000000461a30] do_exit+0x230/0x2c0
[ 0.000000] [00000000004292c0] die_if_kernel+0x180/0x260
[ 0.000000] [000000000069c284] unhandled_fault+0x8c/0x98
[ 0.000000] [0000000000445778] do_kernel_fault+0xd8/0x100
[ 0.000000] [000000000044584c] do_sparc64_fault+0xac/0x540
[ 0.000000] [0000000000407948] sparc64_realfault_common+0x10/0x20
[ 0.000000] [000000000057b4c8] strcmp+0x8/0x60
[ 0.000000] [0000000000606278] of_find_node_by_path+0x58/0xe0
[ 0.000000] [0000000000606e6c] of_alias_scan+0xcc/0x1c0
[ 0.000000] [00000000007c328c] of_pdt_build_devicetree+0x90/0xa0
[ 0.000000] [00000000007b0680] prom_build_devicetree+0x10/0x3c
[ 0.000000] [00000000007b4614] paging_init+0x59c/0x6bc
[ 0.000000] [00000000007afffc] setup_arch+0xf8/0x110
[ 0.000000] [00000000007ae514] start_kernel+0x84/0x32c
[ 0.000000] [0000000000691928] tlb_fixup_done+0xa0/0xa8
[ 0.000000] Press Stop-A (L1-A) to return to the boot prom

--
Meelis Roos ([email protected])

2012-02-29 06:28:22

by David Miller

[permalink] [raw]
Subject: Re: OF-related boot crash in 3.3.0-rc3-00188-g3ec1e88

From: Meelis Roos <[email protected]>
Date: Wed, 29 Feb 2012 08:15:06 +0200 (EET)

> Here it is - triggers this time:

Thanks a lot.

I need to add some more diagnostics to further narrow it down,
I'll give you a patch for that when I get a chance.