Mario Holbe wrote:
> on 2.6.37-rc7 the b43 driver crashes in hwrng_register(). This makes the
> system virtually unusable since it appears to block networking syscalls.
> This leads to, for example, ifconfig never return.
> This issue does also exist in 2.6.37-rc5.
> This issue does not exist in 2.6.36.2.
>
> The hardware in question is:
> 02:00.0 Network controller [0280]: Broadcom Corporation BCM4312 802.11b/g
LP-PHY [14e4:4315] (rev 01)
> on a Lenovo Ideapad S12 with VIA Nano.
> dmesg excerpt:
> [ 2.056847] b43-pci-bridge 0000:02:00.0: PCI INT A -> GSI 28 (level, low) ->
IRQ 28
> [ 2.056864] b43-pci-bridge 0000:02:00.0: setting latency timer to 64
...
> [ 8.643695] b43-phy0: Broadcom 4312 WLAN found (core revision 15)
> [ 9.047514] ieee80211 phy0: Selected rate control algorithm 'minstrel_ht'
> [ 9.048441] Registered led device: b43-phy0::tx
> [ 9.048479] Registered led device: b43-phy0::rx
> [ 9.048518] Registered led device: b43-phy0::radio
> [ 9.048542] Broadcom 43xx driver loaded [ Features: PMLS, Firmware-ID: FW13 ]
...
> [ 24.312100] b43-phy0: Loading firmware version 410.2160 (2007-05-26 15:32:10)
...
> [ 29.848400] b43-pci-bridge 0000:02:00.0: PCI: Disallowing DAC for device
> [ 29.848407] b43-phy0: DMA mask fallback from 64-bit to 32-bit
> [ 29.868632] BUG: unable to handle kernel paging request at 907cde0c
> [ 29.868640] IP: [<f8d543cc>] hwrng_register+0x4c/0x139 [rng_core]
> [ 29.868655] *pde = 00000000
> [ 29.868659] Oops: 0000 [#1] SMP
> [ 29.868664] last sysfs file: /sys/bus/pci/drivers/parport_pc/uevent
> [ 29.868670] Modules linked in: parport_pc ppdev lp parport sbs sbshc
power_meter pci_slot hed fan container acpi_cpufreq mperf cpufreq_conservative
cpufreq_userspace cpufreq_stats cpufreq_powersave dm_crypt fuse loop eeprom
via_cputemp i2c_dev nvram padlock_aes aes_i586 aes_generic padlock_sha
sha256_generic sha1_generic via_rng msr cpuid snd_hda_codec_realtek
snd_hda_intel snd_hda_codec arc4 snd_hwdep ecb snd_pcm_oss snd_mixer_oss snd_pcm
snd_seq_midi b43 snd_rawmidi uvcvideo snd_seq_midi_event joydev videodev btusb
snd_seq rng_core video ac battery tpm_tis v4l1_compat tpm tpm_bios output
power_supply i2c_viapro snd_timer ideapad_laptop snd_seq_device serio_raw wmi
mac80211 cfg80211 processor snd pcspkr i2c_core psmouse button bluetooth evdev
shpchp soundcore snd_page_alloc rfkill pci_hotplug ext3 jbd mbcache raid10
raid456 async_raid6_recov async_pq raid6_pq async_xor xor async_memcpy async_tx
raid1 raid0 multipath linear md_mod dm_mirror dm_region_hash dm_log dm_mod btrfs zli
b_deflate crc32c libcrc32c sd_mod crc_t10dif ata_generic uhci_hcd pata_via
libata ssb ehci_hcd tg3 scsi_mod usbcore pcmcia via_sdmmc mmc_core pcmcia_core
libphy thermal thermal_sys nls_base [last unloaded: scsi_wait_scan]
> [ 29.868810]
> [ 29.868816] Pid: 1781, comm: NetworkManager Not tainted 2.6.37-rc7-686 #1
MoutCook/20021,2959
> [ 29.868822] EIP: 0060:[<f8d543cc>] EFLAGS: 00010286 CPU: 0
> [ 29.868829] EIP is at hwrng_register+0x4c/0x139 [rng_core]
> [ 29.868834] EAX: 00000001 EBX: f4b17010 ECX: f6e5db6c EDX: f4b17035
> [ 29.868839] ESI: 907cddf0 EDI: 00000000 EBP: 00000036 ESP: f6e5db54
> [ 29.868844] DS: 007b ES: 007b FS: 00d8 GS: 00e0 SS: 0068
> [ 29.868850] Process NetworkManager (pid: 1781, ti=f6e5c000 task=f6eb6080
task.ti=f6e5c000)
> [ 29.868854] Stack:
> [ 29.868856] f4b16fc0 f4b17035 f8e5a870 f4b17035 0000001f f8e70095 f8e6f9ca
f4b71e70
> [ 29.868866] 0000000f f6c95000 f6c95000 f6e97400 f4b162c0 f4b10240 f4b16fc8
f8e5ad67
> [ 29.868875] f89e43da f4b162c0 f6cab400 f8b80e44 f6cab000 f8b70889 f8b6fe7a
00000000
> [ 29.868884] Call Trace:
> [ 29.868909] [<f8e5a870>] ? b43_wireless_core_init+0xd0c/0xdd6 [b43]
> [ 29.868925] [<f8e5ad67>] ? b43_op_start+0xf8/0x142 [b43]
> [ 29.868947] [<f89e43da>] ? cfg80211_netdev_notifier_call+0x342/0x355 [cfg80211]
> [ 29.868984] [<f8b70889>] ? ieee80211_do_open+0xed/0x45f [mac80211]
> [ 29.869002] [<f8b6fe7a>] ? ieee80211_check_concurrent_iface+0x1c/0x135 [mac80211]
> [ 29.869015] [<c11edcba>] ? __dev_open+0x7d/0xa7
> [ 29.869022] [<c11ec683>] ? __dev_change_flags+0x9a/0x10d
> [ 29.869028] [<c11edc12>] ? dev_change_flags+0x10/0x3b
> [ 29.869036] [<c11f7c77>] ? do_setlink+0x23e/0x532
> [ 29.869044] [<c11f803b>] ? rtnl_setlink+0xd0/0xe1
> [ 29.869058] [<c1145b00>] ? __strncpy_from_user+0x1d/0x2b
> [ 29.869064] [<c11f7f6b>] ? rtnl_setlink+0x0/0xe1
> [ 29.869069] [<c11f77a2>] ? rtnetlink_rcv_msg+0x186/0x19c
> [ 29.869075] [<c11f761c>] ? rtnetlink_rcv_msg+0x0/0x19c
> [ 29.869082] [<c1206818>] ? netlink_rcv_skb+0x2d/0x72
> [ 29.869088] [<c11f7616>] ? rtnetlink_rcv+0x18/0x1e
> [ 29.869093] [<c120666c>] ? netlink_unicast+0xba/0x10e
> [ 29.869099] [<c1207170>] ? netlink_sendmsg+0x23d/0x256
> [ 29.869111] [<c11dfe26>] ? __sock_sendmsg+0x48/0x4e
> [ 29.869117] [<c11e008f>] ? sock_sendmsg+0x78/0x8f
> [ 29.869123] [<c11e008f>] ? sock_sendmsg+0x78/0x8f
> [ 29.869131] [<c10c6785>] ? d_kill+0x38/0x3d
> [ 29.869141] [<c11e7f0c>] ? verify_iovec+0x3d/0x79
> [ 29.869147] [<c11e088d>] ? sys_sendmsg+0x15f/0x1c1
> [ 29.869153] [<c11e04c4>] ? sockfd_lookup_light+0x13/0x3f
> [ 29.869160] [<c11e0b25>] ? sys_sendto+0xfd/0x121
> [ 29.869166] [<c11e43eb>] ? sk_prot_alloc+0x62/0xd6
> [ 29.869174] [<c1001e6e>] ? __switch_to+0x6f/0xe2
> [ 29.869183] [<c12860de>] ? schedule+0x579/0x5b6
> [ 29.869190] [<c11e0723>] ? sys_recvmsg+0x3c/0x47
> [ 29.869196] [<c11e1afd>] ? sys_socketcall+0x17f/0x1cb
> [ 29.869202] [<c1002f9f>] ? sysenter_do_call+0x12/0x28
> [ 29.869206] Code: f8 e8 46 25 53 c8 8b 35 ec 45 d5 f8 eb 1a 8b 13 8b 06 e8 17
11 3f c8 85 c0 75 0a be ef ff ff ff e9 d3 00 00 00 8b 76 1c 83 ee 1c <8b> 46 1c
0f 18 00 90 81 fe d0 45 d5 f8 75 d4 83 3d ec 47 d5 f8
> [ 29.869249] EIP: [<f8d543cc>] hwrng_register+0x4c/0x139 [rng_core] SS:ESP
0068:f6e5db54
> [ 29.869259] CR2: 00000000907cde0c
> [ 29.869264] ---[ end trace 6719399ed79e8cc1 ]---
I almost missed this posting. Please post wireless problems with
[email protected] for better visibility.
I have a BCM4312 (14e4:4315) on a netbook that does not have this problem, thus
I will have to rely on your debugging. An additional difficulty is that the only
changes to b43 between 2.6.36 and 2.6.37 are adding an additional PCI ID, some
fixes to the SDIO driver, and some code for an 802.11n device. None of these
should affect your 802.11 b/g unit.
Is it possible for you to bisect between 2.6.36 and 2.6.37-rc5? I wish I could
suggest some way to minimize the number of commits and builds, but the problem
could be anywhere.
Larry
On 12/30/2010 08:34 AM, Mario 'BitKoenig' Holbe wrote:
> On Wed, Dec 29, 2010 at 08:37:10PM -0600, Larry Finger wrote:
>> No, don't bother. I do have a different request. The byte counts for my 32-bit
>> system do not match yours. Could you please use the following command to find
>> the instructions that are failing?
>>
>> objdump -l -d drivers/char/hw_random/core.o | less
>>
>> Use the search to find the start of hwrng_register, then add 0x4c to the
>> starting address. Once I see hte instruction that is failing, I should be able
>> to find where the failure occurs.
>
> Alright, here we go...
>
> [ 30.012695] BUG: unable to handle kernel paging request at 4b28f458
> [ 30.012708] IP: [<f90703cc>] hwrng_register+0x4c/0x139 [rng_core]
>
> 00000380 <hwrng_register>:
> hwrng_register():
> /tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:299
> 380: 56 push %esi
> 381: 53 push %ebx
> ...
> /tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:312
> 3c6: 8b 76 1c mov 0x1c(%esi),%esi
> 3c9: 83 ee 1c sub $0x1c,%esi
> prefetch():
> /tmp/1/linux-source-2.6.37-rc7/arch/x86/include/asm/processor.h:837
> 3cc: 8b 46 1c mov 0x1c(%esi),%eax
> 3cf: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
> hwrng_register():
> /tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:312
> 3d3: 81 fe f8 ff ff ff cmp $0xfffffff8,%esi
> 3d9: 75 d4 jne 3af <hwrng_register+0x2f>
> /tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:319
>
> 312 list_for_each_entry(tmp, &rng_list, list) {
> 313 if (strcmp(tmp->name, rng->name) == 0)
> 314 goto out_unlock;
> 315 }
>
> This is btw. the same data that is accessed in the cat rng_available
> crash via hwrng_attr_available_show():
>
> [ 389.303538] BUG: unable to handle kernel paging request at 288dcb5b
> [ 389.303553] IP: [<f8dda34c>] hwrng_attr_available_show+0x5c/0x90 [rng_core]
>
> 000002f0 <hwrng_attr_available_show>:
> hwrng_attr_available_show():
> /tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:236
> 2f0: 55 push %ebp
> ...
> /tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:245
> 346: 8b 5b 1c mov 0x1c(%ebx),%ebx
> 349: 83 eb 1c sub $0x1c,%ebx
> prefetch():
> /tmp/1/linux-source-2.6.37-rc7/arch/x86/include/asm/processor.h:837
> 34c: 8b 43 1c mov 0x1c(%ebx),%eax
> 34f: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
> hwrng_attr_available_show():
> /tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:245
>
> 245 list_for_each_entry(rng, &rng_list, list) {
> 246 strncat(buf, rng->name, PAGE_SIZE - ret - 1);
> 247 ret += strlen(rng->name);
> 248 strncat(buf, " ", PAGE_SIZE - ret - 1);
> 249 ret++;
> 250 }
The head of the rng_list is damaged. It is initialized at compile time and
should be OK. To help discover the order in which hwrng_register() is called,
apply the attached patch. Run it once with commit 84c164a34ffe67908a installed,
and once with it reverted.
Thanks,
Larry
On 12/29/2010 01:54 PM, Mario 'BitKoenig' Holbe wrote:
> Hello Larry,
>
> On Tue, Dec 28, 2010 at 06:34:08PM -0600, Larry Finger wrote:
>> Mario Holbe wrote:
>>> on 2.6.37-rc7 the b43 driver crashes in hwrng_register(). This makes the
> ...
>>> This issue does also exist in 2.6.37-rc5.
>>> This issue does not exist in 2.6.36.2.
> ...
>>> [ 29.868632] BUG: unable to handle kernel paging request at 907cde0c
>>> [ 29.868640] IP: [<f8d543cc>] hwrng_register+0x4c/0x139 [rng_core]
> ...
>>> [ 29.868884] Call Trace:
>>> [ 29.868909] [<f8e5a870>] ? b43_wireless_core_init+0xd0c/0xdd6 [b43]
>>
>> I almost missed this posting.
>
> You're welcome :)
>
>> Please post wireless problems with
>> [email protected] for better visibility.
>
> Sorry and thanks for completing the CC: list.
>
>> I have a BCM4312 (14e4:4315) on a netbook that does not have this problem, thus
>> I will have to rely on your debugging. An additional difficulty is that the only
>> changes to b43 between 2.6.36 and 2.6.37 are adding an additional PCI ID, some
>> fixes to the SDIO driver, and some code for an 802.11n device. None of these
>> should affect your 802.11 b/g unit.
>>
>> Is it possible for you to bisect between 2.6.36 and 2.6.37-rc5? I wish I could
>> suggest some way to minimize the number of commits and builds, but the problem
>> could be anywhere.
>
> To be honest, I never bisected such a huge amount of commits before and
> I'm somewhat afraid of doing it.
>
> However, I think I'm able to nail the issue down to:
> commit 84c164a34ffe67908a932a2d641ec1a80c2d5435 which went to 2.6.37-rc1.
> Author: John W. Linville <[email protected]>
> Date: Fri Aug 6 15:31:45 2010 -0400
>
> b43: move hwrng registration driver to wireless core initialization
>
> Message-ID: <[email protected]>
> http://marc.info/?l=linux-wireless&m=128112658829379&w=2
>
> I did 2 things:
> 1. I (manually) reverted 84c164a34ffe67908a932a2d641ec1a80c2d5435 from
> 2.6.37-rc7: The crash disappears, b43 is useable.
> 2. I added 84c164a34ffe67908a932a2d641ec1a80c2d5435 to 2.6.36.2: The
> crash shows up as with vanilla 2.6.37-rc7.
>
> I'm not sure why this is not reproducible for you, probably it has
> something to do with the VIA Nano having a second HW-RNG driven by
> via-rng. I experienced crashes in the past with earlier kernels when I
> tried to move RNGs around via /sys/devices/virtual/misc/hw_random, but
> never took the time to trace them down since I just got it working :)
>
> Oh, I'm still able to trigger a crash with
> $ cat /sys/devices/virtual/misc/hw_random/rng_available
> on 2.6.37-rc7 without 84c164a34ffe67908a932a2d641ec1a80c2d5435 as well
> as on vanilla 2.6.36.2. Probably this is (better) reproducible for you?
>
> I suspect both (the 84c164a34ffe67908a932a2d641ec1a80c2d5435 crash as
> well as the cat rng_available crash) having something to do with a
> partially uninitialized rng-struct, or better: parts of the rng-struct
> that are free()d too early (i.e. within its lifetime).
Thanks for finding the problem. Obviously, I did not go back far enough in the
record to find the commit that you implicate.
Please show the output of "egrep "B43|RNG|RANDOM" .config".
It should not matter, but please try the attached patch.
Larry
On Wed, Dec 29, 2010 at 06:30:40PM -0600, Larry Finger wrote:
> On 12/29/2010 01:54 PM, Mario 'BitKoenig' Holbe wrote:
> > I did 2 things:
> > 1. I (manually) reverted 84c164a34ffe67908a932a2d641ec1a80c2d5435 from
> > 2.6.37-rc7: The crash disappears, b43 is useable.
> > 2. I added 84c164a34ffe67908a932a2d641ec1a80c2d5435 to 2.6.36.2: The
> > crash shows up as with vanilla 2.6.37-rc7.
>
> Please show the output of "egrep "B43|RNG|RANDOM" .config".
CONFIG_B43=m
CONFIG_B43_PCI_AUTOSELECT=y
CONFIG_B43_PCICORE_AUTOSELECT=y
CONFIG_B43_PCMCIA=y
CONFIG_B43_SDIO=y
CONFIG_B43_PIO=y
CONFIG_B43_PHY_LP=y
CONFIG_B43_LEDS=y
CONFIG_B43_HWRNG=y
# CONFIG_B43_DEBUG is not set
CONFIG_B43LEGACY=m
CONFIG_B43LEGACY_PCI_AUTOSELECT=y
CONFIG_B43LEGACY_PCICORE_AUTOSELECT=y
CONFIG_B43LEGACY_LEDS=y
CONFIG_B43LEGACY_HWRNG=y
CONFIG_B43LEGACY_DEBUG=y
CONFIG_B43LEGACY_DMA=y
CONFIG_B43LEGACY_PIO=y
CONFIG_B43LEGACY_DMA_AND_PIO_MODE=y
# CONFIG_B43LEGACY_DMA_MODE is not set
# CONFIG_B43LEGACY_PIO_MODE is not set
CONFIG_HW_RANDOM=m
CONFIG_HW_RANDOM_TIMERIOMEM=m
CONFIG_HW_RANDOM_INTEL=m
CONFIG_HW_RANDOM_AMD=m
CONFIG_HW_RANDOM_GEODE=m
CONFIG_HW_RANDOM_VIA=m
CONFIG_HW_RANDOM_VIRTIO=m
CONFIG_SSB_B43_PCI_BRIDGE=y
CONFIG_CRYPTO_RNG=m
CONFIG_CRYPTO_RNG2=y
CONFIG_CRYPTO_ANSI_CPRNG=m
CONFIG_CRYPTO_DEV_HIFN_795X_RNG=y
> It should not matter, but please try the attached patch.
It will surely not matter: if CONFIG_B43_HWRNG would not have been
defined, hwrng_register() would not have been reached in the dump from
my first mail.
If you really like me to try that patch, I'll do so when I'm awake again
and will then answer you that nothing has changed :)
Mario
--
It is a capital mistake to theorize before one has data.
Insensibly one begins to twist facts to suit theories instead of theories
to suit facts. -- Sherlock Holmes by Arthur Conan Doyle
Hello Larry,
On Tue, Dec 28, 2010 at 06:34:08PM -0600, Larry Finger wrote:
> Mario Holbe wrote:
> > on 2.6.37-rc7 the b43 driver crashes in hwrng_register(). This makes the
...
> > This issue does also exist in 2.6.37-rc5.
> > This issue does not exist in 2.6.36.2.
...
> > [ 29.868632] BUG: unable to handle kernel paging request at 907cde0c
> > [ 29.868640] IP: [<f8d543cc>] hwrng_register+0x4c/0x139 [rng_core]
...
> > [ 29.868884] Call Trace:
> > [ 29.868909] [<f8e5a870>] ? b43_wireless_core_init+0xd0c/0xdd6 [b43]
>
> I almost missed this posting.
You're welcome :)
> Please post wireless problems with
> [email protected] for better visibility.
Sorry and thanks for completing the CC: list.
> I have a BCM4312 (14e4:4315) on a netbook that does not have this problem, thus
> I will have to rely on your debugging. An additional difficulty is that the only
> changes to b43 between 2.6.36 and 2.6.37 are adding an additional PCI ID, some
> fixes to the SDIO driver, and some code for an 802.11n device. None of these
> should affect your 802.11 b/g unit.
>
> Is it possible for you to bisect between 2.6.36 and 2.6.37-rc5? I wish I could
> suggest some way to minimize the number of commits and builds, but the problem
> could be anywhere.
To be honest, I never bisected such a huge amount of commits before and
I'm somewhat afraid of doing it.
However, I think I'm able to nail the issue down to:
commit 84c164a34ffe67908a932a2d641ec1a80c2d5435 which went to 2.6.37-rc1.
Author: John W. Linville <[email protected]>
Date: Fri Aug 6 15:31:45 2010 -0400
b43: move hwrng registration driver to wireless core initialization
Message-ID: <[email protected]>
http://marc.info/?l=linux-wireless&m=128112658829379&w=2
I did 2 things:
1. I (manually) reverted 84c164a34ffe67908a932a2d641ec1a80c2d5435 from
2.6.37-rc7: The crash disappears, b43 is useable.
2. I added 84c164a34ffe67908a932a2d641ec1a80c2d5435 to 2.6.36.2: The
crash shows up as with vanilla 2.6.37-rc7.
I'm not sure why this is not reproducible for you, probably it has
something to do with the VIA Nano having a second HW-RNG driven by
via-rng. I experienced crashes in the past with earlier kernels when I
tried to move RNGs around via /sys/devices/virtual/misc/hw_random, but
never took the time to trace them down since I just got it working :)
Oh, I'm still able to trigger a crash with
$ cat /sys/devices/virtual/misc/hw_random/rng_available
on 2.6.37-rc7 without 84c164a34ffe67908a932a2d641ec1a80c2d5435 as well
as on vanilla 2.6.36.2. Probably this is (better) reproducible for you?
I suspect both (the 84c164a34ffe67908a932a2d641ec1a80c2d5435 crash as
well as the cat rng_available crash) having something to do with a
partially uninitialized rng-struct, or better: parts of the rng-struct
that are free()d too early (i.e. within its lifetime).
regards
Mario
--
Doing it right is no excuse for not meeting the schedule.
-- Plant Manager, Delphi Corporation
On Wed, Dec 29, 2010 at 08:37:10PM -0600, Larry Finger wrote:
> No, don't bother. I do have a different request. The byte counts for my 32-bit
> system do not match yours. Could you please use the following command to find
> the instructions that are failing?
>
> objdump -l -d drivers/char/hw_random/core.o | less
>
> Use the search to find the start of hwrng_register, then add 0x4c to the
> starting address. Once I see hte instruction that is failing, I should be able
> to find where the failure occurs.
Alright, here we go...
[ 30.012695] BUG: unable to handle kernel paging request at 4b28f458
[ 30.012708] IP: [<f90703cc>] hwrng_register+0x4c/0x139 [rng_core]
00000380 <hwrng_register>:
hwrng_register():
/tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:299
380: 56 push %esi
381: 53 push %ebx
...
/tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:312
3c6: 8b 76 1c mov 0x1c(%esi),%esi
3c9: 83 ee 1c sub $0x1c,%esi
prefetch():
/tmp/1/linux-source-2.6.37-rc7/arch/x86/include/asm/processor.h:837
3cc: 8b 46 1c mov 0x1c(%esi),%eax
3cf: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
hwrng_register():
/tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:312
3d3: 81 fe f8 ff ff ff cmp $0xfffffff8,%esi
3d9: 75 d4 jne 3af <hwrng_register+0x2f>
/tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:319
312 list_for_each_entry(tmp, &rng_list, list) {
313 if (strcmp(tmp->name, rng->name) == 0)
314 goto out_unlock;
315 }
This is btw. the same data that is accessed in the cat rng_available
crash via hwrng_attr_available_show():
[ 389.303538] BUG: unable to handle kernel paging request at 288dcb5b
[ 389.303553] IP: [<f8dda34c>] hwrng_attr_available_show+0x5c/0x90 [rng_core]
000002f0 <hwrng_attr_available_show>:
hwrng_attr_available_show():
/tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:236
2f0: 55 push %ebp
...
/tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:245
346: 8b 5b 1c mov 0x1c(%ebx),%ebx
349: 83 eb 1c sub $0x1c,%ebx
prefetch():
/tmp/1/linux-source-2.6.37-rc7/arch/x86/include/asm/processor.h:837
34c: 8b 43 1c mov 0x1c(%ebx),%eax
34f: 8d 74 26 00 lea 0x0(%esi,%eiz,1),%esi
hwrng_attr_available_show():
/tmp/1/linux-source-2.6.37-rc7/drivers/char/hw_random/core.c:245
245 list_for_each_entry(rng, &rng_list, list) {
246 strncat(buf, rng->name, PAGE_SIZE - ret - 1);
247 ret += strlen(rng->name);
248 strncat(buf, " ", PAGE_SIZE - ret - 1);
249 ret++;
250 }
regards
Mario
--
The problem in the world today is communication. Too much communication.
-- Homer J. Simpson
On Thu, 2010-12-30 at 21:45 +0100, Mario 'BitKoenig' Holbe wrote:
> On Thu, Dec 30, 2010 at 12:37:21PM -0600, Larry Finger wrote:
> > The head of the rng_list is damaged. It is initialized at compile time and
> > should be OK. To help discover the order in which hwrng_register() is called,
> > apply the attached patch. Run it once with commit 84c164a34ffe67908a installed,
> > and once with it reverted.
>
> All right, 3 dmesg excerpts attached...
> 2.6.37-rc7-vanilla.dmesg:
> 2.6.37-rc7 vanilla (i.e. with 84c164a34ffe67908a), crashing
> via-rng is registered first, b43-rng second
> 2.6.37-rc7-without.dmesg:
> 2.6.37-rc7 with 84c164a34ffe67908a reverted, not crashing
> b43-rng is registered first, via-rng second
> 2.6.37-rc7-without+modprobe.dmesg:
> 2.6.37-rc7 with 84c164a34ffe67908a reverted, b43 blacklisted and
> manually modprobed after via-rng, crashing
> via-rng is registered first, b43-rng second
>
> Seems like the crash shows up when b43-rng is registered second, but not
> when via-rng is registered second.
> Btw.: `cat rng_available' does also not crash when via-rng is registered
> second.
I suspect that there is some "hw_random.h" header version mixup is going
on here. The layout of struct hwrng was changed recently.
Your crash seems to happen on the list head embedded in struct hwrng.
Please make sure that your build environment is clean and you're not
using any external stuff such as compat-wireless. All of hwrng-core,
rng-via and b43 must be compiled against the same hw_random.h.
--
Greetings Michael.
On 12/29/2010 07:20 PM, Mario 'BitKoenig' Holbe wrote:
>
> It will surely not matter: if CONFIG_B43_HWRNG would not have been
> defined, hwrng_register() would not have been reached in the dump from
> my first mail.
>
> If you really like me to try that patch, I'll do so when I'm awake again
> and will then answer you that nothing has changed :)
No, don't bother. I do have a different request. The byte counts for my 32-bit
system do not match yours. Could you please use the following command to find
the instructions that are failing?
objdump -l -d drivers/char/hw_random/core.o | less
Use the search to find the start of hwrng_register, then add 0x4c to the
starting address. Once I see hte instruction that is failing, I should be able
to find where the failure occurs.
The order in which things are registered should not cause an error, but who knows?
Larry
On Thu, Dec 30, 2010 at 12:37:21PM -0600, Larry Finger wrote:
> The head of the rng_list is damaged. It is initialized at compile time and
> should be OK. To help discover the order in which hwrng_register() is called,
> apply the attached patch. Run it once with commit 84c164a34ffe67908a installed,
> and once with it reverted.
All right, 3 dmesg excerpts attached...
2.6.37-rc7-vanilla.dmesg:
2.6.37-rc7 vanilla (i.e. with 84c164a34ffe67908a), crashing
via-rng is registered first, b43-rng second
2.6.37-rc7-without.dmesg:
2.6.37-rc7 with 84c164a34ffe67908a reverted, not crashing
b43-rng is registered first, via-rng second
2.6.37-rc7-without+modprobe.dmesg:
2.6.37-rc7 with 84c164a34ffe67908a reverted, b43 blacklisted and
manually modprobed after via-rng, crashing
via-rng is registered first, b43-rng second
Seems like the crash shows up when b43-rng is registered second, but not
when via-rng is registered second.
Btw.: `cat rng_available' does also not crash when via-rng is registered
second.
regards
Mario
--
> As Luke Leighton said once on samba-ntdom, "now, what was that about
> rebooting? that was so long ago, i had to look it up with man -k."
On 12/30/2010 07:57 PM, Michael Büsch wrote:
> On Thu, 2010-12-30 at 21:45 +0100, Mario 'BitKoenig' Holbe wrote:
>> On Thu, Dec 30, 2010 at 12:37:21PM -0600, Larry Finger wrote:
>>> The head of the rng_list is damaged. It is initialized at compile time and
>>> should be OK. To help discover the order in which hwrng_register() is called,
>>> apply the attached patch. Run it once with commit 84c164a34ffe67908a installed,
>>> and once with it reverted.
>>
>> All right, 3 dmesg excerpts attached...
>> 2.6.37-rc7-vanilla.dmesg:
>> 2.6.37-rc7 vanilla (i.e. with 84c164a34ffe67908a), crashing
>> via-rng is registered first, b43-rng second
>> 2.6.37-rc7-without.dmesg:
>> 2.6.37-rc7 with 84c164a34ffe67908a reverted, not crashing
>> b43-rng is registered first, via-rng second
>> 2.6.37-rc7-without+modprobe.dmesg:
>> 2.6.37-rc7 with 84c164a34ffe67908a reverted, b43 blacklisted and
>> manually modprobed after via-rng, crashing
>> via-rng is registered first, b43-rng second
>>
>> Seems like the crash shows up when b43-rng is registered second, but not
>> when via-rng is registered second.
>> Btw.: `cat rng_available' does also not crash when via-rng is registered
>> second.
>
>
> I suspect that there is some "hw_random.h" header version mixup is going
> on here. The layout of struct hwrng was changed recently.
>
> Your crash seems to happen on the list head embedded in struct hwrng.
>
> Please make sure that your build environment is clean and you're not
> using any external stuff such as compat-wireless. All of hwrng-core,
> rng-via and b43 must be compiled against the same hw_random.h.
AFAIK, he is building with the mainline 2.6.37-rc7/8 tree from Linus, thus the
build should be clean, but thanks for the heads-up.
In an Email from Herbert Xu that did not go to the wireless or b43 lists, it is
suspected that the xstore command on a VIA CPU might generate more than 4 bytes
of output and clobber the list header. We now also know that a second copy of
via-rng will also fail, thus b43 is cleared.
Larry