LinuxLists.cc - e1000e: nic does not work properly after cold power on

2013-03-04 10:46:39

Subject: e1000e: nic does not work properly after cold power on

Hi,

I apologize in advance for posting on two lists at once and for not
even being subscribed to e1000-devel.

Ever since upgrading to 3.8.x, I'm unable to use my wired connection
(e1000e - Intel Corporation 82579LM Gigabit) immediately after power
on. I have to boot into an older kernel (3.2.0 I happen to have around)
and _then_ boot into 3.8.1. There are no errors in dmesg that I can
see. The only one behaving strangely is NetworkManager who simply says
the device is managed and skips it. Also, there's no 'link becomes
ready' in dmesg either (and I swear the cable is plugged in).

Has anyone else encountered this?

Thanks,

--
Mihai Donțu

Attachments:

(No filename) (652.00 B)
dmesg-3.8.1-e1000e.txt.gz (17.37 kB)
Download all attachments

2013-03-04 15:21:37

by Jiri Slaby

[permalink] [raw]

Subject: Re: e1000e: nic does not work properly after cold power on

On 03/04/2013 11:46 AM, Mihai Donțu wrote:
> Hi,
>
> I apologize in advance for posting on two lists at once and for not
> even being subscribed to e1000-devel.
>
> Ever since upgrading to 3.8.x, I'm unable to use my wired connection
> (e1000e - Intel Corporation 82579LM Gigabit) immediately after power
> on. I have to boot into an older kernel (3.2.0 I happen to have around)
> and _then_ boot into 3.8.1. There are no errors in dmesg that I can
> see. The only one behaving strangely is NetworkManager who simply says
> the device is managed and skips it. Also, there's no 'link becomes
> ready' in dmesg either (and I swear the cable is plugged in).
>
> Has anyone else encountered this?

Hi, I think so:
http://lists.opensuse.org/opensuse-factory/2013-03/msg00099.html

(I have no idea what the issue is, just adding another report.)

--
js
suse labs

2013-03-04 18:21:14

by Morten Stevens

[permalink] [raw]

Subject: Re: e1000e: nic does not work properly after cold power on

On 04.03.2013 11:46, Mihai Donțu wrote:
> Hi,
>
> I apologize in advance for posting on two lists at once and for not
> even being subscribed to e1000-devel.
>
> Ever since upgrading to 3.8.x, I'm unable to use my wired connection
> (e1000e - Intel Corporation 82579LM Gigabit) immediately after power
> on. I have to boot into an older kernel (3.2.0 I happen to have around)
> and _then_ boot into 3.8.1. There are no errors in dmesg that I can
> see. The only one behaving strangely is NetworkManager who simply says
> the device is managed and skips it. Also, there's no 'link becomes
> ready' in dmesg either (and I swear the cable is plugged in).

Hi,

Can you reproduce this with linux 3.9-rc1? 3.9-rc1 has the latest
upstream driver (e1000e 2.2.14) which contains many bugfixes.

Best regards,

Morten

2013-03-04 21:48:49

by Borislav Petkov

[permalink] [raw]

Subject: e1000e 3.9-rc1 suspend failure (was: Re: e1000e: nic does not work properly after cold power on)

On Mon, Mar 04, 2013 at 07:15:07PM +0100, Morten Stevens wrote:
> Can you reproduce this with linux 3.9-rc1? 3.9-rc1 has the latest
> upstream driver (e1000e 2.2.14) which contains many bugfixes.

This e1000e thing gets more b0rked by the minute. This is what happens
when I try to suspend with 3.9-rc1:

[ 83.502908] PM: Syncing filesystems ... done.
[ 83.509886] Freezing user space processes ... (elapsed 0.01 seconds) done.
[ 83.523352] PM: Preallocating image memory... done (allocated 95652 pages)
[ 83.675083] PM: Allocated 382608 kbytes in 0.15 seconds (2550.72 MB/s)
[ 83.675782] Freezing remaining freezable tasks ... (elapsed 0.01 seconds) done.
[ 83.688524] Suspending console(s) (use no_console_suspend to debug)
[ 84.251024] e1000e 0000:00:19.0 eth0: Hardware Error
[ 84.458866] ------------[ cut here ]------------
[ 84.458871] WARNING: at kernel/irq/manage.c:1249 __free_irq+0xa3/0x1e0()
[ 84.458872] Hardware name: 2320CTO
[ 84.458872] Trying to free already-free IRQ 20
[ 84.458898] Modules linked in: cpufreq_powersave cpufreq_userspace cpufreq_conservative cpufreq_stats uinput loop hid_generic usb
hid hid coretemp kvm_intel arc4 kvm crc32_pclmul iwldvm crc32c_intel ghash_clmulni_intel mac80211 aesni_intel xts ipv6 aes_x86_64 lr
w gf128mul ablk_helper cryptd iTCO_wdt iTCO_vendor_support iwlwifi sdhci_pci sdhci cfg80211 snd_hda_codec_hdmi snd_hda_codec_realtek
mmc_core microcode e1000e thinkpad_acpi pcspkr lpc_ich i2c_i801 mfd_core nvram snd_hda_intel rfkill snd_hda_codec battery ac snd_hw
dep led_class snd_pcm snd_page_alloc snd_timer snd acpi_cpufreq soundcore mperf ptp wmi pps_core xhci_hcd ehci_pci ehci_hcd processo
r thermal
[ 84.458900] Pid: 3353, comm: kworker/u:35 Tainted: G W 3.9.0-rc1 #1
[ 84.458901] Call Trace:
[ 84.458905] [<ffffffff8103ef7f>] warn_slowpath_common+0x7f/0xc0
[ 84.458907] [<ffffffff8103f076>] warn_slowpath_fmt+0x46/0x50
[ 84.458910] [<ffffffff81537bfe>] ? _raw_spin_lock_irqsave+0x4e/0x60
[ 84.458911] [<ffffffff810bc8d5>] ? __free_irq+0x55/0x1e0
[ 84.458913] [<ffffffff810bc923>] __free_irq+0xa3/0x1e0
[ 84.458914] [<ffffffff810bcab4>] free_irq+0x54/0xc0
[ 84.458919] [<ffffffffa017745d>] e1000_free_irq+0x7d/0x90 [e1000e]
[ 84.458922] [<ffffffffa01834af>] __e1000_shutdown+0x8f/0x8a0 [e1000e]
[ 84.458924] [<ffffffff813c92a7>] ? __device_suspend+0xb7/0x200
[ 84.458927] [<ffffffff81073b71>] ? get_parent_ip+0x11/0x50
[ 84.458931] [<ffffffffa0183d33>] e1000_suspend+0x23/0x50 [e1000e]
[ 84.458932] [<ffffffff813c92a7>] ? __device_suspend+0xb7/0x200
[ 84.458933] [<ffffffff8153c049>] ? sub_preempt_count+0x79/0xd0
[ 84.458936] [<ffffffff812a2ff5>] pci_pm_freeze+0x55/0xc0
[ 84.458937] [<ffffffff812a2fa0>] ? pci_pm_resume_noirq+0xd0/0xd0
[ 84.458938] [<ffffffff813c8b45>] dpm_run_callback.isra.5+0x25/0x50
[ 84.458939] [<ffffffff813c92d3>] __device_suspend+0xe3/0x200
[ 84.458941] [<ffffffff813c940f>] async_suspend+0x1f/0xa0
[ 84.458942] [<ffffffff8106bcfb>] async_run_entry_fn+0x3b/0x140
[ 84.458944] [<ffffffff8105d00d>] process_one_work+0x1ed/0x510
[ 84.458946] [<ffffffff8105cfab>] ? process_one_work+0x18b/0x510
[ 84.458948] [<ffffffff8105e7b5>] worker_thread+0x115/0x390
[ 84.458949] [<ffffffff8105e6a0>] ? manage_workers+0x300/0x300
[ 84.458951] [<ffffffff81064e2a>] kthread+0xea/0xf0
[ 84.458953] [<ffffffff81064d40>] ? kthread_create_on_node+0x160/0x160
[ 84.458954] [<ffffffff8153ff9c>] ret_from_fork+0x7c/0xb0
[ 84.458955] [<ffffffff81064d40>] ? kthread_create_on_node+0x160/0x160
[ 84.458956] ---[ end trace 3114e23ce50d2357 ]---
[ 85.082276] pci_pm_freeze(): e1000_suspend+0x0/0x50 [e1000e] returns -2
[ 85.082278] dpm_run_callback(): pci_pm_freeze+0x0/0xc0 returns -2
[ 85.082281] PM: Device 0000:00:19.0 failed to freeze async: error -2

Let's add more folks to CC.

--
Regards/Gruss,
Boris.

Sent from a fat crate under my desk. Formatting is fine.
--

2013-03-04 23:06:31

by Mihai Donțu

[permalink] [raw]

Subject: Re: e1000e 3.9-rc1 suspend failure (was: Re: e1000e: nic does not work properly after cold power on)

On Mon, 4 Mar 2013 22:48:30 +0100 Borislav Petkov wrote:
> On Mon, Mar 04, 2013 at 07:15:07PM +0100, Morten Stevens wrote:
> > Can you reproduce this with linux 3.9-rc1? 3.9-rc1 has the latest
> > upstream driver (e1000e 2.2.14) which contains many bugfixes.
>

On my system (ThinkPad T420) I get:

[ 10.694743] e1000e: Intel(R) PRO/1000 Network Driver - 2.2.14-k
[ 10.694746] e1000e: Copyright(c) 1999 - 2013 Intel Corporation.
[ 10.694852] e1000e 0000:00:19.0: setting latency timer to 64
[ 10.694911] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to dynamic conservative mode
[ 10.694949] e1000e 0000:00:19.0: irq 47 for MSI/MSI-X
[ 10.975086] e1000e 0000:00:19.0 eth0: registered PHC clock
[ 10.975091] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1) 00:21:cc:70:17:a0
[ 10.975093] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network Connection
[ 10.975127] e1000e 0000:00:19.0 eth0: MAC: 10, PHY: 11, PBA No: 1000FF-0FF
[ 89.716695] e1000e 0000:00:19.0 eth0: Hardware Error
[ 90.025403] e1000e 0000:00:19.0 eth0: Timesync Tx Control register not set as expected
[ 90.349197] e1000e 0000:00:19.0: irq 47 for MSI/MSI-X
[ 90.449760] e1000e 0000:00:19.0: irq 47 for MSI/MSI-X

The 'hardware error' line caught my attention.

> This e1000e thing gets more b0rked by the minute. This is what happens
> when I try to suspend with 3.9-rc1:
>
> [ 83.502908] PM: Syncing filesystems ... done.
> [ 83.509886] Freezing user space processes ... (elapsed 0.01
> seconds) done. [ 83.523352] PM: Preallocating image memory... done
> (allocated 95652 pages) [ 83.675083] PM: Allocated 382608 kbytes in
> 0.15 seconds (2550.72 MB/s) [ 83.675782] Freezing remaining
> freezable tasks ... (elapsed 0.01 seconds) done. [ 83.688524]
> Suspending console(s) (use no_console_suspend to debug)
> [ 84.251024] e1000e 0000:00:19.0 eth0: Hardware Error
> [ 84.458866] ------------[ cut here ]------------ [ 84.458871]
> WARNING: at kernel/irq/manage.c:1249 __free_irq+0xa3/0x1e0()
> [ 84.458872] Hardware name: 2320CTO [ 84.458872] Trying to free
> already-free IRQ 20 [ 84.458898] Modules linked in:
> cpufreq_powersave cpufreq_userspace cpufreq_conservative
> cpufreq_stats uinput loop hid_generic usb hid hid coretemp kvm_intel
> arc4 kvm crc32_pclmul iwldvm crc32c_intel ghash_clmulni_intel
> mac80211 aesni_intel xts ipv6 aes_x86_64 lr w gf128mul ablk_helper
> cryptd iTCO_wdt iTCO_vendor_support iwlwifi sdhci_pci sdhci cfg80211
> snd_hda_codec_hdmi snd_hda_codec_realtek mmc_core microcode e1000e
> thinkpad_acpi pcspkr lpc_ich i2c_i801 mfd_core nvram snd_hda_intel
> rfkill snd_hda_codec battery ac snd_hw dep led_class snd_pcm
> snd_page_alloc snd_timer snd acpi_cpufreq soundcore mperf ptp wmi
> pps_core xhci_hcd ehci_pci ehci_hcd processo r thermal [ 84.458900]
> Pid: 3353, comm: kworker/u:35 Tainted: G W 3.9.0-rc1 #1
> [ 84.458901] Call Trace: [ 84.458905] [<ffffffff8103ef7f>]
> warn_slowpath_common+0x7f/0xc0 [ 84.458907] [<ffffffff8103f076>]
> warn_slowpath_fmt+0x46/0x50 [ 84.458910] [<ffffffff81537bfe>] ?
> _raw_spin_lock_irqsave+0x4e/0x60 [ 84.458911]
> [<ffffffff810bc8d5>] ? __free_irq+0x55/0x1e0 [ 84.458913]
> [<ffffffff810bc923>] __free_irq+0xa3/0x1e0 [ 84.458914]
> [<ffffffff810bcab4>] free_irq+0x54/0xc0 [ 84.458919]
> [<ffffffffa017745d>] e1000_free_irq+0x7d/0x90 [e1000e]
> [ 84.458922] [<ffffffffa01834af>] __e1000_shutdown+0x8f/0x8a0
> [e1000e] [ 84.458924] [<ffffffff813c92a7>] ?
> __device_suspend+0xb7/0x200 [ 84.458927] [<ffffffff81073b71>] ?
> get_parent_ip+0x11/0x50 [ 84.458931] [<ffffffffa0183d33>]
> e1000_suspend+0x23/0x50 [e1000e] [ 84.458932]
> [<ffffffff813c92a7>] ? __device_suspend+0xb7/0x200 [ 84.458933]
> [<ffffffff8153c049>] ? sub_preempt_count+0x79/0xd0 [ 84.458936]
> [<ffffffff812a2ff5>] pci_pm_freeze+0x55/0xc0 [ 84.458937]
> [<ffffffff812a2fa0>] ? pci_pm_resume_noirq+0xd0/0xd0 [ 84.458938]
> [<ffffffff813c8b45>] dpm_run_callback.isra.5+0x25/0x50
> [ 84.458939] [<ffffffff813c92d3>] __device_suspend+0xe3/0x200
> [ 84.458941] [<ffffffff813c940f>] async_suspend+0x1f/0xa0
> [ 84.458942] [<ffffffff8106bcfb>] async_run_entry_fn+0x3b/0x140
> [ 84.458944] [<ffffffff8105d00d>] process_one_work+0x1ed/0x510
> [ 84.458946] [<ffffffff8105cfab>] ? process_one_work+0x18b/0x510
> [ 84.458948] [<ffffffff8105e7b5>] worker_thread+0x115/0x390
> [ 84.458949] [<ffffffff8105e6a0>] ? manage_workers+0x300/0x300
> [ 84.458951] [<ffffffff81064e2a>] kthread+0xea/0xf0
> [ 84.458953] [<ffffffff81064d40>] ?
> kthread_create_on_node+0x160/0x160 [ 84.458954]
> [<ffffffff8153ff9c>] ret_from_fork+0x7c/0xb0 [ 84.458955]
> [<ffffffff81064d40>] ? kthread_create_on_node+0x160/0x160
> [ 84.458956] ---[ end trace 3114e23ce50d2357 ]--- [ 85.082276]
> pci_pm_freeze(): e1000_suspend+0x0/0x50 [e1000e] returns -2
> [ 85.082278] dpm_run_callback(): pci_pm_freeze+0x0/0xc0 returns -2
> [ 85.082281] PM: Device 0000:00:19.0 failed to freeze async: error
> -2
>
> Let's add more folks to CC.
>

--
Mihai Donțu

2013-03-05 02:12:00

by Allan, Bruce W

[permalink] [raw]

Subject: RE: [E1000-devel] e1000e 3.9-rc1 suspend failure (was: Re: e1000e: nic does not work properly after cold power on)

> -----Original Message-----
> From: Mihai Donțu [mailto:[email protected]]
> Sent: Monday, March 04, 2013 2:59 PM
> To: Morten Stevens
> Cc: [email protected]; [email protected]; linux-
> [email protected]; Rafael J. Wysocki; Borislav Petkov; Jiri Slaby
> Subject: Re: [E1000-devel] e1000e 3.9-rc1 suspend failure (was: Re: e1000e:
> nic does not work properly after cold power on)
>
> On Mon, 4 Mar 2013 22:48:30 +0100 Borislav Petkov wrote:
> > On Mon, Mar 04, 2013 at 07:15:07PM +0100, Morten Stevens wrote:
> > > Can you reproduce this with linux 3.9-rc1? 3.9-rc1 has the latest
> > > upstream driver (e1000e 2.2.14) which contains many bugfixes.
> >
>
> On my system (ThinkPad T420) I get:
>
> [ 10.694743] e1000e: Intel(R) PRO/1000 Network Driver - 2.2.14-k
> [ 10.694746] e1000e: Copyright(c) 1999 - 2013 Intel Corporation.
> [ 10.694852] e1000e 0000:00:19.0: setting latency timer to 64
> [ 10.694911] e1000e 0000:00:19.0: Interrupt Throttling Rate (ints/sec) set to
> dynamic conservative mode
> [ 10.694949] e1000e 0000:00:19.0: irq 47 for MSI/MSI-X
> [ 10.975086] e1000e 0000:00:19.0 eth0: registered PHC clock
> [ 10.975091] e1000e 0000:00:19.0 eth0: (PCI Express:2.5GT/s:Width x1)
> 00:21:cc:70:17:a0
> [ 10.975093] e1000e 0000:00:19.0 eth0: Intel(R) PRO/1000 Network
> Connection
> [ 10.975127] e1000e 0000:00:19.0 eth0: MAC: 10, PHY: 11, PBA No: 1000FF-
> 0FF
> [ 89.716695] e1000e 0000:00:19.0 eth0: Hardware Error
> [ 90.025403] e1000e 0000:00:19.0 eth0: Timesync Tx Control register not set
> as expected
> [ 90.349197] e1000e 0000:00:19.0: irq 47 for MSI/MSI-X
> [ 90.449760] e1000e 0000:00:19.0: irq 47 for MSI/MSI-X
>
> The 'hardware error' line caught my attention.
>
> > This e1000e thing gets more b0rked by the minute. This is what happens
> > when I try to suspend with 3.9-rc1:
> >
> > [ 83.502908] PM: Syncing filesystems ... done.
> > [ 83.509886] Freezing user space processes ... (elapsed 0.01
> > seconds) done. [ 83.523352] PM: Preallocating image memory... done
> > (allocated 95652 pages) [ 83.675083] PM: Allocated 382608 kbytes in
> > 0.15 seconds (2550.72 MB/s) [ 83.675782] Freezing remaining
> > freezable tasks ... (elapsed 0.01 seconds) done. [ 83.688524]
> > Suspending console(s) (use no_console_suspend to debug)
> > [ 84.251024] e1000e 0000:00:19.0 eth0: Hardware Error
> > [ 84.458866] ------------[ cut here ]------------ [ 84.458871]
> > WARNING: at kernel/irq/manage.c:1249 __free_irq+0xa3/0x1e0()
> > [ 84.458872] Hardware name: 2320CTO [ 84.458872] Trying to free
> > already-free IRQ 20 [ 84.458898] Modules linked in:
> > cpufreq_powersave cpufreq_userspace cpufreq_conservative
> > cpufreq_stats uinput loop hid_generic usb hid hid coretemp kvm_intel
> > arc4 kvm crc32_pclmul iwldvm crc32c_intel ghash_clmulni_intel
> > mac80211 aesni_intel xts ipv6 aes_x86_64 lr w gf128mul ablk_helper
> > cryptd iTCO_wdt iTCO_vendor_support iwlwifi sdhci_pci sdhci cfg80211
> > snd_hda_codec_hdmi snd_hda_codec_realtek mmc_core microcode
> e1000e
> > thinkpad_acpi pcspkr lpc_ich i2c_i801 mfd_core nvram snd_hda_intel
> > rfkill snd_hda_codec battery ac snd_hw dep led_class snd_pcm
> > snd_page_alloc snd_timer snd acpi_cpufreq soundcore mperf ptp wmi
> > pps_core xhci_hcd ehci_pci ehci_hcd processo r thermal [ 84.458900]
> > Pid: 3353, comm: kworker/u:35 Tainted: G W 3.9.0-rc1 #1
> > [ 84.458901] Call Trace: [ 84.458905] [<ffffffff8103ef7f>]
> > warn_slowpath_common+0x7f/0xc0 [ 84.458907] [<ffffffff8103f076>]
> > warn_slowpath_fmt+0x46/0x50 [ 84.458910] [<ffffffff81537bfe>] ?
> > _raw_spin_lock_irqsave+0x4e/0x60 [ 84.458911]
> > [<ffffffff810bc8d5>] ? __free_irq+0x55/0x1e0 [ 84.458913]
> > [<ffffffff810bc923>] __free_irq+0xa3/0x1e0 [ 84.458914]
> > [<ffffffff810bcab4>] free_irq+0x54/0xc0 [ 84.458919]
> > [<ffffffffa017745d>] e1000_free_irq+0x7d/0x90 [e1000e]
> > [ 84.458922] [<ffffffffa01834af>] __e1000_shutdown+0x8f/0x8a0
> > [e1000e] [ 84.458924] [<ffffffff813c92a7>] ?
> > __device_suspend+0xb7/0x200 [ 84.458927] [<ffffffff81073b71>] ?
> > get_parent_ip+0x11/0x50 [ 84.458931] [<ffffffffa0183d33>]
> > e1000_suspend+0x23/0x50 [e1000e] [ 84.458932]
> > [<ffffffff813c92a7>] ? __device_suspend+0xb7/0x200 [ 84.458933]
> > [<ffffffff8153c049>] ? sub_preempt_count+0x79/0xd0 [ 84.458936]
> > [<ffffffff812a2ff5>] pci_pm_freeze+0x55/0xc0 [ 84.458937]
> > [<ffffffff812a2fa0>] ? pci_pm_resume_noirq+0xd0/0xd0 [ 84.458938]
> > [<ffffffff813c8b45>] dpm_run_callback.isra.5+0x25/0x50
> > [ 84.458939] [<ffffffff813c92d3>] __device_suspend+0xe3/0x200
> > [ 84.458941] [<ffffffff813c940f>] async_suspend+0x1f/0xa0
> > [ 84.458942] [<ffffffff8106bcfb>] async_run_entry_fn+0x3b/0x140
> > [ 84.458944] [<ffffffff8105d00d>] process_one_work+0x1ed/0x510
> > [ 84.458946] [<ffffffff8105cfab>] ? process_one_work+0x18b/0x510
> > [ 84.458948] [<ffffffff8105e7b5>] worker_thread+0x115/0x390
> > [ 84.458949] [<ffffffff8105e6a0>] ? manage_workers+0x300/0x300
> > [ 84.458951] [<ffffffff81064e2a>] kthread+0xea/0xf0
> > [ 84.458953] [<ffffffff81064d40>] ?
> > kthread_create_on_node+0x160/0x160 [ 84.458954]
> > [<ffffffff8153ff9c>] ret_from_fork+0x7c/0xb0 [ 84.458955]
> > [<ffffffff81064d40>] ? kthread_create_on_node+0x160/0x160
> > [ 84.458956] ---[ end trace 3114e23ce50d2357 ]--- [ 85.082276]
> > pci_pm_freeze(): e1000_suspend+0x0/0x50 [e1000e] returns -2
> > [ 85.082278] dpm_run_callback(): pci_pm_freeze+0x0/0xc0 returns -2
> > [ 85.082281] PM: Device 0000:00:19.0 failed to freeze async: error
> > -2
> >
> > Let's add more folks to CC.

This may be related to some runtime power management issues for which there are a
number of patches currently in test.

????{.n?+???????+%?????ݶ??w??{.n?+????{??G?????{ay?ʇڙ?,j??f???h?????????z_??(?階?ݢj"???m??????G????????????&???~???iO???z??v?^?m????????????I?