2013-04-07 12:50:16

by David R

[permalink] [raw]
Subject: AMD Vi error and lost networking with r8169

I'm been seeing some problems with my new ish AMD motherboard/processor
combo and networking (r8169). I see the following page fault :-

Apr 7 12:25:14 david kernel: [156421.436545] AMD-Vi: Event logged
[IO_PAGE_FAULT device=02:00.0 domain=0x0015 address=0x0000000000003000
flags=0x0050]

Followed by the transmit queue timing out. This seems to hit randomly,
sometimes it can take a day or so. A hard reset is the only way to
recover networking.

(Userspace is Ubuntu 10.04, kernel 3.9.0 rc5+)

Cheers
David

Apr 7 12:26:09 david kernel: [156475.568257] ------------[ cut here
]------------
Apr 7 12:26:09 david kernel: [156475.568273] WARNING: at
net/sched/sch_generic.c:255 dev_watchdog+0x250/0x260()
Apr 7 12:26:09 david kernel: [156475.568278] Hardware name: To be
filled by O.E.M.
Apr 7 12:26:09 david kernel: [156475.568282] NETDEV WATCHDOG: eth2
(r8169): transmit queue 0 timed out
Apr 7 12:26:09 david kernel: [156475.568285] Modules linked in: xfs
exportfs libcrc32c nls_iso8859_1 nls_cp437 vfat fat ecryptfs
encrypted_keys nfsv3 nfs_acl nfs lockd sunrpc binfmt_misc ppdev dm_crypt
hid_logitech uvcvideo ff_memless videobuf2_core videodev snd_usb_audio
videobuf2_vmalloc snd_usbmidi_lib videobuf2_memops usbhid
snd_hda_codec_hdmi snd_hda_codec_realtek snd_hda_intel snd_hda_codec
snd_seq_dummy snd_hwdep snd_seq_oss fbcon ttm tileblit font bitblit
snd_pcm_oss softcursor snd_seq_midi drm_kms_helper snd_rawmidi
snd_mixer_oss drm snd_seq_midi_event crc32_pclmul serio_raw snd_pcm
snd_seq i2c_piix4 r8169 snd_timer snd_seq_device mii snd i2c_algo_bit
soundcore snd_page_alloc lp parport
Apr 7 12:26:09 david kernel: [156475.568348] Pid: 0, comm: swapper/0
Not tainted 3.9.0-rc5+ #17
Apr 7 12:26:09 david kernel: [156475.568351] Call Trace:
Apr 7 12:26:09 david kernel: [156475.568354] <IRQ>
[<ffffffff81039c2a>] warn_slowpath_common+0x7a/0xc0
Apr 7 12:26:09 david kernel: [156475.568366] [<ffffffff81039d11>]
warn_slowpath_fmt+0x41/0x50
Apr 7 12:26:09 david kernel: [156475.568374] [<ffffffff81544630>]
dev_watchdog+0x250/0x260
Apr 7 12:26:09 david kernel: [156475.568380] [<ffffffff815443e0>] ?
__netdev_watchdog_up+0x80/0x80
Apr 7 12:26:09 david kernel: [156475.568386] [<ffffffff810497a4>]
call_timer_fn+0x44/0x120
Apr 7 12:26:09 david kernel: [156475.568391] [<ffffffff815443e0>] ?
__netdev_watchdog_up+0x80/0x80
Apr 7 12:26:09 david kernel: [156475.568396] [<ffffffff81049d83>]
run_timer_softirq+0x213/0x280
Apr 7 12:26:09 david kernel: [156475.568402] [<ffffffff81041dbf>]
__do_softirq+0xdf/0x260
Apr 7 12:26:09 david kernel: [156475.568408] [<ffffffff81042025>]
irq_exit+0xb5/0xc0
Apr 7 12:26:09 david kernel: [156475.568413] [<ffffffff81023fc9>]
smp_apic_timer_interrupt+0x69/0xa0
Apr 7 12:26:09 david kernel: [156475.568418] [<ffffffff81638e8a>]
apic_timer_interrupt+0x6a/0x70
Apr 7 12:26:09 david kernel: [156475.568420] <EOI>
[<ffffffff8106e9a5>] ? sched_clock_cpu+0xc5/0x100
Apr 7 12:26:09 david kernel: [156475.568432] [<ffffffff814e5542>] ?
cpuidle_wrap_enter+0x42/0x80
Apr 7 12:26:09 david kernel: [156475.568437] [<ffffffff814e553e>] ?
cpuidle_wrap_enter+0x3e/0x80
Apr 7 12:26:09 david kernel: [156475.568443] [<ffffffff814e5590>]
cpuidle_enter_tk+0x10/0x20
Apr 7 12:26:09 david kernel: [156475.568448] [<ffffffff814e4fc2>]
cpuidle_enter_state+0x12/0x50
Apr 7 12:26:09 david kernel: [156475.568453] [<ffffffff814e57d2>]
cpuidle_idle_call+0xa2/0x100
Apr 7 12:26:09 david kernel: [156475.568459] [<ffffffff8100a3b7>]
cpu_idle+0xc7/0x120
Apr 7 12:26:09 david kernel: [156475.568463] [<ffffffff8161fddd>]
rest_init+0x6d/0x80
Apr 7 12:26:09 david kernel: [156475.568470] [<ffffffff81cc7fe3>]
start_kernel+0x3b6/0x3c3
Apr 7 12:26:09 david kernel: [156475.568475] [<ffffffff81cc7a4d>] ?
repair_env_string+0x5b/0x5b
Apr 7 12:26:09 david kernel: [156475.568481] [<ffffffff81cc75a1>]
x86_64_start_reservations+0x2a/0x2c
Apr 7 12:26:09 david kernel: [156475.568486] [<ffffffff81cc76cc>]
x86_64_start_kernel+0x129/0x130
Apr 7 12:26:09 david kernel: [156475.568489] ---[ end trace
31688db2ca49b077 ]---
Apr 7 12:26:09 david kernel: [156475.720834] r8169 0000:02:00.0 eth2:
link up


Attachments:
dmesg.log.bz2 (13.87 kB)
config.bz2 (22.27 kB)
Download all attachments

2013-04-07 21:53:18

by Francois Romieu

[permalink] [raw]
Subject: Re: AMD Vi error and lost networking with r8169

David R <[email protected]> :
> I'm been seeing some problems with my new ish AMD motherboard/processor
> combo and networking (r8169). I see the following page fault :-
>
> Apr 7 12:25:14 david kernel: [156421.436545] AMD-Vi: Event logged
> [IO_PAGE_FAULT device=02:00.0 domain=0x0015 address=0x0000000000003000
> flags=0x0050]

Can you give the hack below a try ?

diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
index 28fb50a..ed8625d 100644
--- a/drivers/net/ethernet/realtek/r8169.c
+++ b/drivers/net/ethernet/realtek/r8169.c
@@ -4125,6 +4125,8 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
case RTL_GIGA_MAC_VER_23:
case RTL_GIGA_MAC_VER_24:
case RTL_GIGA_MAC_VER_34:
+ case RTL_GIGA_MAC_VER_35:
+ case RTL_GIGA_MAC_VER_36:
RTL_W32(RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST);
break;
default:

2013-04-08 06:15:00

by David R

[permalink] [raw]
Subject: Re: AMD Vi error and lost networking with r8169

Sure. Will apply this evening. It may take several days before I can
report back due to the intermittent nature of the thing.

Thanks
David


Quoting Francois Romieu <[email protected]>:

> David R <[email protected]> :
>> I'm been seeing some problems with my new ish AMD motherboard/processor
>> combo and networking (r8169). I see the following page fault :-
>>
>> Apr 7 12:25:14 david kernel: [156421.436545] AMD-Vi: Event logged
>> [IO_PAGE_FAULT device=02:00.0 domain=0x0015 address=0x0000000000003000
>> flags=0x0050]
>
> Can you give the hack below a try ?
>
> diff --git a/drivers/net/ethernet/realtek/r8169.c
> b/drivers/net/ethernet/realtek/r8169.c
> index 28fb50a..ed8625d 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -4125,6 +4125,8 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
> case RTL_GIGA_MAC_VER_23:
> case RTL_GIGA_MAC_VER_24:
> case RTL_GIGA_MAC_VER_34:
> + case RTL_GIGA_MAC_VER_35:
> + case RTL_GIGA_MAC_VER_36:
> RTL_W32(RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST);
> break;
> default:
>


2013-04-10 19:57:10

by David R

[permalink] [raw]
Subject: Re: AMD Vi error and lost networking with r8169

This is working fine so far - no further hangs, and networking seems
much faster into the bargain. Will report back if it happens again.

Thanks
David

On 07/04/13 22:53, Francois Romieu wrote:
> David R <[email protected]> :
>> I'm been seeing some problems with my new ish AMD motherboard/processor
>> combo and networking (r8169). I see the following page fault :-
>>
>> Apr 7 12:25:14 david kernel: [156421.436545] AMD-Vi: Event logged
>> [IO_PAGE_FAULT device=02:00.0 domain=0x0015 address=0x0000000000003000
>> flags=0x0050]
> Can you give the hack below a try ?
>
> diff --git a/drivers/net/ethernet/realtek/r8169.c b/drivers/net/ethernet/realtek/r8169.c
> index 28fb50a..ed8625d 100644
> --- a/drivers/net/ethernet/realtek/r8169.c
> +++ b/drivers/net/ethernet/realtek/r8169.c
> @@ -4125,6 +4125,8 @@ static void rtl_init_rxcfg(struct rtl8169_private *tp)
> case RTL_GIGA_MAC_VER_23:
> case RTL_GIGA_MAC_VER_24:
> case RTL_GIGA_MAC_VER_34:
> + case RTL_GIGA_MAC_VER_35:
> + case RTL_GIGA_MAC_VER_36:
> RTL_W32(RxConfig, RX128_INT_EN | RX_MULTI_EN | RX_DMA_BURST);
> break;
> default: