2014-01-26 10:55:24

by Sander Eikelenboom

[permalink] [raw]
Subject: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

Hi,

I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
It looks like it's related to the rtl8169 ...

--
Sander

Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
Jan 26 11:36:26 serveerstertje kernel: [ 89.291073] [<ffffffff8194aaaa>] ? __kfree_skb+0x3a/0xb0
Jan 26 11:36:26 serveerstertje kernel: [ 89.301252] [<ffffffff81955a3f>] ? dev_queue_xmit_nit+0x1ef/0x260
Jan 26 11:36:26 serveerstertje kernel: [ 89.311392] [<ffffffff81955850>] ? dev_loopback_xmit+0x1e0/0x1e0
Jan 26 11:36:26 serveerstertje kernel: [ 89.321418] [<ffffffff81959b96>] dev_hard_start_xmit+0x2e6/0x4a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.331236] [<ffffffff819778fe>] sch_direct_xmit+0xfe/0x280
Jan 26 11:36:26 serveerstertje kernel: [ 89.341013] [<ffffffff81959f8c>] __dev_queue_xmit+0x23c/0x630
Jan 26 11:36:26 serveerstertje kernel: [ 89.350668] [<ffffffff81959d50>] ? dev_hard_start_xmit+0x4a0/0x4a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.360264] [<ffffffff81a00ce4>] ? ip_output+0x54/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.369698] [<ffffffff8195a39b>] dev_queue_xmit+0xb/0x10
Jan 26 11:36:26 serveerstertje kernel: [ 89.379034] [<ffffffff819ff2bb>] ip_finish_output+0x2cb/0x670
Jan 26 11:36:26 serveerstertje kernel: [ 89.388373] [<ffffffff81a00ce4>] ? ip_output+0x54/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.397498] [<ffffffff81a00ce4>] ip_output+0x54/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.406584] [<ffffffff819fc141>] ip_forward_finish+0x71/0x1a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.415534] [<ffffffff819fc413>] ip_forward+0x1a3/0x440
Jan 26 11:36:26 serveerstertje kernel: [ 89.424400] [<ffffffff819f9f80>] ip_rcv_finish+0x150/0x650
Jan 26 11:36:26 serveerstertje kernel: [ 89.433108] [<ffffffff819faa1b>] ip_rcv+0x22b/0x370
Jan 26 11:36:26 serveerstertje kernel: [ 89.441737] [<ffffffff81a57322>] ? packet_rcv_spkt+0x42/0x190
Jan 26 11:36:26 serveerstertje kernel: [ 89.450226] [<ffffffff81957382>] __netif_receive_skb_core+0x6d2/0x8a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.458687] [<ffffffff81956dc4>] ? __netif_receive_skb_core+0x114/0x8a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.467109] [<ffffffff81008f50>] ? xen_clocksource_read+0x20/0x30
Jan 26 11:36:26 serveerstertje kernel: [ 89.475362] [<ffffffff81116e09>] ? getnstimeofday+0x9/0x30
Jan 26 11:36:26 serveerstertje kernel: [ 89.483548] [<ffffffff8195756c>] __netif_receive_skb+0x1c/0x70
Jan 26 11:36:26 serveerstertje kernel: [ 89.491608] [<ffffffff819575de>] netif_receive_skb_internal+0x1e/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.499596] [<ffffffff81958ac0>] napi_gro_receive+0x70/0xa0
Jan 26 11:36:26 serveerstertje kernel: [ 89.507486] [<ffffffff81711673>] rtl8169_poll+0x2d3/0x680
Jan 26 11:36:26 serveerstertje kernel: [ 89.515222] [<ffffffff81957a81>] net_rx_action+0x161/0x260
Jan 26 11:36:26 serveerstertje kernel: [ 89.523097] [<ffffffff810c28dd>] __do_softirq+0x11d/0x250
Jan 26 11:36:26 serveerstertje kernel: [ 89.530973] [<ffffffff810c2d72>] irq_exit+0xa2/0xd0
Jan 26 11:36:26 serveerstertje kernel: [ 89.538915] [<ffffffff814f94bf>] xen_evtchn_do_upcall+0x2f/0x40
Jan 26 11:36:26 serveerstertje kernel: [ 89.546876] [<ffffffff81ad83de>] xen_do_hypervisor_callback+0x1e/0x30
Jan 26 11:36:26 serveerstertje kernel: [ 89.554591] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.562139] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.569503] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.576788] [<ffffffff81018748>] ? default_idle+0x18/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.583863] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
Jan 26 11:36:26 serveerstertje kernel: [ 89.590627] [<ffffffff8110b511>] ? cpu_startup_entry+0x91/0x1e0
Jan 26 11:36:26 serveerstertje kernel: [ 89.597184] [<ffffffff81ac0497>] ? rest_init+0xb7/0xc0
Jan 26 11:36:26 serveerstertje kernel: [ 89.603507] [<ffffffff81ac03e0>] ? csum_partial_copy_generic+0x170/0x170
Jan 26 11:36:26 serveerstertje kernel: [ 89.609631] [<ffffffff8230ef1c>] ? start_kernel+0x409/0x416
Jan 26 11:36:26 serveerstertje kernel: [ 89.615490] [<ffffffff8230e912>] ? repair_env_string+0x5e/0x5e
Jan 26 11:36:26 serveerstertje kernel: [ 89.621197] [<ffffffff8230e5f8>] ? x86_64_start_reservations+0x2a/0x2c
Jan 26 11:36:26 serveerstertje kernel: [ 89.626592] [<ffffffff82311e26>] ? xen_start_kernel+0x584/0x586
Jan 26 11:36:26 serveerstertje kernel: [ 89.631933] ---[ end trace 206b59d1fe29b5a7 ]---


2014-01-27 00:03:18

by Francois Romieu

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

Sander Eikelenboom <[email protected]> :
[...]
> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
> It looks like it's related to the rtl8169 ...
>
> --
> Sander
>
> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
[r8169 and xen stuff]

Dan, I miss the part of the debug code that tells where the mappings were
previously set.

--
Ueimor

2014-01-29 03:06:26

by Dan Williams

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Sun, Jan 26, 2014 at 4:03 PM, Francois Romieu <[email protected]> wrote:
> Sander Eikelenboom <[email protected]> :
> [...]
>> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
>> It looks like it's related to the rtl8169 ...
>>
>> --
>> Sander
>>
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
> [r8169 and xen stuff]
>
> Dan, I miss the part of the debug code that tells where the mappings were
> previously set.

In this case it was a facepalm mistake on my part. The mappings were
not being properly accounted in the last revision of the patch I sent.
I copied you on the fix [1].

--
Dan

[1]: http://marc.info/?l=linux-netdev&m=139096447627032&w=2

2014-02-06 11:36:41

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

Hi Dan / Francois,

Didn't have time to test it before, but the patch doesn't seem to help.
I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
but i see now i forgot to mention i use r8169.use_dac=1 ...

Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)

--
Sander

Wednesday, January 29, 2014, 4:06:24 AM, you wrote:

> On Sun, Jan 26, 2014 at 4:03 PM, Francois Romieu <[email protected]> wrote:
>> Sander Eikelenboom <[email protected]> :
>> [...]
>>> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
>>> It looks like it's related to the rtl8169 ...
>>>
>>> --
>>> Sander
>>>
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
>> [r8169 and xen stuff]
>>
>> Dan, I miss the part of the debug code that tells where the mappings were
>> previously set.

> In this case it was a facepalm mistake on my part. The mappings were
> not being properly accounted in the last revision of the patch I sent.
> I copied you on the fix [1].

> --
> Dan

> [1]: http://marc.info/?l=linux-netdev&m=139096447627032&w=2

2014-02-06 13:09:25

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...

Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?

--
Sander

complete stacktrace:

[ 342.710738] ------------[ cut here ]------------
[ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
[ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
[ 342.759510] Modules linked in:
[ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
[ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
[ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
[ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
[ 342.854572] Call Trace:
[ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
[ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
[ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
[ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
[ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
[ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
[ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
[ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
[ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
[ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
[ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
[ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
[ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
[ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
[ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
[ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
[ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
[ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
[ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
[ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
[ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
[ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
[ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
[ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
[ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
[ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
[ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
[ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
[ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
[ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
[ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
[ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
[ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
[ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
[ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
[ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
[ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
[ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
[ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
[ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
[ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
[ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
[ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
[ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
[ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
[ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
[ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
[ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
[ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---





Thursday, February 6, 2014, 12:36:31 PM, you wrote:

> Hi Dan / Francois,

> Didn't have time to test it before, but the patch doesn't seem to help.
> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
> but i see now i forgot to mention i use r8169.use_dac=1 ...

> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)

> --
> Sander

> Wednesday, January 29, 2014, 4:06:24 AM, you wrote:

>> On Sun, Jan 26, 2014 at 4:03 PM, Francois Romieu <[email protected]> wrote:
>>> Sander Eikelenboom <[email protected]> :
>>> [...]
>>>> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
>>>> It looks like it's related to the rtl8169 ...
>>>>
>>>> --
>>>> Sander
>>>>
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
>>> [r8169 and xen stuff]
>>>
>>> Dan, I miss the part of the debug code that tells where the mappings were
>>> previously set.

>> In this case it was a facepalm mistake on my part. The mappings were
>> not being properly accounted in the last revision of the patch I sent.
>> I copied you on the fix [1].

>> --
>> Dan

>> [1]: http://marc.info/?l=linux-netdev&m=139096447627032&w=2




--
Best regards,
Sander mailto:[email protected]

2014-02-06 14:26:13

by Dan Williams

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Thu, Feb 6, 2014 at 5:09 AM, Sander Eikelenboom <[email protected]> wrote:
> Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...
>
> Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?
>
> --
> Sander
>
> complete stacktrace:
>
> [ 342.710738] ------------[ cut here ]------------
> [ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
> [ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
> [ 342.759510] Modules linked in:
> [ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
> [ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
> [ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
> [ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
> [ 342.854572] Call Trace:
> [ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
> [ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
> [ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
> [ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
> [ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
> [ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
> [ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
> [ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
> [ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
> [ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
> [ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
> [ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
> [ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
> [ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
> [ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
> [ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
> [ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
> [ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
> [ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
> [ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
> [ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
> [ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
> [ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
> [ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
> [ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
> [ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
> [ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
> [ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
> [ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
> [ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
> [ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
> [ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
> [ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
> [ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
> [ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
> [ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
> [ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
> [ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
> [ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
> [ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
> [ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
> [ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
> [ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
> [ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
> [ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
> [ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
> [ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
> [ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
> [ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
> [ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
> [ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---
>
>
>
>
>
> Thursday, February 6, 2014, 12:36:31 PM, you wrote:
>
>> Hi Dan / Francois,
>
>> Didn't have time to test it before, but the patch doesn't seem to help.
>> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
>> but i see now i forgot to mention i use r8169.use_dac=1 ...
>
>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)

If you are still hitting this with the patch:

59f2e7df574c dma-debug: fix overlap detection

...then I'm more inclined to think it is an actual positive report.

If you don't mind I'll send some debug patches to narrow this down.

2014-02-06 14:27:29

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe


Thursday, February 6, 2014, 3:26:09 PM, you wrote:

> On Thu, Feb 6, 2014 at 5:09 AM, Sander Eikelenboom <[email protected]> wrote:
>> Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...
>>
>> Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?
>>
>> --
>> Sander
>>
>> complete stacktrace:
>>
>> [ 342.710738] ------------[ cut here ]------------
>> [ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
>> [ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
>> [ 342.759510] Modules linked in:
>> [ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
>> [ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> [ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
>> [ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
>> [ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
>> [ 342.854572] Call Trace:
>> [ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
>> [ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>> [ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>> [ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
>> [ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
>> [ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
>> [ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
>> [ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
>> [ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
>> [ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
>> [ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
>> [ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
>> [ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
>> [ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
>> [ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
>> [ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
>> [ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
>> [ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
>> [ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
>> [ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
>> [ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
>> [ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
>> [ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
>> [ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
>> [ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
>> [ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
>> [ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
>> [ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
>> [ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
>> [ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
>> [ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
>> [ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
>> [ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
>> [ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
>> [ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
>> [ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
>> [ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
>> [ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
>> [ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
>> [ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
>> [ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
>> [ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
>> [ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
>> [ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
>> [ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
>> [ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
>> [ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---
>>
>>
>>
>>
>>
>> Thursday, February 6, 2014, 12:36:31 PM, you wrote:
>>
>>> Hi Dan / Francois,
>>
>>> Didn't have time to test it before, but the patch doesn't seem to help.
>>> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
>>> but i see now i forgot to mention i use r8169.use_dac=1 ...
>>
>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)

> If you are still hitting this with the patch:

> 59f2e7df574c dma-debug: fix overlap detection

> ...then I'm more inclined to think it is an actual positive report.

> If you don't mind I'll send some debug patches to narrow this down.

Please do .. sounds better than bisecting :-)

2014-02-06 19:12:26

by Dan Williams

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Thu, Feb 6, 2014 at 6:27 AM, Sander Eikelenboom <[email protected]> wrote:
>>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
>
>> If you are still hitting this with the patch:
>
>> 59f2e7df574c dma-debug: fix overlap detection
>
>> ...then I'm more inclined to think it is an actual positive report.
>
>> If you don't mind I'll send some debug patches to narrow this down.
>
> Please do .. sounds better than bisecting :-)
>

Hi, attached is a patch that should give some insight whether the
driver is triggering many overlapping mappings. Try it on top of
3.14-rc1.

Thank you for the debug help!


Attachments:
debug-overlap (1.92 kB)

2014-02-07 10:21:44

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe


Thursday, February 6, 2014, 8:12:15 PM, you wrote:

> On Thu, Feb 6, 2014 at 6:27 AM, Sander Eikelenboom <[email protected]> wrote:
>>>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
>>
>>> If you are still hitting this with the patch:
>>
>>> 59f2e7df574c dma-debug: fix overlap detection
>>
>>> ...then I'm more inclined to think it is an actual positive report.
>>
>>> If you don't mind I'll send some debug patches to narrow this down.
>>
>> Please do .. sounds better than bisecting :-)
>>

> Hi, attached is a patch that should give some insight whether the
> driver is triggering many overlapping mappings. Try it on top of
> 3.14-rc1.

> Thank you for the debug help!

Hi Dan,

Nifty feature the trace_printk .. however is there a way to limit the list it's spitting out
to what you are interesting in ?

At present the machine chokes while trying to spit out everything in one go and:
- it probably not of all of it is logged to disk because of all the rcu stalls and other problems it causes.
- the list on console at least looked a lot longer (and in the logs i don't see the original warn_on which should
be just before the dump.

However .. attached is what i have got ...

--
Sander



Attachments:
overlap_log.txt (327.83 kB)

2014-02-11 19:56:29

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

Hi Dan,

FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.

I tried something else .. don't know if it gives you anymore insights, but it's worth the try:

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 2defd13..0fe5b75 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
return overlap;
}

-static void active_pfn_inc_overlap(unsigned long pfn)
+static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
{
- int overlap = active_pfn_read_overlap(pfn);
+ int overlap = active_pfn_read_overlap(ent->pfn);

- overlap = active_pfn_set_overlap(pfn, ++overlap);
+ overlap = active_pfn_set_overlap(ent->pfn, ++overlap);

/* If we overflowed the overlap counter then we're potentially
* leaking dma-mappings. Otherwise, if maps and unmaps are
@@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
* debug_dma_assert_idle() as the pfn may be marked idle
* prematurely.
*/
+
WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
"DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
- ACTIVE_PFN_MAX_OVERLAP, pfn);
+ ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
+
+ if(overlap > ACTIVE_PFN_MAX_OVERLAP){
+
+ dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
+ int idx;
+
+ for (idx = 0; idx < HASH_SIZE; idx++) {
+ struct hash_bucket *bucket = &dma_entry_hash[idx];
+ struct dma_debug_entry *entry;
+ unsigned long flags;
+
+ list_for_each_entry(entry, &bucket->list, list) {
+ if (entry->pfn == ent->pfn) {
+ dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
+ type2name[entry->type], idx,
+ phys_addr(entry), entry->pfn,
+ entry->dev_addr, entry->size,
+ dir2name[entry->direction],
+ maperr2str[entry->map_err_type]);
+ }
+ }
+ }
+ dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
+ }
}


@@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)

spin_lock_irqsave(&radix_lock, flags);
rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
- if (rc == -EEXIST)
- active_pfn_inc_overlap(entry->pfn);
+ if (rc == -EEXIST){
+ active_pfn_inc_overlap(entry);
+ }
spin_unlock_irqrestore(&radix_lock, flags);
-
return rc;
}


This results in:
[ 27.708678] r8169 0000:0a:00.0 eth1: link down
[ 27.712102] r8169 0000:0a:00.0 eth1: link down
[ 28.015340] r8169 0000:0b:00.0 eth0: link down
[ 28.015368] r8169 0000:0b:00.0 eth0: link down
[ 29.654844] r8169 0000:0b:00.0 eth0: link up
[ 30.278542] r8169 0000:0a:00.0 eth1: link up
[ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 218.073407] ------------[ cut here ]------------
[ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
[ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
[ 218.095988] Modules linked in:
[ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
[ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
[ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
[ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
[ 218.140154] Call Trace:
[ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
[ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
[ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
[ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
[ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
[ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
[ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
[ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
[ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
[ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
[ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
[ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
[ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
[ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
[ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
[ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
[ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
[ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
[ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
[ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
[ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
[ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
[ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
[ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
[ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
[ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
[ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
[ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
[ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
[ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
[ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
[ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
[ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
[ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
[ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
[ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
[ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
[ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
[ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
[ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
[ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
[ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
[ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
[ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
[ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
[ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
[ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
[ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
[ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
[ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
[ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
[ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
[ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
[ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked
[ 218.424677] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. end of dump
[ 218.429050] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. start dump
[ 218.432225] r8169 0000:0b:00.0: single idx 571 P=3c423040 N=3c423 D=c76040 L=36 DMA_TO_DEVICE dma map error checked
[ 218.435408] r8169 0000:0b:00.0: single idx 571 P=3c423200 N=3c423 D=c77200 L=36 DMA_TO_DEVICE dma map error checked
[ 218.438578] r8169 0000:0b:00.0: single idx 572 P=3c4233c0 N=3c423 D=c783c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.441695] r8169 0000:0b:00.0: single idx 572 P=3c423580 N=3c423 D=c79580 L=7b DMA_TO_DEVICE dma map error checked
[ 218.444783] r8169 0000:0b:00.0: single idx 573 P=3c423780 N=3c423 D=c7a780 L=9b DMA_TO_DEVICE dma map error checked
[ 218.447825] r8169 0000:0b:00.0: single idx 573 P=3c4239c0 N=3c423 D=c7b9c0 L=6b DMA_TO_DEVICE dma map error checked
[ 218.450844] r8169 0000:0b:00.0: single idx 574 P=3c423bc0 N=3c423 D=c7cbc0 L=7b DMA_TO_DEVICE dma map error checked
[ 218.453814] r8169 0000:0b:00.0: single idx 574 P=3c423dc0 N=3c423 D=c7ddc0 L=7b DMA_TO_DEVICE dma map error checked
[ 218.456793] r8169 0000:0b:00.0: single idx 575 P=3c423fc0 N=3c423 D=c7efc0 L=7b DMA_TO_DEVICE dma map error not checked
[ 218.459772] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. end of dump
[ 218.473504] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. start dump
[ 218.475662] r8169 0000:0b:00.0: single idx 586 P=3c7160c0 N=3c716 D=c940c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.477874] r8169 0000:0b:00.0: single idx 586 P=3c716280 N=3c716 D=c95280 L=36 DMA_TO_DEVICE dma map error checked
[ 218.480075] r8169 0000:0b:00.0: single idx 587 P=3c716440 N=3c716 D=c96440 L=36 DMA_TO_DEVICE dma map error checked
[ 218.482245] r8169 0000:0b:00.0: single idx 587 P=3c716600 N=3c716 D=c97600 L=36 DMA_TO_DEVICE dma map error checked
[ 218.484390] r8169 0000:0b:00.0: single idx 588 P=3c7167c0 N=3c716 D=c987c0 L=42 DMA_TO_DEVICE dma map error checked
[ 218.486510] r8169 0000:0b:00.0: single idx 588 P=3c7169c0 N=3c716 D=c999c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.488603] r8169 0000:0b:00.0: single idx 589 P=3c716b80 N=3c716 D=c9ab80 L=42 DMA_TO_DEVICE dma map error checked
[ 218.490682] r8169 0000:0b:00.0: single idx 589 P=3c716d80 N=3c716 D=c9bd80 L=42 DMA_TO_DEVICE dma map error checked
[ 218.492735] r8169 0000:0b:00.0: single idx 590 P=3c716f80 N=3c716 D=c9cf80 L=42 DMA_TO_DEVICE dma map error not checked
[ 218.494788] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. end of dump

--
Sander





Thursday, February 6, 2014, 3:26:09 PM, you wrote:

> On Thu, Feb 6, 2014 at 5:09 AM, Sander Eikelenboom <[email protected]> wrote:
>> Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...
>>
>> Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?
>>
>> --
>> Sander
>>
>> complete stacktrace:
>>
>> [ 342.710738] ------------[ cut here ]------------
>> [ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
>> [ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
>> [ 342.759510] Modules linked in:
>> [ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
>> [ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> [ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
>> [ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
>> [ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
>> [ 342.854572] Call Trace:
>> [ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
>> [ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>> [ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>> [ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
>> [ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
>> [ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
>> [ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
>> [ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
>> [ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
>> [ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
>> [ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
>> [ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
>> [ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
>> [ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
>> [ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
>> [ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
>> [ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
>> [ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
>> [ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
>> [ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
>> [ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
>> [ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
>> [ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
>> [ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
>> [ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
>> [ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
>> [ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
>> [ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
>> [ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
>> [ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
>> [ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
>> [ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
>> [ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
>> [ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
>> [ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
>> [ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
>> [ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
>> [ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
>> [ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
>> [ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
>> [ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
>> [ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
>> [ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
>> [ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
>> [ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
>> [ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
>> [ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---
>>
>>
>>
>>
>>
>> Thursday, February 6, 2014, 12:36:31 PM, you wrote:
>>
>>> Hi Dan / Francois,
>>
>>> Didn't have time to test it before, but the patch doesn't seem to help.
>>> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
>>> but i see now i forgot to mention i use r8169.use_dac=1 ...
>>
>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)

> If you are still hitting this with the patch:

> 59f2e7df574c dma-debug: fix overlap detection

> ...then I'm more inclined to think it is an actual positive report.

> If you don't mind I'll send some debug patches to narrow this down.

2014-02-11 21:28:56

by Eric Dumazet

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, 2014-02-11 at 20:56 +0100, Sander Eikelenboom wrote:
> Hi Dan,
>
> FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.
>
> I tried something else .. don't know if it gives you anymore insights, but it's worth the try:
>
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 2defd13..0fe5b75 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
> return overlap;
> }
>
> -static void active_pfn_inc_overlap(unsigned long pfn)
> +static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
> {
> - int overlap = active_pfn_read_overlap(pfn);
> + int overlap = active_pfn_read_overlap(ent->pfn);
>
> - overlap = active_pfn_set_overlap(pfn, ++overlap);
> + overlap = active_pfn_set_overlap(ent->pfn, ++overlap);
>
> /* If we overflowed the overlap counter then we're potentially
> * leaking dma-mappings. Otherwise, if maps and unmaps are
> @@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
> * debug_dma_assert_idle() as the pfn may be marked idle
> * prematurely.
> */
> +
> WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
> "DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
> - ACTIVE_PFN_MAX_OVERLAP, pfn);
> + ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> +
> + if(overlap > ACTIVE_PFN_MAX_OVERLAP){
> +
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + int idx;
> +
> + for (idx = 0; idx < HASH_SIZE; idx++) {
> + struct hash_bucket *bucket = &dma_entry_hash[idx];
> + struct dma_debug_entry *entry;
> + unsigned long flags;
> +
> + list_for_each_entry(entry, &bucket->list, list) {
> + if (entry->pfn == ent->pfn) {
> + dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
> + type2name[entry->type], idx,
> + phys_addr(entry), entry->pfn,
> + entry->dev_addr, entry->size,
> + dir2name[entry->direction],
> + maperr2str[entry->map_err_type]);
> + }
> + }
> + }
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + }
> }
>
>
> @@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)
>
> spin_lock_irqsave(&radix_lock, flags);
> rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
> - if (rc == -EEXIST)
> - active_pfn_inc_overlap(entry->pfn);
> + if (rc == -EEXIST){
> + active_pfn_inc_overlap(entry);
> + }
> spin_unlock_irqrestore(&radix_lock, flags);
> -
> return rc;
> }
>
>
> This results in:
> [ 27.708678] r8169 0000:0a:00.0 eth1: link down
> [ 27.712102] r8169 0000:0a:00.0 eth1: link down
> [ 28.015340] r8169 0000:0b:00.0 eth0: link down
> [ 28.015368] r8169 0000:0b:00.0 eth0: link down
> [ 29.654844] r8169 0000:0b:00.0 eth0: link up
> [ 30.278542] r8169 0000:0a:00.0 eth1: link up
> [ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 218.073407] ------------[ cut here ]------------
> [ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
> [ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
> [ 218.095988] Modules linked in:
> [ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
> [ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
> [ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
> [ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
> [ 218.140154] Call Trace:
> [ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
> [ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
> [ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
> [ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
> [ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
> [ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
> [ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
> [ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
> [ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
> [ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
> [ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
> [ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
> [ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
> [ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
> [ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
> [ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
> [ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
> [ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
> [ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
> [ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
> [ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
> [ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
> [ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
> [ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
> [ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
> [ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
> [ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
> [ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
> [ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
> [ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
> [ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
> [ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
> [ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
> [ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
> [ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
> [ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
> [ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
> [ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
> [ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
> [ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
> [ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
> [ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
> [ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
> [ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
> [ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
> [ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
> [ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
> [ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked
> [ 218.424677] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. end of dump
> [ 218.429050] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. start dump
> [ 218.432225] r8169 0000:0b:00.0: single idx 571 P=3c423040 N=3c423 D=c76040 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.435408] r8169 0000:0b:00.0: single idx 571 P=3c423200 N=3c423 D=c77200 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.438578] r8169 0000:0b:00.0: single idx 572 P=3c4233c0 N=3c423 D=c783c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.441695] r8169 0000:0b:00.0: single idx 572 P=3c423580 N=3c423 D=c79580 L=7b DMA_TO_DEVICE dma map error checked
> [ 218.444783] r8169 0000:0b:00.0: single idx 573 P=3c423780 N=3c423 D=c7a780 L=9b DMA_TO_DEVICE dma map error checked
> [ 218.447825] r8169 0000:0b:00.0: single idx 573 P=3c4239c0 N=3c423 D=c7b9c0 L=6b DMA_TO_DEVICE dma map error checked
> [ 218.450844] r8169 0000:0b:00.0: single idx 574 P=3c423bc0 N=3c423 D=c7cbc0 L=7b DMA_TO_DEVICE dma map error checked
> [ 218.453814] r8169 0000:0b:00.0: single idx 574 P=3c423dc0 N=3c423 D=c7ddc0 L=7b DMA_TO_DEVICE dma map error checked
> [ 218.456793] r8169 0000:0b:00.0: single idx 575 P=3c423fc0 N=3c423 D=c7efc0 L=7b DMA_TO_DEVICE dma map error not checked
> [ 218.459772] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. end of dump
> [ 218.473504] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. start dump
> [ 218.475662] r8169 0000:0b:00.0: single idx 586 P=3c7160c0 N=3c716 D=c940c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.477874] r8169 0000:0b:00.0: single idx 586 P=3c716280 N=3c716 D=c95280 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.480075] r8169 0000:0b:00.0: single idx 587 P=3c716440 N=3c716 D=c96440 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.482245] r8169 0000:0b:00.0: single idx 587 P=3c716600 N=3c716 D=c97600 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.484390] r8169 0000:0b:00.0: single idx 588 P=3c7167c0 N=3c716 D=c987c0 L=42 DMA_TO_DEVICE dma map error checked
> [ 218.486510] r8169 0000:0b:00.0: single idx 588 P=3c7169c0 N=3c716 D=c999c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.488603] r8169 0000:0b:00.0: single idx 589 P=3c716b80 N=3c716 D=c9ab80 L=42 DMA_TO_DEVICE dma map error checked
> [ 218.490682] r8169 0000:0b:00.0: single idx 589 P=3c716d80 N=3c716 D=c9bd80 L=42 DMA_TO_DEVICE dma map error checked
> [ 218.492735] r8169 0000:0b:00.0: single idx 590 P=3c716f80 N=3c716 D=c9cf80 L=42 DMA_TO_DEVICE dma map error not checked
> [ 218.494788] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. end of dump
>
> --
> Sander
>


Incoming frames might be taken out of order-3 pages.

With regular Ethernet frames, this is 21 frames per order-3 pages.

ACTIVE_PFN_MAX_OVERLAP seems too small.

Alternative would be to user order-0 only pages if CONFIG_DMA_API_DEBUG
is set. Not sure if it works if PAGE_SIZE=66536 ....

diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f589c9af8cbf..1b9995adfd29 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1924,7 +1924,11 @@ static inline void __skb_queue_purge(struct sk_buff_head *list)
kfree_skb(skb);
}

+#if defined(CONFIG_DMA_API_DEBUG)
+#define NETDEV_FRAG_PAGE_MAX_ORDER 0
+#else
#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
+#endif
#define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
#define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE


2014-02-11 22:53:13

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe


Tuesday, February 11, 2014, 10:28:52 PM, you wrote:

> On Tue, 2014-02-11 at 20:56 +0100, Sander Eikelenboom wrote:
>> Hi Dan,
>>
>> FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.
>>
>> I tried something else .. don't know if it gives you anymore insights, but it's worth the try:
>>
>> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
>> index 2defd13..0fe5b75 100644
>> --- a/lib/dma-debug.c
>> +++ b/lib/dma-debug.c
>> @@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
>> return overlap;
>> }
>>
>> -static void active_pfn_inc_overlap(unsigned long pfn)
>> +static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
>> {
>> - int overlap = active_pfn_read_overlap(pfn);
>> + int overlap = active_pfn_read_overlap(ent->pfn);
>>
>> - overlap = active_pfn_set_overlap(pfn, ++overlap);
>> + overlap = active_pfn_set_overlap(ent->pfn, ++overlap);
>>
>> /* If we overflowed the overlap counter then we're potentially
>> * leaking dma-mappings. Otherwise, if maps and unmaps are
>> @@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
>> * debug_dma_assert_idle() as the pfn may be marked idle
>> * prematurely.
>> */
>> +
>> WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
>> "DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
>> - ACTIVE_PFN_MAX_OVERLAP, pfn);
>> + ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
>> +
>> + if(overlap > ACTIVE_PFN_MAX_OVERLAP){
>> +
>> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
>> + int idx;
>> +
>> + for (idx = 0; idx < HASH_SIZE; idx++) {
>> + struct hash_bucket *bucket = &dma_entry_hash[idx];
>> + struct dma_debug_entry *entry;
>> + unsigned long flags;
>> +
>> + list_for_each_entry(entry, &bucket->list, list) {
>> + if (entry->pfn == ent->pfn) {
>> + dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
>> + type2name[entry->type], idx,
>> + phys_addr(entry), entry->pfn,
>> + entry->dev_addr, entry->size,
>> + dir2name[entry->direction],
>> + maperr2str[entry->map_err_type]);
>> + }
>> + }
>> + }
>> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
>> + }
>> }
>>
>>
>> @@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)
>>
>> spin_lock_irqsave(&radix_lock, flags);
>> rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
>> - if (rc == -EEXIST)
>> - active_pfn_inc_overlap(entry->pfn);
>> + if (rc == -EEXIST){
>> + active_pfn_inc_overlap(entry);
>> + }
>> spin_unlock_irqrestore(&radix_lock, flags);
>> -
>> return rc;
>> }
>>
>>
>> This results in:
>> [ 27.708678] r8169 0000:0a:00.0 eth1: link down
>> [ 27.712102] r8169 0000:0a:00.0 eth1: link down
>> [ 28.015340] r8169 0000:0b:00.0 eth0: link down
>> [ 28.015368] r8169 0000:0b:00.0 eth0: link down
>> [ 29.654844] r8169 0000:0b:00.0 eth0: link up
>> [ 30.278542] r8169 0000:0a:00.0 eth1: link up
>> [ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 218.073407] ------------[ cut here ]------------
>> [ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
>> [ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
>> [ 218.095988] Modules linked in:
>> [ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
>> [ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> [ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
>> [ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
>> [ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
>> [ 218.140154] Call Trace:
>> [ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
>> [ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
>> [ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
>> [ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
>> [ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
>> [ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
>> [ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
>> [ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
>> [ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
>> [ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
>> [ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
>> [ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
>> [ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
>> [ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
>> [ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
>> [ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
>> [ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
>> [ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
>> [ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
>> [ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
>> [ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
>> [ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
>> [ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
>> [ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
>> [ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
>> [ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
>> [ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
>> [ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
>> [ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
>> [ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
>> [ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
>> [ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
>> [ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
>> [ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
>> [ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
>> [ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
>> [ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
>> [ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
>> [ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
>> [ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
>> [ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
>> [ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
>> [ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
>> [ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
>> [ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
>> [ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
>> [ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
>> [ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
>> [ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
>> [ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked
>> [ 218.424677] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. end of dump
>> [ 218.429050] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. start dump
>> [ 218.432225] r8169 0000:0b:00.0: single idx 571 P=3c423040 N=3c423 D=c76040 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.435408] r8169 0000:0b:00.0: single idx 571 P=3c423200 N=3c423 D=c77200 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.438578] r8169 0000:0b:00.0: single idx 572 P=3c4233c0 N=3c423 D=c783c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.441695] r8169 0000:0b:00.0: single idx 572 P=3c423580 N=3c423 D=c79580 L=7b DMA_TO_DEVICE dma map error checked
>> [ 218.444783] r8169 0000:0b:00.0: single idx 573 P=3c423780 N=3c423 D=c7a780 L=9b DMA_TO_DEVICE dma map error checked
>> [ 218.447825] r8169 0000:0b:00.0: single idx 573 P=3c4239c0 N=3c423 D=c7b9c0 L=6b DMA_TO_DEVICE dma map error checked
>> [ 218.450844] r8169 0000:0b:00.0: single idx 574 P=3c423bc0 N=3c423 D=c7cbc0 L=7b DMA_TO_DEVICE dma map error checked
>> [ 218.453814] r8169 0000:0b:00.0: single idx 574 P=3c423dc0 N=3c423 D=c7ddc0 L=7b DMA_TO_DEVICE dma map error checked
>> [ 218.456793] r8169 0000:0b:00.0: single idx 575 P=3c423fc0 N=3c423 D=c7efc0 L=7b DMA_TO_DEVICE dma map error not checked
>> [ 218.459772] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. end of dump
>> [ 218.473504] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. start dump
>> [ 218.475662] r8169 0000:0b:00.0: single idx 586 P=3c7160c0 N=3c716 D=c940c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.477874] r8169 0000:0b:00.0: single idx 586 P=3c716280 N=3c716 D=c95280 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.480075] r8169 0000:0b:00.0: single idx 587 P=3c716440 N=3c716 D=c96440 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.482245] r8169 0000:0b:00.0: single idx 587 P=3c716600 N=3c716 D=c97600 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.484390] r8169 0000:0b:00.0: single idx 588 P=3c7167c0 N=3c716 D=c987c0 L=42 DMA_TO_DEVICE dma map error checked
>> [ 218.486510] r8169 0000:0b:00.0: single idx 588 P=3c7169c0 N=3c716 D=c999c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.488603] r8169 0000:0b:00.0: single idx 589 P=3c716b80 N=3c716 D=c9ab80 L=42 DMA_TO_DEVICE dma map error checked
>> [ 218.490682] r8169 0000:0b:00.0: single idx 589 P=3c716d80 N=3c716 D=c9bd80 L=42 DMA_TO_DEVICE dma map error checked
>> [ 218.492735] r8169 0000:0b:00.0: single idx 590 P=3c716f80 N=3c716 D=c9cf80 L=42 DMA_TO_DEVICE dma map error not checked
>> [ 218.494788] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. end of dump
>>
>> --
>> Sander
>>


> Incoming frames might be taken out of order-3 pages.

> With regular Ethernet frames, this is 21 frames per order-3 pages.

> ACTIVE_PFN_MAX_OVERLAP seems too small.

> Alternative would be to user order-0 only pages if CONFIG_DMA_API_DEBUG
> is set. Not sure if it works if PAGE_SIZE=66536 ....

> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f589c9af8cbf..1b9995adfd29 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1924,7 +1924,11 @@ static inline void __skb_queue_purge(struct sk_buff_head *list)
> kfree_skb(skb);
> }
>
> +#if defined(CONFIG_DMA_API_DEBUG)
> +#define NETDEV_FRAG_PAGE_MAX_ORDER 0
> +#else
> #define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
> +#endif
> #define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
> #define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE
>


Hi Eric,

Just tested your patch .. but the warning still persists.

[ 193.004554] ------------[ cut here ]------------
[ 193.034237] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
[ 193.069895] DMA-API: exceeded 7 overlapping mappings of pfn 4da0f
[ 193.100538] Modules linked in:
[ 193.121839] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug7+ #1
[ 193.166335] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 193.202382] 0000000000000009 ffff88005f6037d8 ffffffff81b80984 ffffffff822134e0
[ 193.236534] ffff88005f603828 ffff88005f603818 ffffffff810c985c 0000000000000000
[ 193.270616] 00000000ffffffef 0000000000000036 ffff880057ade240 ffffffff822102e0
[ 193.304533] Call Trace:
[ 193.323492] <IRQ> [<ffffffff81b80984>] dump_stack+0x46/0x58
[ 193.352157] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
[ 193.381448] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
[ 193.409801] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
[ 193.440265] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
[ 193.467674] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
[ 193.496441] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
[ 193.524986] [<ffffffff81a01837>] ? dev_queue_xmit_nit+0x1d7/0x260
[ 193.553937] [<ffffffff81a0184f>] ? dev_queue_xmit_nit+0x1ef/0x260
[ 193.582610] [<ffffffff81a01665>] ? dev_queue_xmit_nit+0x5/0x260
[ 193.610487] [<ffffffff81a065df>] dev_hard_start_xmit+0x37f/0x590
[ 193.638573] [<ffffffff81a26c6e>] sch_direct_xmit+0xfe/0x280
[ 193.665292] [<ffffffff81a06a3f>] __dev_queue_xmit+0x24f/0x660
[ 193.692467] [<ffffffff81a067f5>] ? __dev_queue_xmit+0x5/0x660
[ 193.719507] [<ffffffff81ab2179>] ? ip_output+0x59/0xf0
[ 193.744469] [<ffffffff81a06e70>] dev_queue_xmit+0x10/0x20
[ 193.769895] [<ffffffff81ab072b>] ip_finish_output+0x2cb/0x670
[ 193.796220] [<ffffffff81ab2179>] ? ip_output+0x59/0xf0
[ 193.820722] [<ffffffff81ab2179>] ip_output+0x59/0xf0
[ 193.844674] [<ffffffff81aad556>] ip_forward_finish+0x76/0x1a0
[ 193.870977] [<ffffffff81aad82b>] ip_forward+0x1ab/0x440
[ 193.895737] [<ffffffff81114b3b>] ? lock_is_held+0x8b/0xb0
[ 193.920781] [<ffffffff81aab340>] ip_rcv_finish+0x150/0x660
[ 193.945803] [<ffffffff81aabdfb>] ip_rcv+0x22b/0x370
[ 193.968865] [<ffffffff81b09b87>] ? packet_rcv_spkt+0x47/0x190
[ 193.994340] [<ffffffff81a03232>] __netif_receive_skb_core+0x722/0x8f0
[ 194.021716] [<ffffffff81a02c35>] ? __netif_receive_skb_core+0x125/0x8f0
[ 194.049498] [<ffffffff8100b0c0>] ? xen_clocksource_read+0x20/0x30
[ 194.075755] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
[ 194.100131] [<ffffffff81a03421>] __netif_receive_skb+0x21/0x70
[ 194.125592] [<ffffffff81a03493>] netif_receive_skb_internal+0x23/0xf0
[ 194.152650] [<ffffffff81a04ced>] napi_gro_receive+0x8d/0x100
[ 194.177127] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
[ 194.200779] [<ffffffff81a039ca>] net_rx_action+0x18a/0x2c0
[ 194.224573] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
[ 194.248255] [<ffffffff810ce767>] __do_softirq+0x137/0x300
[ 194.271722] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
[ 194.293407] [<ffffffff8157e4b5>] xen_evtchn_do_upcall+0x35/0x50
[ 194.318007] [<ffffffff81b8dd1e>] xen_do_hypervisor_callback+0x1e/0x30
[ 194.343990] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 194.370744] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 194.395710] [<ffffffff8100ad20>] ? xen_safe_halt+0x10/0x20
[ 194.418397] [<ffffffff81020632>] ? default_idle+0x32/0xd0
[ 194.440557] [<ffffffff81020f86>] ? arch_cpu_idle+0x36/0x40
[ 194.462799] [<ffffffff81120a01>] ? cpu_startup_entry+0xa1/0x2a0
[ 194.486276] [<ffffffff81b7561c>] ? rest_init+0xbc/0xd0
[ 194.507451] [<ffffffff81b75565>] ? rest_init+0x5/0xd0
[ 194.528115] [<ffffffff82341f8e>] ? start_kernel+0x40e/0x41b
[ 194.550139] [<ffffffff8234197f>] ? repair_env_string+0x5e/0x5e
[ 194.572888] [<ffffffff823415f8>] ? x86_64_start_reservations+0x2a/0x2c
[ 194.597693] [<ffffffff82344ef2>] ? xen_start_kernel+0x586/0x588
[ 194.620610] ---[ end trace ecd65b3bd15959c4 ]---
[ 194.639349] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 4da0f .. start dump
[ 194.671379] r8169 0000:0b:00.0: single idx 500 P=4da0f040 N=4da0f D=53abe8040 L=36 DMA_TO_DEVICE dma map error checked
[ 194.708307] r8169 0000:0b:00.0: single idx 500 P=4da0f200 N=4da0f D=53abe8200 L=36 DMA_TO_DEVICE dma map error checked
[ 194.745122] r8169 0000:0b:00.0: single idx 500 P=4da0f3c0 N=4da0f D=53abe83c0 L=36 DMA_TO_DEVICE dma map error checked
[ 194.781859] r8169 0000:0b:00.0: single idx 500 P=4da0f580 N=4da0f D=53abe8580 L=36 DMA_TO_DEVICE dma map error checked
[ 194.818520] r8169 0000:0b:00.0: single idx 500 P=4da0f740 N=4da0f D=53abe8740 L=36 DMA_TO_DEVICE dma map error checked
[ 194.855038] r8169 0000:0b:00.0: single idx 500 P=4da0f900 N=4da0f D=53abe8900 L=36 DMA_TO_DEVICE dma map error checked
[ 194.891475] r8169 0000:0b:00.0: single idx 500 P=4da0fac0 N=4da0f D=53abe8ac0 L=36 DMA_TO_DEVICE dma map error checked
[ 194.927796] r8169 0000:0b:00.0: single idx 500 P=4da0fc80 N=4da0f D=53abe8c80 L=7b DMA_TO_DEVICE dma map error checked
[ 194.964115] r8169 0000:0b:00.0: single idx 500 P=4da0fe80 N=4da0f D=53abe8e80 L=36 DMA_TO_DEVICE dma map error not checked
[ 195.001427] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 4da0f .. end of dump

--
Sander

2014-02-12 02:07:15

by Dan Williams

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, Feb 11, 2014 at 11:56 AM, Sander Eikelenboom
<[email protected]> wrote:
> Hi Dan,
>
> FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.
>
> I tried something else .. don't know if it gives you anymore insights, but it's worth the try:

This is great! See below:

>
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 2defd13..0fe5b75 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
> return overlap;
> }
>
> -static void active_pfn_inc_overlap(unsigned long pfn)
> +static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
> {
> - int overlap = active_pfn_read_overlap(pfn);
> + int overlap = active_pfn_read_overlap(ent->pfn);
>
> - overlap = active_pfn_set_overlap(pfn, ++overlap);
> + overlap = active_pfn_set_overlap(ent->pfn, ++overlap);
>
> /* If we overflowed the overlap counter then we're potentially
> * leaking dma-mappings. Otherwise, if maps and unmaps are
> @@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
> * debug_dma_assert_idle() as the pfn may be marked idle
> * prematurely.
> */
> +
> WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
> "DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
> - ACTIVE_PFN_MAX_OVERLAP, pfn);
> + ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> +
> + if(overlap > ACTIVE_PFN_MAX_OVERLAP){
> +
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + int idx;
> +
> + for (idx = 0; idx < HASH_SIZE; idx++) {
> + struct hash_bucket *bucket = &dma_entry_hash[idx];
> + struct dma_debug_entry *entry;
> + unsigned long flags;
> +
> + list_for_each_entry(entry, &bucket->list, list) {
> + if (entry->pfn == ent->pfn) {
> + dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
> + type2name[entry->type], idx,
> + phys_addr(entry), entry->pfn,
> + entry->dev_addr, entry->size,
> + dir2name[entry->direction],
> + maperr2str[entry->map_err_type]);
> + }
> + }
> + }
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + }
> }
>
>
> @@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)
>
> spin_lock_irqsave(&radix_lock, flags);
> rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
> - if (rc == -EEXIST)
> - active_pfn_inc_overlap(entry->pfn);
> + if (rc == -EEXIST){
> + active_pfn_inc_overlap(entry);
> + }
> spin_unlock_irqrestore(&radix_lock, flags);
> -
> return rc;
> }
>
>
> This results in:
> [ 27.708678] r8169 0000:0a:00.0 eth1: link down
> [ 27.712102] r8169 0000:0a:00.0 eth1: link down
> [ 28.015340] r8169 0000:0b:00.0 eth0: link down
> [ 28.015368] r8169 0000:0b:00.0 eth0: link down
> [ 29.654844] r8169 0000:0b:00.0 eth0: link up
> [ 30.278542] r8169 0000:0a:00.0 eth1: link up
> [ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 218.073407] ------------[ cut here ]------------
> [ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
> [ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
> [ 218.095988] Modules linked in:
> [ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
> [ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
> [ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
> [ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
> [ 218.140154] Call Trace:
> [ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
> [ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
> [ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
> [ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
> [ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
> [ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
> [ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
> [ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
> [ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
> [ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
> [ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
> [ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
> [ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
> [ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
> [ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
> [ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
> [ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
> [ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
> [ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
> [ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
> [ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
> [ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
> [ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
> [ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
> [ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
> [ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
> [ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
> [ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
> [ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
> [ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
> [ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
> [ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
> [ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
> [ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
> [ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
> [ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
> [ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
> [ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
> [ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
> [ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
> [ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
> [ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
> [ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
> [ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
> [ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
> [ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
> [ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
> [ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked

The overlap granularity is too large. Multiple dma_map_single
mappings are allowed to a given page as long as they don't collide on
the same cache line.


Please try the attached patch to see if it fixes this issue. Works ok for me.


Attachments:
fix-dma-debug-overlap.patch (9.61 kB)

2014-02-12 04:17:46

by Eric Dumazet

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:

> The overlap granularity is too large. Multiple dma_map_single
> mappings are allowed to a given page as long as they don't collide on
> the same cache line.
>

I am not sure why you try number of mappings of a page.

Try launching 100 concurrent netperf -t TCP_SENFILE

Same page might be mapped more than 100 times, more than 10000 times in
some cases.



2014-02-12 14:56:40

by Dan Williams

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, Feb 11, 2014 at 8:17 PM, Eric Dumazet <[email protected]> wrote:
> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>
>> The overlap granularity is too large. Multiple dma_map_single
>> mappings are allowed to a given page as long as they don't collide on
>> the same cache line.
>>
>
> I am not sure why you try number of mappings of a page.

For this debug facility I am tracking whether dma has completed by
making sure there are no active dma_map entries in the address range
of a page being cow'd.

> Try launching 100 concurrent netperf -t TCP_SENFILE
>
> Same page might be mapped more than 100 times, more than 10000 times in
> some cases.
>

Aren't these mapping serialized by the device to some extent?
Although multi-queue / multi-device would even defeat that...

Hmm, then I think at a minimum the activity tracking needs to be
constrained to overlapping DMA_FROM_DEVICE or DMA_BIDIRECTIONAL
mappings. However, I am still operating on the assumption that some
architectures (especially non-io-coherent or dmabounce architectures)
expect a dma mapping to reflect exclusive ownership of the buffer.
>From the conversation I had with Russell, back in the day [1]:

"When we get to the second async_xor(), as we haven't started to run any
of these operations, the source and destination buffers are still mapped.
However, we ignore that and call dma_map_page() on them again - this is
illegal because the CPU does not own these buffers."

It might be the case that we can't have a general overlap detection
facility as it will flag stable use cases that nonetheless violate the
exclusivity expectation.

--
Dan

[1]: http://marc.info/?l=linux-arm-kernel&m=129389649101566&w=2

2014-02-12 22:53:05

by Ben Hutchings

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, 2014-02-11 at 13:28 -0800, Eric Dumazet wrote:
[...]
> Incoming frames might be taken out of order-3 pages.
>
> With regular Ethernet frames, this is 21 frames per order-3 pages.
>
> ACTIVE_PFN_MAX_OVERLAP seems too small.
>
> Alternative would be to user order-0 only pages if CONFIG_DMA_API_DEBUG
> is set. Not sure if it works if PAGE_SIZE=66536 ....

Indeed, you can get a lot of packet buffers into a 64K page...

> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f589c9af8cbf..1b9995adfd29 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1924,7 +1924,11 @@ static inline void __skb_queue_purge(struct sk_buff_head *list)
> kfree_skb(skb);
> }
>
> +#if defined(CONFIG_DMA_API_DEBUG)
> +#define NETDEV_FRAG_PAGE_MAX_ORDER 0
> +#else
> #define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
> +#endif
> #define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
> #define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE
>

That may be useful for debugging this particular problem, but please
don't make debugging options change behaviour like this.

Ben.

--
Ben Hutchings
If more than one person is responsible for a bug, no one is at fault.


Attachments:
signature.asc (811.00 B)
This is a digitally signed message part

2014-02-13 20:14:55

by Dan Williams

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>
> > The overlap granularity is too large. Multiple dma_map_single
> > mappings are allowed to a given page as long as they don't collide on
> > the same cache line.
> >
>
> I am not sure why you try number of mappings of a page.
>
> Try launching 100 concurrent netperf -t TCP_SENFILE
>
> Same page might be mapped more than 100 times, more than 10000 times in
> some cases.

Thanks for that test case.

I updated the fix patch with the following.

diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 42b12740940b..611010df1e9c 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
unsigned long flags;
int rc;

+ /* If the device is not writing memory then we don't have any
+ * concerns about the cpu consuming stale data. This mitigates
+ * legitimate usages of overlapping mappings.
+ */
+ if (entry->direction == DMA_TO_DEVICE)
+ return 0;
+
spin_lock_irqsave(&radix_lock, flags);
rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
if (rc == -EEXIST)
@@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
{
unsigned long flags;

+ /* ...mirror the insert case */
+ if (entry->direction == DMA_TO_DEVICE)
+ return;
+
spin_lock_irqsave(&radix_lock, flags);
/* since we are counting overlaps the final put of the
* cacheline will occur when the overlap count is 0.


Sander, barring a negative test result from you I'll send the attached
patch to Andrew.

--
Dan


Attachments:
fix-dma-debug-overlap-v2.patch (10.26 kB)

2014-02-13 20:16:47

by Dave Jones

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, Feb 11, 2014 at 06:07:10PM -0800, Dan Williams wrote:

> The overlap granularity is too large. Multiple dma_map_single
> mappings are allowed to a given page as long as they don't collide on
> the same cache line.
>
>
> Please try the attached patch to see if it fixes this issue. Works ok for me.

FWIW, since applying this, I haven't seen the 8169 warnings.

thanks,

Dave

2014-02-13 21:49:36

by Sander Eikelenboom

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe


Thursday, February 13, 2014, 9:14:47 PM, you wrote:

> On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
>> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>>
>> > The overlap granularity is too large. Multiple dma_map_single
>> > mappings are allowed to a given page as long as they don't collide on
>> > the same cache line.
>> >
>>
>> I am not sure why you try number of mappings of a page.
>>
>> Try launching 100 concurrent netperf -t TCP_SENFILE
>>
>> Same page might be mapped more than 100 times, more than 10000 times in
>> some cases.

> Thanks for that test case.

> I updated the fix patch with the following.

> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 42b12740940b..611010df1e9c 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
> unsigned long flags;
> int rc;
>
> + /* If the device is not writing memory then we don't have any
> + * concerns about the cpu consuming stale data. This mitigates
> + * legitimate usages of overlapping mappings.
> + */
+ if (entry->>direction == DMA_TO_DEVICE)
> + return 0;
> +
> spin_lock_irqsave(&radix_lock, flags);
> rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
> if (rc == -EEXIST)
> @@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
> {
> unsigned long flags;
>
> + /* ...mirror the insert case */
+ if (entry->>direction == DMA_TO_DEVICE)
> + return;
> +
> spin_lock_irqsave(&radix_lock, flags);
> /* since we are counting overlaps the final put of the
> * cacheline will occur when the overlap count is 0.


> Sander, barring a negative test result from you I'll send the attached
> patch to Andrew.

Hi Dan,

That seems to effectively suppress the warning, thanks and:

Tested-by; Sander Eikelenboom <[email protected]>

--
Sander

> --
> Dan

2014-02-25 17:45:53

by Josh Boyer

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Thu, Feb 13, 2014 at 4:49 PM, Sander Eikelenboom
<[email protected]> wrote:
>
> Thursday, February 13, 2014, 9:14:47 PM, you wrote:
>
>> On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
>>> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>>>
>>> > The overlap granularity is too large. Multiple dma_map_single
>>> > mappings are allowed to a given page as long as they don't collide on
>>> > the same cache line.
>>> >
>>>
>>> I am not sure why you try number of mappings of a page.
>>>
>>> Try launching 100 concurrent netperf -t TCP_SENFILE
>>>
>>> Same page might be mapped more than 100 times, more than 10000 times in
>>> some cases.
>
>> Thanks for that test case.
>
>> I updated the fix patch with the following.
>
>> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
>> index 42b12740940b..611010df1e9c 100644
>> --- a/lib/dma-debug.c
>> +++ b/lib/dma-debug.c
>> @@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
>> unsigned long flags;
>> int rc;
>>
>> + /* If the device is not writing memory then we don't have any
>> + * concerns about the cpu consuming stale data. This mitigates
>> + * legitimate usages of overlapping mappings.
>> + */
> + if (entry->>direction == DMA_TO_DEVICE)
>> + return 0;
>> +
>> spin_lock_irqsave(&radix_lock, flags);
>> rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
>> if (rc == -EEXIST)
>> @@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
>> {
>> unsigned long flags;
>>
>> + /* ...mirror the insert case */
> + if (entry->>direction == DMA_TO_DEVICE)
>> + return;
>> +
>> spin_lock_irqsave(&radix_lock, flags);
>> /* since we are counting overlaps the final put of the
>> * cacheline will occur when the overlap count is 0.
>
>
>> Sander, barring a negative test result from you I'll send the attached
>> patch to Andrew.
>
> Hi Dan,
>
> That seems to effectively suppress the warning, thanks and:
>
> Tested-by; Sander Eikelenboom <[email protected]>

Is there a reason this isn't in Linus' tree yet?

josh

2014-02-25 17:50:50

by Dan Williams

[permalink] [raw]
Subject: Re: 3.14-mw regression: rtl8169 WARNING: DMA-API: exceeded 7 overlapping mappings of pfn 55ebe

On Tue, Feb 25, 2014 at 9:45 AM, Josh Boyer <[email protected]> wrote:
> On Thu, Feb 13, 2014 at 4:49 PM, Sander Eikelenboom
> <[email protected]> wrote:
>>
>> Thursday, February 13, 2014, 9:14:47 PM, you wrote:
>>
>>> On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
>>>> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>>>>
>>>> > The overlap granularity is too large. Multiple dma_map_single
>>>> > mappings are allowed to a given page as long as they don't collide on
>>>> > the same cache line.
>>>> >
>>>>
>>>> I am not sure why you try number of mappings of a page.
>>>>
>>>> Try launching 100 concurrent netperf -t TCP_SENFILE
>>>>
>>>> Same page might be mapped more than 100 times, more than 10000 times in
>>>> some cases.
>>
>>> Thanks for that test case.
>>
>>> I updated the fix patch with the following.
>>
>>> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
>>> index 42b12740940b..611010df1e9c 100644
>>> --- a/lib/dma-debug.c
>>> +++ b/lib/dma-debug.c
>>> @@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
>>> unsigned long flags;
>>> int rc;
>>>
>>> + /* If the device is not writing memory then we don't have any
>>> + * concerns about the cpu consuming stale data. This mitigates
>>> + * legitimate usages of overlapping mappings.
>>> + */
>> + if (entry->>direction == DMA_TO_DEVICE)
>>> + return 0;
>>> +
>>> spin_lock_irqsave(&radix_lock, flags);
>>> rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
>>> if (rc == -EEXIST)
>>> @@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
>>> {
>>> unsigned long flags;
>>>
>>> + /* ...mirror the insert case */
>> + if (entry->>direction == DMA_TO_DEVICE)
>>> + return;
>>> +
>>> spin_lock_irqsave(&radix_lock, flags);
>>> /* since we are counting overlaps the final put of the
>>> * cacheline will occur when the overlap count is 0.
>>
>>
>>> Sander, barring a negative test result from you I'll send the attached
>>> patch to Andrew.
>>
>> Hi Dan,
>>
>> That seems to effectively suppress the warning, thanks and:
>>
>> Tested-by; Sander Eikelenboom <[email protected]>
>
> Is there a reason this isn't in Linus' tree yet?
>

It's in -mm and now -next, I expect it will go upstream with akpm's next sync.

--
Dan