Hi,
I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
It looks like it's related to the rtl8169 ...
--
Sander
Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
Jan 26 11:36:26 serveerstertje kernel: [ 89.291073] [<ffffffff8194aaaa>] ? __kfree_skb+0x3a/0xb0
Jan 26 11:36:26 serveerstertje kernel: [ 89.301252] [<ffffffff81955a3f>] ? dev_queue_xmit_nit+0x1ef/0x260
Jan 26 11:36:26 serveerstertje kernel: [ 89.311392] [<ffffffff81955850>] ? dev_loopback_xmit+0x1e0/0x1e0
Jan 26 11:36:26 serveerstertje kernel: [ 89.321418] [<ffffffff81959b96>] dev_hard_start_xmit+0x2e6/0x4a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.331236] [<ffffffff819778fe>] sch_direct_xmit+0xfe/0x280
Jan 26 11:36:26 serveerstertje kernel: [ 89.341013] [<ffffffff81959f8c>] __dev_queue_xmit+0x23c/0x630
Jan 26 11:36:26 serveerstertje kernel: [ 89.350668] [<ffffffff81959d50>] ? dev_hard_start_xmit+0x4a0/0x4a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.360264] [<ffffffff81a00ce4>] ? ip_output+0x54/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.369698] [<ffffffff8195a39b>] dev_queue_xmit+0xb/0x10
Jan 26 11:36:26 serveerstertje kernel: [ 89.379034] [<ffffffff819ff2bb>] ip_finish_output+0x2cb/0x670
Jan 26 11:36:26 serveerstertje kernel: [ 89.388373] [<ffffffff81a00ce4>] ? ip_output+0x54/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.397498] [<ffffffff81a00ce4>] ip_output+0x54/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.406584] [<ffffffff819fc141>] ip_forward_finish+0x71/0x1a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.415534] [<ffffffff819fc413>] ip_forward+0x1a3/0x440
Jan 26 11:36:26 serveerstertje kernel: [ 89.424400] [<ffffffff819f9f80>] ip_rcv_finish+0x150/0x650
Jan 26 11:36:26 serveerstertje kernel: [ 89.433108] [<ffffffff819faa1b>] ip_rcv+0x22b/0x370
Jan 26 11:36:26 serveerstertje kernel: [ 89.441737] [<ffffffff81a57322>] ? packet_rcv_spkt+0x42/0x190
Jan 26 11:36:26 serveerstertje kernel: [ 89.450226] [<ffffffff81957382>] __netif_receive_skb_core+0x6d2/0x8a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.458687] [<ffffffff81956dc4>] ? __netif_receive_skb_core+0x114/0x8a0
Jan 26 11:36:26 serveerstertje kernel: [ 89.467109] [<ffffffff81008f50>] ? xen_clocksource_read+0x20/0x30
Jan 26 11:36:26 serveerstertje kernel: [ 89.475362] [<ffffffff81116e09>] ? getnstimeofday+0x9/0x30
Jan 26 11:36:26 serveerstertje kernel: [ 89.483548] [<ffffffff8195756c>] __netif_receive_skb+0x1c/0x70
Jan 26 11:36:26 serveerstertje kernel: [ 89.491608] [<ffffffff819575de>] netif_receive_skb_internal+0x1e/0xf0
Jan 26 11:36:26 serveerstertje kernel: [ 89.499596] [<ffffffff81958ac0>] napi_gro_receive+0x70/0xa0
Jan 26 11:36:26 serveerstertje kernel: [ 89.507486] [<ffffffff81711673>] rtl8169_poll+0x2d3/0x680
Jan 26 11:36:26 serveerstertje kernel: [ 89.515222] [<ffffffff81957a81>] net_rx_action+0x161/0x260
Jan 26 11:36:26 serveerstertje kernel: [ 89.523097] [<ffffffff810c28dd>] __do_softirq+0x11d/0x250
Jan 26 11:36:26 serveerstertje kernel: [ 89.530973] [<ffffffff810c2d72>] irq_exit+0xa2/0xd0
Jan 26 11:36:26 serveerstertje kernel: [ 89.538915] [<ffffffff814f94bf>] xen_evtchn_do_upcall+0x2f/0x40
Jan 26 11:36:26 serveerstertje kernel: [ 89.546876] [<ffffffff81ad83de>] xen_do_hypervisor_callback+0x1e/0x30
Jan 26 11:36:26 serveerstertje kernel: [ 89.554591] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.562139] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.569503] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.576788] [<ffffffff81018748>] ? default_idle+0x18/0x20
Jan 26 11:36:26 serveerstertje kernel: [ 89.583863] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
Jan 26 11:36:26 serveerstertje kernel: [ 89.590627] [<ffffffff8110b511>] ? cpu_startup_entry+0x91/0x1e0
Jan 26 11:36:26 serveerstertje kernel: [ 89.597184] [<ffffffff81ac0497>] ? rest_init+0xb7/0xc0
Jan 26 11:36:26 serveerstertje kernel: [ 89.603507] [<ffffffff81ac03e0>] ? csum_partial_copy_generic+0x170/0x170
Jan 26 11:36:26 serveerstertje kernel: [ 89.609631] [<ffffffff8230ef1c>] ? start_kernel+0x409/0x416
Jan 26 11:36:26 serveerstertje kernel: [ 89.615490] [<ffffffff8230e912>] ? repair_env_string+0x5e/0x5e
Jan 26 11:36:26 serveerstertje kernel: [ 89.621197] [<ffffffff8230e5f8>] ? x86_64_start_reservations+0x2a/0x2c
Jan 26 11:36:26 serveerstertje kernel: [ 89.626592] [<ffffffff82311e26>] ? xen_start_kernel+0x584/0x586
Jan 26 11:36:26 serveerstertje kernel: [ 89.631933] ---[ end trace 206b59d1fe29b5a7 ]---
Sander Eikelenboom <[email protected]> :
[...]
> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
> It looks like it's related to the rtl8169 ...
>
> --
> Sander
>
> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
[r8169 and xen stuff]
Dan, I miss the part of the debug code that tells where the mappings were
previously set.
--
Ueimor
On Sun, Jan 26, 2014 at 4:03 PM, Francois Romieu <[email protected]> wrote:
> Sander Eikelenboom <[email protected]> :
> [...]
>> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
>> It looks like it's related to the rtl8169 ...
>>
>> --
>> Sander
>>
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
>> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
> [r8169 and xen stuff]
>
> Dan, I miss the part of the debug code that tells where the mappings were
> previously set.
In this case it was a facepalm mistake on my part. The mappings were
not being properly accounted in the last revision of the patch I sent.
I copied you on the fix [1].
--
Dan
[1]: http://marc.info/?l=linux-netdev&m=139096447627032&w=2
Hi Dan / Francois,
Didn't have time to test it before, but the patch doesn't seem to help.
I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
but i see now i forgot to mention i use r8169.use_dac=1 ...
Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
--
Sander
Wednesday, January 29, 2014, 4:06:24 AM, you wrote:
> On Sun, Jan 26, 2014 at 4:03 PM, Francois Romieu <[email protected]> wrote:
>> Sander Eikelenboom <[email protected]> :
>> [...]
>>> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
>>> It looks like it's related to the rtl8169 ...
>>>
>>> --
>>> Sander
>>>
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
>> [r8169 and xen stuff]
>>
>> Dan, I miss the part of the debug code that tells where the mappings were
>> previously set.
> In this case it was a facepalm mistake on my part. The mappings were
> not being properly accounted in the last revision of the patch I sent.
> I copied you on the fix [1].
> --
> Dan
> [1]: http://marc.info/?l=linux-netdev&m=139096447627032&w=2
Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...
Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?
--
Sander
complete stacktrace:
[ 342.710738] ------------[ cut here ]------------
[ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
[ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
[ 342.759510] Modules linked in:
[ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
[ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
[ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
[ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
[ 342.854572] Call Trace:
[ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
[ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
[ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
[ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
[ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
[ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
[ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
[ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
[ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
[ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
[ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
[ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
[ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
[ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
[ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
[ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
[ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
[ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
[ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
[ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
[ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
[ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
[ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
[ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
[ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
[ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
[ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
[ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
[ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
[ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
[ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
[ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
[ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
[ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
[ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
[ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
[ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
[ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
[ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
[ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
[ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
[ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
[ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
[ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
[ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
[ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
[ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
[ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
[ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---
Thursday, February 6, 2014, 12:36:31 PM, you wrote:
> Hi Dan / Francois,
> Didn't have time to test it before, but the patch doesn't seem to help.
> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
> but i see now i forgot to mention i use r8169.use_dac=1 ...
> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
> --
> Sander
> Wednesday, January 29, 2014, 4:06:24 AM, you wrote:
>> On Sun, Jan 26, 2014 at 4:03 PM, Francois Romieu <[email protected]> wrote:
>>> Sander Eikelenboom <[email protected]> :
>>> [...]
>>>> I have got a regression with a 3.14-mw kernel (last commit is 4ba9920e5e9c0e16b5ed24292d45322907bb9035):
>>>> It looks like it's related to the rtl8169 ...
>>>>
>>>> --
>>>> Sander
>>>>
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.105537] ------------[ cut here ]------------
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.116779] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x103/0x130()
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.128148] DMA-API: exceeded 7 overlapping mappings of pfn 55ebe
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.139397] Modules linked in:
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.150535] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.13.0-20140125-mw-pcireset+ #1
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.161784] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.172965] 0000000000000009 ffff88005f603838 ffffffff81acbcfa ffffffff822134e0
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.184156] ffff88005f603888 ffff88005f603878 ffffffff810bdf62 ffff880000000000
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.195186] 0000000000055ebe 00000000ffffffef 0000000000000200 ffff8800592ea098
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.206227] Call Trace:
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.217027] <IRQ> [<ffffffff81acbcfa>] dump_stack+0x46/0x58
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.227907] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.238678] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.249336] [<ffffffff81471c5a>] ? active_pfn_read_overlap+0x3a/0x70
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.259904] [<ffffffff814729e3>] add_dma_entry+0x103/0x130
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.270416] [<ffffffff81472de6>] debug_dma_map_page+0x126/0x150
>>>> Jan 26 11:36:26 serveerstertje kernel: [ 89.280840] [<ffffffff81714686>] rtl8169_start_xmit+0x216/0xa20
>>> [r8169 and xen stuff]
>>>
>>> Dan, I miss the part of the debug code that tells where the mappings were
>>> previously set.
>> In this case it was a facepalm mistake on my part. The mappings were
>> not being properly accounted in the last revision of the patch I sent.
>> I copied you on the fix [1].
>> --
>> Dan
>> [1]: http://marc.info/?l=linux-netdev&m=139096447627032&w=2
--
Best regards,
Sander mailto:[email protected]
On Thu, Feb 6, 2014 at 5:09 AM, Sander Eikelenboom <[email protected]> wrote:
> Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...
>
> Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?
>
> --
> Sander
>
> complete stacktrace:
>
> [ 342.710738] ------------[ cut here ]------------
> [ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
> [ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
> [ 342.759510] Modules linked in:
> [ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
> [ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
> [ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
> [ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
> [ 342.854572] Call Trace:
> [ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
> [ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
> [ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
> [ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
> [ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
> [ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
> [ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
> [ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
> [ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
> [ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
> [ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
> [ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
> [ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
> [ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
> [ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
> [ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
> [ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
> [ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
> [ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
> [ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
> [ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
> [ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
> [ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
> [ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
> [ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
> [ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
> [ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
> [ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
> [ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
> [ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
> [ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
> [ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
> [ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
> [ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
> [ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
> [ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
> [ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
> [ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
> [ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
> [ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
> [ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
> [ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
> [ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
> [ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
> [ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
> [ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
> [ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
> [ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
> [ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
> [ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
> [ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---
>
>
>
>
>
> Thursday, February 6, 2014, 12:36:31 PM, you wrote:
>
>> Hi Dan / Francois,
>
>> Didn't have time to test it before, but the patch doesn't seem to help.
>> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
>> but i see now i forgot to mention i use r8169.use_dac=1 ...
>
>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
If you are still hitting this with the patch:
59f2e7df574c dma-debug: fix overlap detection
...then I'm more inclined to think it is an actual positive report.
If you don't mind I'll send some debug patches to narrow this down.
Thursday, February 6, 2014, 3:26:09 PM, you wrote:
> On Thu, Feb 6, 2014 at 5:09 AM, Sander Eikelenboom <[email protected]> wrote:
>> Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...
>>
>> Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?
>>
>> --
>> Sander
>>
>> complete stacktrace:
>>
>> [ 342.710738] ------------[ cut here ]------------
>> [ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
>> [ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
>> [ 342.759510] Modules linked in:
>> [ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
>> [ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> [ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
>> [ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
>> [ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
>> [ 342.854572] Call Trace:
>> [ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
>> [ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>> [ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>> [ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
>> [ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
>> [ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
>> [ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
>> [ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
>> [ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
>> [ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
>> [ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
>> [ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
>> [ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
>> [ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
>> [ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
>> [ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
>> [ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
>> [ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
>> [ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
>> [ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
>> [ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
>> [ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
>> [ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
>> [ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
>> [ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
>> [ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
>> [ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
>> [ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
>> [ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
>> [ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
>> [ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
>> [ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
>> [ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
>> [ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
>> [ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
>> [ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
>> [ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
>> [ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
>> [ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
>> [ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
>> [ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
>> [ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
>> [ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
>> [ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
>> [ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
>> [ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
>> [ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---
>>
>>
>>
>>
>>
>> Thursday, February 6, 2014, 12:36:31 PM, you wrote:
>>
>>> Hi Dan / Francois,
>>
>>> Didn't have time to test it before, but the patch doesn't seem to help.
>>> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
>>> but i see now i forgot to mention i use r8169.use_dac=1 ...
>>
>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
> If you are still hitting this with the patch:
> 59f2e7df574c dma-debug: fix overlap detection
> ...then I'm more inclined to think it is an actual positive report.
> If you don't mind I'll send some debug patches to narrow this down.
Please do .. sounds better than bisecting :-)
On Thu, Feb 6, 2014 at 6:27 AM, Sander Eikelenboom <[email protected]> wrote:
>>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
>
>> If you are still hitting this with the patch:
>
>> 59f2e7df574c dma-debug: fix overlap detection
>
>> ...then I'm more inclined to think it is an actual positive report.
>
>> If you don't mind I'll send some debug patches to narrow this down.
>
> Please do .. sounds better than bisecting :-)
>
Hi, attached is a patch that should give some insight whether the
driver is triggering many overlapping mappings. Try it on top of
3.14-rc1.
Thank you for the debug help!
Thursday, February 6, 2014, 8:12:15 PM, you wrote:
> On Thu, Feb 6, 2014 at 6:27 AM, Sander Eikelenboom <[email protected]> wrote:
>>>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
>>
>>> If you are still hitting this with the patch:
>>
>>> 59f2e7df574c dma-debug: fix overlap detection
>>
>>> ...then I'm more inclined to think it is an actual positive report.
>>
>>> If you don't mind I'll send some debug patches to narrow this down.
>>
>> Please do .. sounds better than bisecting :-)
>>
> Hi, attached is a patch that should give some insight whether the
> driver is triggering many overlapping mappings. Try it on top of
> 3.14-rc1.
> Thank you for the debug help!
Hi Dan,
Nifty feature the trace_printk .. however is there a way to limit the list it's spitting out
to what you are interesting in ?
At present the machine chokes while trying to spit out everything in one go and:
- it probably not of all of it is logged to disk because of all the rcu stalls and other problems it causes.
- the list on console at least looked a lot longer (and in the logs i don't see the original warn_on which should
be just before the dump.
However .. attached is what i have got ...
--
Sander
Hi Dan,
FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.
I tried something else .. don't know if it gives you anymore insights, but it's worth the try:
diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 2defd13..0fe5b75 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
return overlap;
}
-static void active_pfn_inc_overlap(unsigned long pfn)
+static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
{
- int overlap = active_pfn_read_overlap(pfn);
+ int overlap = active_pfn_read_overlap(ent->pfn);
- overlap = active_pfn_set_overlap(pfn, ++overlap);
+ overlap = active_pfn_set_overlap(ent->pfn, ++overlap);
/* If we overflowed the overlap counter then we're potentially
* leaking dma-mappings. Otherwise, if maps and unmaps are
@@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
* debug_dma_assert_idle() as the pfn may be marked idle
* prematurely.
*/
+
WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
"DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
- ACTIVE_PFN_MAX_OVERLAP, pfn);
+ ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
+
+ if(overlap > ACTIVE_PFN_MAX_OVERLAP){
+
+ dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
+ int idx;
+
+ for (idx = 0; idx < HASH_SIZE; idx++) {
+ struct hash_bucket *bucket = &dma_entry_hash[idx];
+ struct dma_debug_entry *entry;
+ unsigned long flags;
+
+ list_for_each_entry(entry, &bucket->list, list) {
+ if (entry->pfn == ent->pfn) {
+ dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
+ type2name[entry->type], idx,
+ phys_addr(entry), entry->pfn,
+ entry->dev_addr, entry->size,
+ dir2name[entry->direction],
+ maperr2str[entry->map_err_type]);
+ }
+ }
+ }
+ dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
+ }
}
@@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)
spin_lock_irqsave(&radix_lock, flags);
rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
- if (rc == -EEXIST)
- active_pfn_inc_overlap(entry->pfn);
+ if (rc == -EEXIST){
+ active_pfn_inc_overlap(entry);
+ }
spin_unlock_irqrestore(&radix_lock, flags);
-
return rc;
}
This results in:
[ 27.708678] r8169 0000:0a:00.0 eth1: link down
[ 27.712102] r8169 0000:0a:00.0 eth1: link down
[ 28.015340] r8169 0000:0b:00.0 eth0: link down
[ 28.015368] r8169 0000:0b:00.0 eth0: link down
[ 29.654844] r8169 0000:0b:00.0 eth0: link up
[ 30.278542] r8169 0000:0a:00.0 eth1: link up
[ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
[ 218.073407] ------------[ cut here ]------------
[ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
[ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
[ 218.095988] Modules linked in:
[ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
[ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
[ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
[ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
[ 218.140154] Call Trace:
[ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
[ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
[ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
[ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
[ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
[ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
[ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
[ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
[ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
[ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
[ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
[ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
[ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
[ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
[ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
[ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
[ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
[ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
[ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
[ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
[ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
[ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
[ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
[ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
[ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
[ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
[ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
[ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
[ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
[ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
[ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
[ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
[ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
[ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
[ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
[ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
[ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
[ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
[ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
[ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
[ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
[ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
[ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
[ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
[ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
[ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
[ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
[ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
[ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
[ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
[ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
[ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
[ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
[ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked
[ 218.424677] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. end of dump
[ 218.429050] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. start dump
[ 218.432225] r8169 0000:0b:00.0: single idx 571 P=3c423040 N=3c423 D=c76040 L=36 DMA_TO_DEVICE dma map error checked
[ 218.435408] r8169 0000:0b:00.0: single idx 571 P=3c423200 N=3c423 D=c77200 L=36 DMA_TO_DEVICE dma map error checked
[ 218.438578] r8169 0000:0b:00.0: single idx 572 P=3c4233c0 N=3c423 D=c783c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.441695] r8169 0000:0b:00.0: single idx 572 P=3c423580 N=3c423 D=c79580 L=7b DMA_TO_DEVICE dma map error checked
[ 218.444783] r8169 0000:0b:00.0: single idx 573 P=3c423780 N=3c423 D=c7a780 L=9b DMA_TO_DEVICE dma map error checked
[ 218.447825] r8169 0000:0b:00.0: single idx 573 P=3c4239c0 N=3c423 D=c7b9c0 L=6b DMA_TO_DEVICE dma map error checked
[ 218.450844] r8169 0000:0b:00.0: single idx 574 P=3c423bc0 N=3c423 D=c7cbc0 L=7b DMA_TO_DEVICE dma map error checked
[ 218.453814] r8169 0000:0b:00.0: single idx 574 P=3c423dc0 N=3c423 D=c7ddc0 L=7b DMA_TO_DEVICE dma map error checked
[ 218.456793] r8169 0000:0b:00.0: single idx 575 P=3c423fc0 N=3c423 D=c7efc0 L=7b DMA_TO_DEVICE dma map error not checked
[ 218.459772] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. end of dump
[ 218.473504] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. start dump
[ 218.475662] r8169 0000:0b:00.0: single idx 586 P=3c7160c0 N=3c716 D=c940c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.477874] r8169 0000:0b:00.0: single idx 586 P=3c716280 N=3c716 D=c95280 L=36 DMA_TO_DEVICE dma map error checked
[ 218.480075] r8169 0000:0b:00.0: single idx 587 P=3c716440 N=3c716 D=c96440 L=36 DMA_TO_DEVICE dma map error checked
[ 218.482245] r8169 0000:0b:00.0: single idx 587 P=3c716600 N=3c716 D=c97600 L=36 DMA_TO_DEVICE dma map error checked
[ 218.484390] r8169 0000:0b:00.0: single idx 588 P=3c7167c0 N=3c716 D=c987c0 L=42 DMA_TO_DEVICE dma map error checked
[ 218.486510] r8169 0000:0b:00.0: single idx 588 P=3c7169c0 N=3c716 D=c999c0 L=36 DMA_TO_DEVICE dma map error checked
[ 218.488603] r8169 0000:0b:00.0: single idx 589 P=3c716b80 N=3c716 D=c9ab80 L=42 DMA_TO_DEVICE dma map error checked
[ 218.490682] r8169 0000:0b:00.0: single idx 589 P=3c716d80 N=3c716 D=c9bd80 L=42 DMA_TO_DEVICE dma map error checked
[ 218.492735] r8169 0000:0b:00.0: single idx 590 P=3c716f80 N=3c716 D=c9cf80 L=42 DMA_TO_DEVICE dma map error not checked
[ 218.494788] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. end of dump
--
Sander
Thursday, February 6, 2014, 3:26:09 PM, you wrote:
> On Thu, Feb 6, 2014 at 5:09 AM, Sander Eikelenboom <[email protected]> wrote:
>> Hmm ok that last message was false .. sorry for that .. it did happen again without r8169.use_dac=1, it just doesn't seem to happen all the time...
>>
>> Konrad / Wei, do you happen to know of any xen related change that went into 3.14 merge window that relates to dma / xen networking ?
>>
>> --
>> Sander
>>
>> complete stacktrace:
>>
>> [ 342.710738] ------------[ cut here ]------------
>> [ 342.726890] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:491 add_dma_entry+0x105/0x130()
>> [ 342.743210] DMA-API: exceeded 7 overlapping mappings of pfn 40b00
>> [ 342.759510] Modules linked in:
>> [ 342.775557] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc1-20140206-pcireset-net-btrevert+ #1
>> [ 342.791706] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> [ 342.807627] 0000000000000009 ffff88005f603828 ffffffff81ad29fc ffffffff822134e0
>> [ 342.823430] ffff88005f603878 ffff88005f603868 ffffffff810bdf62 ffff880000000000
>> [ 342.839081] 0000000000040b00 00000000ffffffef ffffffff822102e0 ffff8800592b9098
>> [ 342.854572] Call Trace:
>> [ 342.869748] <IRQ> [<ffffffff81ad29fc>] dump_stack+0x46/0x58
>> [ 342.884915] [<ffffffff810bdf62>] warn_slowpath_common+0x82/0xb0
>> [ 342.899710] [<ffffffff810be031>] warn_slowpath_fmt+0x41/0x50
>> [ 342.914395] [<ffffffff8147853a>] ? active_pfn_read_overlap+0x3a/0x70
>> [ 342.929166] [<ffffffff814792c5>] add_dma_entry+0x105/0x130
>> [ 342.943733] [<ffffffff814796c6>] debug_dma_map_page+0x126/0x150
>> [ 342.957988] [<ffffffff8171c8b6>] rtl8169_start_xmit+0x216/0xa20
>> [ 342.972306] [<ffffffff8195f08f>] ? dev_queue_xmit_nit+0x1ef/0x260
>> [ 342.986523] [<ffffffff8195eea0>] ? dev_loopback_xmit+0x1e0/0x1e0
>> [ 343.000689] [<ffffffff819631e6>] dev_hard_start_xmit+0x2e6/0x4a0
>> [ 343.014466] [<ffffffff81980f3e>] sch_direct_xmit+0xfe/0x280
>> [ 343.028052] [<ffffffff819635dc>] __dev_queue_xmit+0x23c/0x630
>> [ 343.041338] [<ffffffff819633a0>] ? dev_hard_start_xmit+0x4a0/0x4a0
>> [ 343.054483] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.067659] [<ffffffff819639eb>] dev_queue_xmit+0xb/0x10
>> [ 343.080804] [<ffffffff81a0890b>] ip_finish_output+0x2cb/0x670
>> [ 343.093746] [<ffffffff81a0a334>] ? ip_output+0x54/0xf0
>> [ 343.106391] [<ffffffff81a0a334>] ip_output+0x54/0xf0
>> [ 343.118683] [<ffffffff81a05791>] ip_forward_finish+0x71/0x1a0
>> [ 343.130901] [<ffffffff81a05a63>] ip_forward+0x1a3/0x440
>> [ 343.142829] [<ffffffff810ffebb>] ? lock_is_held+0x8b/0xb0
>> [ 343.154346] [<ffffffff81a035c0>] ip_rcv_finish+0x150/0x660
>> [ 343.165748] [<ffffffff81a0406b>] ip_rcv+0x22b/0x370
>> [ 343.176838] [<ffffffff81a60972>] ? packet_rcv_spkt+0x42/0x190
>> [ 343.187659] [<ffffffff819609d2>] __netif_receive_skb_core+0x6d2/0x8a0
>> [ 343.198209] [<ffffffff81960414>] ? __netif_receive_skb_core+0x114/0x8a0
>> [ 343.208819] [<ffffffff81009010>] ? xen_clocksource_read+0x20/0x30
>> [ 343.219471] [<ffffffff81116e49>] ? getnstimeofday+0x9/0x30
>> [ 343.229862] [<ffffffff81960bbc>] __netif_receive_skb+0x1c/0x70
>> [ 343.239953] [<ffffffff81960c2e>] netif_receive_skb_internal+0x1e/0xf0
>> [ 343.249908] [<ffffffff81962110>] napi_gro_receive+0x70/0xa0
>> [ 343.259509] [<ffffffff817198a3>] rtl8169_poll+0x2d3/0x680
>> [ 343.268982] [<ffffffff81adcd2b>] ? _raw_spin_unlock_irq+0x2b/0x50
>> [ 343.278091] [<ffffffff819610d1>] net_rx_action+0x161/0x260
>> [ 343.287056] [<ffffffff810c28ec>] __do_softirq+0x12c/0x280
>> [ 343.295756] [<ffffffff810c2da2>] irq_exit+0xa2/0xd0
>> [ 343.304235] [<ffffffff814ffd5f>] xen_evtchn_do_upcall+0x2f/0x40
>> [ 343.312387] [<ffffffff81adf15e>] xen_do_hypervisor_callback+0x1e/0x30
>> [ 343.320389] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.328171] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
>> [ 343.335738] [<ffffffff81008c70>] ? xen_safe_halt+0x10/0x20
>> [ 343.343142] [<ffffffff81018748>] ? default_idle+0x18/0x20
>> [ 343.350202] [<ffffffff81018f5e>] ? arch_cpu_idle+0x2e/0x40
>> [ 343.356994] [<ffffffff8110b551>] ? cpu_startup_entry+0x91/0x1e0
>> [ 343.363658] [<ffffffff81ac7d87>] ? rest_init+0xb7/0xc0
>> [ 343.369924] [<ffffffff81ac7cd0>] ? csum_partial_copy_generic+0x170/0x170
>> [ 343.376057] [<ffffffff8230ff1c>] ? start_kernel+0x409/0x416
>> [ 343.381972] [<ffffffff8230f912>] ? repair_env_string+0x5e/0x5e
>> [ 343.387573] [<ffffffff8230f5f8>] ? x86_64_start_reservations+0x2a/0x2c
>> [ 343.393152] [<ffffffff82312e28>] ? xen_start_kernel+0x586/0x588
>> [ 343.398628] ---[ end trace 8379b598fb7ef5ee ]---
>>
>>
>>
>>
>>
>> Thursday, February 6, 2014, 12:36:31 PM, you wrote:
>>
>>> Hi Dan / Francois,
>>
>>> Didn't have time to test it before, but the patch doesn't seem to help.
>>> I'm still getting the "DMA-API: exceeded 7 overlapping mappings of pfn 55ebe",
>>> but i see now i forgot to mention i use r8169.use_dac=1 ...
>>
>>> Not using it seems to prevent the warning, but before 3.14 i have never seen this (with r8169.use_dac=1)
> If you are still hitting this with the patch:
> 59f2e7df574c dma-debug: fix overlap detection
> ...then I'm more inclined to think it is an actual positive report.
> If you don't mind I'll send some debug patches to narrow this down.
On Tue, 2014-02-11 at 20:56 +0100, Sander Eikelenboom wrote:
> Hi Dan,
>
> FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.
>
> I tried something else .. don't know if it gives you anymore insights, but it's worth the try:
>
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 2defd13..0fe5b75 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
> return overlap;
> }
>
> -static void active_pfn_inc_overlap(unsigned long pfn)
> +static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
> {
> - int overlap = active_pfn_read_overlap(pfn);
> + int overlap = active_pfn_read_overlap(ent->pfn);
>
> - overlap = active_pfn_set_overlap(pfn, ++overlap);
> + overlap = active_pfn_set_overlap(ent->pfn, ++overlap);
>
> /* If we overflowed the overlap counter then we're potentially
> * leaking dma-mappings. Otherwise, if maps and unmaps are
> @@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
> * debug_dma_assert_idle() as the pfn may be marked idle
> * prematurely.
> */
> +
> WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
> "DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
> - ACTIVE_PFN_MAX_OVERLAP, pfn);
> + ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> +
> + if(overlap > ACTIVE_PFN_MAX_OVERLAP){
> +
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + int idx;
> +
> + for (idx = 0; idx < HASH_SIZE; idx++) {
> + struct hash_bucket *bucket = &dma_entry_hash[idx];
> + struct dma_debug_entry *entry;
> + unsigned long flags;
> +
> + list_for_each_entry(entry, &bucket->list, list) {
> + if (entry->pfn == ent->pfn) {
> + dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
> + type2name[entry->type], idx,
> + phys_addr(entry), entry->pfn,
> + entry->dev_addr, entry->size,
> + dir2name[entry->direction],
> + maperr2str[entry->map_err_type]);
> + }
> + }
> + }
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + }
> }
>
>
> @@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)
>
> spin_lock_irqsave(&radix_lock, flags);
> rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
> - if (rc == -EEXIST)
> - active_pfn_inc_overlap(entry->pfn);
> + if (rc == -EEXIST){
> + active_pfn_inc_overlap(entry);
> + }
> spin_unlock_irqrestore(&radix_lock, flags);
> -
> return rc;
> }
>
>
> This results in:
> [ 27.708678] r8169 0000:0a:00.0 eth1: link down
> [ 27.712102] r8169 0000:0a:00.0 eth1: link down
> [ 28.015340] r8169 0000:0b:00.0 eth0: link down
> [ 28.015368] r8169 0000:0b:00.0 eth0: link down
> [ 29.654844] r8169 0000:0b:00.0 eth0: link up
> [ 30.278542] r8169 0000:0a:00.0 eth1: link up
> [ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 218.073407] ------------[ cut here ]------------
> [ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
> [ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
> [ 218.095988] Modules linked in:
> [ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
> [ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
> [ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
> [ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
> [ 218.140154] Call Trace:
> [ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
> [ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
> [ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
> [ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
> [ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
> [ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
> [ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
> [ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
> [ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
> [ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
> [ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
> [ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
> [ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
> [ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
> [ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
> [ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
> [ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
> [ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
> [ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
> [ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
> [ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
> [ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
> [ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
> [ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
> [ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
> [ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
> [ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
> [ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
> [ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
> [ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
> [ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
> [ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
> [ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
> [ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
> [ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
> [ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
> [ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
> [ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
> [ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
> [ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
> [ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
> [ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
> [ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
> [ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
> [ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
> [ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
> [ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
> [ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked
> [ 218.424677] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. end of dump
> [ 218.429050] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. start dump
> [ 218.432225] r8169 0000:0b:00.0: single idx 571 P=3c423040 N=3c423 D=c76040 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.435408] r8169 0000:0b:00.0: single idx 571 P=3c423200 N=3c423 D=c77200 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.438578] r8169 0000:0b:00.0: single idx 572 P=3c4233c0 N=3c423 D=c783c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.441695] r8169 0000:0b:00.0: single idx 572 P=3c423580 N=3c423 D=c79580 L=7b DMA_TO_DEVICE dma map error checked
> [ 218.444783] r8169 0000:0b:00.0: single idx 573 P=3c423780 N=3c423 D=c7a780 L=9b DMA_TO_DEVICE dma map error checked
> [ 218.447825] r8169 0000:0b:00.0: single idx 573 P=3c4239c0 N=3c423 D=c7b9c0 L=6b DMA_TO_DEVICE dma map error checked
> [ 218.450844] r8169 0000:0b:00.0: single idx 574 P=3c423bc0 N=3c423 D=c7cbc0 L=7b DMA_TO_DEVICE dma map error checked
> [ 218.453814] r8169 0000:0b:00.0: single idx 574 P=3c423dc0 N=3c423 D=c7ddc0 L=7b DMA_TO_DEVICE dma map error checked
> [ 218.456793] r8169 0000:0b:00.0: single idx 575 P=3c423fc0 N=3c423 D=c7efc0 L=7b DMA_TO_DEVICE dma map error not checked
> [ 218.459772] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. end of dump
> [ 218.473504] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. start dump
> [ 218.475662] r8169 0000:0b:00.0: single idx 586 P=3c7160c0 N=3c716 D=c940c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.477874] r8169 0000:0b:00.0: single idx 586 P=3c716280 N=3c716 D=c95280 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.480075] r8169 0000:0b:00.0: single idx 587 P=3c716440 N=3c716 D=c96440 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.482245] r8169 0000:0b:00.0: single idx 587 P=3c716600 N=3c716 D=c97600 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.484390] r8169 0000:0b:00.0: single idx 588 P=3c7167c0 N=3c716 D=c987c0 L=42 DMA_TO_DEVICE dma map error checked
> [ 218.486510] r8169 0000:0b:00.0: single idx 588 P=3c7169c0 N=3c716 D=c999c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.488603] r8169 0000:0b:00.0: single idx 589 P=3c716b80 N=3c716 D=c9ab80 L=42 DMA_TO_DEVICE dma map error checked
> [ 218.490682] r8169 0000:0b:00.0: single idx 589 P=3c716d80 N=3c716 D=c9bd80 L=42 DMA_TO_DEVICE dma map error checked
> [ 218.492735] r8169 0000:0b:00.0: single idx 590 P=3c716f80 N=3c716 D=c9cf80 L=42 DMA_TO_DEVICE dma map error not checked
> [ 218.494788] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. end of dump
>
> --
> Sander
>
Incoming frames might be taken out of order-3 pages.
With regular Ethernet frames, this is 21 frames per order-3 pages.
ACTIVE_PFN_MAX_OVERLAP seems too small.
Alternative would be to user order-0 only pages if CONFIG_DMA_API_DEBUG
is set. Not sure if it works if PAGE_SIZE=66536 ....
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index f589c9af8cbf..1b9995adfd29 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -1924,7 +1924,11 @@ static inline void __skb_queue_purge(struct sk_buff_head *list)
kfree_skb(skb);
}
+#if defined(CONFIG_DMA_API_DEBUG)
+#define NETDEV_FRAG_PAGE_MAX_ORDER 0
+#else
#define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
+#endif
#define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
#define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE
Tuesday, February 11, 2014, 10:28:52 PM, you wrote:
> On Tue, 2014-02-11 at 20:56 +0100, Sander Eikelenboom wrote:
>> Hi Dan,
>>
>> FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.
>>
>> I tried something else .. don't know if it gives you anymore insights, but it's worth the try:
>>
>> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
>> index 2defd13..0fe5b75 100644
>> --- a/lib/dma-debug.c
>> +++ b/lib/dma-debug.c
>> @@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
>> return overlap;
>> }
>>
>> -static void active_pfn_inc_overlap(unsigned long pfn)
>> +static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
>> {
>> - int overlap = active_pfn_read_overlap(pfn);
>> + int overlap = active_pfn_read_overlap(ent->pfn);
>>
>> - overlap = active_pfn_set_overlap(pfn, ++overlap);
>> + overlap = active_pfn_set_overlap(ent->pfn, ++overlap);
>>
>> /* If we overflowed the overlap counter then we're potentially
>> * leaking dma-mappings. Otherwise, if maps and unmaps are
>> @@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
>> * debug_dma_assert_idle() as the pfn may be marked idle
>> * prematurely.
>> */
>> +
>> WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
>> "DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
>> - ACTIVE_PFN_MAX_OVERLAP, pfn);
>> + ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
>> +
>> + if(overlap > ACTIVE_PFN_MAX_OVERLAP){
>> +
>> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
>> + int idx;
>> +
>> + for (idx = 0; idx < HASH_SIZE; idx++) {
>> + struct hash_bucket *bucket = &dma_entry_hash[idx];
>> + struct dma_debug_entry *entry;
>> + unsigned long flags;
>> +
>> + list_for_each_entry(entry, &bucket->list, list) {
>> + if (entry->pfn == ent->pfn) {
>> + dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
>> + type2name[entry->type], idx,
>> + phys_addr(entry), entry->pfn,
>> + entry->dev_addr, entry->size,
>> + dir2name[entry->direction],
>> + maperr2str[entry->map_err_type]);
>> + }
>> + }
>> + }
>> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
>> + }
>> }
>>
>>
>> @@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)
>>
>> spin_lock_irqsave(&radix_lock, flags);
>> rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
>> - if (rc == -EEXIST)
>> - active_pfn_inc_overlap(entry->pfn);
>> + if (rc == -EEXIST){
>> + active_pfn_inc_overlap(entry);
>> + }
>> spin_unlock_irqrestore(&radix_lock, flags);
>> -
>> return rc;
>> }
>>
>>
>> This results in:
>> [ 27.708678] r8169 0000:0a:00.0 eth1: link down
>> [ 27.712102] r8169 0000:0a:00.0 eth1: link down
>> [ 28.015340] r8169 0000:0b:00.0 eth0: link down
>> [ 28.015368] r8169 0000:0b:00.0 eth0: link down
>> [ 29.654844] r8169 0000:0b:00.0 eth0: link up
>> [ 30.278542] r8169 0000:0a:00.0 eth1: link up
>> [ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
>> [ 218.073407] ------------[ cut here ]------------
>> [ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
>> [ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
>> [ 218.095988] Modules linked in:
>> [ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
>> [ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
>> [ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
>> [ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
>> [ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
>> [ 218.140154] Call Trace:
>> [ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
>> [ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
>> [ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
>> [ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
>> [ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
>> [ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
>> [ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
>> [ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
>> [ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
>> [ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
>> [ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
>> [ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
>> [ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
>> [ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
>> [ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
>> [ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
>> [ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
>> [ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
>> [ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
>> [ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
>> [ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
>> [ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
>> [ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
>> [ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
>> [ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
>> [ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
>> [ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
>> [ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
>> [ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
>> [ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
>> [ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
>> [ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
>> [ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
>> [ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
>> [ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
>> [ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
>> [ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
>> [ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
>> [ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
>> [ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
>> [ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
>> [ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
>> [ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
>> [ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
>> [ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
>> [ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
>> [ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
>> [ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
>> [ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
>> [ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked
>> [ 218.424677] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. end of dump
>> [ 218.429050] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. start dump
>> [ 218.432225] r8169 0000:0b:00.0: single idx 571 P=3c423040 N=3c423 D=c76040 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.435408] r8169 0000:0b:00.0: single idx 571 P=3c423200 N=3c423 D=c77200 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.438578] r8169 0000:0b:00.0: single idx 572 P=3c4233c0 N=3c423 D=c783c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.441695] r8169 0000:0b:00.0: single idx 572 P=3c423580 N=3c423 D=c79580 L=7b DMA_TO_DEVICE dma map error checked
>> [ 218.444783] r8169 0000:0b:00.0: single idx 573 P=3c423780 N=3c423 D=c7a780 L=9b DMA_TO_DEVICE dma map error checked
>> [ 218.447825] r8169 0000:0b:00.0: single idx 573 P=3c4239c0 N=3c423 D=c7b9c0 L=6b DMA_TO_DEVICE dma map error checked
>> [ 218.450844] r8169 0000:0b:00.0: single idx 574 P=3c423bc0 N=3c423 D=c7cbc0 L=7b DMA_TO_DEVICE dma map error checked
>> [ 218.453814] r8169 0000:0b:00.0: single idx 574 P=3c423dc0 N=3c423 D=c7ddc0 L=7b DMA_TO_DEVICE dma map error checked
>> [ 218.456793] r8169 0000:0b:00.0: single idx 575 P=3c423fc0 N=3c423 D=c7efc0 L=7b DMA_TO_DEVICE dma map error not checked
>> [ 218.459772] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c423 .. end of dump
>> [ 218.473504] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. start dump
>> [ 218.475662] r8169 0000:0b:00.0: single idx 586 P=3c7160c0 N=3c716 D=c940c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.477874] r8169 0000:0b:00.0: single idx 586 P=3c716280 N=3c716 D=c95280 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.480075] r8169 0000:0b:00.0: single idx 587 P=3c716440 N=3c716 D=c96440 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.482245] r8169 0000:0b:00.0: single idx 587 P=3c716600 N=3c716 D=c97600 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.484390] r8169 0000:0b:00.0: single idx 588 P=3c7167c0 N=3c716 D=c987c0 L=42 DMA_TO_DEVICE dma map error checked
>> [ 218.486510] r8169 0000:0b:00.0: single idx 588 P=3c7169c0 N=3c716 D=c999c0 L=36 DMA_TO_DEVICE dma map error checked
>> [ 218.488603] r8169 0000:0b:00.0: single idx 589 P=3c716b80 N=3c716 D=c9ab80 L=42 DMA_TO_DEVICE dma map error checked
>> [ 218.490682] r8169 0000:0b:00.0: single idx 589 P=3c716d80 N=3c716 D=c9bd80 L=42 DMA_TO_DEVICE dma map error checked
>> [ 218.492735] r8169 0000:0b:00.0: single idx 590 P=3c716f80 N=3c716 D=c9cf80 L=42 DMA_TO_DEVICE dma map error not checked
>> [ 218.494788] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c716 .. end of dump
>>
>> --
>> Sander
>>
> Incoming frames might be taken out of order-3 pages.
> With regular Ethernet frames, this is 21 frames per order-3 pages.
> ACTIVE_PFN_MAX_OVERLAP seems too small.
> Alternative would be to user order-0 only pages if CONFIG_DMA_API_DEBUG
> is set. Not sure if it works if PAGE_SIZE=66536 ....
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f589c9af8cbf..1b9995adfd29 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1924,7 +1924,11 @@ static inline void __skb_queue_purge(struct sk_buff_head *list)
> kfree_skb(skb);
> }
>
> +#if defined(CONFIG_DMA_API_DEBUG)
> +#define NETDEV_FRAG_PAGE_MAX_ORDER 0
> +#else
> #define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
> +#endif
> #define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
> #define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE
>
Hi Eric,
Just tested your patch .. but the warning still persists.
[ 193.004554] ------------[ cut here ]------------
[ 193.034237] WARNING: CPU: 0 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
[ 193.069895] DMA-API: exceeded 7 overlapping mappings of pfn 4da0f
[ 193.100538] Modules linked in:
[ 193.121839] CPU: 0 PID: 0 Comm: swapper/0 Not tainted 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug7+ #1
[ 193.166335] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
[ 193.202382] 0000000000000009 ffff88005f6037d8 ffffffff81b80984 ffffffff822134e0
[ 193.236534] ffff88005f603828 ffff88005f603818 ffffffff810c985c 0000000000000000
[ 193.270616] 00000000ffffffef 0000000000000036 ffff880057ade240 ffffffff822102e0
[ 193.304533] Call Trace:
[ 193.323492] <IRQ> [<ffffffff81b80984>] dump_stack+0x46/0x58
[ 193.352157] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
[ 193.381448] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
[ 193.409801] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
[ 193.440265] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
[ 193.467674] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
[ 193.496441] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
[ 193.524986] [<ffffffff81a01837>] ? dev_queue_xmit_nit+0x1d7/0x260
[ 193.553937] [<ffffffff81a0184f>] ? dev_queue_xmit_nit+0x1ef/0x260
[ 193.582610] [<ffffffff81a01665>] ? dev_queue_xmit_nit+0x5/0x260
[ 193.610487] [<ffffffff81a065df>] dev_hard_start_xmit+0x37f/0x590
[ 193.638573] [<ffffffff81a26c6e>] sch_direct_xmit+0xfe/0x280
[ 193.665292] [<ffffffff81a06a3f>] __dev_queue_xmit+0x24f/0x660
[ 193.692467] [<ffffffff81a067f5>] ? __dev_queue_xmit+0x5/0x660
[ 193.719507] [<ffffffff81ab2179>] ? ip_output+0x59/0xf0
[ 193.744469] [<ffffffff81a06e70>] dev_queue_xmit+0x10/0x20
[ 193.769895] [<ffffffff81ab072b>] ip_finish_output+0x2cb/0x670
[ 193.796220] [<ffffffff81ab2179>] ? ip_output+0x59/0xf0
[ 193.820722] [<ffffffff81ab2179>] ip_output+0x59/0xf0
[ 193.844674] [<ffffffff81aad556>] ip_forward_finish+0x76/0x1a0
[ 193.870977] [<ffffffff81aad82b>] ip_forward+0x1ab/0x440
[ 193.895737] [<ffffffff81114b3b>] ? lock_is_held+0x8b/0xb0
[ 193.920781] [<ffffffff81aab340>] ip_rcv_finish+0x150/0x660
[ 193.945803] [<ffffffff81aabdfb>] ip_rcv+0x22b/0x370
[ 193.968865] [<ffffffff81b09b87>] ? packet_rcv_spkt+0x47/0x190
[ 193.994340] [<ffffffff81a03232>] __netif_receive_skb_core+0x722/0x8f0
[ 194.021716] [<ffffffff81a02c35>] ? __netif_receive_skb_core+0x125/0x8f0
[ 194.049498] [<ffffffff8100b0c0>] ? xen_clocksource_read+0x20/0x30
[ 194.075755] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
[ 194.100131] [<ffffffff81a03421>] __netif_receive_skb+0x21/0x70
[ 194.125592] [<ffffffff81a03493>] netif_receive_skb_internal+0x23/0xf0
[ 194.152650] [<ffffffff81a04ced>] napi_gro_receive+0x8d/0x100
[ 194.177127] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
[ 194.200779] [<ffffffff81a039ca>] net_rx_action+0x18a/0x2c0
[ 194.224573] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
[ 194.248255] [<ffffffff810ce767>] __do_softirq+0x137/0x300
[ 194.271722] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
[ 194.293407] [<ffffffff8157e4b5>] xen_evtchn_do_upcall+0x35/0x50
[ 194.318007] [<ffffffff81b8dd1e>] xen_do_hypervisor_callback+0x1e/0x30
[ 194.343990] <EOI> [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 194.370744] [<ffffffff810013aa>] ? xen_hypercall_sched_op+0xa/0x20
[ 194.395710] [<ffffffff8100ad20>] ? xen_safe_halt+0x10/0x20
[ 194.418397] [<ffffffff81020632>] ? default_idle+0x32/0xd0
[ 194.440557] [<ffffffff81020f86>] ? arch_cpu_idle+0x36/0x40
[ 194.462799] [<ffffffff81120a01>] ? cpu_startup_entry+0xa1/0x2a0
[ 194.486276] [<ffffffff81b7561c>] ? rest_init+0xbc/0xd0
[ 194.507451] [<ffffffff81b75565>] ? rest_init+0x5/0xd0
[ 194.528115] [<ffffffff82341f8e>] ? start_kernel+0x40e/0x41b
[ 194.550139] [<ffffffff8234197f>] ? repair_env_string+0x5e/0x5e
[ 194.572888] [<ffffffff823415f8>] ? x86_64_start_reservations+0x2a/0x2c
[ 194.597693] [<ffffffff82344ef2>] ? xen_start_kernel+0x586/0x588
[ 194.620610] ---[ end trace ecd65b3bd15959c4 ]---
[ 194.639349] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 4da0f .. start dump
[ 194.671379] r8169 0000:0b:00.0: single idx 500 P=4da0f040 N=4da0f D=53abe8040 L=36 DMA_TO_DEVICE dma map error checked
[ 194.708307] r8169 0000:0b:00.0: single idx 500 P=4da0f200 N=4da0f D=53abe8200 L=36 DMA_TO_DEVICE dma map error checked
[ 194.745122] r8169 0000:0b:00.0: single idx 500 P=4da0f3c0 N=4da0f D=53abe83c0 L=36 DMA_TO_DEVICE dma map error checked
[ 194.781859] r8169 0000:0b:00.0: single idx 500 P=4da0f580 N=4da0f D=53abe8580 L=36 DMA_TO_DEVICE dma map error checked
[ 194.818520] r8169 0000:0b:00.0: single idx 500 P=4da0f740 N=4da0f D=53abe8740 L=36 DMA_TO_DEVICE dma map error checked
[ 194.855038] r8169 0000:0b:00.0: single idx 500 P=4da0f900 N=4da0f D=53abe8900 L=36 DMA_TO_DEVICE dma map error checked
[ 194.891475] r8169 0000:0b:00.0: single idx 500 P=4da0fac0 N=4da0f D=53abe8ac0 L=36 DMA_TO_DEVICE dma map error checked
[ 194.927796] r8169 0000:0b:00.0: single idx 500 P=4da0fc80 N=4da0f D=53abe8c80 L=7b DMA_TO_DEVICE dma map error checked
[ 194.964115] r8169 0000:0b:00.0: single idx 500 P=4da0fe80 N=4da0f D=53abe8e80 L=36 DMA_TO_DEVICE dma map error not checked
[ 195.001427] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 4da0f .. end of dump
--
Sander
On Tue, Feb 11, 2014 at 11:56 AM, Sander Eikelenboom
<[email protected]> wrote:
> Hi Dan,
>
> FYI just tested and put Xen out of the equation (booting baremetal) and it still persists.
>
> I tried something else .. don't know if it gives you anymore insights, but it's worth the try:
This is great! See below:
>
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 2defd13..0fe5b75 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -474,11 +474,11 @@ static int active_pfn_set_overlap(unsigned long pfn, int overlap)
> return overlap;
> }
>
> -static void active_pfn_inc_overlap(unsigned long pfn)
> +static void active_pfn_inc_overlap(struct dma_debug_entry *ent)
> {
> - int overlap = active_pfn_read_overlap(pfn);
> + int overlap = active_pfn_read_overlap(ent->pfn);
>
> - overlap = active_pfn_set_overlap(pfn, ++overlap);
> + overlap = active_pfn_set_overlap(ent->pfn, ++overlap);
>
> /* If we overflowed the overlap counter then we're potentially
> * leaking dma-mappings. Otherwise, if maps and unmaps are
> @@ -486,15 +486,43 @@ static void active_pfn_inc_overlap(unsigned long pfn)
> * debug_dma_assert_idle() as the pfn may be marked idle
> * prematurely.
> */
> +
> WARN_ONCE(overlap > ACTIVE_PFN_MAX_OVERLAP,
> "DMA-API: exceeded %d overlapping mappings of pfn %lx\n",
> - ACTIVE_PFN_MAX_OVERLAP, pfn);
> + ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> +
> + if(overlap > ACTIVE_PFN_MAX_OVERLAP){
> +
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. start dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + int idx;
> +
> + for (idx = 0; idx < HASH_SIZE; idx++) {
> + struct hash_bucket *bucket = &dma_entry_hash[idx];
> + struct dma_debug_entry *entry;
> + unsigned long flags;
> +
> + list_for_each_entry(entry, &bucket->list, list) {
> + if (entry->pfn == ent->pfn) {
> + dev_info(entry->dev, "%s idx %d P=%Lx N=%lx D=%Lx L=%Lx %s %s\n",
> + type2name[entry->type], idx,
> + phys_addr(entry), entry->pfn,
> + entry->dev_addr, entry->size,
> + dir2name[entry->direction],
> + maperr2str[entry->map_err_type]);
> + }
> + }
> + }
> + dev_info(ent->dev, "DMA-API: exceeded %d overlapping mappings of pfn %lx .. end of dump\n", ACTIVE_PFN_MAX_OVERLAP, ent->pfn);
> + }
> }
>
>
> @@ -505,10 +533,10 @@ static int active_pfn_insert(struct dma_debug_entry *entry)
>
> spin_lock_irqsave(&radix_lock, flags);
> rc = radix_tree_insert(&dma_active_pfn, entry->pfn, entry);
> - if (rc == -EEXIST)
> - active_pfn_inc_overlap(entry->pfn);
> + if (rc == -EEXIST){
> + active_pfn_inc_overlap(entry);
> + }
> spin_unlock_irqrestore(&radix_lock, flags);
> -
> return rc;
> }
>
>
> This results in:
> [ 27.708678] r8169 0000:0a:00.0 eth1: link down
> [ 27.712102] r8169 0000:0a:00.0 eth1: link down
> [ 28.015340] r8169 0000:0b:00.0 eth0: link down
> [ 28.015368] r8169 0000:0b:00.0 eth0: link down
> [ 29.654844] r8169 0000:0b:00.0 eth0: link up
> [ 30.278542] r8169 0000:0a:00.0 eth1: link up
> [ 60.829503] EXT4-fs (dm-2): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 69.708979] EXT4-fs (dm-42): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 76.128678] EXT4-fs (dm-43): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 82.922836] EXT4-fs (dm-44): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 89.232889] EXT4-fs (dm-45): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 95.359859] EXT4-fs (dm-46): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 101.638559] EXT4-fs (sdb1): mounted filesystem with ordered data mode. Opts: barrier=1,errors=remount-ro
> [ 218.073407] ------------[ cut here ]------------
> [ 218.080983] WARNING: CPU: 5 PID: 0 at lib/dma-debug.c:492 add_dma_entry+0xf1/0x210()
> [ 218.088550] DMA-API: exceeded 7 overlapping mappings of pfn 3c421
> [ 218.095988] Modules linked in:
> [ 218.103270] CPU: 5 PID: 0 Comm: swapper/5 Tainted: G W 3.14.0-rc2-20140211-pcireset-net-btrevert-xenblock-dmadebug5+ #1
> [ 218.110712] Hardware name: MSI MS-7640/890FXA-GD70 (MS-7640) , BIOS V1.8B1 09/13/2010
> [ 218.118134] 0000000000000009 ffff88003fd437b8 ffffffff81b809c4 ffff88003e308000
> [ 218.125556] ffff88003fd43808 ffff88003fd437f8 ffffffff810c985c 0000000000000000
> [ 218.132917] 00000000ffffffef 0000000000000036 ffff88003d9d3c00 0000000000000282
> [ 218.140154] Call Trace:
> [ 218.147193] <IRQ> [<ffffffff81b809c4>] dump_stack+0x46/0x58
> [ 218.154271] [<ffffffff810c985c>] warn_slowpath_common+0x8c/0xc0
> [ 218.161293] [<ffffffff810c9946>] warn_slowpath_fmt+0x46/0x50
> [ 218.168227] [<ffffffff814f2cfa>] ? active_pfn_read_overlap+0x3a/0x70
> [ 218.175116] [<ffffffff814f41d1>] add_dma_entry+0xf1/0x210
> [ 218.181865] [<ffffffff814f4646>] debug_dma_map_page+0x126/0x150
> [ 218.188484] [<ffffffff817aabeb>] rtl8169_start_xmit+0x21b/0xa20
> [ 218.195042] [<ffffffff81a01877>] ? dev_queue_xmit_nit+0x1d7/0x260
> [ 218.201553] [<ffffffff81a0188f>] ? dev_queue_xmit_nit+0x1ef/0x260
> [ 218.207965] [<ffffffff81a016a5>] ? dev_queue_xmit_nit+0x5/0x260
> [ 218.214290] [<ffffffff81a0661f>] dev_hard_start_xmit+0x37f/0x590
> [ 218.220481] [<ffffffff81a26cae>] sch_direct_xmit+0xfe/0x280
> [ 218.226529] [<ffffffff81a06a7f>] __dev_queue_xmit+0x24f/0x660
> [ 218.232521] [<ffffffff81a06835>] ? __dev_queue_xmit+0x5/0x660
> [ 218.238439] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.244272] [<ffffffff81a06eb0>] dev_queue_xmit+0x10/0x20
> [ 218.250043] [<ffffffff81ab076b>] ip_finish_output+0x2cb/0x670
> [ 218.255682] [<ffffffff81ab21b9>] ? ip_output+0x59/0xf0
> [ 218.261168] [<ffffffff81ab21b9>] ip_output+0x59/0xf0
> [ 218.266559] [<ffffffff81aad596>] ip_forward_finish+0x76/0x1a0
> [ 218.271883] [<ffffffff81aad86b>] ip_forward+0x1ab/0x440
> [ 218.277148] [<ffffffff81aab380>] ip_rcv_finish+0x150/0x660
> [ 218.282373] [<ffffffff81aabe3b>] ip_rcv+0x22b/0x370
> [ 218.287436] [<ffffffff81b09bc7>] ? packet_rcv_spkt+0x47/0x190
> [ 218.292372] [<ffffffff81a03272>] __netif_receive_skb_core+0x722/0x8f0
> [ 218.297328] [<ffffffff81a02c75>] ? __netif_receive_skb_core+0x125/0x8f0
> [ 218.302304] [<ffffffff8112ce6e>] ? getnstimeofday+0xe/0x30
> [ 218.307296] [<ffffffff819f42c5>] ? __netdev_alloc_frag+0x175/0x1b0
> [ 218.312166] [<ffffffff81a03461>] __netif_receive_skb+0x21/0x70
> [ 218.316904] [<ffffffff81a034d3>] netif_receive_skb_internal+0x23/0xf0
> [ 218.321596] [<ffffffff81a04d2d>] napi_gro_receive+0x8d/0x100
> [ 218.326219] [<ffffffff817a7bc3>] rtl8169_poll+0x2d3/0x680
> [ 218.330754] [<ffffffff8112e366>] ? update_wall_time+0x356/0x690
> [ 218.335208] [<ffffffff81a03a0a>] net_rx_action+0x18a/0x2c0
> [ 218.339595] [<ffffffff810ce6f1>] ? __do_softirq+0xc1/0x300
> [ 218.343890] [<ffffffff810ce767>] __do_softirq+0x137/0x300
> [ 218.348085] [<ffffffff810cec9a>] irq_exit+0xaa/0xd0
> [ 218.352203] [<ffffffff81b8e5a7>] do_IRQ+0x67/0x110
> [ 218.356225] [<ffffffff81b8b772>] common_interrupt+0x72/0x72
> [ 218.360156] <EOI> [<ffffffff810536e6>] ? native_safe_halt+0x6/0x10
> [ 218.364087] [<ffffffff81113a7d>] ? trace_hardirqs_on+0xd/0x10
> [ 218.367935] [<ffffffff81020632>] default_idle+0x32/0xd0
> [ 218.371691] [<ffffffff8102071e>] amd_e400_idle+0x4e/0x140
> [ 218.375360] [<ffffffff81020f86>] arch_cpu_idle+0x36/0x40
> [ 218.378921] [<ffffffff81120a01>] cpu_startup_entry+0xa1/0x2a0
> [ 218.382508] [<ffffffff810473cf>] start_secondary+0x1af/0x210
> [ 218.386133] ---[ end trace 0e12f271209e2c18 ]---
> [ 218.389769] r8169 0000:0b:00.0: DMA-API: exceeded 7 overlapping mappings of pfn 3c421 .. start dump
> [ 218.393566] r8169 0000:0b:00.0: single idx 563 P=3c421100 N=3c421 D=c66100 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.397379] r8169 0000:0b:00.0: single idx 563 P=3c4212c0 N=3c421 D=c672c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.401094] r8169 0000:0b:00.0: single idx 564 P=3c421480 N=3c421 D=c68480 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.404730] r8169 0000:0b:00.0: single idx 564 P=3c421640 N=3c421 D=c69640 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.408310] r8169 0000:0b:00.0: single idx 565 P=3c421800 N=3c421 D=c6a800 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.411762] r8169 0000:0b:00.0: single idx 565 P=3c4219c0 N=3c421 D=c6b9c0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.415075] r8169 0000:0b:00.0: single idx 566 P=3c421b80 N=3c421 D=c6cb80 L=9b DMA_TO_DEVICE dma map error checked
> [ 218.418305] r8169 0000:0b:00.0: single idx 566 P=3c421dc0 N=3c421 D=c6ddc0 L=36 DMA_TO_DEVICE dma map error checked
> [ 218.421502] r8169 0000:0b:00.0: single idx 567 P=3c421f80 N=3c421 D=c6ef80 L=36 DMA_TO_DEVICE dma map error not checked
The overlap granularity is too large. Multiple dma_map_single
mappings are allowed to a given page as long as they don't collide on
the same cache line.
Please try the attached patch to see if it fixes this issue. Works ok for me.
On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
> The overlap granularity is too large. Multiple dma_map_single
> mappings are allowed to a given page as long as they don't collide on
> the same cache line.
>
I am not sure why you try number of mappings of a page.
Try launching 100 concurrent netperf -t TCP_SENFILE
Same page might be mapped more than 100 times, more than 10000 times in
some cases.
On Tue, Feb 11, 2014 at 8:17 PM, Eric Dumazet <[email protected]> wrote:
> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>
>> The overlap granularity is too large. Multiple dma_map_single
>> mappings are allowed to a given page as long as they don't collide on
>> the same cache line.
>>
>
> I am not sure why you try number of mappings of a page.
For this debug facility I am tracking whether dma has completed by
making sure there are no active dma_map entries in the address range
of a page being cow'd.
> Try launching 100 concurrent netperf -t TCP_SENFILE
>
> Same page might be mapped more than 100 times, more than 10000 times in
> some cases.
>
Aren't these mapping serialized by the device to some extent?
Although multi-queue / multi-device would even defeat that...
Hmm, then I think at a minimum the activity tracking needs to be
constrained to overlapping DMA_FROM_DEVICE or DMA_BIDIRECTIONAL
mappings. However, I am still operating on the assumption that some
architectures (especially non-io-coherent or dmabounce architectures)
expect a dma mapping to reflect exclusive ownership of the buffer.
>From the conversation I had with Russell, back in the day [1]:
"When we get to the second async_xor(), as we haven't started to run any
of these operations, the source and destination buffers are still mapped.
However, we ignore that and call dma_map_page() on them again - this is
illegal because the CPU does not own these buffers."
It might be the case that we can't have a general overlap detection
facility as it will flag stable use cases that nonetheless violate the
exclusivity expectation.
--
Dan
[1]: http://marc.info/?l=linux-arm-kernel&m=129389649101566&w=2
On Tue, 2014-02-11 at 13:28 -0800, Eric Dumazet wrote:
[...]
> Incoming frames might be taken out of order-3 pages.
>
> With regular Ethernet frames, this is 21 frames per order-3 pages.
>
> ACTIVE_PFN_MAX_OVERLAP seems too small.
>
> Alternative would be to user order-0 only pages if CONFIG_DMA_API_DEBUG
> is set. Not sure if it works if PAGE_SIZE=66536 ....
Indeed, you can get a lot of packet buffers into a 64K page...
> diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
> index f589c9af8cbf..1b9995adfd29 100644
> --- a/include/linux/skbuff.h
> +++ b/include/linux/skbuff.h
> @@ -1924,7 +1924,11 @@ static inline void __skb_queue_purge(struct sk_buff_head *list)
> kfree_skb(skb);
> }
>
> +#if defined(CONFIG_DMA_API_DEBUG)
> +#define NETDEV_FRAG_PAGE_MAX_ORDER 0
> +#else
> #define NETDEV_FRAG_PAGE_MAX_ORDER get_order(32768)
> +#endif
> #define NETDEV_FRAG_PAGE_MAX_SIZE (PAGE_SIZE << NETDEV_FRAG_PAGE_MAX_ORDER)
> #define NETDEV_PAGECNT_MAX_BIAS NETDEV_FRAG_PAGE_MAX_SIZE
>
That may be useful for debugging this particular problem, but please
don't make debugging options change behaviour like this.
Ben.
--
Ben Hutchings
If more than one person is responsible for a bug, no one is at fault.
On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>
> > The overlap granularity is too large. Multiple dma_map_single
> > mappings are allowed to a given page as long as they don't collide on
> > the same cache line.
> >
>
> I am not sure why you try number of mappings of a page.
>
> Try launching 100 concurrent netperf -t TCP_SENFILE
>
> Same page might be mapped more than 100 times, more than 10000 times in
> some cases.
Thanks for that test case.
I updated the fix patch with the following.
diff --git a/lib/dma-debug.c b/lib/dma-debug.c
index 42b12740940b..611010df1e9c 100644
--- a/lib/dma-debug.c
+++ b/lib/dma-debug.c
@@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
unsigned long flags;
int rc;
+ /* If the device is not writing memory then we don't have any
+ * concerns about the cpu consuming stale data. This mitigates
+ * legitimate usages of overlapping mappings.
+ */
+ if (entry->direction == DMA_TO_DEVICE)
+ return 0;
+
spin_lock_irqsave(&radix_lock, flags);
rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
if (rc == -EEXIST)
@@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
{
unsigned long flags;
+ /* ...mirror the insert case */
+ if (entry->direction == DMA_TO_DEVICE)
+ return;
+
spin_lock_irqsave(&radix_lock, flags);
/* since we are counting overlaps the final put of the
* cacheline will occur when the overlap count is 0.
Sander, barring a negative test result from you I'll send the attached
patch to Andrew.
--
Dan
On Tue, Feb 11, 2014 at 06:07:10PM -0800, Dan Williams wrote:
> The overlap granularity is too large. Multiple dma_map_single
> mappings are allowed to a given page as long as they don't collide on
> the same cache line.
>
>
> Please try the attached patch to see if it fixes this issue. Works ok for me.
FWIW, since applying this, I haven't seen the 8169 warnings.
thanks,
Dave
Thursday, February 13, 2014, 9:14:47 PM, you wrote:
> On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
>> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>>
>> > The overlap granularity is too large. Multiple dma_map_single
>> > mappings are allowed to a given page as long as they don't collide on
>> > the same cache line.
>> >
>>
>> I am not sure why you try number of mappings of a page.
>>
>> Try launching 100 concurrent netperf -t TCP_SENFILE
>>
>> Same page might be mapped more than 100 times, more than 10000 times in
>> some cases.
> Thanks for that test case.
> I updated the fix patch with the following.
> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
> index 42b12740940b..611010df1e9c 100644
> --- a/lib/dma-debug.c
> +++ b/lib/dma-debug.c
> @@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
> unsigned long flags;
> int rc;
>
> + /* If the device is not writing memory then we don't have any
> + * concerns about the cpu consuming stale data. This mitigates
> + * legitimate usages of overlapping mappings.
> + */
+ if (entry->>direction == DMA_TO_DEVICE)
> + return 0;
> +
> spin_lock_irqsave(&radix_lock, flags);
> rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
> if (rc == -EEXIST)
> @@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
> {
> unsigned long flags;
>
> + /* ...mirror the insert case */
+ if (entry->>direction == DMA_TO_DEVICE)
> + return;
> +
> spin_lock_irqsave(&radix_lock, flags);
> /* since we are counting overlaps the final put of the
> * cacheline will occur when the overlap count is 0.
> Sander, barring a negative test result from you I'll send the attached
> patch to Andrew.
Hi Dan,
That seems to effectively suppress the warning, thanks and:
Tested-by; Sander Eikelenboom <[email protected]>
--
Sander
> --
> Dan
On Thu, Feb 13, 2014 at 4:49 PM, Sander Eikelenboom
<[email protected]> wrote:
>
> Thursday, February 13, 2014, 9:14:47 PM, you wrote:
>
>> On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
>>> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>>>
>>> > The overlap granularity is too large. Multiple dma_map_single
>>> > mappings are allowed to a given page as long as they don't collide on
>>> > the same cache line.
>>> >
>>>
>>> I am not sure why you try number of mappings of a page.
>>>
>>> Try launching 100 concurrent netperf -t TCP_SENFILE
>>>
>>> Same page might be mapped more than 100 times, more than 10000 times in
>>> some cases.
>
>> Thanks for that test case.
>
>> I updated the fix patch with the following.
>
>> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
>> index 42b12740940b..611010df1e9c 100644
>> --- a/lib/dma-debug.c
>> +++ b/lib/dma-debug.c
>> @@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
>> unsigned long flags;
>> int rc;
>>
>> + /* If the device is not writing memory then we don't have any
>> + * concerns about the cpu consuming stale data. This mitigates
>> + * legitimate usages of overlapping mappings.
>> + */
> + if (entry->>direction == DMA_TO_DEVICE)
>> + return 0;
>> +
>> spin_lock_irqsave(&radix_lock, flags);
>> rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
>> if (rc == -EEXIST)
>> @@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
>> {
>> unsigned long flags;
>>
>> + /* ...mirror the insert case */
> + if (entry->>direction == DMA_TO_DEVICE)
>> + return;
>> +
>> spin_lock_irqsave(&radix_lock, flags);
>> /* since we are counting overlaps the final put of the
>> * cacheline will occur when the overlap count is 0.
>
>
>> Sander, barring a negative test result from you I'll send the attached
>> patch to Andrew.
>
> Hi Dan,
>
> That seems to effectively suppress the warning, thanks and:
>
> Tested-by; Sander Eikelenboom <[email protected]>
Is there a reason this isn't in Linus' tree yet?
josh
On Tue, Feb 25, 2014 at 9:45 AM, Josh Boyer <[email protected]> wrote:
> On Thu, Feb 13, 2014 at 4:49 PM, Sander Eikelenboom
> <[email protected]> wrote:
>>
>> Thursday, February 13, 2014, 9:14:47 PM, you wrote:
>>
>>> On Tue, 2014-02-11 at 20:17 -0800, Eric Dumazet wrote:
>>>> On Tue, 2014-02-11 at 18:07 -0800, Dan Williams wrote:
>>>>
>>>> > The overlap granularity is too large. Multiple dma_map_single
>>>> > mappings are allowed to a given page as long as they don't collide on
>>>> > the same cache line.
>>>> >
>>>>
>>>> I am not sure why you try number of mappings of a page.
>>>>
>>>> Try launching 100 concurrent netperf -t TCP_SENFILE
>>>>
>>>> Same page might be mapped more than 100 times, more than 10000 times in
>>>> some cases.
>>
>>> Thanks for that test case.
>>
>>> I updated the fix patch with the following.
>>
>>> diff --git a/lib/dma-debug.c b/lib/dma-debug.c
>>> index 42b12740940b..611010df1e9c 100644
>>> --- a/lib/dma-debug.c
>>> +++ b/lib/dma-debug.c
>>> @@ -513,6 +513,13 @@ static int active_cln_insert(struct dma_debug_entry *entry)
>>> unsigned long flags;
>>> int rc;
>>>
>>> + /* If the device is not writing memory then we don't have any
>>> + * concerns about the cpu consuming stale data. This mitigates
>>> + * legitimate usages of overlapping mappings.
>>> + */
>> + if (entry->>direction == DMA_TO_DEVICE)
>>> + return 0;
>>> +
>>> spin_lock_irqsave(&radix_lock, flags);
>>> rc = radix_tree_insert(&dma_active_cacheline, to_cln(entry), entry);
>>> if (rc == -EEXIST)
>>> @@ -526,6 +533,10 @@ static void active_cln_remove(struct dma_debug_entry *entry)
>>> {
>>> unsigned long flags;
>>>
>>> + /* ...mirror the insert case */
>> + if (entry->>direction == DMA_TO_DEVICE)
>>> + return;
>>> +
>>> spin_lock_irqsave(&radix_lock, flags);
>>> /* since we are counting overlaps the final put of the
>>> * cacheline will occur when the overlap count is 0.
>>
>>
>>> Sander, barring a negative test result from you I'll send the attached
>>> patch to Andrew.
>>
>> Hi Dan,
>>
>> That seems to effectively suppress the warning, thanks and:
>>
>> Tested-by; Sander Eikelenboom <[email protected]>
>
> Is there a reason this isn't in Linus' tree yet?
>
It's in -mm and now -next, I expect it will go upstream with akpm's next sync.
--
Dan