Return-path: Received: from magic.merlins.org ([209.81.13.136]:34503 "EHLO mail1.merlins.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752372Ab2DJFLb (ORCPT ); Tue, 10 Apr 2012 01:11:31 -0400 Date: Mon, 9 Apr 2012 22:11:27 -0700 From: Marc MERLIN To: Eric Dumazet Cc: David Miller , Larry.Finger@lwfinger.net, bhutchings@solarflare.com, linux-wireless@vger.kernel.org, netdev@vger.kernel.org Subject: Re: 3.2.8/amd64 full interrupt hangs and deadlocks under big network copies (page allocation failure) Message-ID: <20120410051127.GA32048@merlins.org> (sfid-20120410_071206_551460_F708EDF6) References: <20120409.143710.879746943062854492.davem@davemloft.net> <4F83316F.20504@lwfinger.net> <1333998672.3007.245.camel@edumazet-glaptop> <20120409.153452.1284163346306246866.davem@davemloft.net> <1334030180.13293.98.camel@edumazet-glaptop> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii In-Reply-To: <1334030180.13293.98.camel@edumazet-glaptop> Sender: linux-wireless-owner@vger.kernel.org List-ID: On Tue, Apr 10, 2012 at 05:56:20AM +0200, Eric Dumazet wrote: > > What wireless device are we dealing with again? > > Problem seems related to tailroom needed by mac80211 > (IEEE80211_ENCRYPT_TAILROOM = 18 bytes) > > So we must reallocate skb->head, thats impressive nobody cares. > > [ 3007.249687] ieee80211_skb_resize(skb=ffff8802329846e8) cloned=1 head_need=0 tail_need=18 skb->len=1494 ksize=4096 tailroom=0 headroom=2282 > [ 3007.249693] ieee80211_skb_resize(skb=ffff8802329846e8) cloned=0 head_need=0 tail_need=0 skb->len=1526 ksize=8192 tailroom=64 headroom=2250 > > Ouch... skb_tailroom() seems wrong ... it seems pskb_expand_head() is really suboptimal. > > It appears tcp_sendmsg() tries to fill skb completely, with no available tailroom : > > if (skb_tailroom(skb) > 0) { > /* We have some space in skb head. Superb! */ > if (copy > skb_tailroom(skb)) > copy = skb_tailroom(skb); > err = skb_add_data_nocache(sk, skb, from, copy); > if (err) > goto do_fault; > } else { > > Shouldnt we take into account dev->needed_tailroom ? > > I'll submit a pskb_expand_head() fix asap. Thanks for finding this. To answer an earlier question, I tried the non wireless case too. The problem is harder to reproduce over e1000e though, I just got two short hangs where my mouse cursor was hung for 5-10 seconds, but nothing in syslog/dmesg this time. I'm pretty sure this older log below did happen on e1000e with wireless disabled though (but it had a taint 'O'): If that helps, my earlier message had the traces below. I can report back when you have a patch you'd like me to try out. Thanks again, Marc > [28451.191115] WorkerPool/1248 D ffff88013bc93580 0 12483 3740 0x00000080 > [28451.191115] ffff8801189ba100 0000000000000082 0000000000000000 ffff880134f2e180 > [28451.191115] 0000000000013580 ffff88001614bfd8 ffff88001614bfd8 ffff8801189ba100 > [28451.191115] ffffffff811b4b62 000000010164525a 0000000000000046 ffffffff8165a250 > [28451.191115] Call Trace: > [28451.191115] [] ? sha_transform+0x395/0x1209 > [28451.191115] [] ? __mutex_lock_common.isra.6+0x13d/0x219 > [28451.191115] [] ? extract_buf+0x86/0xf2 > [28451.191115] [] ? mutex_lock+0xf/0x1f > [28451.191115] [] ? rtnetlink_rcv+0xe/0x28 > [28451.191115] [] ? netlink_unicast+0xe6/0x14e > [28451.191115] [] ? netlink_sendmsg+0x1fc/0x237 > [28451.191115] [] ? sock_sendmsg+0xc1/0xde > [28451.191115] [] ? __cache_free.isra.40+0x19/0x1a7 > [28451.191115] [] ? nl_pid_hash_rehash+0xc8/0xef > [28451.191115] [] ? get_parent_ip+0x9/0x1b > [28451.191115] [] ? get_parent_ip+0x9/0x1b > [28451.191115] [] ? sub_preempt_count+0x83/0x94 > [28451.191115] [] ? fget_light+0x85/0x8d > [28451.191115] [] ? sys_sendto+0xf7/0x137 > [28451.191115] [] ? get_parent_ip+0x9/0x1b > [28451.191115] [] ? sub_preempt_count+0x83/0x94 > [28451.191115] [] ? _raw_spin_unlock+0x24/0x30 > [28451.191115] [] ? audit_syscall_entry+0x105/0x130 > [28451.191115] [] ? system_call_fastpath+0x16/0x1b > > > > Below are lines I got in syslog during the copy. > Highlight is: > [ 4437.367046] kworker/1:1: page allocation failure: order:1, mode:0x20 > and then: > [ 8640.516177] INFO: task flush-0:37:7122 blocked for more than 120 seconds. > and then 120,000 lines(!) of: > [ 9654.042164] ieee80211 phy0: failed to reallocate TX buffer > > unedited lines below. > > So, any idea of what I can try next? > > Thanks, > Marc > > > [ 4437.367046] kworker/1:1: page allocation failure: order:1, mode:0x20 > [ 4437.367053] Pid: 8067, comm: kworker/1:1 Tainted: G O 3.2.8-amd64-volpreempt-noide-20120208 #1 > [ 4437.367056] Call Trace: > [ 4437.367058] [] ? warn_alloc_failed+0x11f/0x132 > [ 4437.367074] [] ? __alloc_pages_nodemask+0x6b1/0x72f > [ 4437.367081] [] ? kmem_getpages+0x4c/0xd9 > [ 4437.367086] [] ? kmem_getpages+0x4c/0xd9 > [ 4437.367090] [] ? fallback_alloc+0x123/0x1c2 > [ 4437.367096] [] ? pskb_expand_head+0xe0/0x24a > [ 4437.367101] [] ? __kmalloc+0xb2/0x10a > [ 4437.367105] [] ? pskb_expand_head+0xe0/0x24a > [ 4437.367139] [] ? ieee80211_skb_resize+0x64/0x9d [mac80211] > [ 4437.367154] [] ? ieee80211_subif_start_xmit+0x705/0x883 [mac80211] > [ 4437.367175] [] ? dev_hard_start_xmit+0x40b/0x552 > [ 4437.367179] [] ? sch_direct_xmit+0x63/0x13a > [ 4437.367182] [] ? dev_queue_xmit+0x2e0/0x4b5 > [ 4437.367185] [] ? ip_finish_output2+0x1c7/0x218 > [ 4437.367188] [] ? __ip_flush_pending_frames.isra.29+0x69/0x69 > [ 4437.367191] [] ? ip_queue_xmit+0x2cd/0x30d > [ 4437.367195] [] ? getnstimeofday+0x4a/0x7b > [ 4437.367198] [] ? tcp_transmit_skb+0x6d7/0x70a > [ 4437.367201] [] ? tcp_write_xmit+0x698/0x7a1 > [ 4437.367204] [] ? tcp_ack+0x14e3/0x1658 > [ 4437.367207] [] ? tcp_established_options+0x2b/0x9e > [ 4437.367210] [] ? __tcp_push_pending_frames+0x18/0x44 > [ 4437.367213] [] ? tcp_data_snd_check+0x2c/0xfd > [ 4437.367216] [] ? tcp_rcv_established+0x4f0/0x549 > [ 4437.367220] [] ? select_task_rq_fair+0x67b/0x690 > [ 4437.367223] [] ? tcp_v4_do_rcv+0x166/0x323 > [ 4437.367226] [] ? tcp_v4_rcv+0x404/0x65d > [ 4437.367230] [] ? ip_local_deliver_finish+0x148/0x1ba > [ 4437.367233] [] ? __netif_receive_skb+0x3f2/0x43f > [ 4437.367236] [] ? netif_receive_skb+0x7e/0x84 > [ 4437.367239] [] ? napi_gro_receive+0x1c/0x29 > [ 4437.367241] [] ? napi_skb_finish+0x1c/0x31 > [ 4437.367253] [] ? e1000_clean_rx_irq+0x1f3/0x290 [e1000e] > [ 4437.367261] [] ? e1000_clean+0x69/0x208 [e1000e] > [ 4437.367264] [] ? net_rx_action+0xa4/0x1c0 > [ 4437.367268] [] ? __do_softirq+0xc0/0x188 > [ 4437.367272] [] ? call_softirq+0x1c/0x30 > [ 4437.367276] [] ? do_softirq+0x3c/0x7b > [ 4437.367278] [] ? irq_exit+0x3d/0xa7 > [ 4437.367281] [] ? do_IRQ+0x81/0x97 > [ 4437.367285] [] ? common_interrupt+0x6e/0x6e > [ 4437.367287] [] ? dec128+0x434/0x80c [aes_x86_64] > [ 4437.367307] [] ? crypt+0xae/0x101 [xts] > [ 4437.367313] [] ? aes_decrypt+0xe/0xe [aes_x86_64] > [ 4437.367320] [] ? dec128+0x80c/0x80c [aes_x86_64] > [ 4437.367327] [] ? decrypt+0x3f/0x44 [xts] > [ 4437.367331] [] ? async_decrypt+0x37/0x3c > [ 4437.367338] [] ? crypt_convert+0x22f/0x2c4 [dm_crypt] > [ 4437.367342] [] ? load_TLS+0x7/0xa > [ 4437.367348] [] ? kcryptd_crypt+0x56/0x342 [dm_crypt] > [ 4437.367352] [] ? finish_task_switch+0x86/0xb7 > [ 4437.367355] [] ? get_parent_ip+0x9/0x1b > [ 4437.367358] [] ? sub_preempt_count+0x83/0x94 > [ 4437.367361] [] ? need_resched+0x1a/0x23 > [ 4437.367368] [] ? crypt_convert_init.isra.14+0x4f/0x4f [dm_crypt] > [ 4437.367372] [] ? process_one_work+0x16d/0x298 > [ 4437.367375] [] ? worker_thread+0xc2/0x145 > [ 4437.367378] [] ? manage_workers.isra.23+0x15b/0x15b > [ 4437.367381] [] ? kthread+0x76/0x7e > [ 4437.367384] [] ? kernel_thread_helper+0x4/0x10 > [ 4437.367387] [] ? kthread_worker_fn+0x139/0x139 > [ 4437.367390] [] ? gs_change+0x13/0x13 > [ 4437.367392] Mem-Info: > [ 4437.367393] Node 0 DMA per-cpu: > [ 4437.367396] CPU 0: hi: 0, btch: 1 usd: 0 > [ 4437.367397] CPU 1: hi: 0, btch: 1 usd: 0 > [ 4437.367399] Node 0 DMA32 per-cpu: > [ 4437.367401] CPU 0: hi: 186, btch: 31 usd: 164 > [ 4437.367403] CPU 1: hi: 186, btch: 31 usd: 111 > [ 4437.367405] Node 0 Normal per-cpu: > [ 4437.367407] CPU 0: hi: 186, btch: 31 usd: 114 > [ 4437.367409] CPU 1: hi: 186, btch: 31 usd: 158 > [ 4437.367413] active_anon:391300 inactive_anon:132951 isolated_anon:0 > [ 4437.367414] active_file:136666 inactive_file:140710 isolated_file:31 > [ 4437.367415] unevictable:1 dirty:3402 writeback:26688 unstable:7844 > [ 4437.367416] free:36509 slab_reclaimable:85289 slab_unreclaimable:35524 > [ 4437.367417] mapped:18088 shmem:35934 pagetables:9300 bounce:0 > [ 4437.367419] Node 0 DMA free:15712kB min:260kB low:324kB high:388kB active_anon:0kB inactive_anon:0kB active_file:0kB inactive_file:36kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:15684kB mlocked:0kB dirty:0kB writeback:36kB mapped:0kB shmem:0kB slab_reclaimable:0kB slab_unreclaimable:160kB kernel_stack:0kB pagetables:0kB unstable:0kB bounce:0kB writeback_tmp:0kB pages_scanned:40833 all_unreclaimable? yes > [ 4437.367428] lowmem_reserve[]: 0 2960 3907 3907 > [ 4437.367432] Node 0 DMA32 free:110732kB min:51004kB low:63752kB high:76504kB active_anon:1380396kB inactive_anon:345140kB active_file:422008kB inactive_file:437440kB unevictable:4kB isolated(anon):0kB isolated(file):124kB present:3031688kB mlocked:4kB dirty:7148kB writeback:72004kB mapped:39424kB shmem:64836kB slab_reclaimable:212408kB slab_unreclaimable:80516kB kernel_stack:1720kB pagetables:19252kB unstable:23964kB bounce:0kB writeback_tmp:0kB pages_scanned:63 all_unreclaimable? no > [ 4437.367442] lowmem_reserve[]: 0 0 946 946 > [ 4437.367445] Node 0 Normal free:19592kB min:16312kB low:20388kB high:24468kB active_anon:184804kB inactive_anon:186664kB active_file:124656kB inactive_file:125364kB unevictable:0kB isolated(anon):0kB isolated(file):0kB present:969600kB mlocked:0kB dirty:6460kB writeback:34712kB mapped:32928kB shmem:78900kB slab_reclaimable:128748kB slab_unreclaimable:61420kB kernel_stack:2792kB pagetables:17948kB unstable:7412kB bounce:0kB writeback_tmp:0kB pages_scanned:89 all_unreclaimable? no > [ 4437.367455] lowmem_reserve[]: 0 0 0 0 > [ 4437.367458] Node 0 DMA: 2*4kB 1*8kB 1*16kB 0*32kB 1*64kB 2*128kB 2*256kB 1*512kB 2*1024kB 2*2048kB 2*4096kB = 15712kB > [ 4437.367467] Node 0 DMA32: 25961*4kB 73*8kB 8*16kB 1*32kB 0*64kB 0*128kB 0*256kB 0*512kB 0*1024kB 1*2048kB 1*4096kB = 110732kB > [ 4437.367475] Node 0 Normal: 4134*4kB 0*8kB 1*16kB 1*32kB 0*64kB 2*128kB 1*256kB 1*512kB 0*1024kB 1*2048kB 0*4096kB = 19656kB > [ 4437.367484] 317456 total pagecache pages > [ 4437.367485] 4042 pages in swap cache > [ 4437.367487] Swap cache stats: add 31786, delete 27744, find 10282/11070 > [ 4437.367489] Free swap = 4012560kB > [ 4437.367490] Total swap = 4106248kB > [ 4437.370978] 1032176 pages RAM > [ 4437.370978] 42834 pages reserved > [ 4437.370978] 390787 pages shared > [ 4437.370978] 750687 pages non-shared > > > [ 8640.516177] INFO: task flush-0:37:7122 blocked for more than 120 seconds. > [ 8640.516182] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 8640.516186] flush-0:37 D ffff88013bc93580 0 7122 2 0x00000080 > [ 8640.516192] ffff880072c28810 0000000000000046 ffff880100000000 ffff880134f2e180 > [ 8640.516199] 0000000000013580 ffff88006d491fd8 ffff88006d491fd8 ffff880072c28810 > [ 8640.516205] ffff88013bfd1c50 000000018134b58b ffff88010c3cc1b0 ffff88006d491d18 > [ 8640.516211] Call Trace: > [ 8640.516221] [] ? inode_owner_or_capable+0x36/0x36 > [ 8640.516226] [] ? inode_wait+0x6/0xa > [ 8640.516232] [] ? __wait_on_bit+0x3e/0x71 > [ 8640.516241] [] ? get_parent_ip+0x9/0x1b > [ 8640.516245] [] ? inode_wait_for_writeback+0xa2/0xc8 > [ 8640.516249] [] ? autoremove_wake_function+0x2a/0x2a > [ 8640.516252] [] ? wb_writeback+0x226/0x255 > [ 8640.516255] [] ? add_preempt_count+0x9a/0x9c > [ 8640.516258] [] ? wb_do_writeback+0x150/0x1b2 > [ 8640.516261] [] ? bdi_writeback_thread+0x8f/0x204 > [ 8640.516264] [] ? wb_do_writeback+0x1b2/0x1b2 > [ 8640.516266] [] ? kthread+0x76/0x7e > [ 8640.516270] [] ? kernel_thread_helper+0x4/0x10 > [ 8640.516273] [] ? kthread_worker_fn+0x139/0x139 > [ 8640.516275] [] ? gs_change+0x13/0x13 > [ 8640.516281] INFO: task cp:7568 blocked for more than 120 seconds. > [ 8640.516283] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. > [ 8640.516284] cp D ffff88013bc13580 0 7568 6744 0x00000080 > [ 8640.516288] ffff880123976750 0000000000000082 0000000000000000 ffffffff8160d020 > [ 8640.516292] 0000000000013580 ffff88001b3a9fd8 ffff88001b3a9fd8 ffff880123976750 > [ 8640.516295] 0000000000000001 0000000181066767 ffff880131463e50 ffff88013bc13e08 > [ 8640.516299] Call Trace: > [ 8640.516303] [] ? __lock_page+0x66/0x66 > [ 8640.516306] [] ? io_schedule+0x58/0x6f > [ 8640.516308] [] ? sleep_on_page+0x6/0xa > [ 8640.516311] [] ? __wait_on_bit+0x3e/0x71 > [ 8640.516313] [] ? wait_on_page_bit+0x6e/0x73 > [ 8640.516316] [] ? autoremove_wake_function+0x2a/0x2a > [ 8640.516319] [] ? filemap_fdatawait_range+0x74/0x139 > [ 8640.516327] [] ? writeback_single_inode+0x155/0x2f4 > [ 8640.516330] [] ? sync_inode+0x4a/0x6f > [ 8640.516343] [] ? nfs_wb_all+0x39/0x3e [nfs] > [ 8640.516351] [] ? nfs_setattr+0x8e/0xf6 [nfs] > [ 8640.516354] [] ? notify_change+0x177/0x24f > [ 8640.516357] [] ? utimes_common+0x10c/0x135 > [ 8640.516361] [] ? fget+0x50/0x57 > [ 8640.516364] [] ? do_utimes+0x8a/0xd6 > [ 8640.516367] [] ? vfs_read+0x9f/0xe6 > [ 8640.516369] [] ? sys_utimensat+0x64/0x6b > [ 8640.516372] [] ? system_call_fastpath+0x16/0x1b > > > [ 9654.042164] ieee80211 phy0: failed to reallocate TX buffer > [ 9654.042189] ieee80211 phy0: failed to reallocate TX buffer > (120,000 lines of this) -- "A mouse is a device used to point at the xterm you want to type in" - A.S.R. Microsoft is to operating systems .... .... what McDonalds is to gourmet cooking Home page: http://marc.merlins.org/