Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1030679Ab3DSOgN (ORCPT ); Fri, 19 Apr 2013 10:36:13 -0400 Received: from mail-pd0-f170.google.com ([209.85.192.170]:53415 "EHLO mail-pd0-f170.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1030552Ab3DSOgK (ORCPT ); Fri, 19 Apr 2013 10:36:10 -0400 Message-ID: <1366382166.3205.112.camel@edumazet-glaptop> Subject: Re: trinity locking troubles with ax25_recvmsg() From: Eric Dumazet To: Tommi Rantala Cc: netdev@vger.kernel.org, Joerg Reuter , Ralf Baechle , "David S. Miller" , LKML , Dave Jones Date: Fri, 19 Apr 2013 07:36:06 -0700 In-Reply-To: References: Content-Type: text/plain; charset="UTF-8" X-Mailer: Evolution 3.2.3-0ubuntu6 Content-Transfer-Encoding: 7bit Mime-Version: 1.0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 9033 Lines: 171 On Fri, 2013-04-19 at 14:30 +0300, Tommi Rantala wrote: > Hello, > > Noticed a huge load in one of my trinity test machines: > 14:07:10 up 6:11, 1 user, load average: 39,00, 39,01, 38,95 > > But it is just 39x trinity child processes sitting there for couple of > hours in D state, plus "trinity-child38" in S state: > > USER PID %CPU %MEM VSZ RSS TTY STAT START TIME COMMAND > ... > root 3716 0.0 0.0 186620 3284 ? Sl 07:57 0:00 polkitd > ttrantal 5576 0.0 0.1 22728 9124 ? DNL 12:00 0:00 trinity-child33 > ttrantal 5592 0.0 0.1 24636 8968 ? DNL 12:00 0:00 trinity-child14 > ttrantal 5593 0.0 0.1 26164 10684 ? DNL 12:00 0:00 trinity-child25 > ttrantal 5597 0.0 0.1 26620 10804 ? DNL 12:00 0:00 trinity-child4 > ttrantal 5599 0.0 0.1 31044 9268 ? DNL 12:00 0:00 trinity-child17 > ttrantal 5613 0.0 0.1 38804 8716 ? DNL 12:00 0:00 trinity-child24 > ttrantal 5621 0.0 0.2 38536 20528 ? DNL 12:00 0:00 trinity-child27 > ttrantal 5632 0.0 0.1 37508 11596 ? SNL 12:00 0:00 trinity-child38 > ttrantal 5663 0.0 0.2 31304 17656 ? DNL 12:01 0:00 trinity-child35 > ttrantal 5666 0.0 0.1 43656 13880 ? DNL 12:01 0:01 trinity-child2 > ttrantal 5689 0.0 0.3 54872 28748 ? DNL 12:01 0:00 trinity-child9 > ttrantal 5700 0.0 0.1 36600 10528 ? DN 12:01 0:00 trinity-child23 > ... > > [20091.121189] trinity-child38 S 0000000000000000 0 5632 1 0x00000000 > [20091.121189] ffff8802036adab8 0000000000000046 ffff8802036adfd8 > ffff88020354a3e0 > [20091.121189] ffff8802036adfd8 ffff8802036adfd8 ffff88022afe47c0 > ffff88020354a3e0 > [20091.121189] ffff88020354aab8 ffff8802036adc64 ffff88021b154b80 > ffff88021b154a40 > [20091.121189] Call Trace: > [20091.121189] [] schedule+0x55/0x60 > [20091.121189] [] schedule_timeout+0x36/0x310 > [20091.121189] [] ? trace_hardirqs_on_caller+0x155/0x1f0 > [20091.121189] [] ? trace_hardirqs_on+0xd/0x10 > [20091.121189] [] ? prepare_to_wait_exclusive+0x77/0x90 > [20091.121189] [] __skb_recv_datagram+0x2e3/0x3d0 > [20091.121189] [] ? lock_sock_nested+0x8d/0xa0 > [20091.121189] [] ? > csum_partial_copy_fromiovecend+0x220/0x220 > [20091.121189] [] skb_recv_datagram+0x2d/0x30 > [20091.121189] [] ax25_recvmsg+0x78/0x260 > [20091.121189] [] sock_aio_read+0x132/0x170 > [20091.121189] [] do_sync_read+0x98/0xf0 > [20091.121189] [] vfs_read+0xbd/0x180 > [20091.121189] [] sys_read+0x50/0xa0 > [20091.121189] [] system_call_fastpath+0x16/0x1b > [20091.121189] trinity-child35 D 0000000000000000 0 5663 1 0x00000004 > [20091.121189] ffff880210a2fe68 0000000000000046 ffff880210a2ffd8 > ffff88022afe0000 > [20091.121189] ffff880210a2ffd8 ffff880210a2ffd8 ffff88022aa823e0 > ffff88022afe0000 > [20091.121189] ffff880210a2fe58 ffff88021b154a90 ffff88021b154a40 > ffff88021b154ae0 > [20091.121189] Call Trace: > [20091.121189] [] ? _stext+0xe48/0xe48 > [20091.121189] [] schedule+0x55/0x60 > [20091.121189] [] __lock_sock+0x6e/0xa0 > [20091.121189] [] ? abort_exclusive_wait+0xb0/0xb0 > [20091.121189] [] ? lock_sock_nested+0x3e/0xa0 > [20091.121189] [] lock_sock_nested+0x52/0xa0 > [20091.121189] [] ? _stext+0xe48/0xe48 > [20091.121189] [] ax25_listen+0x2b/0x70 > [20091.121189] [] sys_listen+0x57/0x80 > [20091.121189] [] system_call_fastpath+0x16/0x1b > [20091.121189] [] ? _stext+0xe48/0xe48 > [20091.121189] trinity-child15 D 0000000000000000 0 5781 1 0x00000004 > [20091.121189] ffff8801f992fd58 0000000000000046 ffff8801f992ffd8 > ffff8801f9c723e0 > [20091.121189] ffff8801f992ffd8 ffff8801f992ffd8 ffff88020354c7c0 > ffff8801f9c723e0 > [20091.121189] ffff8801f992fd48 ffff88021b154a90 ffff88021b154a40 > ffff88021b154ae0 > [20091.121189] Call Trace: > [20091.121189] [] schedule+0x55/0x60 > [20091.121189] [] __lock_sock+0x6e/0xa0 > [20091.121189] [] ? abort_exclusive_wait+0xb0/0xb0 > [20091.121189] [] ? lock_sock_nested+0x3e/0xa0 > [20091.121189] [] lock_sock_nested+0x52/0xa0 > [20091.121189] [] ? get_empty_filp+0x13b/0x1e0 > [20091.121189] [] ax25_accept+0x78/0x280 > [20091.121189] [] ? abort_exclusive_wait+0xb0/0xb0 > [20091.121189] [] sys_accept4+0x128/0x200 > [20091.121189] [] ? trace_hardirqs_on+0xd/0x10 > [20091.121189] [] ? _raw_spin_unlock_irq+0x2b/0x50 > [20091.121189] [] ? sysret_check+0x22/0x5d > [20091.121189] [] ? trace_hardirqs_on_caller+0x155/0x1f0 > [20091.121189] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [20091.121189] [] sys_accept+0xb/0x10 > [20091.121189] [] system_call_fastpath+0x16/0x1b > [20091.121189] trinity-child37 D 0000000000000000 0 5788 1 0x00000004 > [20091.121189] ffff8801fccabb78 0000000000000046 ffff8801fccabfd8 > ffff88022aa70000 > [20091.121189] ffff8801fccabfd8 ffff8801fccabfd8 ffff8801f9840000 > ffff88022aa70000 > [20091.121189] ffff8801fccabb68 ffff88021b154a90 ffff88021b154a40 > ffff88021b154ae0 > [20091.121189] Call Trace: > [20091.121189] [] schedule+0x55/0x60 > [20091.121189] [] __lock_sock+0x6e/0xa0 > [20091.121189] [] ? abort_exclusive_wait+0xb0/0xb0 > [20091.121189] [] ? lock_sock_nested+0x3e/0xa0 > [20091.121189] [] lock_sock_nested+0x52/0xa0 > [20091.121189] [] ? sock_has_perm+0x180/0x1e0 > [20091.121189] [] ? selinux_setprocattr+0x790/0x790 > [20091.121189] [] ax25_recvmsg+0x3d/0x260 > [20091.121189] [] sock_recvmsg+0xbc/0xf0 > [20091.121189] [] ? native_sched_clock+0x35/0x80 > [20091.121189] [] ? sched_clock+0x9/0x10 > [20091.121189] [] sys_recvfrom+0xde/0x150 > [20091.121189] [] ? trace_hardirqs_on+0xd/0x10 > [20091.121189] [] ? _raw_spin_unlock_irq+0x2b/0x50 > [20091.121189] [] ? sysret_check+0x22/0x5d > [20091.121189] [] ? trace_hardirqs_on_caller+0x155/0x1f0 > [20091.121189] [] ? trace_hardirqs_on_thunk+0x3a/0x3f > [20091.121189] [] system_call_fastpath+0x16/0x1b > [...] > > > [20091.121189] Showing all locks held in the system: > [20091.121189] 1 lock held by getty/3568: > [20091.121189] #0: (&ldata->atomic_read_lock){+.+...}, at: > [] n_tty_read+0x2b8/0x980 > [20091.121189] 1 lock held by getty/3574: > [20091.121189] #0: (&ldata->atomic_read_lock){+.+...}, at: > [] n_tty_read+0x2b8/0x980 > [20091.121189] 1 lock held by getty/3581: > [20091.121189] #0: (&ldata->atomic_read_lock){+.+...}, at: > [] n_tty_read+0x2b8/0x980 > [20091.121189] 1 lock held by getty/3585: > [20091.121189] #0: (&ldata->atomic_read_lock){+.+...}, at: > [] n_tty_read+0x2b8/0x980 > [20091.121189] 1 lock held by getty/3588: > [20091.121189] #0: (&ldata->atomic_read_lock){+.+...}, at: > [] n_tty_read+0x2b8/0x980 > [20091.121189] 1 lock held by getty/3637: > [20091.121189] #0: (&ldata->atomic_read_lock){+.+...}, at: > [] n_tty_read+0x2b8/0x980 > [20091.121189] 1 lock held by trinity-child38/5632: > [20091.121189] #0: (sk_lock-AF_AX25){+.+.+.}, at: > [] ax25_recvmsg+0x3d/0x260 > [20091.121189] 2 locks held by bash/6823: > [20091.121189] #0: (sysrq_key_table_lock){......}, at: > [] __handle_sysrq+0x2e/0x1b0 > [20091.121189] #1: (tasklist_lock){.+.+..}, at: [] > debug_show_all_locks+0x5c/0x260 > [20091.121189] > > > Looking at e.g. net/ax25/af_ax25.c and net/x25/af_x25.c side-by-side, > one glaring difference is that in x25 we release_sock() just before > skb_recv_datagram(), but not in ax25. This code is probably not used anymore, and if used, its used for a few sockets. There is no need trying to make it faster. Lets hope there is no bug in x25 because of the release_sock(). -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/