Received: by 10.223.164.202 with SMTP id h10csp502115wrb; Thu, 30 Nov 2017 03:00:24 -0800 (PST) X-Google-Smtp-Source: AGs4zMZgaKT+d60FBWmP9jwlxMKvLlEuplDhx4A2725mWbYKuAFgwcgr7mQTPJROb9ykl9nts8g0 X-Received: by 10.98.156.204 with SMTP id u73mr6210454pfk.8.1512039624771; Thu, 30 Nov 2017 03:00:24 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1512039624; cv=none; d=google.com; s=arc-20160816; b=N1ihcDDfEFWFkQ/GjrfhO2c5pej2PREb5Jz/T/fcoKoODL4tYy8LvbUTvjIySCEh+1 lPqk/NSug8xh6s6TVAmAfvGstWhyu+koJScQJ837tOOMX2g6SzKR/fiI/qsIP4QZ1iRV RCGv21blcgBF0QCuKCoMx+hkDELXDM90JYdx/nynqOpYMbI+k54Y4SJf/uWEZ2khkhuJ SPE4uGGD3yuBtYs+xoq9MgZsmYL/Yw5+CMwnwGB+ZDb6GRN5FpBVTWswIQo9pj2JhHLh Bi1kOv7q5Syui0COqPiTtHFpmJvI0QaYYFDlnJos8SktzeJJ/Qzxfxfqd+R8RZM8q8jw oyCg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:arc-authentication-results; bh=DsgFoFns7SIqhLmWoyliO/rL8tgXtJLulyXZ2g5GcF4=; b=BCQeu+ZLqn9EjkvKtO50eGh6Q7eXTnOZRPWAx50PbMlO8HyJeGnUFo6bXNm+IE99YF mO1am8BXhx9RPtn7owlzSX88p2rSkEBy/UzKdZcUjBVyyHKck/Q3D/5ob+n5ztXRViyX E818RfnRSqKaqg2W7K8ZwijWR57qj+X9o/VaWdU9lseO83vV3e4PxgDkJa4yX8lSY4xL t9KSfJ//Cg8/XHKwi6LcAyXSBp7ssaTPZoWT6v2e3ujyxsy1UvxGsfMJSwC9mVu77CZp Xgzmg4q6IJ5oVG3Kk6XOrr2RGRvRyZk08yaK3OPxrAphjtGuUcY7NYsRu+fLO4utsVQX IQYw== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id u186si2843012pgc.434.2017.11.30.03.00.10; Thu, 30 Nov 2017 03:00:24 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752241AbdK3K76 (ORCPT + 99 others); Thu, 30 Nov 2017 05:59:58 -0500 Received: from mail5.windriver.com ([192.103.53.11]:33594 "EHLO mail5.wrs.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751395AbdK3K74 (ORCPT ); Thu, 30 Nov 2017 05:59:56 -0500 Received: from ALA-HCA.corp.ad.wrs.com (ala-hca.corp.ad.wrs.com [147.11.189.40]) by mail5.wrs.com (8.15.2/8.15.2) with ESMTPS id vAUAxcCK011950 (version=TLSv1 cipher=AES128-SHA bits=128 verify=OK); Thu, 30 Nov 2017 02:59:38 -0800 Received: from [128.224.155.90] (128.224.155.90) by ALA-HCA.corp.ad.wrs.com (147.11.189.50) with Microsoft SMTP Server (TLS) id 14.3.361.1; Thu, 30 Nov 2017 02:59:37 -0800 Subject: Re: [PATCH net v2] tipc: call tipc_rcv() only if bearer is up in tipc_udp_recv() To: Tommi Rantala , Jon Maloy , "David S. Miller" , , , References: <20171129104842.30781-1-tommi.t.rantala@nokia.com> From: Ying Xue Message-ID: <213f64e0-d5c3-511f-b83a-c468c399abc1@windriver.com> Date: Thu, 30 Nov 2017 18:57:22 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.4.0 MIME-Version: 1.0 In-Reply-To: <20171129104842.30781-1-tommi.t.rantala@nokia.com> Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-Originating-IP: [128.224.155.90] Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 11/29/2017 06:48 PM, Tommi Rantala wrote: > Remove the second tipc_rcv() call in tipc_udp_recv(). We have just > checked that the bearer is not up, and calling tipc_rcv() with a bearer > that is not up leads to a TIPC div-by-zero crash in > tipc_node_calculate_timer(). The crash is rare in practice, but can > happen like this: In my opinion, the real root cause of the issue is because we too early set a not-yet-initialized bearer instance to ub->bearer through rcu_assign_pointer(ub->bearer, b) in tipc_udp_enable(). Instead if we assign the bearer pointer at the end of tipc_udp_enable() where the bearer has been completed the initialization, the issue would be avoided. Thanks, Ying > > We're enabling a bearer, but it's not yet up and fully initialized. > At the same time we receive a discovery packet, and in tipc_udp_recv() > we end up calling tipc_rcv() with the not-yet-initialized bearer, > causing later the div-by-zero crash in tipc_node_calculate_timer(). > > Jon Maloy explains the impact of removing the second tipc_rcv() call: > "link setup in the worst case will be delayed until the next arriving > discovery messages, 1 sec later, and this is an acceptable delay." > > As the tipc_rcv() call is removed, just leave the function via the > rcu_out label, so that we will kfree_skb(). > > [ 12.590450] Own node address <1.1.1>, network identity 1 > [ 12.668088] divide error: 0000 [#1] SMP > [ 12.676952] CPU: 2 PID: 0 Comm: swapper/2 Not tainted 4.14.2-dirty #1 > [ 12.679225] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.10.2-2.fc27 04/01/2014 > [ 12.682095] task: ffff8c2a761edb80 task.stack: ffffa41cc0cac000 > [ 12.684087] RIP: 0010:tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] > [ 12.686486] RSP: 0018:ffff8c2a7fc838a0 EFLAGS: 00010246 > [ 12.688451] RAX: 0000000000000000 RBX: ffff8c2a5b382600 RCX: 0000000000000000 > [ 12.691197] RDX: 0000000000000000 RSI: ffff8c2a5b382600 RDI: ffff8c2a5b382600 > [ 12.693945] RBP: ffff8c2a7fc838b0 R08: 0000000000000001 R09: 0000000000000001 > [ 12.696632] R10: 0000000000000000 R11: 0000000000000000 R12: ffff8c2a5d8949d8 > [ 12.699491] R13: ffffffff95ede400 R14: 0000000000000000 R15: ffff8c2a5d894800 > [ 12.702338] FS: 0000000000000000(0000) GS:ffff8c2a7fc80000(0000) knlGS:0000000000000000 > [ 12.705099] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 12.706776] CR2: 0000000001bb9440 CR3: 00000000bd009001 CR4: 00000000003606e0 > [ 12.708847] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000 > [ 12.711016] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400 > [ 12.712627] Call Trace: > [ 12.713390] > [ 12.714011] tipc_node_check_dest+0x2e8/0x350 [tipc] > [ 12.715286] tipc_disc_rcv+0x14d/0x1d0 [tipc] > [ 12.716370] tipc_rcv+0x8b0/0xd40 [tipc] > [ 12.717396] ? minmax_running_min+0x2f/0x60 > [ 12.718248] ? dst_alloc+0x4c/0xa0 > [ 12.718964] ? tcp_ack+0xaf1/0x10b0 > [ 12.719658] ? tipc_udp_is_known_peer+0xa0/0xa0 [tipc] > [ 12.720634] tipc_udp_recv+0x71/0x1d0 [tipc] > [ 12.721459] ? dst_alloc+0x4c/0xa0 > [ 12.722130] udp_queue_rcv_skb+0x264/0x490 > [ 12.722924] __udp4_lib_rcv+0x21e/0x990 > [ 12.723670] ? ip_route_input_rcu+0x2dd/0xbf0 > [ 12.724442] ? tcp_v4_rcv+0x958/0xa40 > [ 12.725039] udp_rcv+0x1a/0x20 > [ 12.725587] ip_local_deliver_finish+0x97/0x1d0 > [ 12.726323] ip_local_deliver+0xaf/0xc0 > [ 12.726959] ? ip_route_input_noref+0x19/0x20 > [ 12.727689] ip_rcv_finish+0xdd/0x3b0 > [ 12.728307] ip_rcv+0x2ac/0x360 > [ 12.728839] __netif_receive_skb_core+0x6fb/0xa90 > [ 12.729580] ? udp4_gro_receive+0x1a7/0x2c0 > [ 12.730274] __netif_receive_skb+0x1d/0x60 > [ 12.730953] ? __netif_receive_skb+0x1d/0x60 > [ 12.731637] netif_receive_skb_internal+0x37/0xd0 > [ 12.732371] napi_gro_receive+0xc7/0xf0 > [ 12.732920] receive_buf+0x3c3/0xd40 > [ 12.733441] virtnet_poll+0xb1/0x250 > [ 12.733944] net_rx_action+0x23e/0x370 > [ 12.734476] __do_softirq+0xc5/0x2f8 > [ 12.734922] irq_exit+0xfa/0x100 > [ 12.735315] do_IRQ+0x4f/0xd0 > [ 12.735680] common_interrupt+0xa2/0xa2 > [ 12.736126] > [ 12.736416] RIP: 0010:native_safe_halt+0x6/0x10 > [ 12.736925] RSP: 0018:ffffa41cc0cafe90 EFLAGS: 00000246 ORIG_RAX: ffffffffffffff4d > [ 12.737756] RAX: 0000000000000000 RBX: ffff8c2a761edb80 RCX: 0000000000000000 > [ 12.738504] RDX: 0000000000000000 RSI: 0000000000000000 RDI: 0000000000000000 > [ 12.739258] RBP: ffffa41cc0cafe90 R08: 0000014b5b9795e5 R09: ffffa41cc12c7e88 > [ 12.740118] R10: 0000000000000000 R11: 0000000000000000 R12: 0000000000000002 > [ 12.740964] R13: ffff8c2a761edb80 R14: 0000000000000000 R15: 0000000000000000 > [ 12.741831] default_idle+0x2a/0x100 > [ 12.742323] arch_cpu_idle+0xf/0x20 > [ 12.742796] default_idle_call+0x28/0x40 > [ 12.743312] do_idle+0x179/0x1f0 > [ 12.743761] cpu_startup_entry+0x1d/0x20 > [ 12.744291] start_secondary+0x112/0x120 > [ 12.744816] secondary_startup_64+0xa5/0xa5 > [ 12.745367] Code: b9 f4 01 00 00 48 89 c2 48 c1 ea 02 48 3d d3 07 00 > 00 48 0f 47 d1 49 8b 0c 24 48 39 d1 76 07 49 89 14 24 48 89 d1 31 d2 48 > 89 df <48> f7 f1 89 c6 e8 81 6e ff ff 5b 41 5c 5d c3 66 90 66 2e 0f 1f > [ 12.747527] RIP: tipc_node_calculate_timer.isra.12+0x45/0x60 [tipc] RSP: ffff8c2a7fc838a0 > [ 12.748555] ---[ end trace 1399ab83390650fd ]--- > [ 12.749296] Kernel panic - not syncing: Fatal exception in interrupt > [ 12.750123] Kernel Offset: 0x13200000 from 0xffffffff82000000 > (relocation range: 0xffffffff80000000-0xffffffffbfffffff) > [ 12.751215] Rebooting in 60 seconds.. > > Fixes: c9b64d492b1f ("tipc: add replicast peer discovery") > Signed-off-by: Tommi Rantala > Cc: Jon Maloy > --- > > v2: Resorted to a minimal fix, as per feedback from David M. > > net/tipc/udp_media.c | 4 ---- > 1 file changed, 4 deletions(-) > > diff --git a/net/tipc/udp_media.c b/net/tipc/udp_media.c > index ecca64fc6a6f..3deabcab4882 100644 > --- a/net/tipc/udp_media.c > +++ b/net/tipc/udp_media.c > @@ -371,10 +371,6 @@ static int tipc_udp_recv(struct sock *sk, struct sk_buff *skb) > goto rcu_out; > } > > - tipc_rcv(sock_net(sk), skb, b); > - rcu_read_unlock(); > - return 0; > - > rcu_out: > rcu_read_unlock(); > out: > From 1585397336381194213@xxx Wed Nov 29 10:52:01 +0000 2017 X-GM-THRID: 1585397336381194213 X-Gmail-Labels: Inbox,Category Forums,HistoricalUnread