Received: by 2002:a05:6a10:206:0:0:0:0 with SMTP id 6csp746157pxj; Fri, 7 May 2021 20:36:29 -0700 (PDT) X-Google-Smtp-Source: ABdhPJzDV9qFc703hKpJr8JYuczOOajNipr9lTM4amQ56n6WbIhF79Jm2dKQ/HQ2F7dhM9emTrPZ X-Received: by 2002:a17:906:ae8f:: with SMTP id md15mr13906135ejb.244.1620444989216; Fri, 07 May 2021 20:36:29 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1620444989; cv=none; d=google.com; s=arc-20160816; b=yEGXF2egz+XF4gefeOdvitW5ImTeDroARlBQCtK8cQntfH/+6CEvk18XtwvI+PTRjf kVmj2azXSodn1o942HrXsUNcw7HqlaBQpgJA9R9yoPi801ypFUTAUqiYu0ii6G3txw0/ YoQEH4tuJJPQlHW3DnXs6J4j7CgZYWSgXNjE2pxnRgiA6hrQ6tL+z/nEL3CiIa5Kmmbv xx8tNvd/lbwXoaprvnHkQ7Ynt3FWFAD+bvSEO2omUckmRuZFSdvl9XGqA+4Ja2th4d5z d98ZY1I0W6hKbeFJHAAfdaY3l/K7jFFmWuaokHZpBKhDTPirT3+dlgSIoVPvMVFQUtov SPyQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=JlWWLJu+SOU+OleF8b9iNUr3ONW3oZ22rK7JJmpNKGM=; b=qTDyaOGMLUFFM90s3xVt0ZGyKViRP6F1dnwlA4sTff9veGzIHevvCgttEeBilfWDPs R7uabr5TpWo7Sgv3RK2xCNBrNeXrPW75tn5H6Ta0wJ+yNnKWWLwEbHVlc3YD+yYleaND Tw3CJTB+uNJhcbNF8aoOxLixd+yGMgy8v9Fcm5fIYcubttr2uM2Wtbd2d46rTf68DF0d TJt5WIigWbaGSTX8CGKK4dqO+tHhEVYr5OFPN9D0xRiHEM3LjCOTKy3zoIiYTQ6Spahb WeQwWHCyIlCI9UZjwV5xGh7nOaIx09iNXUiQ5j4NA+6dG6Z3gSl+SudRsn/Zo7enOz4h YygQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=m626vE1H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id go43si6631493ejc.620.2021.05.07.20.36.05; Fri, 07 May 2021 20:36:29 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=m626vE1H; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S229864AbhEHDgH (ORCPT + 99 others); Fri, 7 May 2021 23:36:07 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59604 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S229583AbhEHDgG (ORCPT ); Fri, 7 May 2021 23:36:06 -0400 Received: from mail-oo1-xc32.google.com (mail-oo1-xc32.google.com [IPv6:2607:f8b0:4864:20::c32]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 7682EC061574; Fri, 7 May 2021 20:34:54 -0700 (PDT) Received: by mail-oo1-xc32.google.com with SMTP id v13-20020a4aa40d0000b02902052145a469so765340ool.3; Fri, 07 May 2021 20:34:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=JlWWLJu+SOU+OleF8b9iNUr3ONW3oZ22rK7JJmpNKGM=; b=m626vE1H+y7lCGKUvaStCwyNFCWbaoHpaaFnCRHEZz1ewCn8HHfZRPYdHGSnMOCmwP OuWtKYVL07svmyJqJez+afEJdQWf6Nnvpt1Qs37St+SuG978lK3nXC9heaT1AJLE5gLj C6XsPzR/NUSVqgA6CZU3yKKf6y4BAPJyekT9O/3k2Z3hmU8CNyQKvlgWhJJzO1ygPORs PfqP+qF0rf+hX4ox9MgAXnQC2qqqD7DDwNdsxgoYBnNMzNPdjhIyXDhxgtYixXT3Vr7K jSsnlsfXS66ccPPnBjhLh4YZu6hMiIqZTUYpXTa2QyeKehSdJEroZJ4/0K39kxa8dyh4 rZnA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=JlWWLJu+SOU+OleF8b9iNUr3ONW3oZ22rK7JJmpNKGM=; b=WITWgsJhgZ7T5UDnAdglsj9vv56X3B1SlXGeXhYGR2H/qRvmmCLUg9rg4SFRPemjxn JogenqVXdYdPylTJW1E4Q9vSlVlWngP1BqAfpiSXmro+/e4yWgOkcKfL2VOt73QYpqss 7UjQUem+bogPf6LYTB8nQMzIMW9GhwYIR2QEH77VmOBExi3UrzTcGKzFjjaAAUV+QX6U ko0sue9nZCBYrX6n9Y2LdxhyvF0O/h5Y3ybdXnD0AFxWbF0oG9PaQia7TODHJ3I6b2a2 97ghmD2vvENJYYRg2S5Xaq/kj2dalR7qg5OQAaFZuFzUhWbsjhvdMkGIKWxPLxYg4nvs PhRg== X-Gm-Message-State: AOAM5323PW33fkq0LglRaKQccM5SrEnQmydlk+UQ5P0R/lVUO0MPqCuP AAMpd4sc9vzn6Kn5mCalBjPprLf/ozQLJdLBMcb8gnjSNr7VJw== X-Received: by 2002:a4a:d2cb:: with SMTP id j11mr10482767oos.87.1620444893917; Fri, 07 May 2021 20:34:53 -0700 (PDT) MIME-Version: 1.0 References: In-Reply-To: From: Jason Xing Date: Sat, 8 May 2021 11:34:18 +0800 Message-ID: Subject: Re: soft lockup in __inet_lookup_established() function To: David Miller , Hideaki YOSHIFUJI , dsahern@kernel.org, kuba@kernel.org Cc: netdev , LKML , liweishi Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Does anyone have some suggestions? I've been haunted for a while. thanks, jason On Thu, Apr 29, 2021 at 8:15 PM Jason Xing wrote: > > Hello, > > I've encountered one big issue which causes infinite loop in > __inet_lookup_established() function until I reboot manually. it's > happening randomly among thousands of machines with the 4.19 kernel > running. Once the soft lockup issue is triggered, whatever I try I > still cannot ping or ssh to the machine anymore until reboot. > > Does anyone have any clue on how to dig into this part of code? I > highly suspect that it has something to do with the corruption of > nulls_list, so the lookup of sk could never break the infinite loop of > hashinfo. > > These call traces are totally identical attached below: > [1048271.465028] watchdog: BUG: soft lockup - CPU#20 stuck for 22s! > [swapper/20:0] > [1048271.473669] Modules linked in: vxlan ip6_udp_tunnel udp_tunnel > udp_diag tcp_diag inet_diag nf_conntrack_netlink nfnetlink > br_netfilter bridge stp llc xt_statistic xt_nat ipt_MASQUERADE > ipt_REJECT nf_reject_ipv4 xt_mark xt_addrtype xt_comment xt_conntrack > ... > [1048271.553597] RIP: 0010:__inet_lookup_established+0x5a/0x190 > ... > [1048271.660309] Call Trace: > [1048271.663135] > [1048271.665432] tcp_v4_early_demux+0xaa/0x150 > [1048271.669812] ip_rcv_finish+0x171/0x410 > [1048271.673941] ip_rcv+0x273/0x362 > [1048271.677360] ? inet_add_protocol.cold.1+0x1e/0x1e > [1048271.682354] __netif_receive_skb_core+0xac2/0xbb0 > [1048271.687351] ? inet_gro_receive+0x22a/0x2d0 > [1048271.692001] ? ktime_get_with_offset+0x4d/0xc0 > [1048271.696725] netif_receive_skb_internal+0x42/0xf0 > [1048271.701717] napi_gro_receive+0xba/0xe0 > [1048271.705839] receive_buf+0x165/0xa50 [virtio_net] > [1048271.710839] ? receive_buf+0x165/0xa50 [virtio_net] > [1048271.716053] ? vring_unmap_one+0x16/0x80 > [1048271.720308] ? detach_buf+0x69/0x110 > [1048271.724218] virtnet_poll+0xc0/0x2ea [virtio_net] > [1048271.729202] net_rx_action+0x149/0x3b0 > [1048271.733234] __do_softirq+0xe3/0x30a > [1048271.737095] irq_exit+0x100/0x110 > [1048271.740882] do_IRQ+0x85/0xd0 > [1048271.744143] common_interrupt+0xf/0xf > [1048271.748104] > > > thanks, > jason