Received: by 2002:a05:6a10:6006:0:0:0:0 with SMTP id w6csp509982pxa; Thu, 27 Aug 2020 08:15:00 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxtYKRf9YyzJpeNLz469wN5hCjq3IJd1DnlVWq3kFxRA9MyxqmyeYw4E922mEeLyx5MLglc X-Received: by 2002:a05:6402:14cd:: with SMTP id f13mr4858535edx.272.1598541300024; Thu, 27 Aug 2020 08:15:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1598541300; cv=none; d=google.com; s=arc-20160816; b=IkPOf6X15I4MsKh5GjKJipCgWQXpuhoIsLc9mm8sx6mxu7kx6YyzqXkU90Kkg08161 TY4CS4PDhOPN/akXNo9uVKytsNu1acjEBJSDKL9t0Le+W01dTXEu9CAGyQ2isQPNbJz0 ThtLJkBDm/pkVEClxRU058gcYVQtgGMrdrT25K3v3E7WKTNQg58Y2e5giqspCB14pRAI +rKMxL0SmdxtL7OnNsV6lDz2zTEU/K88mGV7ffjRtZIPTq7H8Gdu7zvbCiw0UQ7TDC9T jyB/yH9wQBS0PQipOJmU4vxDa2uKH3fX5wlkPq/oIj3TBN5KfTGaSNuo9deUdwhltGiw +JKQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=2XR1Mc8dJ5lZuCbZST8KrY7/ZxwDrNjZ39kdQPvS71M=; b=E9tIkrhw+yj1S1JwPyRndbIKfUr0P9Y5t2LGnMSswA1a4TeYOgBkm/BsqrDABO4P4c 6YoJDMz9QYPlmChrvp3/0Y2wHLb0Eus12jD1CIJAWLIF1lj/jgHFcmT/g00P5AlqQTxk Pe3y8xXOVeSRncayD7iljGdgMFsnmo0g9r999HwMIv15cmOgeGF/7/8i0jmJ/bFDGH3A 7OF9n8oCtFPAaSfkWn3D5prybJBGhPvTGvtd9DBgDp1pM5quKcI/p5uh0zdTIK2OUdhZ ypRe/anXkNgSxQwpsVvQzxIpwPd9bnfc9AbggIEL0MLE8Fi581aevvfSi6PnJX/biRfC ZWLQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DHNNLW7O; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id n5si1471183ejc.171.2020.08.27.08.14.33; Thu, 27 Aug 2020 08:15:00 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=DHNNLW7O; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727108AbgH0PLa (ORCPT + 99 others); Thu, 27 Aug 2020 11:11:30 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:59930 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728818AbgH0Llb (ORCPT ); Thu, 27 Aug 2020 07:41:31 -0400 Received: from mail-il1-x144.google.com (mail-il1-x144.google.com [IPv6:2607:f8b0:4864:20::144]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 2C683C06121B for ; Thu, 27 Aug 2020 04:41:30 -0700 (PDT) Received: by mail-il1-x144.google.com with SMTP id e17so170813ils.10 for ; Thu, 27 Aug 2020 04:41:30 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=2XR1Mc8dJ5lZuCbZST8KrY7/ZxwDrNjZ39kdQPvS71M=; b=DHNNLW7O8TNMDsLvBYn79ffz6n300qKSqKrTPoO68hDSCjkEQh+oHhJKQzkEk3xLYz VP4Yd5najj1hoFMVvcFOJevMTJM+wFEBjB8vwACxOyxf6m0+I6PhLfDl6gp7IWWXjWYV X92HIQOARMLiln2esTB1pkLQXcwQQjYmNahfa+9M5U+9q6uXw6IPtyK6KURD5Hwk++F3 QBcmqv6wuNRjuX4yexJ4vxRYbxfJK81Cs1NGOVv3f7AekiFx+Y6YwGMRiNqW8oLesFDh +iqfBwmGqsYND/a2wog+Hp7WMVDnbageMYBTztSeyjy0t8ApG+NHVcgUHuOWG6OOoNbV gnWg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=2XR1Mc8dJ5lZuCbZST8KrY7/ZxwDrNjZ39kdQPvS71M=; b=Ehq8nbtNAAk0GowS+EWaPGVQ4rQ4FQp0i9fs/9Cc5ksIOLC+2K/poeQqrvdn05XTNd SfJJsp7vWB9/4G6VzXMKV+N4HgoZi9r9nBx6U2pkhGI7P5GxLg6rN2OeEniDoUUncMrg ULfHf4Vo+i/g51PjjWdR0Gdn1v/7aajNm43KUuZrLmFkcZnUbSw3rRzC3d6YxGc08WJh EimbrQI7VmS6TYXCdwzQVFGg5tQ4CnueccWlM6GgO8HQ6YG5DkM7eP6aVmZ0LX/KB3mB Kogw891J3iNO1PF+hN8apQEWO3xy5I/J1LDxVykWioBbrdkc09Z9CQXxJ6HDgglyw/uw lehQ== X-Gm-Message-State: AOAM532/XFo/f9ScKnu0KyJYeKgmnlswJu71G1FspQ3gRCuDwlQdvcCF d21c3CvArPoBj1wtoHKPb9W1fLjWbs8jRD/TDxcS7w== X-Received: by 2002:a92:d086:: with SMTP id h6mr17190790ilh.205.1598528489329; Thu, 27 Aug 2020 04:41:29 -0700 (PDT) MIME-Version: 1.0 References: <5f479309.1c69fb81.9106e.e12bSMTPIN_ADDED_BROKEN@mx.google.com> In-Reply-To: <5f479309.1c69fb81.9106e.e12bSMTPIN_ADDED_BROKEN@mx.google.com> From: Eric Dumazet Date: Thu, 27 Aug 2020 04:41:18 -0700 Message-ID: Subject: Re: RFC: inet_timewait_sock->tw_timer list corruption To: Wang Long Cc: netdev , David Miller , Alexey Kuznetsov , Hideaki YOSHIFUJI , Jakub Kicinski , Eric Dumazet , "Eric W. Biederman" , opurdila@ixiacom.com, vegard.nossum@gmail.com, LKML Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu, Aug 27, 2020 at 4:03 AM Wang Long wrote: > > Hi=EF=BC=8C > > we encountered a kernel panic as following: > > [4394470.273792] general protection fault: 0000 [#1] SMP NOPTI > [4394470.274038] CPU: 0 PID: 0 Comm: swapper/0 Kdump: loaded Tainted: G > W --------- - - 4.18.0-80.el8.x86_64 #1 > [4394470.274477] Hardware name: Sugon I620-G30/60P24-US, BIOS MJGS1223 > 04/07/2020 > [4394470.274727] RIP: 0010:run_timer_softirq+0x34e/0x440 > [4394470.274957] Code: 84 3f ff ff ff 49 8b 04 24 48 85 c0 74 58 49 8b > 1c 24 48 89 5d 08 0f 1f 44 00 00 48 8b 03 48 8b 53 08 48 85 c0 48 89 02 > 74 04 <48> 89 50 08 f6 43 22 20 48 c7 43 08 00 00 00 00 48 89 ef 4c 89 2b > [4394470.275505] RSP: 0018:ffff88f000803ee0 EFLAGS: 00010086 > [4394470.275783] RAX: dead000000000200 RBX: ffff88e5e33ea078 RCX: > 0000000000000100 > [4394470.276087] RDX: ffff88f000803ee8 RSI: 0000000000000000 RDI: > ffff88f00081aa00 > [4394470.276391] RBP: ffff88f00081aa00 R08: 0000000000000001 R09: > 0000000000000000 > [4394470.276697] R10: ffff88e5e33eb1f0 R11: 0000000000000000 R12: > ffff88f000803ee8 > [4394470.277030] R13: dead000000000200 R14: ffff88f000803ee0 R15: > 0000000000000000 > [4394470.277350] FS: 0000000000000000(0000) GS:ffff88f000800000(0000) > knlGS:0000000000000000 > [4394470.277684] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [4394470.278020] CR2: 00007f200eddd160 CR3: 0000000e0b20a002 CR4: > 00000000007606f0 > [4394470.278412] DR0: 0000000000000000 DR1: 0000000000000000 DR2: > 0000000000000000 > [4394470.278799] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: > 0000000000000400 > [4394470.279194] PKRU: 55555554 > [4394470.279543] Call Trace: > [4394470.279889] > [4394470.280237] ? __hrtimer_init+0xb0/0xb0 > [4394470.280618] ? sched_clock+0x5/0x10 > [4394470.281000] __do_softirq+0xe8/0x2ef > [4394470.281397] irq_exit+0xf1/0x100 > [4394470.281761] smp_apic_timer_interrupt+0x74/0x130 > [4394470.282132] apic_timer_interrupt+0xf/0x20 > [4394470.282548] > [4394470.282954] RIP: 0010:cpuidle_enter_state+0xa0/0x2b0 > [4394470.283341] Code: 8b 3d 6c fb 59 4c e8 0f ed a6 ff 48 89 c3 0f 1f > 44 00 00 31 ff e8 80 00 a7 ff 45 84 f6 0f 85 c3 01 00 00 fb 66 0f 1f 44 > 00 00 <4c> 29 fb 48 ba cf f7 53 e3 a5 9b c4 20 48 89 d8 48 c1 fb 3f 48 f7 > [4394470.284219] RSP: 0018:ffffffffb4603e78 EFLAGS: 00000246 ORIG_RAX: > ffffffffffffff13 > [4394470.284671] RAX: ffff88f000823080 RBX: 000f9cbf579e86c6 RCX: > 000000000000001f > [4394470.285129] RDX: 000f9cbf579e86c6 RSI: 0000000037a6f674 RDI: > 0000000000000000 > [4394470.285623] RBP: 0000000000000002 R08: 00000000000000c4 R09: > 0000000000000027 > [4394470.286088] R10: ffffffffb4603e58 R11: 000000000000004c R12: > ffff88f00082df00 > [4394470.286566] R13: ffffffffb4724118 R14: 0000000000000000 R15: > 000f9cbf579d44e0 > [4394470.287045] ? cpuidle_enter_state+0x90/0x2b0 > [4394470.287527] do_idle+0x200/0x280 > [4394470.288010] cpu_startup_entry+0x6f/0x80 > [4394470.288501] start_kernel+0x533/0x553 > [4394470.288994] secondary_startup_64+0xb7/0xc0 > > > After analysis, we found that the timer which expires has > timer->entry.next =3D=3D POISON2 !(the list corruption ) > > the crash scenario is the same as https://lkml.org/lkml/2017/3/21/732, > > I cannot reproduce this issue, but I found that the timer cause crash is > the inet_timewait_sock->tw_timer(its callback function is > tw_timer_handler), and the value of tcp_tw_reuse is 1. > > # cat /proc/sys/net/ipv4/tcp_tw_reuse > 1 > > In the production environment, we encountered this problem many times, > and every time it was a problem with the inet_timewait_sock->tw_timer. > > Do anyone have any ideas for this issue? Thanks. > Nothing comes to mind, I am not aware of such crashes in stable linux kerne= ls.