Received: by 2002:a05:6a10:1a4d:0:0:0:0 with SMTP id nk13csp193899pxb; Mon, 31 Jan 2022 19:20:16 -0800 (PST) X-Google-Smtp-Source: ABdhPJxWyD9bkOKuZyZ4ed7hvfTDaKj34CiB49oIALeBAhAairDJvmtorgQIkKQBbo9hey0qElQC X-Received: by 2002:aa7:c2d3:: with SMTP id m19mr1861771edp.437.1643685616355; Mon, 31 Jan 2022 19:20:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1643685616; cv=none; d=google.com; s=arc-20160816; b=LV8wxefIQ7upiX4ZTQ+lidIJShISqf3tk5x3Jc/OXKQuXSuDRKUOwKtq5HK4+B9l01 uDXCqSITSZboRObax81TYPrynt2wtaFjHg64OgN7kRYRZtOqauLUBfVmSBgXQghvPOTV olCdyLDPQKQtVhYl40zb+ZQvDGc/PN4cxjm8JErTLn/4u/FMAm+1q6aEVea5lLLnEXdc P/8DLr4sNXAjzqYvfOBElrA52901Fz7/Wgjih0Ic1fukIXJ5qOfCkQha3DXVuAN51Y/T r1qXc2WqDEMEbR1B2rnAhRjgMVVzPEFpkpfzG4inBOi83ET+pdglvDdvlBQzTzIXUGbq C9Uw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:in-reply-to:from :references:cc:to:content-language:subject:user-agent:mime-version :date:message-id; bh=UuzKGMKLLLZrdWVgRBTiS+bM+p2XVO0v+Bq+GwoQXl8=; b=R15fbmSArSRTo7RwyLmh7O9t71oycd8gast/TN8qv++A/iLAjldgq+dC9/981QW2fS HkeU6hNhpA+W1rs93N+PFaYFKK7ac4gEZWZRdmSceOtTzWFJJGoVRNz5t//TnSo7pFNT kj6UvDkj9ov3vOOoJJIJrHX65uxh5/exF9mWE/NJqW1a7cI3e4GJZy4gtMSmuIivaUl3 esdgBUDiAgqWJaHbwye5kf+QB990+t299NIiaAK/xMK84tRJyXpkSKLAd4COIv++RcRX SG4/I1GISKcqPAqi+N4PwHgNLKxrUkoF+S1DDYJrkzwGnoji+j9dTtwVY2xStk7+KMjf 3wEg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id d8si685364edt.642.2022.01.31.19.19.51; Mon, 31 Jan 2022 19:20:16 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352844AbiA2QwX (ORCPT + 99 others); Sat, 29 Jan 2022 11:52:23 -0500 Received: from mx3.molgen.mpg.de ([141.14.17.11]:50087 "EHLO mx1.molgen.mpg.de" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S233791AbiA2QwT (ORCPT ); Sat, 29 Jan 2022 11:52:19 -0500 Received: from [10.59.106.37] (unknown [77.235.169.38]) (using TLSv1.3 with cipher TLS_AES_128_GCM_SHA256 (128/128 bits) key-exchange X25519 server-signature RSA-PSS (2048 bits) server-digest SHA256) (No client certificate requested) (Authenticated sender: pmenzel) by mx.molgen.mpg.de (Postfix) with ESMTPSA id 0FD3B61EA1926; Sat, 29 Jan 2022 17:52:15 +0100 (CET) Message-ID: <3534d781-7d01-b42a-8974-0b1c367946f0@molgen.mpg.de> Date: Sat, 29 Jan 2022 17:52:12 +0100 MIME-Version: 1.0 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:91.0) Gecko/20100101 Thunderbird/91.5.1 Subject: Re: BUG: Kernel NULL pointer dereference on write at 0x00000000 (rtmsg_ifinfo_build_skb) Content-Language: en-US To: Zhouyi Zhou Cc: "Paul E. McKenney" , Josh Triplett , rcu , LKML , "David S. Miller" , Jakub Kicinski , netdev@vger.kernel.org References: <159db05f-539c-fe29-608b-91b036588033@molgen.mpg.de> From: Paul Menzel In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dear Zhouyi, Thank you for taking the time. Am 29.01.22 um 03:23 schrieb Zhouyi Zhou: > I don't have an IBM machine, but I tried to analyze the problem using > my x86_64 kvm virtual machine, I can't reproduce the bug using my > x86_64 kvm virtual machine. No idea, if it’s architecture specific. > I saw the panic is caused by registration of sit device (A sit device > is a type of virtual network device that takes our IPv6 traffic, > encapsulates/decapsulates it in IPv4 packets, and sends/receives it > over the IPv4 Internet to another host) > > sit device is registered in function sit_init_net: > 1895 static int __net_init sit_init_net(struct net *net) > 1896 { > 1897 struct sit_net *sitn = net_generic(net, sit_net_id); > 1898 struct ip_tunnel *t; > 1899 int err; > 1900 > 1901 sitn->tunnels[0] = sitn->tunnels_wc; > 1902 sitn->tunnels[1] = sitn->tunnels_l; > 1903 sitn->tunnels[2] = sitn->tunnels_r; > 1904 sitn->tunnels[3] = sitn->tunnels_r_l; > 1905 > 1906 if (!net_has_fallback_tunnels(net)) > 1907 return 0; > 1908 > 1909 sitn->fb_tunnel_dev = alloc_netdev(sizeof(struct ip_tunnel), "sit0", > 1910 NET_NAME_UNKNOWN, > 1911 ipip6_tunnel_setup); > 1912 if (!sitn->fb_tunnel_dev) { > 1913 err = -ENOMEM; > 1914 goto err_alloc_dev; > 1915 } > 1916 dev_net_set(sitn->fb_tunnel_dev, net); > 1917 sitn->fb_tunnel_dev->rtnl_link_ops = &sit_link_ops; > 1918 /* FB netdevice is special: we have one, and only one per netns. > 1919 * Allowing to move it to another netns is clearly unsafe. > 1920 */ > 1921 sitn->fb_tunnel_dev->features |= NETIF_F_NETNS_LOCAL; > 1922 > 1923 err = register_netdev(sitn->fb_tunnel_dev); > register_netdev on line 1923 will call if_nlmsg_size indirectly. > > On the other hand, the function that calls the paniced strlen is if_nlmsg_size: > (gdb) disassemble if_nlmsg_size > Dump of assembler code for function if_nlmsg_size: > 0xffffffff81a0dc20 <+0>: nopl 0x0(%rax,%rax,1) > 0xffffffff81a0dc25 <+5>: push %rbp > 0xffffffff81a0dc26 <+6>: push %r15 > 0xffffffff81a0dd04 <+228>: je 0xffffffff81a0de20 > 0xffffffff81a0dd0a <+234>: mov 0x10(%rbp),%rdi > ... > => 0xffffffff81a0dd0e <+238>: callq 0xffffffff817532d0 > 0xffffffff81a0dd13 <+243>: add $0x10,%eax > 0xffffffff81a0dd16 <+246>: movslq %eax,%r12 Excuse my ignorance, would that look the same for ppc64le? Unfortunately, I didn’t save the problematic `vmlinuz` file, but on a current build (without rcutorture) I have the line below, where strlen shows up. (gdb) disassemble if_nlmsg_size […] 0xc000000000f7f82c <+332>: bl 0xc000000000a10e30 […] > and the C code for 0xffffffff81a0dd0e is following (line 524): > 515 static size_t rtnl_link_get_size(const struct net_device *dev) > 516 { > 517 const struct rtnl_link_ops *ops = dev->rtnl_link_ops; > 518 size_t size; > 519 > 520 if (!ops) > 521 return 0; > 522 > 523 size = nla_total_size(sizeof(struct nlattr)) + /* IFLA_LINKINFO */ > 524 nla_total_size(strlen(ops->kind) + 1); /* IFLA_INFO_KIND */ How do I connect the disassemby output with the corresponding line? > But ops is assigned the value of sit_link_ops in function sit_init_net > line 1917, so I guess something must happened between the calls. > > Do we have KASAN in IBM machine? would KASAN help us find out what > happened in between? Unfortunately, KASAN is not support on Power, I have, as far as I can see. From `arch/powerpc/Kconfig`: select HAVE_ARCH_KASAN if PPC32 && PPC_PAGE_SHIFT <= 14 select HAVE_ARCH_KASAN_VMALLOC if PPC32 && PPC_PAGE_SHIFT <= 14 > Hope I can be of more helpful. Some distributions support multi-arch, so they easily allow crosscompiling for different architectures. Kind regards, Paul