Received: by 2002:ad5:474a:0:0:0:0:0 with SMTP id i10csp5028821imu; Mon, 12 Nov 2018 23:17:50 -0800 (PST) X-Google-Smtp-Source: AJdET5etDfD5YZkh4sKbaXyBOFqvmLaPhCntBf8mf9gPZa/fafVs8DIhhnx+jHLZb5xtgjnR12k8 X-Received: by 2002:a63:d949:: with SMTP id e9mr3776290pgj.24.1542093470844; Mon, 12 Nov 2018 23:17:50 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1542093470; cv=none; d=google.com; s=arc-20160816; b=mOZALbl7S2PB/GTEmzXebP9UwfoLDSgasgkSgfhinwS3Jq++Cb8ppF/y/V+J1lQww0 SsoP3iYGh7m8ae00r8lYXNfx5R9KSwl9/Y4oas/FcpKDCIhVQ15ot+OrulzWS+NRpP5j Lht6yW1Z6v3HxaG6u27tHRB+izLwWqmmPE/J0cObc4cuEIm2/bnmUDoMRFwVYTF1E5fX lyOqkIjqZZ/8G+Dd0gr8IZFv/b944nc/64GaoY+GuJvl1ycXxx2XgxBAsVnIgEJjCjJN 8TISN+s8ePGUPraixJOuVkz/inw4y81WXR3DYrWBa24qvFdTNvudUIylW+pgjFHVGVC8 0KJA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:subject:content-transfer-encoding :mime-version:user-agent:date:message-id:from:to; bh=mnawllypG4EietnASWV9tbW6kJdCWWcN+i4YIsGJOJI=; b=fsxBxEK5XGLyAj/23ftfDMME0P5SzMKEud8MwGFIWyneYrEdhQfO4amn+IXpazxJgs 12wPB+DwzQ+SFDmRE+RmTuKMjdFEuIa3VMyiU5XUQ4slcgxmc9aMsVkWLjo1ctktyNDo NBr15VwwzIIMW/oFjH8XwCAVaU3zWkqkN6dxsripA0YZ6tlQXUW9DeWxHesHNuPSuXkR 1xW3Y7URxl8rMIltZY7Ak7yqx0QwH3c/PS6oLzugHIYq47RUGpAhdtADycf7vFld2yyc Sg+3zcn3xYPgW+eAVsyE9IZ9IzTGF6GU4wDs1ueZKZd6CPxAjfqyAJYQPjuuJ8+54vAb ts0g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id h188-v6si21101114pfg.129.2018.11.12.23.17.35; Mon, 12 Nov 2018 23:17:50 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731116AbeKMRNr (ORCPT + 99 others); Tue, 13 Nov 2018 12:13:47 -0500 Received: from mailguard.fkie.fraunhofer.de ([128.7.3.5]:40501 "EHLO a.mx.fkie.fraunhofer.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726173AbeKMRNq (ORCPT ); Tue, 13 Nov 2018 12:13:46 -0500 X-Greylist: delayed 910 seconds by postgrey-1.27 at vger.kernel.org; Tue, 13 Nov 2018 12:13:45 EST Received: from rufsun5.fkie.fraunhofer.de ([128.7.2.5] helo=mailhost.fkie.fraunhofer.de) by a.mx.fkie.fraunhofer.de with esmtps (TLS1.2:ECDHE_RSA_AES_256_GCM_SHA384:256) (Exim 4.84_2) (envelope-from ) id 1gMShp-0004yA-AO for linux-kernel@vger.kernel.org; Tue, 13 Nov 2018 08:01:49 +0100 Received: from srv-mail-02.fkie.fraunhofer.de ([128.7.11.17] helo=srv-mail-02.gaia.fkie.fraunhofer.de) by mailhost.fkie.fraunhofer.de with esmtps (TLS1.2:ECDHE_RSA_AES_256_CBC_SHA384:256) (Exim 4.89) (envelope-from ) id 1gMShq-0004x8-RK for linux-kernel@vger.kernel.org; Tue, 13 Nov 2018 08:01:50 +0100 Received: from [10.71.67.24] (128.7.89.212) by srv-mail-02.gaia.fkie.fraunhofer.de (128.7.11.17) with Microsoft SMTP Server (TLS) id 15.0.1156.6; Tue, 13 Nov 2018 08:01:49 +0100 To: From: Henning Rogge Message-ID: Date: Tue, 13 Nov 2018 08:01:47 +0100 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.3.0 MIME-Version: 1.0 Content-Type: text/plain; charset="utf-8"; format=flowed Content-Transfer-Encoding: 8bit X-Originating-IP: [128.7.89.212] X-ClientProxiedBy: srv-mail-01.gaia.fkie.fraunhofer.de (128.7.11.16) To srv-mail-02.gaia.fkie.fraunhofer.de (128.7.11.17) X-Spam_score: -1.0 X-Spam_score_int: -9 X-Spam_bar: - X-Spam_report: Spam detection software, running on the system "mailguard.fkie.fraunhofer.de", has NOT identified this incoming email as spam. The original message has been attached to this so you can view it or label similar future email. If you have any questions, see the administrator of that system for details. Content preview: Hi, I am working on a self-written routing agent (https://github.com/OLSR/OONF) and am stuck on a problem with netlink that I cannot explain with an userspace error. I am using a netlink socket for setting routes (RTM_NEWROUTE/RTM_DELROUTE), querying the kernel for the current routes in the database (via a RTM_GETROUTE dump) and for getting multicast messages for ongoing routing changes. [...] Content analysis details: (-1.0 points, 5.0 required) pts rule name description ---- ---------------------- -------------------------------------------------- -0.0 SHORTCIRCUIT Not all rules were run, due to a shortcircuited rule -1.0 ALL_TRUSTED Passed through trusted hosts only via SMTP X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on mailguard.fkie.fraunhofer.de X-Spam-Level: X-Spam-ASN: X-Spam-Status: No, score=-1.0 required=5.0 tests=ALL_TRUSTED,SHORTCIRCUIT shortcircuit=ham autolearn=disabled version=3.4.0 Subject: [rtnetlink] Potential bug in Linux (rt)netlink code (repost from linux-netdev)? Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi, I am working on a self-written routing agent (https://github.com/OLSR/OONF) and am stuck on a problem with netlink that I cannot explain with an userspace error. I am using a netlink socket for setting routes (RTM_NEWROUTE/RTM_DELROUTE), querying the kernel for the current routes in the database (via a RTM_GETROUTE dump) and for getting multicast messages for ongoing routing changes. After a few netlink messages I get to the point where the kernel just does not responst to a RTM_NEWROUTE. No error, no answer, despite the NLM_F_ACK flag set)... but sometime when (during shutdown of the routing agent) the program sends another route command (most times a RTM_DELROUTE) I get a single netlink packet with a "successful" response for both the "missing" RTM_NEWROUTE and one for the new RTM DELROUTE sequence number. I am testing two routing agents, each of them in a systemd-nspawn based container connected over a bridge on the host system on a current Debian Testing (kernel 4.18.0-1-amd64). I am directly using the netlink sockets, without any other userspace library in between. I have checked the hexdumps of a couple of netlink messages (including the ones just before the bug happens) by hand and they seem to be okay. When I tried to add a "netlink listener" socket for futher debugging (ip link add nlmon0 type nlmon) the problem vanished until I removed the listener socket again. Any ideas how to debug this problem? Unfortunately I have no short example program to trigger the bug... I have rarely seen the problem for years (once every couple of months), but until a few days ago I never managed to reproduce it. I have asked on linux-netdev but got no reply expect for a question about rate-limitation. Henning Rogge -- Diplom-Informatiker Henning Rogge , Fraunhofer-Institut für Kommunikation, Informationsverarbeitung und Ergonomie FKIE Kommunikationssysteme (KOM) Zanderstrasse 5, 53177 Bonn, Germany Telefon +49 228 50212-469 mailto:henning.rogge@fkie.fraunhofer.de http://www.fkie.fraunhofer.de