Received: by 10.213.65.68 with SMTP id h4csp513697imn; Tue, 13 Mar 2018 11:26:15 -0700 (PDT) X-Google-Smtp-Source: AG47ELsCnUFnIdZeLYt4bHnzKsgypSJNtjjgXfaunnSCuAxNSAzvTYHzsEIhizU6mJAL2eAeLga8 X-Received: by 2002:a17:902:bb81:: with SMTP id m1-v6mr1372344pls.71.1520965575763; Tue, 13 Mar 2018 11:26:15 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1520965575; cv=none; d=google.com; s=arc-20160816; b=rOyHZsSENqhI2os0g+g6VbNjZyblDVSCjBRzcai8kzCH512Egckf1JCGKtfy1LZSYE +k72JsdkpAN0I+Qw+KSOiRS1FUQYxbpWpusUgq4NAFTpcxvxdA4cJKUoVz3Kfc1bbPy4 MW9BmjGRttT6szb8Uv3Tk0Ygj8ars9DJrh7XmiEn1cAQg0eyekZZmZw/rSnioRS0DxWi WKdzJbhuc4fOg7hTqwy0ppHrwLu+HAi1ZLQbG2fOctOuMWwBOHPXgBAjdpIv03SYEdT2 AOQ3WtPj2uEJEVyFLjdApye6GvhflStDMgQTACz1iWd84MXZ1z01Xm0a3kpnHG1fJRTC bj2A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:mime-version:user-agent:date:message-id:cc:to :subject:from:arc-authentication-results; bh=Hjk4RnysxKPl7FrJ++1BhU/Mqi3Sj/YOnkyOoSejgDU=; b=ASJbC9rxlO4fcoTkfFXzgxL97GXOCjMbFkWS5o5Pk6fvtLk6qxIjfhVkgTzZ6nFmBL qyo7lTjDpW9A/gR6CtxYb7l2u394z0MYh8Tp8r/MjsFvAYipDYO5rs7Ms0YdlAhjO7GA +pfEbG+0FjtdbLtdmFGsykjxmqk6Wey/gvkvMMyW3KN5stN+ioR2dzGmjpHcG+afN1Aj pBqTKeOIqF4XQR6cIKyyyRsgDusvaFHEdgY+ow0pVmtXpbWxuB/NHBBTsFy1sqRNtWiE Sl14XN0HUHKu1LA+jCRZIk0fPWqg9SgqpdemST6QrbVfPuAZ1tMxUkPJS6u/iQUmsCfB 8gPg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id j15si459786pga.418.2018.03.13.11.26.01; Tue, 13 Mar 2018 11:26:15 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932555AbeCMSYx (ORCPT + 99 others); Tue, 13 Mar 2018 14:24:53 -0400 Received: from vegas.theobroma-systems.com ([144.76.126.164]:55586 "EHLO mail.theobroma-systems.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752196AbeCMSYv (ORCPT ); Tue, 13 Mar 2018 14:24:51 -0400 Received: from [86.59.122.178] (port=54357 helo=ju27.lan) by mail.theobroma-systems.com with esmtpsa (TLS1.2:DHE_RSA_AES_128_CBC_SHA1:128) (Exim 4.80) (envelope-from ) id 1evobN-00022v-HX; Tue, 13 Mar 2018 19:24:45 +0100 From: Jakob Unterwurzacher Subject: [bug, bisected] pfifo_fast causes packet reordering To: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, John Fastabend , "David S. Miller" Cc: "linux-can@vger.kernel.org" , Martin Elshuber Message-ID: <946dbe16-a2eb-eca8-8069-468859ccc78d@theobroma-systems.com> Date: Tue, 13 Mar 2018 19:24:44 +0100 User-Agent: Mozilla/5.0 (Macintosh; Intel Mac OS X 10.12; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8; format=flowed Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are delivered out-of-order. We have tracked the problem down to the driver interface level, and it seems that the driver's net_device_ops.ndo_start_xmit() function gets the packets handed over in the wrong order. This behavior was not observed on Linux v4.15 and I have bisected the problem down to this patch: > commit c5ad119fb6c09b0297446be05bd66602fa564758 > Author: John Fastabend > Date: Thu Dec 7 09:58:19 2017 -0800 > > net: sched: pfifo_fast use skb_array > > This converts the pfifo_fast qdisc to use the skb_array data structure > and set the lockless qdisc bit. pfifo_fast is the first qdisc to support > the lockless bit that can be a child of a qdisc requiring locking. So > we add logic to clear the lock bit on initialization in these cases when > the qdisc graft operation occurs. > > This also removes the logic used to pick the next band to dequeue from > and instead just checks a per priority array for packets from top priority > to lowest. This might need to be a bit more clever but seems to work > for now. > > Signed-off-by: John Fastabend > Signed-off-by: David S. Miller The patch does not revert cleanly, but moving to one commit earlier makes the problem go away. Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go away as well. Is this an unintended side-effect of the patch or is there something the driver has to do to request in-order delivery? Thanks, Jakob