Received: by 10.213.65.68 with SMTP id h4csp742684imn; Tue, 13 Mar 2018 21:05:19 -0700 (PDT) X-Google-Smtp-Source: AG47ELtK/CqpVBhDrR0EtkYAEf0LDfN83VUaNEcGTkeIwEp9ZpIofxzhZ7U6IdhonqaYhZWQ2h5p X-Received: by 10.98.219.129 with SMTP id f123mr2888785pfg.195.1521000319686; Tue, 13 Mar 2018 21:05:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1521000319; cv=none; d=google.com; s=arc-20160816; b=ZbFXF1el7b1L9SuqO0zlj1HwgjrW1ZEcrnxA52BHpDzi6lSWZ2m4zJPyQrL7Oytcrf FeBa9OCx+Z+pyc6nSy5rTxPGMNE+KDGr6FvfA0ZlRTy8ExWo+s2hGa62sHVse9aOijgV DTxbde2qieRyJGLkJerxUkB+8jA8HJ2H7iN3exhWJCNdzpuuizjcsLuHYbbYwX4Takto rtE1GStKXLr/nfKg/F+c3aNta+7YtweTSnIN6Qes4tubOjP0HlgGNU8ykA5Db031EcTl OP34HfoWDneVBxyUBPkuuc1RaSvq3KZBqiorlgMEvvqq7Ml/KJkFg+O4ggvk0BsT2kfP 4NnA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=RhmzvhEquRQ4EXFBBb9c14UmX8W4PP9/LM/w1dhHduo=; b=1CKEKhOKELGR6zOnh3ElMZX4mlfvQAEfR2/MovC5eA9ctx5/Z4yZ7P3V41n2oQfpwC pf0Q5us9O+Gvf37oJ+kl5aoSyGBa2rA8v9HOMtoJ9GJi034L5FHsLAvxF+pgwGuAynEc XtaN75uNRN1y95QIh3vQaV7swwGTT5/qckVl7TWbSd1u7fhkGmi8PygXJSOP0Q5V/0xm bjn5Q/KgpR3iUqnN4JMwSrN9WsXnQgM00c66X65dZX8M1PhiMFuZNp2uopQCuKxk9Mch VeC/q8fTRDYI18d/cUHXY3ssHTPlGfE35V1Dfgz8UpFCGi4Tany+3ZPqYydwqXMW8LtZ m4kA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=O54ZTkMT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q13-v6si1267344pli.577.2018.03.13.21.05.05; Tue, 13 Mar 2018 21:05:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=O54ZTkMT; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751428AbeCNEEA (ORCPT + 99 others); Wed, 14 Mar 2018 00:04:00 -0400 Received: from mail-pg0-f67.google.com ([74.125.83.67]:36953 "EHLO mail-pg0-f67.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1750862AbeCNED6 (ORCPT ); Wed, 14 Mar 2018 00:03:58 -0400 Received: by mail-pg0-f67.google.com with SMTP id t186so860789pgc.4; Tue, 13 Mar 2018 21:03:58 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=subject:to:cc:references:from:message-id:date:user-agent :mime-version:in-reply-to:content-language:content-transfer-encoding; bh=RhmzvhEquRQ4EXFBBb9c14UmX8W4PP9/LM/w1dhHduo=; b=O54ZTkMTl/19UppibZpdW7ua7M88gvDCc4E1qsJTaCWEY3DgUbBhLIDsHNER7ulJPT ERwz8FsbjJ/nyj4yDXGBqnFcFmPUMbI8GAi+61gB3n7pg+j6NMTGVRUj0sSR2BqLqXdt EOXtDQSWNzaV9Kisje6fVAAmXNyMrsuoP6sgsD8wbdvYk70ZFK8Dk2BmCjrpS9y4tgDw 12e/c6+rkg7201WIsC/DsILiNV+gteOZ6ZUCE8SM4DwZLDcV3s/BcSv+Wsz7Vx79L0IO DwG4h1Ehzud8ueOqa6D73H26iJyk0XTCelWbnt/2MZeP5+mkYnAxCg2P6QUbHL5Smnye LNMw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:cc:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=RhmzvhEquRQ4EXFBBb9c14UmX8W4PP9/LM/w1dhHduo=; b=NIfxv+Mwnvjgn/iqDUYxGPkviW2PXoS3NN5oEUs2vEO4fXTnj6U90xRR22ENWhJr2W Wz+r7jRDcG/MvBGWA7Ad1FU+cbol8bgyVWIINXP0IlN84YJnkFjlsP4Um1G4FprqqM6j sCC5BXXTZSFRQWQIK5pFC+w6Ka5sEGXO2CyejurIyahyxuSWBcbpCPbGdwaepzK2GZsq GrmahXZbE4jiXebijkuAPlSwsY5qfxuVl7uDwngEN3kRyHEbVMGM8AWrgSqOG6s9QUFE aXJ3+ZvwNwNi2JWAeoQ0//x7krb4g8gPWh7aACEwSabfZxTBot45V3wGOKjVWbth+jQt VEaA== X-Gm-Message-State: AElRT7F3WeMV+bHeZwTGyiZCN4cLWF92s6lqTqB4Ljt9ap0p9GVvsgX1 2utgKsdWRQF2NuNNgvLrwl0mZkrR X-Received: by 10.101.78.201 with SMTP id w9mr2410707pgq.83.1521000237533; Tue, 13 Mar 2018 21:03:57 -0700 (PDT) Received: from [192.168.86.235] ([75.106.27.153]) by smtp.gmail.com with ESMTPSA id x4sm2327819pgv.72.2018.03.13.21.03.50 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Tue, 13 Mar 2018 21:03:56 -0700 (PDT) Subject: Re: [bug, bisected] pfifo_fast causes packet reordering To: Dave Taht , Jakob Unterwurzacher Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, "David S. Miller" , "linux-can@vger.kernel.org" , Martin Elshuber References: <946dbe16-a2eb-eca8-8069-468859ccc78d@theobroma-systems.com> From: John Fastabend Message-ID: <95844480-d020-9000-53ef-0da8b965ce6e@gmail.com> Date: Tue, 13 Mar 2018 21:03:40 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 03/13/2018 11:35 AM, Dave Taht wrote: > On Tue, Mar 13, 2018 at 11:24 AM, Jakob Unterwurzacher > wrote: >> During stress-testing our "ucan" USB/CAN adapter SocketCAN driver on Linux >> v4.16-rc4-383-ged58d66f60b3 we observed that a small fraction of packets are >> delivered out-of-order. >> Is the stress-testing tool available somewhere? What type of packets are being sent? >> We have tracked the problem down to the driver interface level, and it seems >> that the driver's net_device_ops.ndo_start_xmit() function gets the packets >> handed over in the wrong order. >> >> This behavior was not observed on Linux v4.15 and I have bisected the >> problem down to this patch: >> >>> commit c5ad119fb6c09b0297446be05bd66602fa564758 >>> Author: John Fastabend >>> Date: Thu Dec 7 09:58:19 2017 -0800 >>> >>> net: sched: pfifo_fast use skb_array >>> >>> This converts the pfifo_fast qdisc to use the skb_array data structure >>> and set the lockless qdisc bit. pfifo_fast is the first qdisc to >>> support >>> the lockless bit that can be a child of a qdisc requiring locking. So >>> we add logic to clear the lock bit on initialization in these cases >>> when >>> the qdisc graft operation occurs. >>> >>> This also removes the logic used to pick the next band to dequeue from >>> and instead just checks a per priority array for packets from top >>> priority >>> to lowest. This might need to be a bit more clever but seems to work >>> for now. >>> >>> Signed-off-by: John Fastabend >>> Signed-off-by: David S. Miller >> >> >> The patch does not revert cleanly, but moving to one commit earlier makes >> the problem go away. >> >> Selecting the "fq" scheduler instead of "pfifo_fast" makes the problem go >> away as well. > Is this a single queue device or a multiqueue device? Running 'tc -s qdisc show dev foo' would help some. > I am of course, a fan of obsoleting pfifo_fast. There's no good reason > for it anymore. > >> >> Is this an unintended side-effect of the patch or is there something the >> driver has to do to request in-order delivery? >> If we introduced a OOO edge case somewhere that was not intended so I'll take a look into it. But, if you can provide a bit more details on how stress testing is done to cause the issue that would help. Thanks, John >> Thanks, >> Jakob > > >