Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754668AbcK1NJQ (ORCPT ); Mon, 28 Nov 2016 08:09:16 -0500 Received: from mx0b-0016f401.pphosted.com ([67.231.156.173]:48020 "EHLO mx0b-0016f401.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753751AbcK1NJH (ORCPT ); Mon, 28 Nov 2016 08:09:07 -0500 Subject: Re: stmmac ethernet in kernel 4.4: coalescing related pauses? To: Pavel Machek , , , kernel list References: <20161123105125.GA26394@amd> From: Lino Sanfilippo Message-ID: Date: Mon, 28 Nov 2016 14:07:51 +0100 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: <20161123105125.GA26394@amd> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2016-11-28_04:,, signatures=0 X-Proofpoint-Details: rule=outbound_notspam policy=outbound score=0 priorityscore=1501 malwarescore=0 suspectscore=2 phishscore=0 bulkscore=0 spamscore=0 clxscore=1011 lowpriorityscore=0 impostorscore=0 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1609300000 definitions=main-1611280219 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1167 Lines: 38 Hi Pavel, On 23.11.2016 11:51, Pavel Machek wrote: > I'm debugging strange delays during transmit in stmmac driver. They > seem to be present in 4.4 kernel (and older kernels, too). Workload is > burst of udp packets being sent, pause, burst of udp packets, ... > > Test code is attached, I use these parameters for testing: > > ./udp-test raw 10.0.0.6 1234 1000 100 30 > > The delays seem to be related to coalescing: > > drivers/net/ethernet/stmicro/stmmac/common.h > #define STMMAC_COAL_TX_TIMER 40000 > #define STMMAC_MAX_COAL_TX_TICK 100000 > #define STMMAC_TX_MAX_FRAMES 256 > > If I lower the parameters, delays are gone, but I get netdev watchdog > backtrace followed by broken driver. > > Any ideas what is going on there? > > [I'm currently trying to get newer kernels working on affected > hardware.] > > Best regards, > > Pavel I once encountered a similar behaviour with a driver. The reason was that the socket queue limit was temporarily exhausted because the irq handler did not free the tx skbs fast enough (that driver also used irq coalescing). Calling skb_orphan() in the xmit handler made this issue disappear. Regards, Lino