Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752952AbcKROg6 (ORCPT ); Fri, 18 Nov 2016 09:36:58 -0500 Received: from mail-bl2nam02on0096.outbound.protection.outlook.com ([104.47.38.96]:8375 "EHLO NAM02-BL2-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752443AbcKROgy (ORCPT ); Fri, 18 Nov 2016 09:36:54 -0500 X-Greylist: delayed 61495 seconds by postgrey-1.27 at vger.kernel.org; Fri, 18 Nov 2016 09:36:54 EST Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=chris.lesiak@licor.com; Subject: Re: [PATCH] net: fec: Detect and recover receive queue hangs To: Andy Duan References: <1479417282-15540-1-git-send-email-chris.lesiak@licor.com> CC: "netdev@vger.kernel.org" , "linux-kernel@vger.kernel.org" , Jaccon Bastiaansen From: Chris Lesiak Message-ID: Date: Fri, 18 Nov 2016 08:36:44 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:45.0) Gecko/20100101 Thunderbird/45.4.0 MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="windows-1252" X-Originating-IP: [208.82.105.200] X-ClientProxiedBy: BLUPR14CA0025.namprd14.prod.outlook.com (10.163.77.163) To CY4PR08MB2901.namprd08.prod.outlook.com (10.173.60.11) X-Microsoft-Exchange-Diagnostics: 1;CY4PR08MB2901;2:2vJPEmh63BsJzZi/IEGSUzEkud55cH7vUH/kSpTlOa6PxbncSp8n2D/Dm1v773Drw53JL9+5IpMLHkNYTmLv/MdIsNefW/Um89Jzrl9vVLdv4KH3qDNT6OnmILR4zR5pW8e+soZXpN6uRw/axrwYfZj0YCuv434LK0fnm2kKzKA=;3:M/IzI/UlgChRsOVVxj8iq03GdqViwGT6BVneLL+nmRt2uAass/TTtvEAmv7J71R2JiXtubwN1cHpK3ELrrM8lPm4KZZPpjTHQCMUiGlHT+q8BSsg4grfnsuKHRmbwKqnQ4TXZrgSfd8/WA3DFLjMYL8PcUXE0KBzEsQYU5JKSOQ=;25:8Dt2HYEZxH56GI7promC3EaL2nsnboYINlUMDll2S32Hq4QlWQufUmPUQlSP9Fbja9DoO3yHcj1OmZUDBdNDwiI3RNNnxY6ibSi3TponcOOufv0aVp5GCQAh9tbO49EX4TsTROLb8cNxnGGK0lcbTl6QjrXOF7hs2f2TamNNsXcIAQchRRM4GK/ltF2TYZY6yQGd8z2fRwzVIKaNoczP2Zntj+Nj25sXBp5IecUg5O653X3jW8TJnIzM8aHL+n5BzalW+0rRsGq9OWNnUbVKx+7ozTtxxnmqWiSlFhaxzAYArDvxyAodXvMi6m2PxGakz85a2j2152E+Y9PNEI6U79zy5pT3zA7V/SxspARd76qQpyWDJ/rWQY8gnl5qSf776BrQRrAdjMogpZPDhmkVLxvDHOJvq94nU3l0787kYCpRIO20DES1K6swTBYd5d1sVcRFTAqgJQZdOjbG/z3xVg== X-MS-Office365-Filtering-Correlation-Id: 22968335-02d3-48ec-567f-08d40fc055b3 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:CY4PR08MB2901; X-Microsoft-Exchange-Diagnostics: 1;CY4PR08MB2901;31:pgZZKfWSx8QGCG0Fjh4numPwKXI5Ih3mj+r5kzZ9agx1AjSZyLLTp6gBc1ckvcT/OAFv1ZMk2dODtJINlEuEURZl2yDf7cqFt9z4l6hZWrFyUwpvEI/4+EM7/z59XJUkk4+gXTjEW10h38TLZVUEKE+EpPDunLTlmw3NJifuYIbhjuIkUhKc8FoEEZzqgOboVwPeCqSkyyZcAaOXFer7TTtntEg7CqjDypOj0z/AtJIwNRoekUevqXJwcEmPwVwv+lDWG46R3SJBC2wBMzwT4g==;20:ZdeHr/Jaz6ydtuvoVUtqB/9DX+svg8SYw/JdHDI5lmTeQzQEHxU5/mLEFx/ccqGUUNWTM/0WLvrOAVudKSFwOTvJDv3+kdc1alO3/odc1Srvv9XFZIA37I3f/PIuDhfI71z7pXbn50JgCOn29VQfebglNmphJmaCSgNgvbUiODCa4le08hSUYAG2IUCBRXat8kiuIpbwQoXK3id4ol7i5f8xPz+UvPM6+yXda9GvRAmnV71XFk3FZrWrdZn6k+lvr9o4yS6IrlnJXUOK/KSWOKWYA6endzvxb1hMPyQsB8qU2U6FX79mcKJqAYUZRp9fjDONxBbHqbJxQJgUIz7OgE9J+OPj0/jkdAmwcKqgKBhXYAihtR9kNH5AvYD7oSN3SqH+hWR+5f11JPoTBkrihDgkbPsAntoaIOtb5OpyUKMsO+niuLfHxtzqM36gnKsobL9Ehz+RbMO9OF3iCNg5LVboDvk0Ga/HlGMucWl2DISP1hIlg9v1Usv2WyErwgdo X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:(9452136761055)(185117386973197); X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040281)(6060326)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6061324)(6041223)(6072148);SRVR:CY4PR08MB2901;BCL:0;PCL:0;RULEID:;SRVR:CY4PR08MB2901; X-Microsoft-Exchange-Diagnostics: 1;CY4PR08MB2901;4:FqYMlJqphT6/S/T+42j0BntaXC4CfHyH4Diurhv4vd0y4XElu80mCrSgYg3hr3rKKOXubLsS/8VpGN7MGQ1JZyNZeHALdqL7HpyhFSGw8fVnTi2tqGjtzQv2C/iCDE4iQTDz8N5aa9HkEyvupkcN0sg4kGKHtIU2ulcxihum7DWH3qpqqkZ5yNIaphiq/TIl4/AR8MO/Yvlxn4VG4McYCbL39jAYpxMqRv21HCB8Yk6v75wIYgqMLt0jKQBQQUc0r+fczUlm0UEsd6jGxcfR5tKgqfsYSfexeljuDvDHDG+pKkifACKFuDGanR0D5mmX16mLSw4LsWh3hbZFCOjrqStLl8NoqpjN25D6Wp8q4BbWQDNfe4QP4DKs+FvjBLiHtZWxVBFeWLeNyHP+BbZKokmtAebPQKlXM5A+2gm3fJGy+OOfFdY4JaoJzzvgheKsqh6C0dltWqEKQhnpzteso9bxw9cPhay4UHIO32PidUaJDD8BOUhfDRJ6g6IuQKV7wQOS7/CTdynOLy8SlErxpVCcwYg/KaGl00lII7zpdzA= X-Forefront-PRVS: 01304918F3 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10019020)(4630300001)(6009001)(7916002)(24454002)(377454003)(51444003)(199003)(189002)(92566002)(7846002)(7736002)(2906002)(8666005)(64126003)(305945005)(4326007)(83506001)(97736004)(23746002)(77096005)(36756003)(3846002)(81166006)(106356001)(6116002)(81156014)(65826007)(229853002)(50986999)(54356999)(76176999)(50466002)(69596002)(5660300001)(8746002)(38730400001)(6666003)(101416001)(33646002)(575784001)(8676002)(110136003)(31696002)(2950100002)(6916009)(53416004)(86362001)(105586002)(31686004)(189998001)(42186005)(68736007)(65956001)(4001350100001)(66066001)(65806001)(47776003)(7099028)(7059030);DIR:OUT;SFP:1102;SCL:1;SRVR:CY4PR08MB2901;H:bee.licor.com;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?Windows-1252?Q?1;CY4PR08MB2901;23:IEGHHZlbOscnTGfzwmcNwzVR5ZFbEgKxRF+9R?= =?Windows-1252?Q?9DKrcesQ4WVZlhsUF9zIlAfjKhSZxN0wAhOonhJBQnyDWLyaaSzgIHmP?= =?Windows-1252?Q?ivuvvXt06pMoxKeut2gdvDTPIEzU8+BMFgyY54v1Ppl2Iy9HOd7h2lX0?= =?Windows-1252?Q?Mvhd7tWVPwF4YmdhXy/hsct51HyjQDhi+EnyrKzDWatFNRvZ/VT9No+q?= =?Windows-1252?Q?WPOYp9M+QT0eu7ZO2qzxZZDPC+NkR2obaK0R1KUyjBcbteGd3wflBr3m?= =?Windows-1252?Q?SNzEm92hlNH0dDMbknq2uu039cUEWotFFyJGE1JtaFiuXHxFXtL4LAFa?= =?Windows-1252?Q?emveD3rSqKM45x1ZmLba01W+7HD2+jEXRD2IvCKBkpSy4nLom89sMAvQ?= =?Windows-1252?Q?7CAvb8B5SunDwVWGCNOmIaAvQlEXuRi99NinxzxIBNanTs1loRmaYxFd?= =?Windows-1252?Q?MS2G0/quRSEJBTDpu9dEcpSPdUuM/+wSMKER0tMyQ7WbxLdHyQ5V7chy?= =?Windows-1252?Q?BKEke8VH4HOoaNd+C9BxZD/Qpdeq8tjX4lEoN2VkyGGJMGleQkErhCfJ?= =?Windows-1252?Q?MXgRmwPwnoUs2ISvUyKq90dl60qyTayrdLKCkxmvlIRZudd42uFBgCmK?= =?Windows-1252?Q?sOC3f85R1HQ6GpqZGpSnCWrMwGfoa3X4CIhPFrxNiajZSSKALo1G4K+t?= =?Windows-1252?Q?69KESWawa4DqlTUZCMzqSyd0Lh+WL5LcESA5c3/6WmZJsF0LJeRNhDPR?= =?Windows-1252?Q?ZtcSMP8liSYIzz0aQdq1H6yrLnyxVtFJscJFTUz7gICW9Ez+e2vUht6T?= =?Windows-1252?Q?RR0tq0EHYo1c7JwKM03/Tz2gYpmuyOx7xnOtvgZ9fULhs0lDBQjMZ8MI?= =?Windows-1252?Q?yWx5cHtCbzjNFt2QG6QqWRmP0EuY1J+T171McPAD9aXOVfq3HPLMnXF8?= =?Windows-1252?Q?DoQLHEiZpTWJ+wjZwIYVM7jn4SVaRz7jdFTBYe/TIWzJR7G5FGzadhUG?= =?Windows-1252?Q?8zL9NPd7sAfvqZMMOQLBv9ulZqkNATE38QTesl6mO/1tXasdyxzXy5Sj?= =?Windows-1252?Q?kmVV+h62hw/WYT3qO8XpOsVrJDLVPQh++gFxkwjuTNpksH+6kZPOokHT?= =?Windows-1252?Q?2oqU1IAQwzL5gfO+QgEfSX4eTmAZ9vy6j1FarId8QZt36xcCWiaqidN4?= =?Windows-1252?Q?wdtncWWZY5/31aq+yPPLydDteIiuohrpdMpNJNSmw61TI14F6saF/vK5?= =?Windows-1252?Q?EqhCnhs03l9nq3wMqa6roNEeHrekfs0XtfYoi8O2LuU0Tr4r7uZJ1TBw?= =?Windows-1252?Q?RxNZM2iUln8jW+fplDXtFAVw5mn7LMIky9vGb/Juz1qjmhsEttPJ5kDl?= =?Windows-1252?Q?lDpcpaZQn/1gdM+0t0vz1QMRy19th+PtlKSGKFycPph8y9XEuS510h3K?= =?Windows-1252?Q?08gREnRltHVcxRLnNWewylPTQ3+midvXpY/J3F8ylgNo5Uqb320RMmRO?= =?Windows-1252?Q?66bdxnEvC83rBXsh4wN2PEk7xVwXar8cHl90YkL9W/GnY6iHMTfLS64n?= =?Windows-1252?Q?85uptmq/CO1Mq8=3D?= X-Microsoft-Exchange-Diagnostics: 1;CY4PR08MB2901;6:vdjQ34GhcbdzM1ks/aOHv3sPcSBe/57anUvZ9Hrt7FzRRm5MJEmx/NWLAESFSGyIIHz2Ozq5rG6D9PmnWZZAMJjRV1p+JPCLiw2rwzRut8NM+A6wyzGAlLgMgyik23kQpJFwBWyBz+J+YCkKscwNSHv/X8QDP+cIjFM4m1ResqngZzVM8PDcHITbH9bLt4/z3NOMDsG8fAZmQ3rsD735SuMCp9qwBDld+crAqN6MgBH+YENbqKdlix2xBwe0F7jPNs+vNT3cAbtODo0xSOMuvzPCmFArayqS02EcCxQ71OvZ9iqj6s7I6dOsJ2j2UpdMMSk4RmMBEN9s11zgFTuH9Q8yOdNndxL7RHTA0VunsGo=;5:If14kbKAOmQH/7zrlsZcIAfjJ+923V/o2fUjKbmU/FQNdCFW0Pp2JB7kIBwLACcfi2Y1rOPaavGyzOF1Yx8P0t7rdhPsqT4domHv7iXo5fg4CZUKE7IVCkgd18QN8Cg/ZgRYbt57N2WpKIhPfKj8JA==;24:nUDuWAXfIUn0IGbxJUp0pQ4PHwcoOvOEekGeTuFkzgfbMc0VS9VUqrFtYd7AbFh7ElY0IjvEiEdUCESO9088XyXUjrCxFmwbrD4r4N5Aj0U= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;CY4PR08MB2901;7:lEfAUXyI+sp3jLPoyENQvXZbGkrmx4BdGXxQa5QLVTyYbkbcLUT9GZqT7Fxm4JX4eUrvyruapt3qTyn8U1OJFDpOoGgvVaRg62sLhxxqgxIE9WpyLLuFWTecO6cqPfTQSuD/f913DXi8UcbO4OpT1Lrcdmexugfkb2f/EBTq/mU/9JmSd2aG7qUdl0NwHX/wAXVZFyaTJzjufblJ/CWGUdS8C4Wuindrm5xlobukaIQIaiq7Wyx53C6O1JTTOwEaqj4nHTIBV3E2TrDrkP4zpf6pF57r4OEvKAueIBKxTbIrAS+k6GoNdp/oRia6L12QGVejQfbbYflELXUYAk4mqGUwE2Tq/elW5NscyT4UPgQ= X-OriginatorOrg: licor.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 18 Nov 2016 14:36:50.1607 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: CY4PR08MB2901 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Transfer-Encoding: 8bit X-MIME-Autoconverted: from quoted-printable to 8bit by mail.home.local id uAIEb5EU019081 Content-Length: 4353 Lines: 129 On 11/18/2016 12:44 AM, Andy Duan wrote: > From: Chris Lesiak Sent: Friday, November 18, 2016 5:15 AM > >To: Andy Duan > >Cc: netdev@vger.kernel.org; linux-kernel@vger.kernel.org; Jaccon > >Bastiaansen ; chris.lesiak@licor.com > >Subject: [PATCH] net: fec: Detect and recover receive queue hangs > > > >This corrects a problem that appears to be similar to ERR006358. But while > >ERR006358 is a race when the tx queue transitions from empty to not empty, > >this problem is a race when the rx queue transitions from full to not full. > > > >The symptom is a receive queue that is stuck. The ENET_RDAR register will > >read 0, indicating that there are no empty receive descriptors in the receive > >ring. Since no additional frames can be queued, no RXF interrupts occur. > > > >This problem can be triggered with a 1 Gb link and about 400 Mbps of traffic. I can cause the error by running the following on an imx6q: iperf -s -u And sending packets from the other end of a 1 Gbps link: iperf -c $IPADDR -u -b40000pps A few others have seen this problem. See: https://community.nxp.com/thread/322882 > > > >This patch detects this condition, sets the work_rx bit, and reschedules the > >poll method. > > > >Signed-off-by: Chris Lesiak > >--- > > drivers/net/ethernet/freescale/fec_main.c | 31 > >+++++++++++++++++++++++++++++++ > > 1 file changed, 31 insertions(+) > > > Firstly, how to reproduce the issue, pls list the reproduce steps. Thanks. > Secondly, pls check below comments. > > >diff --git a/drivers/net/ethernet/freescale/fec_main.c > >b/drivers/net/ethernet/freescale/fec_main.c > >index fea0f33..8a87037 100644 > >--- a/drivers/net/ethernet/freescale/fec_main.c > >+++ b/drivers/net/ethernet/freescale/fec_main.c > >@@ -1588,6 +1588,34 @@ fec_enet_interrupt(int irq, void *dev_id) > > return ret; > > } > > > >+static inline bool > >+fec_enet_recover_rxq(struct fec_enet_private *fep, u16 queue_id) { > >+ int work_bit = (queue_id == 0) ? 2 : ((queue_id == 1) ? 0 : 1); > >+ > >+ if (readl(fep->rx_queue[queue_id]->bd.reg_desc_active)) > If rx ring is really empty in slight throughput cases, rdar is always cleared, then there always do napi reschedule. I think that you are concerned that if rdar is zero due to this hardware problem, but the rx ring is actually empty, then fec_enet_rx_queue will never do a write to rdar so that it can be non-zero. That will cause napi to always be resceduled. I suppose that might be the case with zero rx traffic, and I was concerned that it might be true even when there was rx traffic. I suspected that the hardware, seeing that rdar is zero, would never queue another packet, even if there were in fact empty descriptors. But it doesn't seem to be the case. It does reschedule multiple times, but eventually sees some packets in the rx ring and recovers. I admit that I do not completely understand how that can happen. I did confirm that fec_enet_active_rxring is not being called. Maybe someone with a deeper understanding of the fec than I can provide an explanation. > > >+ return false; > >+ > >+ dev_notice_once(&fep->pdev->dev, "Recovered rx queue\n"); > >+ > >+ fep->work_rx |= 1 << work_bit; > >+ > >+ return true; > >+} > >+ > >+static inline bool fec_enet_recover_rxqs(struct fec_enet_private *fep) > >+{ > >+ unsigned int q; > >+ bool ret = false; > >+ > >+ for (q = 0; q < fep->num_rx_queues; q++) { > >+ if (fec_enet_recover_rxq(fep, q)) > >+ ret = true; > >+ } > >+ > >+ return ret; > >+} > >+ > > static int fec_enet_rx_napi(struct napi_struct *napi, int budget) { > > struct net_device *ndev = napi->dev; > >@@ -1601,6 +1629,9 @@ static int fec_enet_rx_napi(struct napi_struct *napi, > >int budget) > > if (pkts < budget) { > > napi_complete(napi); > > writel(FEC_DEFAULT_IMASK, fep->hwp + FEC_IMASK); > >+ > >+ if (fec_enet_recover_rxqs(fep) && napi_reschedule(napi)) > >+ writel(FEC_NAPI_IMASK, fep->hwp + FEC_IMASK); > > } > > return pkts; > > } > >-- > >2.5.5 > -- Chris Lesiak Principal Design Engineer, Software LI-COR Biosciences chris.lesiak@licor.com Any opinions expressed are those of the author and do not necessarily represent those of his employer.