Received: by 10.223.176.46 with SMTP id f43csp2377000wra; Sun, 21 Jan 2018 18:41:45 -0800 (PST) X-Google-Smtp-Source: AH8x227RCFRkAa/uPCpyGubKpew4RdE/asmEWIyHBI7lz3gsMnvTTIL9Nmasw7ECEKwyrtMOJVM9 X-Received: by 10.98.49.199 with SMTP id x190mr6934072pfx.1.1516588905368; Sun, 21 Jan 2018 18:41:45 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516588905; cv=none; d=google.com; s=arc-20160816; b=ie6kSJxN6TVCiSJueJRWlO+Nbb0HgewksVxG9YS4T64zu4Qa/s/zI0h1pQbNc97DXp hQma0LtFnXsybOSmez6UbrvJ1STAvyfUeKKHlBjIw8IGETOMGpe/P7xZB6aPJV6YG902 dp2up2jEyopJ3mmAyCHI2kGwiRfAs3ArsnJRtQ3XqqHmeP3Gwx5AB7fAoyqYQAib+cbn lldMGxqJB5nTOp3YHMcZgsHCUsYiAcNRtFSWK+vsvH5W6SxdivttGhDPkpwyZ+oEFJKi 5jSyfoVLAACLRMDKdDUv5uIp4c5ZF4DmLfoGWZXZKJXKgeUsC/Mppn0Vsvubs73uzx7Y mV4A== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:dkim-signature :arc-authentication-results; bh=++mBWejwohAokt8IBNG51JV+eco1e9vgAJrlrkTmKLo=; b=wPlkqffYU6qdExIJ4YRMPyjEv2YA5+SOfZZoApRGQEIZpuUZzZoD+wfnHlgSKu15RA +IdzWoEBru+22ncdoJkO/rUb6aPHA2+e4T5l+7vjGsfYwbtUyeefnO6M69oF2YXP1plr GaBNNlaVecHdq6aG5SK+s+jy8bqVt2h/5mrk9/sGvEVTtkM9eTlbBHLT3cmoRStcrPlg 57oBY8SSxM513QIyDP4c0PL4nUFnFtWYQVqblfRB1h5ES7WCqPTTJogA0xBMO/9VNOI8 q5dpjas5zPVqXDbp4L6YcmMucopui+cv6ctY2BkxL4x58V1KyWKGfnnpgP2bm2SP8SJI 8fXg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=S0MujA7X; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id x14si13343444pgo.812.2018.01.21.18.41.31; Sun, 21 Jan 2018 18:41:45 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@oracle.com header.s=corp-2017-10-26 header.b=S0MujA7X; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=oracle.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751182AbeAVClB (ORCPT + 99 others); Sun, 21 Jan 2018 21:41:01 -0500 Received: from aserp2120.oracle.com ([141.146.126.78]:36998 "EHLO aserp2120.oracle.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751086AbeAVCk7 (ORCPT ); Sun, 21 Jan 2018 21:40:59 -0500 Received: from pps.filterd (aserp2120.oracle.com [127.0.0.1]) by aserp2120.oracle.com (8.16.0.22/8.16.0.22) with SMTP id w0M2b3iX108616; Mon, 22 Jan 2018 02:40:55 GMT DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=oracle.com; h=subject : to : cc : references : from : message-id : date : mime-version : in-reply-to : content-type : content-transfer-encoding; s=corp-2017-10-26; bh=++mBWejwohAokt8IBNG51JV+eco1e9vgAJrlrkTmKLo=; b=S0MujA7X6rlUa4YMuUlWIhNXPI5ainLc1y/RVqPomOgUwFL7gKJA5DUduiXt84Dipoy4 H+vDZbly1XQ9zFNcqmmuEY1VwkPhPPQl4cGyU28ar0H8PK4rfxwu5ykwCeDc3Mm5Mx5W bjK6bk82fxgCZNhplZFh1rr4LAOp6zPf1G1ZrNZC4In62OvLQwe+SJ0F92NiT1iNNxQ1 Qn10jtBnLAc01ib8bbLXPlZVpRJg6Cr4tHWFGpgsFr/aXIDhediR3rfGCtUBEMnyLvlm ScuX/hm1L7WyZmD/dwk+qESWDE5tAOKmJjmtmXlDrcFDm2W0amvVLzF1+Pcukz1tyKjw 7g== Received: from userv0022.oracle.com (userv0022.oracle.com [156.151.31.74]) by aserp2120.oracle.com with ESMTP id 2fn4ne09p8-1 (version=TLSv1.2 cipher=ECDHE-RSA-AES256-GCM-SHA384 bits=256 verify=OK); Mon, 22 Jan 2018 02:40:55 +0000 Received: from userv0121.oracle.com (userv0121.oracle.com [156.151.31.72]) by userv0022.oracle.com (8.14.4/8.14.4) with ESMTP id w0M2esSW027725 (version=TLSv1/SSLv3 cipher=DHE-RSA-AES256-GCM-SHA384 bits=256 verify=FAIL); Mon, 22 Jan 2018 02:40:54 GMT Received: from abhmp0019.oracle.com (abhmp0019.oracle.com [141.146.116.25]) by userv0121.oracle.com (8.14.4/8.13.8) with ESMTP id w0M2ertu004347; Mon, 22 Jan 2018 02:40:54 GMT Received: from [10.182.69.179] (/10.182.69.179) by default (Oracle Beehive Gateway v4.0) with ESMTP ; Sun, 21 Jan 2018 18:40:53 -0800 Subject: Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating To: Eric Dumazet , Tariq Toukan , Jason Gunthorpe Cc: junxiao.bi@oracle.com, netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, Saeed Mahameed References: <1515728542-3060-1-git-send-email-jianchao.w.wang@oracle.com> <339a7156-9ef1-1f3c-30b8-3cc3558d124e@mellanox.com> <1516552998.3478.5.camel@gmail.com> From: "jianchao.wang" Message-ID: <460fca68-f8a8-e3c4-2e60-e90dc0e2f843@oracle.com> Date: Mon, 22 Jan 2018 10:40:53 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.5.0 MIME-Version: 1.0 In-Reply-To: <1516552998.3478.5.camel@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit X-Proofpoint-Virus-Version: vendor=nai engine=5900 definitions=8781 signatures=668655 X-Proofpoint-Spam-Details: rule=notspam policy=default score=0 suspectscore=0 malwarescore=0 phishscore=0 bulkscore=0 spamscore=0 mlxscore=0 mlxlogscore=748 adultscore=0 classifier=spam adjust=0 reason=mlx scancount=1 engine=8.0.1-1711220000 definitions=main-1801220034 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Hi Eric On 01/22/2018 12:43 AM, Eric Dumazet wrote: > On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote: >> >> On 21/01/2018 11:31 AM, Tariq Toukan wrote: >>> >>> >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: >>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: >>>>> Hi Tariq >>>>> >>>>> Very sad that the crash was reproduced again after applied the patch. >> >> Memory barriers vary for different Archs, can you please share more >> details regarding arch and repro steps? > > Yeah, mlx4 NICs in Google fleet receive trillions of packets per > second, and we never noticed an issue. > > Although we are using a slightly different driver, using order-0 pages > and fast pages recycling. > > The driver we use will will set the page reference count to (size of pages)/stride, the pages will be freed by networking stack when the reference become zero, and the order-3 pages maybe allocated soon, this give NIC device a chance to corrupt the pages which have been allocated by others, such as slab. In the current version with order-0 and page recycling, maybe the corruption occurred on the inbound packets sometimes and just cause some bad and invalid packets which will be dropped. Thanks Jianchao