Received: by 10.223.176.46 with SMTP id f43csp3059973wra; Mon, 22 Jan 2018 07:48:49 -0800 (PST) X-Google-Smtp-Source: AH8x226nu2X4CnEgGhvO8ydz3YpDjM34vcDIGgHowo6yq4zlXpBCFQ6SUQrd8EyjdUs56RjENX3Q X-Received: by 10.36.252.2 with SMTP id b2mr8990715ith.0.1516636129504; Mon, 22 Jan 2018 07:48:49 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1516636129; cv=none; d=google.com; s=arc-20160816; b=du3sUQyIzXJ60B75EZWUkyt8+B6K3Bzwr9LUX9sFudOZmwRzLPaQHWxcTrR26deZ4l YhMLquPRgI/WTOP7FCkeKtgeKuurc6TZoyOD1CnEsuv1LAHokijj9DazZ4V8DYMhAypM 413kQJaAqVmy9TdIHyIVBcwOa04o8sg0CGujHewPv9je3syK+99/SJe4YzYxJp1zRlGn zZ+w9CdSKDdfAIH6PBhMPwbhui1yJoHwlsAsesJU528Ayhuy1GMA1MFscOvPQAIrUx/n 282/uics4PDs4Zt+Jwf0NikNynCnquyyRyi5z8N0t21uF4AUWhUyIvL2jGVO0bCvIKAI 40tQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:dkim-signature:arc-authentication-results; bh=nL9TMjGeX5Gq91Br8Ebb0bxpjf+iZsg+oqvNM0HaAvw=; b=JSpODutJWHS/U1bMnY6B8IM1vlVQNDEEr7/ukFGg4nOPIvRvJPz7P83ljjpm6hEEXS w30ynVt+M8nkAJHbF6LcQd5YmvMwpQjo6mqHSt0GmrdfFZEthKD8F8zxNkYiHot4hfCl w34KV0keLRraT4Df+WV4cON7Tc9DmweDZQCiB7FmLt/kxdXyG0lSU682B55N6hO+ISIk pdmywqrALjKO/Vwe2qry4uFaLrc+E2pWypjpkVg8VBXAm1Sb2SKc3EOpsa5aZjEuQ1KY Ju+3do6C1HEUy/CbI9Nfsv4OnWKtuaLNVSBmYeHEMDLL9kXeteLfgpL+hQngz7cBJUmP Z/hg== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=Jy643YAz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n184si13462004iof.131.2018.01.22.07.48.36; Mon, 22 Jan 2018 07:48:49 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@ziepe.ca header.s=google header.b=Jy643YAz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1751282AbeAVPrq (ORCPT + 99 others); Mon, 22 Jan 2018 10:47:46 -0500 Received: from mail-wm0-f43.google.com ([74.125.82.43]:44322 "EHLO mail-wm0-f43.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751244AbeAVPrl (ORCPT ); Mon, 22 Jan 2018 10:47:41 -0500 Received: by mail-wm0-f43.google.com with SMTP id t74so17287628wme.3 for ; Mon, 22 Jan 2018 07:47:41 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=ziepe.ca; s=google; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to:user-agent; bh=nL9TMjGeX5Gq91Br8Ebb0bxpjf+iZsg+oqvNM0HaAvw=; b=Jy643YAzsA3AqdlzDy5a9h/kwo/0QqBq/2p3Hc3Wyy/o0WWt/Ro2ziZFICkjUtgcUr BkhQxls8o1GaGNsjoddMPKs6eT10rBI4f9/s1ajhJz2jNd1K2b2R+u+TmfmJ109Kmm1t 2zXVmb8QS6rzT9U4jVma7AeKH/1hG+HXraQ7G94CwdrcmV2f4I6zf2GZoT+1GHv6eGOZ ykAINaR4Ynxf1uluCTlUttr8D5lJ6VGygl/ZHh7FwN8zPMV8CshyRiKTLTV6b/fcIvT2 f7ijwe4ujIU69bLRpIUm7Da/WZi1VeuXYsKvIM57ijzBPf62VHEofafG4xICVeodTWOM Bp3w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to:user-agent; bh=nL9TMjGeX5Gq91Br8Ebb0bxpjf+iZsg+oqvNM0HaAvw=; b=lFthOAgc4oftDoMPLbDES7JkcGRx2XxMZd9qLd8ENEUbgXr5g/5RyCwpjdkea8d6pD WU0U96y0FfncnxvqssF36Mb5GeQ7WdOUpkY0tqWKLB3aeTGRL3XmlSKE2aKNcZhSsF9U eCkPpSDRwJSwduu3wdFmDHC+Uygj+XPbUZThwOu1LA09n3EPiCBFO9NUkKLB0twfMoNJ fqDKBuRCNYwz9FpJk04vykX/D+jNLeVjLH1ZwnZtFuirZ89FQqiWkhexfeZRlGpzJXYf SubqHAxFuXX8tfl4rDjCW/uhXUpUupuzPI3UTdHj/J1wv0QJkVuZ2v5uzy9luhfsf9yb 3oLA== X-Gm-Message-State: AKwxytca0YzKxCTMyTh8WXOA1ZWebL+N9+clQk571mno8sH7eREOBYhc eJVpovF28twxmpmFBCJ1usDtVA== X-Received: by 10.28.241.14 with SMTP id p14mr5367361wmh.20.1516636060331; Mon, 22 Jan 2018 07:47:40 -0800 (PST) Received: from ziepe.ca (S010614cc2056d97f.ed.shawcable.net. [70.74.179.152]) by smtp.gmail.com with ESMTPSA id p29sm10703756wmf.20.2018.01.22.07.47.37 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Mon, 22 Jan 2018 07:47:38 -0800 (PST) Received: from jgg by mlx.ziepe.ca with local (Exim 4.86_2) (envelope-from ) id 1edeJq-0007nK-DL; Mon, 22 Jan 2018 08:47:34 -0700 Date: Mon, 22 Jan 2018 08:47:34 -0700 From: Jason Gunthorpe To: "jianchao.wang" Cc: Eric Dumazet , Tariq Toukan , junxiao.bi@oracle.com, netdev@vger.kernel.org, linux-rdma@vger.kernel.org, linux-kernel@vger.kernel.org, Saeed Mahameed Subject: Re: [PATCH] net/mlx4_en: ensure rx_desc updating reaches HW before prod db updating Message-ID: <20180122154734.GD14372@ziepe.ca> References: <1515728542-3060-1-git-send-email-jianchao.w.wang@oracle.com> <339a7156-9ef1-1f3c-30b8-3cc3558d124e@mellanox.com> <1516552998.3478.5.camel@gmail.com> <460fca68-f8a8-e3c4-2e60-e90dc0e2f843@oracle.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <460fca68-f8a8-e3c4-2e60-e90dc0e2f843@oracle.com> User-Agent: Mutt/1.5.24 (2015-08-30) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Jan 22, 2018 at 10:40:53AM +0800, jianchao.wang wrote: > Hi Eric > > On 01/22/2018 12:43 AM, Eric Dumazet wrote: > > On Sun, 2018-01-21 at 18:24 +0200, Tariq Toukan wrote: > >> > >> On 21/01/2018 11:31 AM, Tariq Toukan wrote: > >>> > >>> > >>> On 19/01/2018 5:49 PM, Eric Dumazet wrote: > >>>> On Fri, 2018-01-19 at 23:16 +0800, jianchao.wang wrote: > >>>>> Hi Tariq > >>>>> > >>>>> Very sad that the crash was reproduced again after applied the patch. > >> > >> Memory barriers vary for different Archs, can you please share more > >> details regarding arch and repro steps? > > > > Yeah, mlx4 NICs in Google fleet receive trillions of packets per > > second, and we never noticed an issue. > > > > Although we are using a slightly different driver, using order-0 pages > > and fast pages recycling. > > > > > The driver we use will will set the page reference count to (size of pages)/stride, the > pages will be freed by networking stack when the reference become zero, and the order-3 > pages maybe allocated soon, this give NIC device a chance to corrupt the pages which have > been allocated by others, such as slab. But it looks like the wmb() is placed when stuffing new rx descriptors into the device - how can it prevent corruption of pages where ownership was transfered from device to the host? That sounds more like a rmb() is missing someplace to me... (Granted the missing wmb() is a bug, but it may not be fully solving this issue??) Jason