Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752696AbdCAPqt (ORCPT ); Wed, 1 Mar 2017 10:46:49 -0500 Received: from mail-db5eur01on0069.outbound.protection.outlook.com ([104.47.2.69]:30112 "EHLO EUR01-DB5-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1752507AbdCAPqf (ORCPT ); Wed, 1 Mar 2017 10:46:35 -0500 Authentication-Results: vger.kernel.org; dkim=none (message not signed) header.d=none;vger.kernel.org; dmarc=none action=none header.from=mellanox.com; Subject: Re: Poll CQ syncing problem To: Christoph Hellwig References: <3ba1baab-e2ac-358d-3b3b-ff4a27405c93@mellanox.com> <20170301145124.GA12121@lst.de> CC: , , Majd Dibbiny , , From: Noa Osherovich Message-ID: <67049755-56c9-d2ba-c7c1-4a1593a5706f@mellanox.com> Date: Wed, 1 Mar 2017 17:28:41 +0200 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:45.0) Gecko/20100101 Thunderbird/45.7.1 MIME-Version: 1.0 In-Reply-To: <20170301145124.GA12121@lst.de> Content-Type: text/plain; charset="utf-8" Content-Transfer-Encoding: 7bit X-Originating-IP: [193.47.165.251] X-ClientProxiedBy: VI1PR07CA0047.eurprd07.prod.outlook.com (2a01:111:e400:5967::15) To DB5PR05MB1702.eurprd05.prod.outlook.com (2a01:111:e400:5bc4::8) X-MS-Office365-Filtering-Correlation-Id: 331bebf2-3cff-43da-26bd-08d460b7a783 X-MS-Office365-Filtering-HT: Tenant X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001)(48565401081);SRVR:DB5PR05MB1702; X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1702;3:5H2Kedo83YPf/y3M5/ZolNfGtt1gewYU62ZwYxg/m3B25HIPyOuR0GT9Eah1CeHFKVVX6om6TrcCg2/kTzuU4DBTy1c6YNqJvM4fvMFMIWfmxS4CN/lNxzEu8MDX1KjRcWpY2NggyIOM472ExELGyDwA96NlE7rsOB6/jeXP5K85Jd8YBOXYuIwl/pBAuzPRozpGRiJg26P6u9Q/AVXKSDMyTK6WMNlpYOg/mraaGSYBaWfyfzdMQ4WDZKVdimCQxhJn5uMODWlKWqQiPXCl2fTQVubHTWbD9EEJHEqc0A8=;25:BT0dy7q51ZsH83ez0ySrlLPQDx5qPt7dNWjNFmgvqvAxoOOnnEtp6BSROh+mQxJEhtMWPZ+9KsKtM/tdcudAfPWTmT/yvZF0BJucgoynyjVoGU8YctHntpknGUPNwVArGFB3ARvjd/18ppIdedUuhyyyjyDssz6NyUFWccZB8WPMCZy+ugsytVweqq1IEuZjJFfTwpHaYUBrLrf3DiJHc/QkKLVPRnt/MzQp4bq0BmhPxlxjEcpgwQGfinJURacK63arzRRJfxnweilbeUJ/ut0/P7vsHrZn7bqtTeTDRLT8vrrnLmtJyV2o3nR1WZEr02bBhjo09p2zfRgPebvVzJ19Ki7MYwZEHpl6jGP6iUJexY8pWobAwDzLrSHf9GtBsY2H7ZMtrm8E95n4IHHMHQVm6IU6OQ5yQklab0A3DUqOJ2D8vZ5UbtZru/7bvTjfIIQwO03zse+F22YixV9DhA== X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1702;31:j3FTkUJWguZjwZG9kWlLSdHoKFoaJM6v2jbHphheRH7Lb+uITXjzv2pMdHIkWqz2KLOeXTykV5iTb1Fh1lWTjc9FUyD+VGVo05V2PvqQax3E9q1Rkg+yl63g5dbhzDAln6uH68+yoC2HVDepTh6vC8cWZnGay9BUYs43z5RFJdw9oJAfjE9l/58iMWju8+6A60V9HLkyNeztTOctjE+hzlotXswpkNOTx4mWUQn427EmwFKEsVrJuj026Z3tXNr4;20:yg7tN/lknodHEdcWJL2neoT/8pI91nlvyw+JvehNCEp8o1nHRCud48OlATVD6bpjcFppdDIK9bA9Tglx8QaTkNMnQCrlzheL6mPk0VEdnoL3ZqsomXfRUfG5KOQkoTzzp/cSi4jWy2DXc2xa1CHFt+RdbOgQP7QayB+pUaBxznXwEtJk6AHAweVkg1Vi1bH2m1PIcZKcB7Ee3MpJsKM/GZfjMsydWsvGgAnPHOV01jiqFmxZtHIdsgLKxgpdBWpjFM11kj4O9JcPq/15qsEXVbuDyRLji/PAbyOpeQZ6QVxM9TiX8QfNWSo4PbW48LJKfH8iEBylpDKr8mHEOvSwECHHe3PVObASQrv04+CuO99PXJG5nWckuNyaL5gi/ERoeRUPHFc4kQ0PRz8gcFCs4OsODPE0mDKLH8XDjzFRs9dCYtT3Gz+kRv/jwN03uh7CoA6RBDsJmXnogNTN5shJLYhgkFR9G3C5ijdVh22ZCV3R9HpkjqfG8nd++0XAPs06 X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(3002001)(10201501046)(6055026)(6041248)(20161123562025)(20161123558025)(20161123555025)(20161123564025)(20161123560025)(6072148);SRVR:DB5PR05MB1702;BCL:0;PCL:0;RULEID:;SRVR:DB5PR05MB1702; X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1702;4:WG/bvyPpV/rtBR+RubhKxfeGnF9/Eka3QvcaekWIhU8HdwCgFdK18dP5Qe/iO7eSPa7b/xhG3x7HRZwtTQDwOgsaP2SxeGTI+8AsDzsoiRz8YTEbmUcl0QNiN0QJTxz/FviJw3VidRXbuRC/bJe1qzpXQXGlBRJ6gsOEElUbRnu5xXNhgAM4rphSN/+K35GJOVgGZskD7t8/Qq0cse+D9Ohc0Yo3Cq5PQ+U6bnpmSdEbh3bkJqV86fg8GDoGzcnoZnhOa4rULfK4V1tZcWBypyet0lgLTSn2BRZmN6Y9Hl7PFOKJpmx1xbltKpENgwMC9Nduxo9mz8VH24GPWW5jkpqQpWU9WEwDtqB4ltk1O8yaAl63BmORIuIj9PgTsk15ERgxcWGm53rrEr001UDlNCzCSzIw9n0S8hijl7rBetC307W2mzAX044bS+1b21Ap0sR2QmgawZcSJGNyZPP4Ao1L59rGnWBZj9GML9tvfIlOjTPTJS5/rnFU4Piscj4DTXWnCEaBFJhP91PbAKmHlD7tgAIFRoPCxwJVztMH7dyTxLSi70n4+VUm1VmcLtXq/e5NdfYJxl8OkDtpsK8fMNZAEUsVN5XIgqS7a4MXTBDTiAosLevvJc1PWoLdGWwB X-Forefront-PRVS: 0233768B38 X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6009001)(6049001)(7916002)(39840400002)(39450400003)(39850400002)(39860400002)(39410400002)(377454003)(24454002)(2906002)(76176999)(54356999)(230700001)(33646002)(305945005)(7736002)(229853002)(3846002)(38730400002)(50986999)(6116002)(110136004)(86362001)(6246003)(66066001)(31696002)(65956001)(65806001)(42186005)(31686004)(189998001)(77096006)(36756003)(23676002)(6486002)(81166006)(50466002)(92566002)(3480700004)(2950100002)(6916009)(53546006)(8676002)(6666003)(83506001)(53936002)(5660300001)(54906002)(4326008);DIR:OUT;SFP:1101;SCL:1;SRVR:DB5PR05MB1702;H:[10.8.1.187];FPR:;SPF:None;MLV:sfv;LANG:en; X-Microsoft-Exchange-Diagnostics: =?utf-8?B?MTtEQjVQUjA1TUIxNzAyOzIzOlhkQU4vR1Q4N2VUOExuTDhIL0w0RDBIMTFE?= =?utf-8?B?c25GQmJJeCswNFp3RkVQT1dBOVZrbTdxOFpaVEhQY29WazY3aUc1enl0V3pF?= =?utf-8?B?cVZXTFpENXFuUVpaU1ExTHNKc0Q1M3Q3Q0h4YmRjelFFSmdmQ09vNGF0T2d5?= =?utf-8?B?Q05OOWFkdU1zbXdzOFZTZGNDVDgzT29LUDIzRHdrdVJucHFzaHducTV3TmVS?= =?utf-8?B?N1h5NVFkYWFIQ3NRUHdkMjZUUkQxYlc0dFBESVlRZ2ZkNHlhODFqbmdoUmRE?= =?utf-8?B?YnZaRWRZRFdMM3hBTXZidWpVRjFHUHQ4S01WUlNpa29WNEpWL0tFZzV3d3Vr?= =?utf-8?B?eGFlR0JhTnRzVzlLcWtIRS9BTk40emxFK0FaVTMvTmwxMGZwdVFTKzIrZDJl?= =?utf-8?B?N29IbHJWb0pXemt6ak82aFBpaWVDemZuVXJPd2VUdHVLVmNsa2ZuREE5aDFX?= =?utf-8?B?aXpTSmY1WW9xbEk1Q1BNYml3YWY2cWNPVVZaeGRiWmFyTGdyVHQxVDNlUjIw?= =?utf-8?B?QVc3U0JQbGcvR2MyQS9sSW1BTnhOcTVDWXpKYnY1MkIzT0xKL3R6S0pMdVNq?= =?utf-8?B?enkzOXJzQWR1QmUyL2srK0JCajgrZkpySzNDeW5CVnpEVEdNbkFRNTBPU0gw?= =?utf-8?B?a01URWtrbEpvTHZYdnlKNGMzaXRyUUZLTU9qR2ZPQm1oenFRUFo1QU4zbW02?= =?utf-8?B?dndXWCtWVWhwaytTZk5IWFNZUWNabk5vVkdMSC96UjB1d2poVlFhVFpKMVp5?= =?utf-8?B?VWlOS2h1Z1htRmlRRDNEM2FxV2U1NDg4eVQ0VU5VNy9aaTAvanpmYXU0bUd1?= =?utf-8?B?ekRGOUh3V2QzcGVtMUd3YXducld1UlBHVFNNM3BJYlcyRjBMWHFOOWFyTFFr?= =?utf-8?B?a1l0bll0QzJjVzUvNmE2RnNoaVNmbi94c09RMTd1STVPZ0hoSFFaNFYwREx6?= =?utf-8?B?Y2tHQkx1MmVHSkxybm1HYXQvbHgxcE15NHdhNHBDOEJuZFBMdDRZdXc0VU5Z?= =?utf-8?B?SjFqQmVFeW9mYlV0dU5oVXpYdE1FRU9ESTVPRHMxTTAzYnRKV2c1Mi8yKzND?= =?utf-8?B?bFpBMXdQVHIvQU9CUW1NRzR5bHJIRWJkR3h3Qm5ndTJpYjZWYndxSHVtVDM2?= =?utf-8?B?K0RoeUlMT2J4UHM4TlYxZGFiT3dwMjhwMDJKYVh4TUFVRTJLemlWNEkvZ21O?= =?utf-8?B?QWNxQ28rZzFVbzdmelB1RUIzN3lZaVUvTzNROEswQkNqU1JUQzY4b2hCb3Q2?= =?utf-8?B?bkErK3VkVXFPdHBkUTZpeVFROXY0ODZVL1plZUZjeVFhQTJnMmhiMW8yZ1BH?= =?utf-8?B?ZGs2emdhbHJRQm5lWU9kVE1nYkc5bHVpK3FWSVIzdUdTKzZOLzlvbHllVlZ3?= =?utf-8?B?WUV4Z052cXdMR2JSVWM2R05RNXV3akpKdmhRV0FVMGhxYkhkWHpUU1FqVE81?= =?utf-8?B?S0luMjFzWTZJTzdjaXFWTG9KUWhLVlVNdU93UjdEaW5qV0xFUFVpbDNjeVJi?= =?utf-8?B?STU3QWw3cjlKcUVGZTZ0aHNwcGhhQTVXeDluci9CelRHUWJHTWdNR0NBVisz?= =?utf-8?B?bTdxUUM5OHd4cTJpLzQ4WVVha3pzem8rZ0RDcHAzeEl0aW1iUC9KNjkxS0dF?= =?utf-8?B?TmJKRVN2VUpSRTlFV2xJTU5zNFgwQjNnNnRmYkhHUjZUci83RmxuMEJGSEhl?= =?utf-8?Q?CMm2vzdkIzb9612JEw=3D?= X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1702;6:pdTHitxy/R8EXI3ojwJRGPERARvmFI1aG4WE0QEHEN+/G6iVBsKH7pudgzgLJKcGByNxHdtiZhKkIwOcFGeYR6aglcXt/YfdXzcG0JhQHttCGiYX/4mDVs+J9PL7K0LyOVJQ1aDh1l1jr5FXq+ZKgyLk/9VrEDHH6esoLm9aWBjA6sM29+hdSHc3PciRfUC09OVW3PKLhLDbpHI1Wf+dt/eNXrVUjwwrexHgVl3GFJYy+0f7OBdSf3H/omv/pRrvLLbLKHtqoP+mk+yfXpee8jZxnWpdvZ9/JXnh4uSXBTkFg7Q3J4OvVpvjhB9yCyJNaHlCKatwIEjQTHoGgESLeZvtEH5c5Sw6iOSpiBA7l1JwsLvGez0hfigZ13JPOPB9dcxg71/FUxoJ+3ClY3u1dUC7HgQTecJuVG1F97sjqkY=;5:AZ7+YuxCnBnCmvsZ/PZ66WqzaZTHvCODVsVCqlnfpvIayjok6sRtKc7Sv5qMMhEQROaskMNcsSUxDUZKw2VGAkAkxc/bJKU/byxM10RYTm3uhAvDlkvy1Ow3Uy4G9zEy6pzCC3jkK16k9gxlbofHzQ==;24:nYM4kN13wH/R/kXW4rOhQsnBmBHoieWBTTb33itcV8N76QeGQYPmpqcZK+8L0FuFZ33Tdh9eL7iccQ2/et6dERyU/JKE5xGa6oZEkn//fpc= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;DB5PR05MB1702;7:c49YpocBb9IeY5NqsbvsxPjNNUGX/yhGnZjvENZICjCZTUF0ktLpTC7TKDVoethu0beSF16WIcpiZBvrSmK+ZUlxLYgc/LQPSF3EMIz6Pe0rvp2vNe+15ge92wqF3OxEuBiOgAh6EVWQ/oxE1ByuXEHEGBoW4HTfCWFodYz1ZrYMjFcGeh0GfPfZG5/RvrRUmR6VUeNfc6cAzEIWW2qYVQpJkPkCAvZtjCd3VoBep19tCBuEdTaGRXQg7jD5s5F25LDvk+pOetBhyZ0JWAuOcN7W7Xdwkeq3nQKFoems8jVwJzTuvaAAL2+BLPAKebQfX/XV9d+yNO92IWIgoc+t9Q== X-OriginatorOrg: Mellanox.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 01 Mar 2017 15:28:46.2535 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: DB5PR05MB1702 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 1011 Lines: 22 On 3/1/2017 4:51 PM, Christoph Hellwig wrote: > On Wed, Mar 01, 2017 at 04:30:26PM +0200, Noa Osherovich wrote: >> Analysis: >> Since ib_comp_wq isn't single threaded, two works can run in parallel for the same CQ, >> executing __ib_process_cq. > They shouldn't. Each CQ has a single work_struct, and any given work_struct > should only be executing at once: > > "Note that the flag ``WQ_NON_REENTRANT`` no longer exists as all > workqueues are now non-reentrant - any work item is guaranteed to be > executed by at most one worker system-wide at any given time." > >> Since this function isn't thread safe and the wc array is shared, it causes a data corruption >> which eventually crashes in the MAD layer due to a double list_del of the same element. > This should not be the case. What kernel version are you testing and does > it contain any patches touching core kernel code? Thanks Christoph for the quick response. Currently we see this only in old kernels. I'll investigate this more and update.