Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp243628ybj; Wed, 6 May 2020 16:26:17 -0700 (PDT) X-Google-Smtp-Source: APiQypIh7VhdwTsvas0EeNXukaFXNhIXzjfUzUHeYm+TsJxruG8ER4tYIYWeVie7H6YVZr+ur7Ee X-Received: by 2002:a17:907:2711:: with SMTP id w17mr9533879ejk.116.1588807576862; Wed, 06 May 2020 16:26:16 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588807576; cv=none; d=google.com; s=arc-20160816; b=ajJGb14oEBW3xvA8k2yX1oHv8BIwReUzadsU99DPVfcpA7ztdGbGx7VfSvmgZsofYz N2zsJmdALuqEFrtXLVn5JVcUedv6s4IryH3bENyc0wI6hB1nCxUoW+ArqA41iq0U7gTf 7FiFGHzCticMIoctP4DDqNlJREXPpHX6p9xFftjpDzruS+pX/K65r81cIKZHN8hznkvp 0V3nJ8HXsKYPUDj/K4OteJlNbW+W8/DNqlsS4sjMGTCyCpikDJtr1yZimy7zT5GWM9LL xS1Y3aP8tpnVyjl2wvR9SPrWWMICjn6pgbV5h2JBRvmzvR0RXWNReK3UqgTfNKg0E4lC QEsw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=XbqSj0zLlZxwhc6FbfH4AgPdBJNe+u7Ez5XFOAJ4dlE=; b=aP9o1Tm+BJ37dTpl+1AtQSClDGNUgDBXHNgd5sqGZypDQX+Be+8N22xKDt5bVqam/G l9NrwV5/feGWRNJ8/hpPvNyc4/QBe79p/5oSKudP73bCMN9YmJ27UbeJViBtajSZdL93 ua6gZAhZW99H+WXLuOH43T0CRedbfh1iB58E0zxeeApOXdpNsPKOEcTksCThgNFA9+LW eudrC8GwI472J0wI03H7MKaCVirA97nq18JZAUw08zmzwezBJsFFcUeuvkJzleMg+6vV VhNGgSeskKSGssEi+MwpGMnFxBb9tCZW4EuiMJT77urs69s0W46hFfs+yhhzvIpkU2E8 ljqQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sMK2gxKz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id k11si1979460edx.311.2020.05.06.16.25.53; Wed, 06 May 2020 16:26:16 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=sMK2gxKz; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728935AbgEFOdy (ORCPT + 99 others); Wed, 6 May 2020 10:33:54 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:50744 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1728836AbgEFOdy (ORCPT ); Wed, 6 May 2020 10:33:54 -0400 Received: from mail-yb1-xb41.google.com (mail-yb1-xb41.google.com [IPv6:2607:f8b0:4864:20::b41]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1AE38C061A41 for ; Wed, 6 May 2020 07:33:54 -0700 (PDT) Received: by mail-yb1-xb41.google.com with SMTP id w137so1085384ybg.8 for ; Wed, 06 May 2020 07:33:54 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=XbqSj0zLlZxwhc6FbfH4AgPdBJNe+u7Ez5XFOAJ4dlE=; b=sMK2gxKz00XlJrqT2IQRh3RVnLny1lQB0H+zg4sYGFloaiAo/v9DJLCWh3yavgGR9t c9UH85bbgwNnZ8rYaAmhkL9RXQ0ojcvHwdGPEORoemyMdhnQh1F34HjrbRUWcPnC06if XXQz6FU/rVCvlSYCU7ys0+0E0Twdld4F+lZM4DENbx1STo8XrEe+V2fdFnCkStrpVY/p AtTCq68fOOYR2goQGPNTKBBoBU+ce30sIhhZuhLPmFihWqnJd1lOneugUIJwibLPGWmh wHKs3CJjA3fRLTTjL74ocbYlbKhUsFm/Mv/m6kMUuQOJXTxOg2QqsPJd/cn1IoG3HMml gn/g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=XbqSj0zLlZxwhc6FbfH4AgPdBJNe+u7Ez5XFOAJ4dlE=; b=CY5gjmT2P0vRU126EmpAoA1o1eB0gLzhi72iLe/HuPHMdx3cyT/DuTmNjBCUXLmEHF pfvANbRr9L9cHyF/A28rYT7q95f1OILwY6SGKDvSsU4iPVq38PzbrM/lXRcuZueyxD54 GfJ6Jd8eElqsnzNXNyc6wf9Td9KhpYhyGY1cLoaiOtw/NWft0rMRsSsp10yMAA4mcyej 2NGddfyWYIGQfg6ZwKZQMEoiApvasnJq2ZdFjZax8aWbUKDZyGpH6aw/N889nU567QSu pXbn49mYXh2ST5Xcl5Co2srSrNFVIg18kOmfH4rEXXrrvUzUrW4kjqge/lS4hvQNOxqx KXIQ== X-Gm-Message-State: AGi0Puan0ExvSalVAcjjkaLLntksoZ8aJ7v/JQ/OH5PA2OvIY2UZLQyf hlObrLx8uYkKwqjgQ2/DQqQ0gc+iInEAtOvhNj3Ueg== X-Received: by 2002:a25:bc53:: with SMTP id d19mr13792238ybk.395.1588775632988; Wed, 06 May 2020 07:33:52 -0700 (PDT) MIME-Version: 1.0 References: <20200505184955.GO2869@paulmck-ThinkPad-P72> <20200506125926.29844-1-sjpark@amazon.com> In-Reply-To: <20200506125926.29844-1-sjpark@amazon.com> From: Eric Dumazet Date: Wed, 6 May 2020 07:33:41 -0700 Message-ID: Subject: Re: Re: Re: Re: Re: [PATCH net v2 0/2] Revert the 'socket_alloc' life cycle change To: SeongJae Park Cc: "Paul E. McKenney" , Eric Dumazet , David Miller , Al Viro , Jakub Kicinski , Greg Kroah-Hartman , sj38.park@gmail.com, netdev , LKML , SeongJae Park , snu@amazon.com, amit@kernel.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Wed, May 6, 2020 at 5:59 AM SeongJae Park wrote: > > TL; DR: It was not kernel's fault, but the benchmark program. > > So, the problem is reproducible using the lebench[1] only. I carefully read > it's code again. > > Before running the problem occurred "poll big" sub test, lebench executes > "context switch" sub test. For the test, it sets the cpu affinity[2] and > process priority[3] of itself to '0' and '-20', respectively. However, it > doesn't restore the values to original value even after the "context switch" is > finished. For the reason, "select big" sub test also run binded on CPU 0 and > has lowest nice value. Therefore, it can disturb the RCU callback thread for > the CPU 0, which processes the deferred deallocations of the sockets, and as a > result it triggers the OOM. > > We confirmed the problem disappears by offloading the RCU callbacks from the > CPU 0 using rcu_nocbs=0 boot parameter or simply restoring the affinity and/or > priority. > > Someone _might_ still argue that this is kernel problem because the problem > didn't occur on the old kernels prior to the Al's patches. However, setting > the affinity and priority was available because the program received the > permission. Therefore, it would be reasonable to blame the system > administrators rather than the kernel. > > So, please ignore this patchset, apology for making confuse. If you still has > some doubts or need more tests, please let me know. > > [1] https://github.com/LinuxPerfStudy/LEBench > [2] https://github.com/LinuxPerfStudy/LEBench/blob/master/TEST_DIR/OS_Eval.c#L820 > [3] https://github.com/LinuxPerfStudy/LEBench/blob/master/TEST_DIR/OS_Eval.c#L822 > > > Thanks, > SeongJae Park No harm done, thanks for running more tests and root-causing the issue !