Received: by 2002:a25:23cc:0:0:0:0:0 with SMTP id j195csp812274ybj; Tue, 5 May 2020 07:55:44 -0700 (PDT) X-Google-Smtp-Source: APiQypJO61vFQWl7tagppsYViQG9XcR7Wmz5YpqQCbBhAjTZZgsqs78Y5RrWjkDvEEYJ6JFSuY3C X-Received: by 2002:a17:906:d10b:: with SMTP id b11mr2961912ejz.62.1588690544378; Tue, 05 May 2020 07:55:44 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588690544; cv=none; d=google.com; s=arc-20160816; b=iefEH/vPZXrf3G0/Cn8V+A4haowplSKhEtFvXFFpuNmXMnLiNrCCa14ZILM9X0VHQU yBd7HWRui6zDXcW3p6lXqVc69zdWdrf0sDG9IgKFk7uNMf/b0AhdS0xlvs++qhfVsOjT mzpBvyqnfy8crbHNZq6VKDX2S7qI3UM+/F7T6gtd42ZHPyMQTfZFwbCibB7oPByQdvns f1ZD4+i/WJKwCuuwKU4T7dsLLC9z/fKstb5n68SL3ew1zSbbkVjAnnMsIiu+nupiRWfX h/XE8JNCmdInvGHiEAHA579//Y0lBc9J5GKmrFcjs3kEAFLPV3URwx08PW1e4DJP0A4+ B7fQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=YG6y8orYpJzDQB8e1eZa9Fuhz3Q8PhX0LEqRw9jpFUg=; b=PLRYta4KBSlvN3TjsPykL6ejyTrLrorn/Bd41wg68lhLlFLM1xWPJaMhWjPOTO82OC VJtwe/FlNbBHs4H5kACzAvfy8b0aBWvv9RHXHkj6aLrPfeUb0v9MMtBnM8sFuAu54pCG opY50KqLIykyjSqYquhNkKTMMPjoykDKu0N1Q5GuBybJccq33LhL0fw+z0TfkKDV4Igd 2HiemnNPuwuhEdvnSvxQ0ET2nRKRYwYyE8hHU8VwUxzJ8YgpdftzdUGHPR2IJkMJfmLB 1rQ5EhJxRfUcPonVas1hDw9lQWY3nf3Fv+LFMAJEUs/YQD2HawRNCvjkxS4yTxseJWcP sK3g== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tjSAZq8c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id p13si1172356ejo.283.2020.05.05.07.55.21; Tue, 05 May 2020 07:55:44 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20161025 header.b=tjSAZq8c; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1729603AbgEEOxw (ORCPT + 99 others); Tue, 5 May 2020 10:53:52 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:55114 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-FAIL-OK-FAIL) by vger.kernel.org with ESMTP id S1729150AbgEEOxw (ORCPT ); Tue, 5 May 2020 10:53:52 -0400 Received: from mail-yb1-xb43.google.com (mail-yb1-xb43.google.com [IPv6:2607:f8b0:4864:20::b43]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 1CCF5C061A0F for ; Tue, 5 May 2020 07:53:52 -0700 (PDT) Received: by mail-yb1-xb43.google.com with SMTP id w137so5303ybg.8 for ; Tue, 05 May 2020 07:53:52 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20161025; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=YG6y8orYpJzDQB8e1eZa9Fuhz3Q8PhX0LEqRw9jpFUg=; b=tjSAZq8c7g1WZfAE9R+g+hnNv/rcciTbeuQNP3UR//p0LVE48OMSS48KOloIX8wuQl 3GF5y9fIy2K7J/RyZsL7k6Y+njHlllGeEY24dnbzffYJUhuOwW0C6KMEDkGbJRWZixjj PEiNAIKAk0L08NRjz3eMU3QTOdsqG2RKyrdSO28IQsrNvqjTV9SNhWTsjlk/BErxBXkk MtkDYNfHSPUYKEIfuZxW2yaWCKM95pQeGE5qiPeVJdcUKzN407bqXp1YPgzvCAdGGvnC XfFs/WosYwgBSWiB7X0FFnZ0utQb6LeNXQlYZScQl5n8fkycSCnrL1Pc0YKRfXLr3hhO qKyA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=YG6y8orYpJzDQB8e1eZa9Fuhz3Q8PhX0LEqRw9jpFUg=; b=e7FTi0u+UNeLdGQBELyXVWePf3Bcj8g9lkb8HDUfi0C7ybZm+yI3/N0h3V1bqKUJBq rqZ2jismdtY3DpEh5r9Q6+ipZCQkKVpZOhhL4RGAzuDDQ/6DVWiHsEH8vGqlSeZ2aTdR aeNZdfYc7U/MJKEGM95v2A3ju949zHp5g9PtutkLyRO/dddLRzwmKpHoPZBeFbFeUx95 OJBc3JRvfrItygomXfNP0cpGCiso2M1c8Z1EZIWEBsDxKhES1NBqlBRVkOXBGfA1qVSc TCBU7VclkLsIB9HCcGh+GGRIpVVG0vErOQ2DYlB7ezsq8xb9rU5wuw3qY2cCEXu/Z7sc si4g== X-Gm-Message-State: AGi0PuZylc3Eho2JgqAKfecCywXW9SzblhPrU77AHej7DHEBWQ4HtzrO oemOGsqyNRn/5x757doCPnP5fhBDU3VJ1UkNb7FTEQ== X-Received: by 2002:a25:1484:: with SMTP id 126mr5292169ybu.380.1588690430999; Tue, 05 May 2020 07:53:50 -0700 (PDT) MIME-Version: 1.0 References: <20200505081035.7436-1-sjpark@amazon.com> <20200505115402.25768-1-sjpark@amazon.com> In-Reply-To: <20200505115402.25768-1-sjpark@amazon.com> From: Eric Dumazet Date: Tue, 5 May 2020 07:53:39 -0700 Message-ID: Subject: Re: [PATCH net v2 0/2] Revert the 'socket_alloc' life cycle change To: SeongJae Park Cc: David Miller , Al Viro , Jakub Kicinski , Greg Kroah-Hartman , sj38.park@gmail.com, netdev , LKML , SeongJae Park , snu@amazon.com, amit@kernel.org, stable@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Tue, May 5, 2020 at 4:54 AM SeongJae Park wrote: > > CC-ing stable@vger.kernel.org and adding some more explanations. > > On Tue, 5 May 2020 10:10:33 +0200 SeongJae Park wrote: > > > From: SeongJae Park > > > > The commit 6d7855c54e1e ("sockfs: switch to ->free_inode()") made the > > deallocation of 'socket_alloc' to be done asynchronously using RCU, as > > same to 'sock.wq'. And the following commit 333f7909a857 ("coallocate > > socket_sq with socket itself") made those to have same life cycle. > > > > The changes made the code much more simple, but also made 'socket_alloc' > > live longer than before. For the reason, user programs intensively > > repeating allocations and deallocations of sockets could cause memory > > pressure on recent kernels. > > I found this problem on a production virtual machine utilizing 4GB memory while > running lebench[1]. The 'poll big' test of lebench opens 1000 sockets, polls > and closes those. This test is repeated 10,000 times. Therefore it should > consume only 1000 'socket_alloc' objects at once. As size of socket_alloc is > about 800 Bytes, it's only 800 KiB. However, on the recent kernels, it could > consume up to 10,000,000 objects (about 8 GiB). On the test machine, I > confirmed it consuming about 4GB of the system memory and results in OOM. > > [1] https://github.com/LinuxPerfStudy/LEBench To be fair, I have not backported Al patches to Google production kernels, nor I have tried this benchmark. Why do we have 10,000,000 objects around ? Could this be because of some RCU problem ? Once Al patches reverted, do you have 10,000,000 sock_alloc around ? Thanks. > > > > > To avoid the problem, this commit reverts the changes. > > I also tried to make fixup rather than reverts, but I couldn't easily find > simple fixup. As the commits 6d7855c54e1e and 333f7909a857 were for code > refactoring rather than performance optimization, I thought introducing complex > fixup for this problem would make no sense. Meanwhile, the memory pressure > regression could affect real machines. To this end, I decided to quickly > revert the commits first and consider better refactoring later. > > > Thanks, > SeongJae Park > > > > > SeongJae Park (2): > > Revert "coallocate socket_wq with socket itself" > > Revert "sockfs: switch to ->free_inode()" > > > > drivers/net/tap.c | 5 +++-- > > drivers/net/tun.c | 8 +++++--- > > include/linux/if_tap.h | 1 + > > include/linux/net.h | 4 ++-- > > include/net/sock.h | 4 ++-- > > net/core/sock.c | 2 +- > > net/socket.c | 23 ++++++++++++++++------- > > 7 files changed, 30 insertions(+), 17 deletions(-) > > > > -- > > 2.17.1