Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp107850ybk; Fri, 8 May 2020 20:22:18 -0700 (PDT) X-Google-Smtp-Source: APiQypJrLp8SROpUhl0VQCqf3UIzZ11pO7rPqg1X3Vsyp922yO2UAEx+Uu7WkgLtsxUsTn97Ekr1 X-Received: by 2002:a05:6402:3047:: with SMTP id bu7mr4834667edb.303.1588994538570; Fri, 08 May 2020 20:22:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588994538; cv=none; d=google.com; s=arc-20160816; b=VnUk0+UmPou3kLuUA0NHl5lu8+7rZkSkhuY7j+LJFA9CNL1CHOgC++VzXH8QM/EPhM 0EYnCIBHIlQ/IG6SjDmgmN6aEawSK4Y/FdBHOEoc065xerQ/08Q1LpbEBYE+Pkiv8esY +m/lEE2Y8LuLr+KB3YIF7C7yv3iFKTtVLdYrxe2W2UfhVyxjKI7nV99CGnbwRcmUmlgR j3JPjazsXRO1V4pP8DX2fjfpzdP88jx4rixUTox6onWEV/MI464PCwnyDPWhBKGEmtaw dNGCPLu8FoxTzqjYNz78Sz0IdEtya4jU2byGkh/HsLrY/5XqCMdDlyTIShb1r0xQn2nm Cc0Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :user-agent:date:message-id:subject:from:cc:to; bh=md4IjrJ+EzTL3EATIRRIFf6PhUVK+a0Uo8FF5fwX/dE=; b=KJH9mWqWeKdKpLwn5pY1DtzJY8gsrRap8jzVae20/cAP+lf3OfABM4aZgZrWqAz8lE EhX0KcNi7dJRFMg1WDm6PuAfm4f8feebMpuY0Or01cvX3J1zOPy4DexhrHAs8nIZPF4p W0lD7de115wS49SdoZrEWI6dvUwyPvkm+pt0Nq2W0cx65//kBUhKVBkrcH9wkJxZms27 SmryPSsOSUdgKN6ip8vsMgAL3FYyFeRoTiTC6NzHZ3iAX45hYsqjhGmfevR+FS1U2Xfb x6/nqoXeqc64T0PvVDfqmRD6rUp05LCrI2nz47P/ALU4zvt97ECJfTtVB+MKo88ZkSmv 8l8g== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id a4si2125890edf.478.2020.05.08.20.21.55; Fri, 08 May 2020 20:22:18 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728743AbgEIDTw (ORCPT + 99 others); Fri, 8 May 2020 23:19:52 -0400 Received: from szxga04-in.huawei.com ([45.249.212.190]:4361 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728471AbgEIDTw (ORCPT ); Fri, 8 May 2020 23:19:52 -0400 Received: from DGGEMS403-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id 3B481973F3FFD9566F27; Sat, 9 May 2020 11:19:49 +0800 (CST) Received: from [10.133.206.78] (10.133.206.78) by smtp.huawei.com (10.3.19.203) with Microsoft SMTP Server (TLS) id 14.3.487.0; Sat, 9 May 2020 11:19:40 +0800 To: Tejun Heo , David Miller CC: yangyingliang , Kefeng Wang , , , , , Linux Kernel Network Developers From: Zefan Li Subject: [PATCH] netprio_cgroup: Fix unlimited memory leak of v2 cgroup Message-ID: <939566f5-abe3-3526-d4ff-ec6bf8e8c138@huawei.com> Date: Sat, 9 May 2020 11:19:39 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.1 MIME-Version: 1.0 Content-Type: text/plain; charset="gbk" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.133.206.78] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If systemd is configured to use hybrid mode which enables the use of both cgroup v1 and v2, systemd will create new cgroup on both the default root (v2) and netprio_cgroup hierarchy (v1) for a new session and attach task to the two cgroups. If the task does some network thing then the v2 cgroup can never be freed after the session exited. One of our machines ran into OOM due to this memory leak. In the scenario described above when sk_alloc() is called cgroup_sk_alloc() thought it's in v2 mode, so it stores the cgroup pointer in sk->sk_cgrp_data and increments the cgroup refcnt, but then sock_update_netprioidx() thought it's in v1 mode, so it stores netprioidx value in sk->sk_cgrp_data, so the cgroup refcnt will never be freed. Currently we do the mode switch when someone writes to the ifpriomap cgroup control file. The easiest fix is to also do the switch when a task is attached to a new cgroup. Fixes: bd1060a1d671("sock, cgroup: add sock->sk_cgroup") Reported-by: Yang Yingliang Tested-by: Yang Yingliang Signed-off-by: Zefan Li --- net/core/netprio_cgroup.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c index b905747..2397866 100644 --- a/net/core/netprio_cgroup.c +++ b/net/core/netprio_cgroup.c @@ -240,6 +240,8 @@ static void net_prio_attach(struct cgroup_taskset *tset) struct task_struct *p; struct cgroup_subsys_state *css; + cgroup_sk_alloc_disable(); + cgroup_taskset_for_each(p, css, tset) { void *v = (void *)(unsigned long)css->cgroup->id; -- 2.7.4