Received: by 2002:a25:868d:0:0:0:0:0 with SMTP id z13csp114019ybk; Fri, 8 May 2020 20:37:39 -0700 (PDT) X-Google-Smtp-Source: APiQypLOx+4o8WJTBI8wL5D/8m9TlX5voWyxeMOaZed5hvS40LwhU+nzWdw+Cp31O9Vw36FyofyI X-Received: by 2002:a17:906:dbf6:: with SMTP id yd22mr3780823ejb.231.1588995458923; Fri, 08 May 2020 20:37:38 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1588995458; cv=none; d=google.com; s=arc-20160816; b=FlFrWLNmoJOE67AXvhJ5lbIxHJi0kdbSuvbdlSgjkw4EDj3o8q+X9qBZGEq6Z6ll9f SNIw28XJcstYMbAuVJitpSAgRAPqMUD1aspzgimlYzgCauYluerV310ugWLpnMBcwrsy MGqO/V+WFzZJQDWnPwzTHEXPQ9oWNM+V25CCAtmMP4tLzqIMIoV2mys+/XSw+VA0A7SV U1SDegydLusk+iCD/9E75Yx7/Pt3yzAQqsUuTrDE1xEdxPqjXYDrKBr2/pDMifXhbQPH /JMxHFeC9ny2SwH/xjlES0BvbQ9Sz9aSPs3U+4jtfxO+LOAtLMVJpkGE6bX6SApkPPVM gkkg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:in-reply-to :mime-version:user-agent:date:message-id:references:cc:to:from :subject; bh=9GJyNW1UMKnKJ9mrHGliMV+/aBR0WXKU5C2ytraqeaQ=; b=miwiS1RshxauuFf4Xe4xXQQB0LT9LWaIfHTlViGprBJNYHIlrpZso93T5Ag0YFDEyQ JB4VS9e/qNqOQrxrCuE8/CmhUyouqJAiJteeRkU/u22C90dZ34QWWDhxuPwxZTvD3iYD /BTSC0mw6a+QCQ22u1ByIKMbdgT5ZXq214kdYZ3TEDknzm5bxKW/O4i9ohf0YlLAogFz afOBkZJeh7N/9R+cRc8b/AM9THjOUZJ7j13X1HbsuXHChMCm/0SYl8p8RRZtMuZyCuUo xKxL0PlSqrjSEDxWecSDHWWHzsf2SB6NERu6ohN0vDx8VSqPbfhXHxma2xUPKqnDhhyk QWCg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id y22si2104403ejw.407.2020.05.08.20.37.02; Fri, 08 May 2020 20:37:38 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728620AbgEIDc0 (ORCPT + 99 others); Fri, 8 May 2020 23:32:26 -0400 Received: from szxga05-in.huawei.com ([45.249.212.191]:4310 "EHLO huawei.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1728353AbgEIDcZ (ORCPT ); Fri, 8 May 2020 23:32:25 -0400 Received: from DGGEMS408-HUB.china.huawei.com (unknown [172.30.72.59]) by Forcepoint Email with ESMTP id F305CA6454AE448B38D9; Sat, 9 May 2020 11:32:20 +0800 (CST) Received: from [10.133.206.78] (10.133.206.78) by smtp.huawei.com (10.3.19.208) with Microsoft SMTP Server (TLS) id 14.3.487.0; Sat, 9 May 2020 11:32:10 +0800 Subject: [PATCH v2] netprio_cgroup: Fix unlimited memory leak of v2 cgroups From: Zefan Li To: Tejun Heo , David Miller CC: yangyingliang , Kefeng Wang , , , , , Linux Kernel Network Developers References: <939566f5-abe3-3526-d4ff-ec6bf8e8c138@huawei.com> Message-ID: <2fcd921d-8f42-9d33-951c-899d0bbdd92d@huawei.com> Date: Sat, 9 May 2020 11:32:10 +0800 User-Agent: Mozilla/5.0 (Windows NT 6.1; WOW64; rv:60.0) Gecko/20100101 Thunderbird/60.7.1 MIME-Version: 1.0 In-Reply-To: <939566f5-abe3-3526-d4ff-ec6bf8e8c138@huawei.com> Content-Type: text/plain; charset="gbk" Content-Transfer-Encoding: 7bit X-Originating-IP: [10.133.206.78] X-CFilter-Loop: Reflected Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If systemd is configured to use hybrid mode which enables the use of both cgroup v1 and v2, systemd will create new cgroup on both the default root (v2) and netprio_cgroup hierarchy (v1) for a new session and attach task to the two cgroups. If the task does some network thing then the v2 cgroup can never be freed after the session exited. One of our machines ran into OOM due to this memory leak. In the scenario described above when sk_alloc() is called cgroup_sk_alloc() thought it's in v2 mode, so it stores the cgroup pointer in sk->sk_cgrp_data and increments the cgroup refcnt, but then sock_update_netprioidx() thought it's in v1 mode, so it stores netprioidx value in sk->sk_cgrp_data, so the cgroup refcnt will never be freed. Currently we do the mode switch when someone writes to the ifpriomap cgroup control file. The easiest fix is to also do the switch when a task is attached to a new cgroup. Fixes: bd1060a1d671("sock, cgroup: add sock->sk_cgroup") Reported-by: Yang Yingliang Tested-by: Yang Yingliang Signed-off-by: Zefan Li --- forgot to rebase to the latest kernel. --- net/core/netprio_cgroup.c | 2 ++ 1 file changed, 2 insertions(+) diff --git a/net/core/netprio_cgroup.c b/net/core/netprio_cgroup.c index 8881dd9..9bd4cab 100644 --- a/net/core/netprio_cgroup.c +++ b/net/core/netprio_cgroup.c @@ -236,6 +236,8 @@ static void net_prio_attach(struct cgroup_taskset *tset) struct task_struct *p; struct cgroup_subsys_state *css; + cgroup_sk_alloc_disable(); + cgroup_taskset_for_each(p, css, tset) { void *v = (void *)(unsigned long)css->id; -- 2.7.4