Received: by 2002:a05:6a10:2726:0:0:0:0 with SMTP id ib38csp3461556pxb; Mon, 4 Apr 2022 17:44:09 -0700 (PDT) X-Google-Smtp-Source: ABdhPJwms0cWFQEOfjR+M8XCzkL8bV4bs3ucFauWVL6ZqZuuaisH8CLavMxBsIn7fLoGoF2Iy9Fp X-Received: by 2002:a17:902:da91:b0:153:cc84:f13f with SMTP id j17-20020a170902da9100b00153cc84f13fmr807167plx.38.1649119449464; Mon, 04 Apr 2022 17:44:09 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1649119449; cv=none; d=google.com; s=arc-20160816; b=Hd4OOHuDDpJy8urt5JdAzpMq4hSW+WkVhRyl2PMrVj+spRobvg5Nt3XiE2g0Po9ohk bfSIpwqVvwUre7xUMFsIUaJ/cgx4r1xd9Q0Z2r6CWOgEmNzpOOChkftLh97rW+EtqULo JjzXBADI89IfIBl8m/mBHibas3NZCJ5arTaoVsLlrjGtYv5oQLZgvUXXs8s3RyXbHc9p +Zz+NK/VH8AOIJsJH06xlB2TusuG6VcSP4THuZK4SOnKa5eLheiyLknzA4AXt2BnXjRz Y9dUCvkTxnDwSEeHrdCGUZ2SkFk9DHvmgdbumeKaj5Z+NSlKrVoV2v4fIJOWddwtjTVh A9CA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:user-agent:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date; bh=uS0h8wDUOrTvwqX+X7Jmw3uo7NSPqnA6ZZMASd0hV1Y=; b=F0ATLYLz2Rn/fz6FJaQ4jR9rVH0kW7QZ8fP8Cun1Wrg1n8fdbIKE7KQk9RHNUI9hyc YCeoQxPr5BTBEtUZ1FxbFUtoQdbdQ9lFsmw+CQDmHMn5hXKaGpKftwEsvPGU6hl8VvpU SF6k2vVh1MGNrNz5/waG/DWj6th8JFZ+K2BPD/P0iRA4ykbMpKiEt/aHwflLfFooetEy nJx3wUT96P+3T4Z//n4dZhXN6ndG/zhcndKMPTTE2DFwfEwrDLnryti38Yr3JpNlscSI 6C/WtKiRBf0UKlbKpAzQqJrGqW+/SRgl1TfxZjKadz/NMKTZj+s5XnPbDqScOWkjVZfC v4pQ== ARC-Authentication-Results: i=1; mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from lindbergh.monkeyblade.net (lindbergh.monkeyblade.net. [23.128.96.19]) by mx.google.com with ESMTPS id b3-20020a637143000000b0039890725054si11562388pgn.711.2022.04.04.17.44.08 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 04 Apr 2022 17:44:09 -0700 (PDT) Received-SPF: softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) client-ip=23.128.96.19; Authentication-Results: mx.google.com; spf=softfail (google.com: domain of transitioning linux-kernel-owner@vger.kernel.org does not designate 23.128.96.19 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by lindbergh.monkeyblade.net (Postfix) with ESMTP id 97A91A146A; Mon, 4 Apr 2022 16:56:44 -0700 (PDT) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1345728AbiDAMFi (ORCPT + 99 others); Fri, 1 Apr 2022 08:05:38 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:41996 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1345187AbiDAMFg (ORCPT ); Fri, 1 Apr 2022 08:05:36 -0400 Received: from Chamillionaire.breakpoint.cc (Chamillionaire.breakpoint.cc [IPv6:2a0a:51c0:0:12e:520::1]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 8ADAB5F68; Fri, 1 Apr 2022 05:03:45 -0700 (PDT) Received: from fw by Chamillionaire.breakpoint.cc with local (Exim 4.92) (envelope-from ) id 1naG0E-0007M7-Ag; Fri, 01 Apr 2022 14:03:42 +0200 Date: Fri, 1 Apr 2022 14:03:42 +0200 From: Florian Westphal To: Vasily Averin Cc: Pablo Neira Ayuso , kernel@openvz.org, Jozsef Kadlecsik , Florian Westphal , netfilter-devel@vger.kernel.org, linux-kernel@vger.kernel.org, Roman Gushchin Subject: Re: [PATCH nft] nft: memcg accounting for dynamically allocated objects Message-ID: <20220401120342.GC9545@breakpoint.cc> References: MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: User-Agent: Mutt/1.10.1 (2018-07-13) X-Spam-Status: No, score=-1.9 required=5.0 tests=BAYES_00, HEADER_FROM_DIFFERENT_DOMAINS,MAILING_LIST_MULTI,RDNS_NONE, SPF_HELO_NONE,T_SCC_BODY_TEXT_LINE autolearn=no autolearn_force=no version=3.4.6 X-Spam-Checker-Version: SpamAssassin 3.4.6 (2021-04-09) on lindbergh.monkeyblade.net Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Vasily Averin wrote: > nft_*.c files whose NFT_EXPR_STATEFUL flag is set on need to > use __GFP_ACCOUNT flag for objects that are dynamically > allocated from the packet path. > > Such objects are allocated inside .init() or .clone() callbacks > of struct nft_expr_ops executed in task context while processing > netlink messages. They can also be called from packet path. > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/nf_tables_api.c > index 04be94236a34..e01241151ef7 100644 > --- a/net/netfilter/nf_tables_api.c > +++ b/net/netfilter/nf_tables_api.c > @@ -5447,7 +5447,7 @@ int nft_set_elem_expr_clone(const struct nft_ctx *ctx, struct nft_set *set, > int err, i, k; > > for (i = 0; i < set->num_exprs; i++) { > - expr = kzalloc(set->exprs[i]->ops->size, GFP_KERNEL); > + expr = kzalloc(set->exprs[i]->ops->size, GFP_KERNEL_ACCOUNT); > if (!expr) > goto err_expr; This is ok. > diff --git a/net/netfilter/nft_connlimit.c b/net/netfilter/nft_connlimit.c > index 3362417ebfdb..9c2146aac59e 100644 > --- a/net/netfilter/nft_connlimit.c > +++ b/net/netfilter/nft_connlimit.c > @@ -77,7 +77,7 @@ static int nft_connlimit_do_init(const struct nft_ctx *ctx, > invert = true; > } > > - priv->list = kmalloc(sizeof(*priv->list), GFP_KERNEL); > + priv->list = kmalloc(sizeof(*priv->list), GFP_KERNEL_ACCOUNT); > if (!priv->list) > return -ENOMEM; Same. > @@ -214,7 +214,7 @@ static int nft_connlimit_clone(struct nft_expr *dst, const struct nft_expr *src) > struct nft_connlimit *priv_dst = nft_expr_priv(dst); > struct nft_connlimit *priv_src = nft_expr_priv(src); > > - priv_dst->list = kmalloc(sizeof(*priv_dst->list), GFP_ATOMIC); > + priv_dst->list = kmalloc(sizeof(*priv_dst->list), GFP_ATOMIC | __GFP_ACCOUNT); This can be called from packet path, via nft_dynset.c. nft_do_chain -> nft_dynset_eval -> nft_dynset_new -> nft_dynset_expr_setup -> nft_expr_clone -> src->ops->clone() > diff --git a/net/netfilter/nft_counter.c b/net/netfilter/nft_counter.c > index f179e8c3b0ca..040a697d96b3 100644 > --- a/net/netfilter/nft_counter.c > +++ b/net/netfilter/nft_counter.c > @@ -62,7 +62,7 @@ static int nft_counter_do_init(const struct nlattr * const tb[], > struct nft_counter __percpu *cpu_stats; > struct nft_counter *this_cpu; > > - cpu_stats = alloc_percpu(struct nft_counter); > + cpu_stats = alloc_percpu_gfp(struct nft_counter, GFP_KERNEL_ACCOUNT); This is ok. > @@ -235,7 +235,7 @@ static int nft_counter_clone(struct nft_expr *dst, const struct nft_expr *src) > > nft_counter_fetch(priv, &total); > > - cpu_stats = alloc_percpu_gfp(struct nft_counter, GFP_ATOMIC); > + cpu_stats = alloc_percpu_gfp(struct nft_counter, GFP_ATOMIC | __GFP_ACCOUNT); > if (cpu_stats == NULL) > return -ENOMEM; Same problem as connlimit, can be called from packet path. Basically all GFP_ATOMIC are suspicious. Not sure how to resolve this, similar mechanics in iptables world (e.g. connlimit or SET target) don't use memcg accounting. Perhaps for now resend with only the GFP_KERNEL parts converted? Those are safe. Insertion from packet path is limited by set->size (element count) only at this time.