Received: by 2002:ac0:a5a7:0:0:0:0:0 with SMTP id m36-v6csp5646245imm; Mon, 23 Jul 2018 03:36:23 -0700 (PDT) X-Google-Smtp-Source: AAOMgpfcfGTWBX4sKRq8V+DQDGeOYo2XuwnQ2BKtX4csMsjh2E5f2o7TQFVyoBq3zW3DpUgpEvOj X-Received: by 2002:a65:6104:: with SMTP id z4-v6mr11686181pgu.361.1532342183415; Mon, 23 Jul 2018 03:36:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1532342183; cv=none; d=google.com; s=arc-20160816; b=wi9P7ijH3UavuHj0qGmp9NZUj1Bv1ibm/wGYKoSJMhu0EGYOegqWRfXusDchOGIA6k ACkWwrMDgZ7SSqPCD9BFDgYZAdWXOXOHtuwrG4LY5UnT8eeXV45Vcjkde4aiDe11poeF az8/BAyCVcV8oW0wQIT4t9s9tvLGvXFlbLsQ/uMu1w5OfLrCsTITSO36V24VtDReEZJB 31FYyiRxTFQY2TwK0jbdqbNeAlUEnNoKT7/10a7lFv1KZ8s2f9z8BLNKBOmBcvVPkQNh wwSkowAz40JZb85fuXsDYAq8/hIgyS4wi/lHAHyqOpnGGnZz1unZUqgFU/3lgNHX1UT6 vDsA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :references:in-reply-to:mime-version:dkim-signature :arc-authentication-results; bh=hRI847Eh0Vcz9gjMfP1LGwddkgt3rwkP+Hx/1MlDA2k=; b=MkEov3H61kyrY9moUdTFWtlhKH9fIoKIZqPm7/H54dH0a/5+oxoj5ZiDslSjmB5TC2 3Oc2N+YfLdPwhHAwtXxheBAcIaNatnJOu2ZQrajYFfdIbEN1bMyyubwWq9LyRVZjUA5U 2b18VGwqv8BBFLtfNIjZbOAjT7e7c443e0hBxYpMKEmKjWiHn/eBs9JlMdzLueIpwmqI g7czfYzwGof1yUky3uCgivOY20YIV4ohrMLge3zMLPEvIp55WAUmcIaFPgM25dx05sWL bQRCHhX1eDwg5Il/jwFaRdvR7+JU+x2QEjoWTBjjUIzyVLQAhcomiJzqpKB4MhhUPOGY Qxvw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=QPfLNdJl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 16-v6si8119449pfy.169.2018.07.23.03.36.08; Mon, 23 Jul 2018 03:36:23 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=QPfLNdJl; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2388202AbeGWLfK (ORCPT + 99 others); Mon, 23 Jul 2018 07:35:10 -0400 Received: from mail-oi0-f66.google.com ([209.85.218.66]:37390 "EHLO mail-oi0-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2388065AbeGWLfK (ORCPT ); Mon, 23 Jul 2018 07:35:10 -0400 Received: by mail-oi0-f66.google.com with SMTP id k81-v6so241642oib.4; Mon, 23 Jul 2018 03:34:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=mime-version:in-reply-to:references:from:date:message-id:subject:to :cc; bh=hRI847Eh0Vcz9gjMfP1LGwddkgt3rwkP+Hx/1MlDA2k=; b=QPfLNdJl2cJIv766mCDNUsk0WFMdzPvddOAWzz8WgkxZ/ulhb5DiNlfC7QWBILd1Gm r2vkcN4z6IFVW1yXYu2qkExc7gWtTSgxqvL32I79g+t6deAAc2YSGfxkXSB7n7d5iYMn 5z9D6Qsga/xReWBXQpH+MgtObm0O9bRV0cT7RZd/B6Ixud7zjkkBeroBVZo8kVZRz4wU aiwW0h8yP4pFDyamvW2HLM5Oh+H3iNK7h22bzYLLqD0Z3VLvqaXTzRIDjdYp5qh9bOH3 3Nd+2oSaDpYf9u2q8rerdMShWZ7DpoPZm1lUpx339Vcu/tbfiF6quMPSjgsCdFKS+D9k 971w== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:in-reply-to:references:from:date :message-id:subject:to:cc; bh=hRI847Eh0Vcz9gjMfP1LGwddkgt3rwkP+Hx/1MlDA2k=; b=QVhMc4+MB26bf5tgSD1Vdy/kJbiSF6kEGa6wVZNoiOPhP4X58wNnnMx7vj3t417SKD 9KAnYLGWBRT3+rfqG3I9vZWEuHE3t9Yne7YGeS2NL1/XbROtfCLkXhQyqEhcA+Vni0/b 3rmJHKXMaLlehogMd4FRnBwl4v2cpGj3DF2YoGuM9899uVxaa+L1TZE47VznRRRoIbcD 81vfgjghvnYhb9Y/qSjjqx3uuttKNvXKd9ly3mGH5PEmxzUEi0eEHspq42jwPJQkf1B3 8Go1Ml3SVqHtjeUO5gfVD9KiypD75j/zK8qGoyUod2VMtfgx72oZV7gpC6uswVUlJJyx gJSg== X-Gm-Message-State: AOUpUlFSWg1jkTWbAbp2ivNWSQqsp0YD7u0xcEoHOlgxru9EkZmOiscL iVwk7u921a77td7Ji0hP+ENCALG8dIm+FBj4WDI= X-Received: by 2002:aca:ea57:: with SMTP id i84-v6mr7383058oih.266.1532342076662; Mon, 23 Jul 2018 03:34:36 -0700 (PDT) MIME-Version: 1.0 Received: by 2002:ac9:ca5:0:0:0:0:0 with HTTP; Mon, 23 Jul 2018 03:34:36 -0700 (PDT) In-Reply-To: References: <20180722143354.23722-1-cscnull@gmail.com> <20180722163925.gdfkndldatsoae6x@breakpoint.cc> <20180722.100755.19840167505550163.davem@davemloft.net> <20180722180910.wcwhantwpm2nfxet@breakpoint.cc> <20180723091551.mwhltw4ujm4bylvj@salvia> <20180723092818.ztsfsnqzxgzrauim@breakpoint.cc> From: shaochun chen Date: Mon, 23 Jul 2018 18:34:36 +0800 Message-ID: Subject: Re: [PATCH] netlink: fix memory leak of dump To: Florian Westphal Cc: Pablo Neira Ayuso , David Miller , kadlec , "johannes.berg" , jason , ktkhai , "lucien.xin" , "xiyou.wangcong" , dsahern , netfilter-devel , tom , netdev , linux-kernel Content-Type: multipart/alternative; boundary="0000000000002769a70571a8310b" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org --0000000000002769a70571a8310b Content-Type: text/plain; charset="UTF-8" I have a question: we will try_module_get in __netlink_dump_start(), but why we need to call try_module_get again in nft_netlink_dump_start ?? 2018-07-23 17:52 GMT+08:00 shaochun chen : > allocate memory in ->start(), it's not convenient for users. > if call ->done() isn't ok for clean memory when netlink_dump_start() fail, > maybe we should have another function ->clean() to clean memory. > > 2018-07-23 17:28 GMT+08:00 Florian Westphal : > >> Pablo Neira Ayuso wrote: >> > > diff --git a/net/netfilter/nf_tables_api.c >> b/net/netfilter/nf_tables_api.c >> > > --- a/net/netfilter/nf_tables_api.c >> > > +++ b/net/netfilter/nf_tables_api.c >> > > @@ -5010,6 +5013,22 @@ nft_obj_filter_alloc(const struct nlattr * >> const nla[]) >> > > return filter; >> > > } >> > > >> > > +static int nf_tables_dump_obj_start(struct netlink_callback *cb) >> > > +{ >> > > + const struct nlattr * const *nla = cb->data; >> >> On-Stack input. >> I can't see how its wrong, ->start() happens from same context as >> netlink_dump_start so its valid. >> >> > > + struct nft_obj_filter *filter = NULL; >> > > + >> > > + if (nla[NFTA_OBJ_TABLE] || >> > > + nla[NFTA_OBJ_TYPE]) { >> > > + filter = nft_obj_filter_alloc(nla); >> > > + if (IS_ERR(filter)) >> > > + return -ENOMEM; >> > > + } >> > > + >> > > + cb->data = filter; >> >> And this replaced the on-stack input with dynamically >> allocated one, which will be free'd via ->done(). >> >> > > /* called with rcu_read_lock held */ >> > > static int nf_tables_getobj(struct net *net, struct sock *nlsk, >> > > struct sk_buff *skb, const struct nlmsghdr >> *nlh, >> > > @@ -5028,21 +5047,13 @@ static int nf_tables_getobj(struct net *net, >> struct sock *nlsk, >> > > >> > > if (nlh->nlmsg_flags & NLM_F_DUMP) { >> > > struct netlink_dump_control c = { >> > > + .start = nf_tables_dump_obj_start, >> > > .dump = nf_tables_dump_obj, >> > > .done = nf_tables_dump_obj_done, >> > > .module = THIS_MODULE, >> > > + .data = (void *)nla, >> > >> > You cannot do this. >> > >> > nla is allocated in this stack. >> >> Yes. >> >> > the nla will not be available in the second recv(), it won't be there. >> >> Its replaced in ->start(). >> >> As David pointed out, once ->start() returns 0 we set cb_running, i.e. >> only after successful ->start() netlink core will call ->dump() again. >> >> So I see no problem setting ->data to onstack cookie and then >> duplicating it to heap via kmemdup in ->start(). >> >> As far as I can see netlink core offers all functionality already, >> so we only need to switch netfilter to make use of it. >> >> If you disagree please let me know, otherwise I will cook up >> a patch along this pattern for net/netfilter/*. >> >> Thanks. >> > > --0000000000002769a70571a8310b Content-Type: text/html; charset="UTF-8" Content-Transfer-Encoding: quoted-printable
I have a question: we will try_module_get in=C2=A0__netlink_dump_start(),=
but why we need t= o call=C2=A0try_module_get again in=20 =C2=A0nft_netlink_dump_start=C2=A0??

<= div class=3D"gmail_quote">2018-07-23 17:52 GMT+08:00 shaochun chen <cscnul= l@gmail.com>:
allocate memory in ->start(),=C2=A0 it's not convenient for user= s.
if call ->done() isn't ok for clean memory when netlink_dump_= start() fail,
maybe we should have another function ->clean() = to clean memory.

2018-07-23 17:28 GMT+08:= 00 Florian Westphal <fw@strlen.de>:
Pablo Neira Ayuso <pablo@netfilter.org> wrote:
> > diff --git a/net/netfilter/nf_tables_api.c b/net/netfilter/n= f_tables_api.c
> > --- a/net/netfilter/nf_tables_api.c
> > +++ b/net/netfilter/nf_tables_api.c
> > @@ -5010,6 +5013,22 @@ nft_obj_filter_alloc(const struct nlattr *= const nla[])
> >=C2=A0 =C2=A0 =C2=A0return filter;
> >=C2=A0 }
> >=C2=A0
> > +static int nf_tables_dump_obj_start(struct netlink_callback= *cb)
> > +{
> > +=C2=A0 =C2=A0const struct nlattr * const *nla =3D cb->data;
On-Stack input.
I can't see how its wrong, ->start() happens from same context as netlink_dump_start so its valid.

> > +=C2=A0 =C2=A0struct nft_obj_filter *filter =3D NULL;
> > +
> > +=C2=A0 =C2=A0if (nla[NFTA_OBJ_TABLE] ||
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0nla[NFTA_OBJ_TYPE]) {
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0filter =3D nft_obj_filt= er_alloc(nla);
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0if (IS_ERR(filter))
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0return -ENOMEM;
> > +=C2=A0 =C2=A0}
> > +
> > +=C2=A0 =C2=A0cb->data =3D filter;

And this replaced the on-stack input with dynamically
allocated one, which will be free'd via ->done().

> >=C2=A0 /* called with rcu_read_lock held */
> >=C2=A0 static int nf_tables_getobj(struct net *net, struct sock *n= lsk,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0 =C2=A0 =C2=A0struct sk_buff *skb, const struct nlmsghdr *nlh, > > @@ -5028,21 +5047,13 @@ static int nf_tables_getobj(struct net *n= et, struct sock *nlsk,
> >=C2=A0
> >=C2=A0 =C2=A0 =C2=A0if (nlh->nlmsg_flags & NLM_F_DUMP) { > >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0struct netlink_dum= p_control c =3D {
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0.start =3D nf_tables_dump_obj_start,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0.dump =3D nf_tables_dump_obj,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0.done =3D nf_tables_dump_obj_done,
> >=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2= =A0 =C2=A0.module =3D THIS_MODULE,
> > +=C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 =C2=A0 = =C2=A0.data =3D (void *)nla,
>
> You cannot do this.
>
> nla is allocated in this stack.

Yes.

> the nla will not be available in the second recv(), it won't be th= ere.

Its replaced in ->start().

As David pointed out, once ->start() returns 0 we set cb_running, i.e. only after successful ->start() netlink core will call ->dump() again= .

So I see no problem setting ->data to onstack cookie and then
duplicating it to heap via kmemdup in ->start().

As far as I can see netlink core offers all functionality already,
so we only need to switch netfilter to make use of it.

If you disagree please let me know, otherwise I will cook up
a patch along this pattern for net/netfilter/*.

Thanks.


--0000000000002769a70571a8310b--