From: Michal Hocko <[email protected]>
eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc()
in xt_alloc_table_info()") has unintentionally fortified
xt_alloc_table_info allocation when __GFP_RETRY has been dropped from
the vmalloc fallback. Later on there was a syzbot report that this
can lead to OOM killer invocations when tables are too large and
0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
has been merged to restore the original behavior. Georgi Nikolov however
noticed that he is not able to install his iptables anymore so this can
be seen as a regression.
The primary argument for 0537250fdc6c was that this allocation path
shouldn't really trigger the OOM killer and kill innocent tasks. On the
other hand the interface requires root and as such should allow what the
admin asks for. Root inside a namespaces makes this more complicated
because those might be not trusted in general. If they are not then such
namespaces should be restricted anyway. Therefore drop the __GFP_NORETRY
and replace it by __GFP_ACCOUNT to enfore memcg constrains on it.
Fixes: 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
Reported-by: Georgi Nikolov <[email protected]>
Suggested-by: Vlastimil Babka <[email protected]>
Acked-by: Florian Westphal <[email protected]>
Signed-off-by: Michal Hocko <[email protected]>
---
net/netfilter/x_tables.c | 7 +------
1 file changed, 1 insertion(+), 6 deletions(-)
diff --git a/net/netfilter/x_tables.c b/net/netfilter/x_tables.c
index d0d8397c9588..aecadd471e1d 100644
--- a/net/netfilter/x_tables.c
+++ b/net/netfilter/x_tables.c
@@ -1178,12 +1178,7 @@ struct xt_table_info *xt_alloc_table_info(unsigned int size)
if (sz < sizeof(*info) || sz >= XT_MAX_TABLE_SIZE)
return NULL;
- /* __GFP_NORETRY is not fully supported by kvmalloc but it should
- * work reasonably well if sz is too large and bail out rather
- * than shoot all processes down before realizing there is nothing
- * more to reclaim.
- */
- info = kvmalloc(sz, GFP_KERNEL | __GFP_NORETRY);
+ info = kvmalloc(sz, GFP_KERNEL_ACCOUNT);
if (!info)
return NULL;
--
2.18.0
On 08/07/2018 09:54 PM, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc()
> in xt_alloc_table_info()") has unintentionally fortified
> xt_alloc_table_info allocation when __GFP_RETRY has been dropped from
> the vmalloc fallback. Later on there was a syzbot report that this
> can lead to OOM killer invocations when tables are too large and
> 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
> has been merged to restore the original behavior. Georgi Nikolov however
> noticed that he is not able to install his iptables anymore so this can
> be seen as a regression.
>
> The primary argument for 0537250fdc6c was that this allocation path
> shouldn't really trigger the OOM killer and kill innocent tasks. On the
> other hand the interface requires root and as such should allow what the
> admin asks for. Root inside a namespaces makes this more complicated
> because those might be not trusted in general. If they are not then such
> namespaces should be restricted anyway. Therefore drop the __GFP_NORETRY
> and replace it by __GFP_ACCOUNT to enfore memcg constrains on it.
>
> Fixes: 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
> Reported-by: Georgi Nikolov <[email protected]>
> Suggested-by: Vlastimil Babka <[email protected]>
> Acked-by: Florian Westphal <[email protected]>
> Signed-off-by: Michal Hocko <[email protected]>
Acked-by: Vlastimil Babka <[email protected]>
IIRC According to Florian there are more places like this in the
netfilter code?
On Wed 08-08-18 10:16:01, Vlastimil Babka wrote:
> On 08/07/2018 09:54 PM, Michal Hocko wrote:
> > From: Michal Hocko <[email protected]>
> >
> > eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc()
> > in xt_alloc_table_info()") has unintentionally fortified
> > xt_alloc_table_info allocation when __GFP_RETRY has been dropped from
> > the vmalloc fallback. Later on there was a syzbot report that this
> > can lead to OOM killer invocations when tables are too large and
> > 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
> > has been merged to restore the original behavior. Georgi Nikolov however
> > noticed that he is not able to install his iptables anymore so this can
> > be seen as a regression.
> >
> > The primary argument for 0537250fdc6c was that this allocation path
> > shouldn't really trigger the OOM killer and kill innocent tasks. On the
> > other hand the interface requires root and as such should allow what the
> > admin asks for. Root inside a namespaces makes this more complicated
> > because those might be not trusted in general. If they are not then such
> > namespaces should be restricted anyway. Therefore drop the __GFP_NORETRY
> > and replace it by __GFP_ACCOUNT to enfore memcg constrains on it.
> >
> > Fixes: 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
> > Reported-by: Georgi Nikolov <[email protected]>
> > Suggested-by: Vlastimil Babka <[email protected]>
> > Acked-by: Florian Westphal <[email protected]>
> > Signed-off-by: Michal Hocko <[email protected]>
>
> Acked-by: Vlastimil Babka <[email protected]>
Thanks!
> IIRC According to Florian there are more places like this in the
> netfilter code?
Well, this is more for netfilter guys. I can only give a general
guidance that generally untrusted allocations triggered from userspace
should be a subject of kmem accounting.
--
Michal Hocko
SUSE Labs
On Tue, Aug 07, 2018 at 09:54:00PM +0200, Michal Hocko wrote:
> From: Michal Hocko <[email protected]>
>
> eacd86ca3b03 ("net/netfilter/x_tables.c: use kvmalloc()
> in xt_alloc_table_info()") has unintentionally fortified
> xt_alloc_table_info allocation when __GFP_RETRY has been dropped from
> the vmalloc fallback. Later on there was a syzbot report that this
> can lead to OOM killer invocations when tables are too large and
> 0537250fdc6c ("netfilter: x_tables: make allocation less aggressive")
> has been merged to restore the original behavior. Georgi Nikolov however
> noticed that he is not able to install his iptables anymore so this can
> be seen as a regression.
>
> The primary argument for 0537250fdc6c was that this allocation path
> shouldn't really trigger the OOM killer and kill innocent tasks. On the
> other hand the interface requires root and as such should allow what the
> admin asks for. Root inside a namespaces makes this more complicated
> because those might be not trusted in general. If they are not then such
> namespaces should be restricted anyway. Therefore drop the __GFP_NORETRY
> and replace it by __GFP_ACCOUNT to enfore memcg constrains on it.
Applied, thanks.