From: Pavel Skripkin <[email protected]>
Date: Sat, 27 Feb 2021 20:51:14 +0300
Hi,
> syzbot found WARNING in __alloc_pages_nodemask()[1] when order >= MAX_ORDER.
> It was caused by __netdev_alloc_skb(), which doesn't check len value after adding NET_SKB_PAD.
> Order will be >= MAX_ORDER and passed to __alloc_pages_nodemask() if size > KMALLOC_MAX_SIZE.
> Same happens in __napi_alloc_skb.
>
> static void *kmalloc_large_node(size_t size, gfp_t flags, int node)
> {
> struct page *page;
> void *ptr = NULL;
> unsigned int order = get_order(size);
> ...
> page = alloc_pages_node(node, flags, order);
> ...
>
> [1] WARNING in __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5014
> Call Trace:
> __alloc_pages include/linux/gfp.h:511 [inline]
> __alloc_pages_node include/linux/gfp.h:524 [inline]
> alloc_pages_node include/linux/gfp.h:538 [inline]
> kmalloc_large_node+0x60/0x110 mm/slub.c:3999
> __kmalloc_node_track_caller+0x319/0x3f0 mm/slub.c:4496
> __kmalloc_reserve net/core/skbuff.c:150 [inline]
> __alloc_skb+0x4e4/0x5a0 net/core/skbuff.c:210
> __netdev_alloc_skb+0x70/0x400 net/core/skbuff.c:446
> netdev_alloc_skb include/linux/skbuff.h:2832 [inline]
> qrtr_endpoint_post+0x84/0x11b0 net/qrtr/qrtr.c:442
> qrtr_tun_write_iter+0x11f/0x1a0 net/qrtr/tun.c:98
> call_write_iter include/linux/fs.h:1901 [inline]
> new_sync_write+0x426/0x650 fs/read_write.c:518
> vfs_write+0x791/0xa30 fs/read_write.c:605
> ksys_write+0x12d/0x250 fs/read_write.c:658
> do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> entry_SYSCALL_64_after_hwframe+0x44/0xa9
Ah, by the way. Have you tried to seek for the root cause, why
a request for such insanely large (at least 4 Mib) skb happens
in QRTR? I don't believe it's intended to be like this.
Now I feel that silencing this error with early return isn't
really correct approach for this.
> Reported-by: [email protected]
> Signed-off-by: Pavel Skripkin <[email protected]>
>
> ---
> Changes from v3:
> * Removed Change-Id and extra tabs in net/core/skbuff.c
>
> Changes from v2:
> * Added length check to __napi_alloc_skb
> * Added unlikely() in checks
>
> Change from v1:
> * Added length check to __netdev_alloc_skb
> ---
> net/core/skbuff.c | 6 ++++++
> 1 file changed, 6 insertions(+)
>
> diff --git a/net/core/skbuff.c b/net/core/skbuff.c
> index 785daff48030..ec7ba8728b61 100644
> --- a/net/core/skbuff.c
> +++ b/net/core/skbuff.c
> @@ -443,6 +443,9 @@ struct sk_buff *__netdev_alloc_skb(struct net_device *dev, unsigned int len,
> if (len <= SKB_WITH_OVERHEAD(1024) ||
> len > SKB_WITH_OVERHEAD(PAGE_SIZE) ||
> (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) {
> + if (unlikely(len > KMALLOC_MAX_SIZE))
> + return NULL;
> +
> skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE);
> if (!skb)
> goto skb_fail;
> @@ -517,6 +520,9 @@ struct sk_buff *__napi_alloc_skb(struct napi_struct *napi, unsigned int len,
> if (len <= SKB_WITH_OVERHEAD(1024) ||
> len > SKB_WITH_OVERHEAD(PAGE_SIZE) ||
> (gfp_mask & (__GFP_DIRECT_RECLAIM | GFP_DMA))) {
> + if (unlikely(len > KMALLOC_MAX_SIZE))
> + return NULL;
> +
> skb = __alloc_skb(len, gfp_mask, SKB_ALLOC_RX, NUMA_NO_NODE);
> if (!skb)
> goto skb_fail;
> --
> 2.25.1
Thanks,
Al
On Sun, 28 Feb 2021 18:14:46 +0000 Alexander Lobakin wrote:
> > [1] WARNING in __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5014
> > Call Trace:
> > __alloc_pages include/linux/gfp.h:511 [inline]
> > __alloc_pages_node include/linux/gfp.h:524 [inline]
> > alloc_pages_node include/linux/gfp.h:538 [inline]
> > kmalloc_large_node+0x60/0x110 mm/slub.c:3999
> > __kmalloc_node_track_caller+0x319/0x3f0 mm/slub.c:4496
> > __kmalloc_reserve net/core/skbuff.c:150 [inline]
> > __alloc_skb+0x4e4/0x5a0 net/core/skbuff.c:210
> > __netdev_alloc_skb+0x70/0x400 net/core/skbuff.c:446
> > netdev_alloc_skb include/linux/skbuff.h:2832 [inline]
> > qrtr_endpoint_post+0x84/0x11b0 net/qrtr/qrtr.c:442
> > qrtr_tun_write_iter+0x11f/0x1a0 net/qrtr/tun.c:98
> > call_write_iter include/linux/fs.h:1901 [inline]
> > new_sync_write+0x426/0x650 fs/read_write.c:518
> > vfs_write+0x791/0xa30 fs/read_write.c:605
> > ksys_write+0x12d/0x250 fs/read_write.c:658
> > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > entry_SYSCALL_64_after_hwframe+0x44/0xa9
>
> Ah, by the way. Have you tried to seek for the root cause, why
> a request for such insanely large (at least 4 Mib) skb happens
> in QRTR? I don't believe it's intended to be like this.
> Now I feel that silencing this error with early return isn't
> really correct approach for this.
Right, IIUC Eric suggested we limit the length of the allocation
to 64KB because that's the max reasonable skb length, and QRTR tun write
results in generating a single skb. That seems like a good approach.
From: Jakub Kicinski <[email protected]>
Date: Sun, 28 Feb 2021 10:55:52 -0800
> On Sun, 28 Feb 2021 18:14:46 +0000 Alexander Lobakin wrote:
> > > [1] WARNING in __alloc_pages_nodemask+0x5f8/0x730 mm/page_alloc.c:5014
> > > Call Trace:
> > > __alloc_pages include/linux/gfp.h:511 [inline]
> > > __alloc_pages_node include/linux/gfp.h:524 [inline]
> > > alloc_pages_node include/linux/gfp.h:538 [inline]
> > > kmalloc_large_node+0x60/0x110 mm/slub.c:3999
> > > __kmalloc_node_track_caller+0x319/0x3f0 mm/slub.c:4496
> > > __kmalloc_reserve net/core/skbuff.c:150 [inline]
> > > __alloc_skb+0x4e4/0x5a0 net/core/skbuff.c:210
> > > __netdev_alloc_skb+0x70/0x400 net/core/skbuff.c:446
> > > netdev_alloc_skb include/linux/skbuff.h:2832 [inline]
> > > qrtr_endpoint_post+0x84/0x11b0 net/qrtr/qrtr.c:442
> > > qrtr_tun_write_iter+0x11f/0x1a0 net/qrtr/tun.c:98
> > > call_write_iter include/linux/fs.h:1901 [inline]
> > > new_sync_write+0x426/0x650 fs/read_write.c:518
> > > vfs_write+0x791/0xa30 fs/read_write.c:605
> > > ksys_write+0x12d/0x250 fs/read_write.c:658
> > > do_syscall_64+0x2d/0x70 arch/x86/entry/common.c:46
> > > entry_SYSCALL_64_after_hwframe+0x44/0xa9
> >
> > Ah, by the way. Have you tried to seek for the root cause, why
> > a request for such insanely large (at least 4 Mib) skb happens
> > in QRTR? I don't believe it's intended to be like this.
> > Now I feel that silencing this error with early return isn't
> > really correct approach for this.
>
> Right, IIUC Eric suggested we limit the length of the allocation
> to 64KB because that's the max reasonable skb length, and QRTR tun write
> results in generating a single skb. That seems like a good approach.
+1 for explicit capping max skb length to 64K.
Al