2022-06-06 07:35:39

by Muchun Song

[permalink] [raw]
Subject: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb

In our server, there may be no high order (>= 6) memory since we reserve
lots of HugeTLB pages when booting. Then the system panic. So use
kvmalloc_array() to allocate table_perturb.

Signed-off-by: Muchun Song <[email protected]>
---
net/ipv4/inet_hashtables.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/net/ipv4/inet_hashtables.c b/net/ipv4/inet_hashtables.c
index e8de5e699b3f..1ecbfdebc6bf 100644
--- a/net/ipv4/inet_hashtables.c
+++ b/net/ipv4/inet_hashtables.c
@@ -1026,8 +1026,8 @@ void __init inet_hashinfo2_init(struct inet_hashinfo *h, const char *name,
init_hashinfo_lhash2(h);

/* this one is used for source ports of outgoing connections */
- table_perturb = kmalloc_array(INET_TABLE_PERTURB_SIZE,
- sizeof(*table_perturb), GFP_KERNEL);
+ table_perturb = kvmalloc_array(INET_TABLE_PERTURB_SIZE,
+ sizeof(*table_perturb), GFP_KERNEL);
if (!table_perturb)
panic("TCP: failed to alloc table_perturb");
}
--
2.11.0


2022-06-06 16:22:47

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb

On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <[email protected]> wrote:
>
> In our server, there may be no high order (>= 6) memory since we reserve
> lots of HugeTLB pages when booting. Then the system panic. So use
> kvmalloc_array() to allocate table_perturb.
>
> Signed-off-by: Muchun Song <[email protected]>

Please add a Fixes: tag and CC original author ?

Thanks.

2022-06-06 16:31:29

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb

On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <[email protected]> wrote:
>
> On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <[email protected]> wrote:
> >
> > In our server, there may be no high order (>= 6) memory since we reserve
> > lots of HugeTLB pages when booting. Then the system panic. So use
> > kvmalloc_array() to allocate table_perturb.
> >
> > Signed-off-by: Muchun Song <[email protected]>
>
> Please add a Fixes: tag and CC original author ?
>
> Thanks.

Also using alloc_large_system_hash() might be a better option anyway,
spreading pages on multiple nodes on NUMA hosts.

2022-06-07 09:17:35

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb

On Tue, Jun 7, 2022 at 12:03 PM Eric Dumazet <[email protected]> wrote:
>
> On Mon, Jun 6, 2022 at 8:56 PM Muchun Song <[email protected]> wrote:
> >
> > On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <[email protected]> wrote:
> > >
> > > On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <[email protected]> wrote:
> > > >
> > > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <[email protected]> wrote:
> > > > >
> > > > > In our server, there may be no high order (>= 6) memory since we reserve
> > > > > lots of HugeTLB pages when booting. Then the system panic. So use
> > > > > kvmalloc_array() to allocate table_perturb.
> > > > >
> > > > > Signed-off-by: Muchun Song <[email protected]>
> > > >
> > > > Please add a Fixes: tag and CC original author ?
> > > >
> >
> > Will do.
> >
> > > > Thanks.
> > >
> > > Also using alloc_large_system_hash() might be a better option anyway,
> > > spreading pages on multiple nodes on NUMA hosts.
> >
> > Using alloc_large_system_hash() LGTM, but
> > I didn't see where the memory is allocated on multi-node
> > in alloc_large_system_hash() or vmalloc_huge(), what I
> > missed here?
>
> This is done by default. You do not have to do anything special. Just
> call alloc_large_system_hash().
>
> For instance, on two socket system:
>
> # grep alloc_large_system_hash /proc/vmallocinfo
> 0x000000005536618c-0x00000000a4ae0198 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x000000003beddc38-0x0000000092b61b54 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x0000000092b61b54-0x000000005c33d7fb 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x000000004c0588af-0x0000000012cf548f 12288
> alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
> 0x000000008d50035e-0x00000000f434e297 266240
> alloc_large_system_hash+0x1df/0x2f0 pages=64 vmalloc N0=32 N1=32
> 0x00000000fe631da3-0x00000000b60e95b8 268439552
> alloc_large_system_hash+0x1df/0x2f0 pages=65536 vmalloc vpages
> N0=32768 N1=32768
> 0x00000000b60e95b8-0x0000000062eb7a11 528384
> alloc_large_system_hash+0x1df/0x2f0 pages=128 vmalloc N0=64 N1=64
> 0x0000000062eb7a11-0x000000005408af10 134221824
> alloc_large_system_hash+0x1df/0x2f0 pages=32768 vmalloc vpages
> N0=16384 N1=16384
> 0x000000005408af10-0x0000000054fb99eb 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x0000000054fb99eb-0x00000000a130e604 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x00000000a130e604-0x00000000e6e62c85 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x00000000e6e62c85-0x000000005ca0ef7c 2101248
> alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
> 0x000000005ca0ef7c-0x000000003bfe757f 1052672
> alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
> 0x000000003bfe757f-0x00000000bf49fcbd 4198400
> alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
> N1=512
> 0x00000000bf49fcbd-0x00000000902de200 1052672
> alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
> 0x00000000902de200-0x00000000c3d2821a 2101248
> alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
> 0x00000000c3d2821a-0x000000002ddc68f6 2101248
> alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
>
> You can see N0=X and N1=X meaning pages are evenly spread among the two nodes.

Thanks a lot. Really helpful information.

2022-06-07 11:14:12

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb

On Mon, Jun 6, 2022 at 8:56 PM Muchun Song <[email protected]> wrote:
>
> On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <[email protected]> wrote:
> >
> > On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <[email protected]> wrote:
> > >
> > > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <[email protected]> wrote:
> > > >
> > > > In our server, there may be no high order (>= 6) memory since we reserve
> > > > lots of HugeTLB pages when booting. Then the system panic. So use
> > > > kvmalloc_array() to allocate table_perturb.
> > > >
> > > > Signed-off-by: Muchun Song <[email protected]>
> > >
> > > Please add a Fixes: tag and CC original author ?
> > >
>
> Will do.
>
> > > Thanks.
> >
> > Also using alloc_large_system_hash() might be a better option anyway,
> > spreading pages on multiple nodes on NUMA hosts.
>
> Using alloc_large_system_hash() LGTM, but
> I didn't see where the memory is allocated on multi-node
> in alloc_large_system_hash() or vmalloc_huge(), what I
> missed here?

This is done by default. You do not have to do anything special. Just
call alloc_large_system_hash().

For instance, on two socket system:

# grep alloc_large_system_hash /proc/vmallocinfo
0x000000005536618c-0x00000000a4ae0198 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x000000003beddc38-0x0000000092b61b54 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x0000000092b61b54-0x000000005c33d7fb 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x000000004c0588af-0x0000000012cf548f 12288
alloc_large_system_hash+0x1df/0x2f0 pages=2 vmalloc N0=1 N1=1
0x000000008d50035e-0x00000000f434e297 266240
alloc_large_system_hash+0x1df/0x2f0 pages=64 vmalloc N0=32 N1=32
0x00000000fe631da3-0x00000000b60e95b8 268439552
alloc_large_system_hash+0x1df/0x2f0 pages=65536 vmalloc vpages
N0=32768 N1=32768
0x00000000b60e95b8-0x0000000062eb7a11 528384
alloc_large_system_hash+0x1df/0x2f0 pages=128 vmalloc N0=64 N1=64
0x0000000062eb7a11-0x000000005408af10 134221824
alloc_large_system_hash+0x1df/0x2f0 pages=32768 vmalloc vpages
N0=16384 N1=16384
0x000000005408af10-0x0000000054fb99eb 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x0000000054fb99eb-0x00000000a130e604 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x00000000a130e604-0x00000000e6e62c85 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x00000000e6e62c85-0x000000005ca0ef7c 2101248
alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
0x000000005ca0ef7c-0x000000003bfe757f 1052672
alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
0x000000003bfe757f-0x00000000bf49fcbd 4198400
alloc_large_system_hash+0x1df/0x2f0 pages=1024 vmalloc vpages N0=512
N1=512
0x00000000bf49fcbd-0x00000000902de200 1052672
alloc_large_system_hash+0x1df/0x2f0 pages=256 vmalloc N0=128 N1=128
0x00000000902de200-0x00000000c3d2821a 2101248
alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256
0x00000000c3d2821a-0x000000002ddc68f6 2101248
alloc_large_system_hash+0x1df/0x2f0 pages=512 vmalloc N0=256 N1=256

You can see N0=X and N1=X meaning pages are evenly spread among the two nodes.

2022-06-07 15:33:52

by Muchun Song

[permalink] [raw]
Subject: Re: [PATCH] tcp: use kvmalloc_array() to allocate table_perturb

On Tue, Jun 7, 2022 at 12:13 AM Eric Dumazet <[email protected]> wrote:
>
> On Mon, Jun 6, 2022 at 9:05 AM Eric Dumazet <[email protected]> wrote:
> >
> > On Mon, Jun 6, 2022 at 12:08 AM Muchun Song <[email protected]> wrote:
> > >
> > > In our server, there may be no high order (>= 6) memory since we reserve
> > > lots of HugeTLB pages when booting. Then the system panic. So use
> > > kvmalloc_array() to allocate table_perturb.
> > >
> > > Signed-off-by: Muchun Song <[email protected]>
> >
> > Please add a Fixes: tag and CC original author ?
> >

Will do.

> > Thanks.
>
> Also using alloc_large_system_hash() might be a better option anyway,
> spreading pages on multiple nodes on NUMA hosts.

Using alloc_large_system_hash() LGTM, but
I didn't see where the memory is allocated on multi-node
in alloc_large_system_hash() or vmalloc_huge(), what I
missed here?

Thanks.