2023-12-06 03:39:44

by Judy Hsiao

[permalink] [raw]
Subject: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

We are seeing cases where neigh_cleanup_and_release() is called by
neigh_forced_gc() many times in a row with preemption turned off.
When running on a low powered CPU at a low CPU frequency, this has
been measured to keep preemption off for ~10 ms. That's not great on a
system with HZ=1000 which expects tasks to be able to schedule in
with ~1ms latency.

Suggested-by: Douglas Anderson <[email protected]>
Signed-off-by: Judy Hsiao <[email protected]>

---

Changes in v2:
- Use ktime_get_ns() for timeout calculation instead of jiffies.

net/core/neighbour.c | 9 ++++++++-
1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/net/core/neighbour.c b/net/core/neighbour.c
index df81c1f0a570..552719c3bbc3 100644
--- a/net/core/neighbour.c
+++ b/net/core/neighbour.c
@@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
{
int max_clean = atomic_read(&tbl->gc_entries) -
READ_ONCE(tbl->gc_thresh2);
+ u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
unsigned long tref = jiffies - 5 * HZ;
struct neighbour *n, *tmp;
int shrunk = 0;
+ int loop = 0;

NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);

@@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
shrunk++;
if (shrunk >= max_clean)
break;
+ if (++loop == 16) {
+ if (ktime_get_ns() > tmax)
+ goto unlock;
+ loop = 0;
+ }
}
}

WRITE_ONCE(tbl->last_flush, jiffies);
-
+unlock:
write_unlock_bh(&tbl->lock);

return shrunk;
--
2.43.0.rc2.451.g8631bc7472-goog


2023-12-06 03:43:56

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

On 12/5/23 8:38 PM, Judy Hsiao wrote:
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
>
> Suggested-by: Douglas Anderson <[email protected]>
> Signed-off-by: Judy Hsiao <[email protected]>
>
> ---
>
> Changes in v2:
> - Use ktime_get_ns() for timeout calculation instead of jiffies.
>
> net/core/neighbour.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)
>


Reviewed-by: David Ahern <[email protected]>

2023-12-06 07:34:51

by Eric Dumazet

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

On Wed, Dec 6, 2023 at 4:39 AM Judy Hsiao <[email protected]> wrote:
>
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
>
> Suggested-by: Douglas Anderson <[email protected]>
> Signed-off-by: Judy Hsiao <[email protected]>
>
> ---
>
> Changes in v2:
> - Use ktime_get_ns() for timeout calculation instead of jiffies.

SGTM, thanks.
Reviewed-by: Eric Dumazet <[email protected]>

2023-12-06 17:22:00

by Douglas Anderson

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

Hi,

On Tue, Dec 5, 2023 at 7:39 PM Judy Hsiao <[email protected]> wrote:
>
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
>
> Suggested-by: Douglas Anderson <[email protected]>
> Signed-off-by: Judy Hsiao <[email protected]>
>
> ---
>
> Changes in v2:
> - Use ktime_get_ns() for timeout calculation instead of jiffies.
>
> net/core/neighbour.c | 9 ++++++++-
> 1 file changed, 8 insertions(+), 1 deletion(-)

Though as evidenced by the discussion in v1 I'm not versed enough in
this code for it to mean much, the patch nonetheless looks reasonable
to me. I'm happy enough with:

Reviewed-by: Douglas Anderson <[email protected]>

2023-12-06 17:39:30

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

On Wed, 6 Dec 2023 03:38:33 +0000
Judy Hsiao <[email protected]> wrote:

> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> index df81c1f0a570..552719c3bbc3 100644
> --- a/net/core/neighbour.c
> +++ b/net/core/neighbour.c
> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> {
> int max_clean = atomic_read(&tbl->gc_entries) -
> READ_ONCE(tbl->gc_thresh2);
> + u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
> unsigned long tref = jiffies - 5 * HZ;
> struct neighbour *n, *tmp;
> int shrunk = 0;
> + int loop = 0;
>
> NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
>
> @@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> shrunk++;
> if (shrunk >= max_clean)
> break;
> + if (++loop == 16) {

Overall looks good.
Minor comments:
- loop count should probably be unsigned
- the magic constant 16 should be a sysctl tuneable

2023-12-06 17:52:01

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

On 12/6/23 10:39 AM, Stephen Hemminger wrote:
> On Wed, 6 Dec 2023 03:38:33 +0000
> Judy Hsiao <[email protected]> wrote:
>
>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>> index df81c1f0a570..552719c3bbc3 100644
>> --- a/net/core/neighbour.c
>> +++ b/net/core/neighbour.c
>> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>> {
>> int max_clean = atomic_read(&tbl->gc_entries) -
>> READ_ONCE(tbl->gc_thresh2);
>> + u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
>> unsigned long tref = jiffies - 5 * HZ;
>> struct neighbour *n, *tmp;
>> int shrunk = 0;
>> + int loop = 0;
>>
>> NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
>>
>> @@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>> shrunk++;
>> if (shrunk >= max_clean)
>> break;
>> + if (++loop == 16) {
>
> Overall looks good.
> Minor comments:
> - loop count should probably be unsigned
> - the magic constant 16 should be a sysctl tuneable

A tunable is needed here; the loop counter is just to keep the overhead
of the ktime_get_ns call in check.

2023-12-06 18:57:32

by Douglas Anderson

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

Hi,

On Wed, Dec 6, 2023 at 9:51 AM David Ahern <[email protected]> wrote:
>
> On 12/6/23 10:39 AM, Stephen Hemminger wrote:
> > On Wed, 6 Dec 2023 03:38:33 +0000
> > Judy Hsiao <[email protected]> wrote:
> >
> >> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
> >> index df81c1f0a570..552719c3bbc3 100644
> >> --- a/net/core/neighbour.c
> >> +++ b/net/core/neighbour.c
> >> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> >> {
> >> int max_clean = atomic_read(&tbl->gc_entries) -
> >> READ_ONCE(tbl->gc_thresh2);
> >> + u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
> >> unsigned long tref = jiffies - 5 * HZ;
> >> struct neighbour *n, *tmp;
> >> int shrunk = 0;
> >> + int loop = 0;
> >>
> >> NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
> >>
> >> @@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
> >> shrunk++;
> >> if (shrunk >= max_clean)
> >> break;
> >> + if (++loop == 16) {
> >
> > Overall looks good.
> > Minor comments:
> > - loop count should probably be unsigned
> > - the magic constant 16 should be a sysctl tuneable
>
> A tunable is needed here; the loop counter is just to keep the overhead
> of the ktime_get_ns call in check.

From context, I'm going to assume you meant a tunable is _NOT_ needed here. ;-)

-Doug

2023-12-06 19:16:10

by David Ahern

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

On 12/6/23 11:49 AM, Doug Anderson wrote:
> Hi,
>
> On Wed, Dec 6, 2023 at 9:51 AM David Ahern <[email protected]> wrote:
>>
>> On 12/6/23 10:39 AM, Stephen Hemminger wrote:
>>> On Wed, 6 Dec 2023 03:38:33 +0000
>>> Judy Hsiao <[email protected]> wrote:
>>>
>>>> diff --git a/net/core/neighbour.c b/net/core/neighbour.c
>>>> index df81c1f0a570..552719c3bbc3 100644
>>>> --- a/net/core/neighbour.c
>>>> +++ b/net/core/neighbour.c
>>>> @@ -253,9 +253,11 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>>>> {
>>>> int max_clean = atomic_read(&tbl->gc_entries) -
>>>> READ_ONCE(tbl->gc_thresh2);
>>>> + u64 tmax = ktime_get_ns() + NSEC_PER_MSEC;
>>>> unsigned long tref = jiffies - 5 * HZ;
>>>> struct neighbour *n, *tmp;
>>>> int shrunk = 0;
>>>> + int loop = 0;
>>>>
>>>> NEIGH_CACHE_STAT_INC(tbl, forced_gc_runs);
>>>>
>>>> @@ -278,11 +280,16 @@ static int neigh_forced_gc(struct neigh_table *tbl)
>>>> shrunk++;
>>>> if (shrunk >= max_clean)
>>>> break;
>>>> + if (++loop == 16) {
>>>
>>> Overall looks good.
>>> Minor comments:
>>> - loop count should probably be unsigned
>>> - the magic constant 16 should be a sysctl tuneable
>>
>> A tunable is needed here; the loop counter is just to keep the overhead
>> of the ktime_get_ns call in check.
>
> From context, I'm going to assume you meant a tunable is _NOT_ needed here. ;-)
>
> -Doug

yes, multitasking fail :-(

2023-12-08 10:50:36

by patchwork-bot+netdevbpf

[permalink] [raw]
Subject: Re: [PATCH v2] neighbour: Don't let neigh_forced_gc() disable preemption for long

Hello:

This patch was applied to netdev/net.git (main)
by David S. Miller <[email protected]>:

On Wed, 6 Dec 2023 03:38:33 +0000 you wrote:
> We are seeing cases where neigh_cleanup_and_release() is called by
> neigh_forced_gc() many times in a row with preemption turned off.
> When running on a low powered CPU at a low CPU frequency, this has
> been measured to keep preemption off for ~10 ms. That's not great on a
> system with HZ=1000 which expects tasks to be able to schedule in
> with ~1ms latency.
>
> [...]

Here is the summary with links:
- [v2] neighbour: Don't let neigh_forced_gc() disable preemption for long
https://git.kernel.org/netdev/net/c/e5dc5afff62f

You are awesome, thank you!
--
Deet-doot-dot, I am a bot.
https://korg.docs.kernel.org/patchwork/pwbot.html