Hello Neal.
It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on 32 bit systems.
Konstantin suggested a solution available in the same linked merge request and copy-pasted by me below for your convenience:
```
diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
index 664c9e119787..fd3f89e3a8a6 100644
--- a/net/ipv4/tcp_bbr.c
+++ b/net/ipv4/tcp_bbr.c
@@ -312,7 +312,7 @@ static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
bytes = sk->sk_pacing_rate >> sk->sk_pacing_shift;
bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
- segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
+ segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
return segs;
}
diff --git a/net/ipv4/tcp_bbr2.c b/net/ipv4/tcp_bbr2.c
index fa49e17c47ca..488429f0f3d0 100644
--- a/net/ipv4/tcp_bbr2.c
+++ b/net/ipv4/tcp_bbr2.c
@@ -588,7 +588,7 @@ static void bbr_debug(struct sock *sk, u32 acked,
bbr_rate_kbps(sk, bbr_max_bw(sk)), /* bw: max bw */
0ULL, /* lb: [obsolete] */
0ULL, /* ib: [obsolete] */
- (u64)sk->sk_pacing_rate * 8 / 1000,
+ div_u64((u64)sk->sk_pacing_rate * 8, 1000),
acked,
tcp_packets_in_flight(tp),
rs->is_ack_delayed ? 'd' : '.',
@@ -698,7 +698,7 @@ static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
}
bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
- segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
+ segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
return segs;
}
```
Could you please evaluate this report and check whether it is correct, and also check whether the suggested patch is acceptable?
Thanks.
[1] https://gitlab.com/post-factum/pf-kernel/-/merge_requests/6
--
Oleksandr Natalenko (post-factum)
From: Oleksandr Natalenko
> Sent: 22 May 2022 23:30
> To: Neal Cardwell <[email protected]>
>
> Hello Neal.
>
> It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on
> 32 bit systems.
Do any of these divisions ever actually have 64bit operands?
Even on x86-64 64bit divide is significantly slower than 32bit divide.
It is quite clear that x * 8 / 1000 is the same as x / (1000 / 8).
So promoting to 64bit cannot be needed.
David
>
> Konstantin suggested a solution available in the same linked merge request and copy-pasted by me below
> for your convenience:
>
> ```
> diff --git a/net/ipv4/tcp_bbr.c b/net/ipv4/tcp_bbr.c
> index 664c9e119787..fd3f89e3a8a6 100644
> --- a/net/ipv4/tcp_bbr.c
> +++ b/net/ipv4/tcp_bbr.c
> @@ -312,7 +312,7 @@ static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
> bytes = sk->sk_pacing_rate >> sk->sk_pacing_shift;
>
> bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
> - segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
> + segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
> return segs;
> }
>
> diff --git a/net/ipv4/tcp_bbr2.c b/net/ipv4/tcp_bbr2.c
> index fa49e17c47ca..488429f0f3d0 100644
> --- a/net/ipv4/tcp_bbr2.c
> +++ b/net/ipv4/tcp_bbr2.c
> @@ -588,7 +588,7 @@ static void bbr_debug(struct sock *sk, u32 acked,
> bbr_rate_kbps(sk, bbr_max_bw(sk)), /* bw: max bw */
> 0ULL, /* lb: [obsolete] */
> 0ULL, /* ib: [obsolete] */
> - (u64)sk->sk_pacing_rate * 8 / 1000,
> + div_u64((u64)sk->sk_pacing_rate * 8, 1000),
> acked,
> tcp_packets_in_flight(tp),
> rs->is_ack_delayed ? 'd' : '.',
> @@ -698,7 +698,7 @@ static u32 bbr_tso_segs_generic(struct sock *sk, unsigned int mss_now,
> }
>
> bytes = min_t(u32, bytes, gso_max_size - 1 - MAX_TCP_HEADER);
> - segs = max_t(u32, bytes / mss_now, bbr_min_tso_segs(sk));
> + segs = max_t(u32, div_u64(bytes, mss_now), bbr_min_tso_segs(sk));
> return segs;
> }
> ```
>
> Could you please evaluate this report and check whether it is correct, and also check whether the
> suggested patch is acceptable?
>
> Thanks.
>
> [1] https://gitlab.com/post-factum/pf-kernel/-/merge_requests/6
>
> --
> Oleksandr Natalenko (post-factum)
>
-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)
On Tue, May 24, 2022 at 1:06 PM Neal Cardwell <[email protected]> wrote:
>
> On Tue, May 24, 2022 at 4:01 AM David Laight <[email protected]> wrote:
> >
> > From: Oleksandr Natalenko
> > > Sent: 22 May 2022 23:30
> > > To: Neal Cardwell <[email protected]>
> > >
> > > Hello Neal.
> > >
> > > It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on
> > > 32 bit systems.
> >
> > Do any of these divisions ever actually have 64bit operands?
> > Even on x86-64 64bit divide is significantly slower than 32bit divide.
> >
> > It is quite clear that x * 8 / 1000 is the same as x / (1000 / 8).
> > So promoting to 64bit cannot be needed.
> >
> > David
>
> The sk->sk_pacing_rate can definitely be bigger than 32 bits if the
> network path can support more than 34 Gbit/sec (a pacing rate of 2^32
> bytes per sec is roughly 34 Gibt/sec). This definitely happens.
>
> So this one seems reasonable to me (and is only in debug code, so the
> performance is probably fine):
> - (u64)sk->sk_pacing_rate * 8 / 1000,
> + div_u64((u64)sk->sk_pacing_rate * 8, 1000),
>
> For the other two I agree we should rework them to avoid the 64-bit
> divide, since we don't need it.
>
> There is similar logic in mainline Linux in tcp_tso_autosize(), which
> is currently using "unsigned long" for bytes.
>
Not sure I follow.
sk_pacing_rate is also 'unsigned long'
So tcp_tso_autosize() is correct on 32bit and 64bit arches.
There is no forced 64bit operation there.
> Eric, what do you advise?
>
> thanks,
> neal
On Tue, May 24, 2022 at 4:01 AM David Laight <[email protected]> wrote:
>
> From: Oleksandr Natalenko
> > Sent: 22 May 2022 23:30
> > To: Neal Cardwell <[email protected]>
> >
> > Hello Neal.
> >
> > It was reported to me [1] by Konstantin (in Cc) that BBRv2 code suffers from integer division issue on
> > 32 bit systems.
>
> Do any of these divisions ever actually have 64bit operands?
> Even on x86-64 64bit divide is significantly slower than 32bit divide.
>
> It is quite clear that x * 8 / 1000 is the same as x / (1000 / 8).
> So promoting to 64bit cannot be needed.
>
> David
The sk->sk_pacing_rate can definitely be bigger than 32 bits if the
network path can support more than 34 Gbit/sec (a pacing rate of 2^32
bytes per sec is roughly 34 Gibt/sec). This definitely happens.
So this one seems reasonable to me (and is only in debug code, so the
performance is probably fine):
- (u64)sk->sk_pacing_rate * 8 / 1000,
+ div_u64((u64)sk->sk_pacing_rate * 8, 1000),
For the other two I agree we should rework them to avoid the 64-bit
divide, since we don't need it.
There is similar logic in mainline Linux in tcp_tso_autosize(), which
is currently using "unsigned long" for bytes.
Eric, what do you advise?
thanks,
neal