2023-06-19 10:20:37

by Florian Kauer

[permalink] [raw]
Subject: [PATCH net v2 0/6] igc: Fix corner cases for TSN offload

The igc driver supports several different offloading capabilities
relevant in the TSN context. Recent patches in this area introduced
regressions for certain corner cases that are fixed in this series.

Each of the patches (except the first one) addresses a different
regression that can be separately reproduced. Still, they have
overlapping code changes so they should not be separately applied.

Especially #4 and #6 address the same observation,
but both need to be applied to avoid TX hang occurrences in
the scenario described in the patches.

Signed-off-by: Florian Kauer <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>

---

v2: - Rebased onto net. #1-#2 needed adaptations, others unmodified.
- Extend #3 commit message that it only regards i225.

---

Florian Kauer (6):
igc: Rename qbv_enable to taprio_offload_enable
igc: Do not enable taprio offload for invalid arguments
igc: Handle already enabled taprio offload for basetime 0
igc: No strict mode in pure launchtime/CBS offload
igc: Fix launchtime before start of cycle
igc: Fix inserting of empty frame for launchtime

drivers/net/ethernet/intel/igc/igc.h | 2 +-
drivers/net/ethernet/intel/igc/igc_main.c | 10 ++++-----
drivers/net/ethernet/intel/igc/igc_tsn.c | 26 ++++++++++++++++++++---
3 files changed, 29 insertions(+), 9 deletions(-)

--
2.39.2



2023-06-19 10:21:21

by Florian Kauer

[permalink] [raw]
Subject: [PATCH net v2 6/6] igc: Fix inserting of empty frame for launchtime

The insertion of an empty frame was introduced with
commit db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
in order to ensure that the current cycle has at least one packet if
there is some packet to be scheduled for the next cycle.

However, the current implementation does not properly check if
a packet is already scheduled for the current cycle. Currently,
an empty packet is always inserted if and only if
txtime >= end_of_cycle && txtime > last_tx_cycle
but since last_tx_cycle is always either the end of the current
cycle (end_of_cycle) or the end of a previous cycle, the
second part (txtime > last_tx_cycle) is always true unless
txtime == last_tx_cycle.

What actually needs to be checked here is if the last_tx_cycle
was already written within the current cycle, so an empty frame
should only be inserted if and only if
txtime >= end_of_cycle && end_of_cycle > last_tx_cycle.

This patch does not only avoid an unnecessary insertion, but it
can actually be harmful to insert an empty packet if packets
are already scheduled in the current cycle, because it can lead
to a situation where the empty packet is actually processed
as the first packet in the upcoming cycle shifting the packet
with the first_flag even one cycle into the future, finally leading
to a TX hang.

The TX hang can be reproduced on a i225 with:

sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \
num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 \
base-time 0 \
sched-entry S 01 300000 \
flags 0x1 \
txtime-delay 500000 \
clockid CLOCK_TAI
sudo tc qdisc replace dev enp1s0 parent 100:1 etf \
clockid CLOCK_TAI \
delta 500000 \
offload \
skip_sock_check

and traffic generator

sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns

with traffic.cfg

#define ETH_P_IP 0x0800

{
/* Ethernet Header */
0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed
0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed
const16(ETH_P_IP),

/* IPv4 Header */
0b01000101, 0, # IPv4 version, IHL, TOS
const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header))
const16(2), # IPv4 ident
0b01000000, 0, # IPv4 flags, fragmentation off
64, # IPv4 TTL
17, # Protocol UDP
csumip(14, 33), # IPv4 checksum

/* UDP Header */
10, 0, 48, 1, # IP Src - adapt as needed
10, 0, 48, 10, # IP Dest - adapt as needed
const16(5555), # UDP Src Port
const16(6666), # UDP Dest Port
const16(1008), # UDP length (UDP header 8 bytes + payload length)
csumudp(14, 34), # UDP checksum

/* Payload */
fill('W', 1000),
}

and the observed message with that is for example

igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang
Tx Queue <0>
TDH <32>
TDT <3c>
next_to_use <3c>
next_to_clean <32>
buffer_info[next_to_clean]
time_stamp <ffff26a8>
next_to_watch <00000000632a1828>
jiffies <ffff27f8>
desc.status <1048000>

Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
Signed-off-by: Florian Kauer <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 565c72bd737d..f847c9a408d6 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1030,7 +1030,7 @@ static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime,
*first_flag = true;
ring->last_ff_cycle = baset_est;

- if (ktime_compare(txtime, ring->last_tx_cycle) > 0)
+ if (ktime_compare(end_of_cycle, ring->last_tx_cycle) > 0)
*insert_empty = true;
}
}
--
2.39.2


2023-06-19 10:22:30

by Florian Kauer

[permalink] [raw]
Subject: [PATCH net v2 5/6] igc: Fix launchtime before start of cycle

It is possible (verified on a running system) that frames are processed
by igc_tx_launchtime with a txtime before the start of the cycle
(baset_est).

However, the result of txtime - baset_est is written into a u32,
leading to a wrap around to a positive number. The following
launchtime > 0 check will only branch to executing launchtime = 0
if launchtime is already 0.

Fix it by using a s32 before checking launchtime > 0.

Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
Signed-off-by: Florian Kauer <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 9c04df900b59..565c72bd737d 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -1017,7 +1017,7 @@ static __le32 igc_tx_launchtime(struct igc_ring *ring, ktime_t txtime,
ktime_t base_time = adapter->base_time;
ktime_t now = ktime_get_clocktai();
ktime_t baset_est, end_of_cycle;
- u32 launchtime;
+ s32 launchtime;
s64 n;

n = div64_s64(ktime_sub_ns(now, base_time), cycle_time);
--
2.39.2


2023-06-19 10:34:37

by Florian Kauer

[permalink] [raw]
Subject: [PATCH net v2 2/6] igc: Do not enable taprio offload for invalid arguments

Only set adapter->taprio_offload_enable after validating the arguments.
Otherwise, it stays set even if the offload was not enabled.
Since the subsequent code does not get executed in case of invalid
arguments, it will not be read at first.
However, by activating and then deactivating another offload
(e.g. ETF/TX launchtime offload), taprio_offload_enable is read
and erroneously keeps the offload feature of the NIC enabled.

This can be reproduced as follows:

# TAPRIO offload (flags == 0x2) and negative base-time leading to expected -ERANGE
sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \
num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 \
base-time -1000 \
sched-entry S 01 300000 \
flags 0x2

# IGC_TQAVCTRL is 0x0 as expected (iomem=relaxed for reading register)
sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1

# Activate ETF offload
sudo tc qdisc replace dev enp1s0 parent root handle 6666 mqprio \
num_tc 3 \
map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
queues 1@0 1@1 2@2 \
hw 0
sudo tc qdisc add dev enp1s0 parent 6666:1 etf \
clockid CLOCK_TAI \
delta 500000 \
offload

# IGC_TQAVCTRL is 0x9 as expected
sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1

# Deactivate ETF offload again
sudo tc qdisc delete dev enp1s0 parent 6666:1

# IGC_TQAVCTRL should now be 0x0 again, but is observed as 0x9
sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1

Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
Signed-off-by: Florian Kauer <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index dda057a3b5e3..290daa5827f0 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -6053,6 +6053,7 @@ static int igc_tsn_clear_schedule(struct igc_adapter *adapter)

adapter->base_time = 0;
adapter->cycle_time = NSEC_PER_SEC;
+ adapter->taprio_offload_enable = false;
adapter->qbv_config_change_errors = 0;

for (i = 0; i < adapter->num_tx_queues; i++) {
@@ -6075,8 +6076,6 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
size_t n;
int i;

- adapter->taprio_offload_enable = qopt->enable;
-
if (!qopt->enable)
return igc_tsn_clear_schedule(adapter);

@@ -6091,6 +6090,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,

adapter->cycle_time = qopt->cycle_time;
adapter->base_time = qopt->base_time;
+ adapter->taprio_offload_enable = true;

for (n = 0; n < qopt->num_entries; n++) {
struct tc_taprio_sched_entry *e = &qopt->entries[n];
--
2.39.2


2023-06-19 10:36:41

by Florian Kauer

[permalink] [raw]
Subject: [PATCH net v2 3/6] igc: Handle already enabled taprio offload for basetime 0

Since commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
it is possible to enable taprio offload with a basetime of 0.
However, the check if taprio offload is already enabled for i225
(and thus -EALREADY should be returned for igc_save_qbv_schedule)
still relied on adapter->base_time > 0.

This can be reproduced as follows:

# TAPRIO offload (flags == 0x2) and base-time = 0
sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \
num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 \
base-time 0 \
sched-entry S 01 300000 \
flags 0x2

# The second call should fail with "Error: Device failed to setup taprio offload."
# But that only happens if base-time was != 0
sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \
num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 \
base-time 0 \
sched-entry S 01 300000 \
flags 0x2

Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
Signed-off-by: Florian Kauer <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_main.c b/drivers/net/ethernet/intel/igc/igc_main.c
index 290daa5827f0..9c04df900b59 100644
--- a/drivers/net/ethernet/intel/igc/igc_main.c
+++ b/drivers/net/ethernet/intel/igc/igc_main.c
@@ -6082,7 +6082,7 @@ static int igc_save_qbv_schedule(struct igc_adapter *adapter,
if (qopt->base_time < 0)
return -ERANGE;

- if (igc_is_device_id_i225(hw) && adapter->base_time)
+ if (igc_is_device_id_i225(hw) && adapter->taprio_offload_enable)
return -EALREADY;

if (!validate_schedule(adapter, qopt))
--
2.39.2


2023-06-19 10:36:41

by Florian Kauer

[permalink] [raw]
Subject: [PATCH net v2 4/6] igc: No strict mode in pure launchtime/CBS offload

The flags IGC_TXQCTL_STRICT_CYCLE and IGC_TXQCTL_STRICT_END
prevent the packet transmission over slot and cycle boundaries.
This is important for taprio offload where the slots and
cycles correspond to the slots and cycles configured for the
network.

However, the Qbv offload feature of the i225 is also used for
enabling TX launchtime / ETF offload. In that case, however,
the cycle has no meaning for the network and is only used
internally to adapt the base time register after a second has
passed.

Enabling strict mode in this case would unneccesarily prevent
the transmission of certain packets (i.e. at the boundary of a
second) and thus interfers with the ETF qdisc that promises
transmission at a certain point in time.

Similar to ETF, this also applies to CBS offload that also should
not be influenced by strict mode unless taprio offload would be
enabled at the same time.

This fully reverts
commit d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling")
but its commit message only describes what was already implemented
before that commit. The difference to a plain revert of that commit
is that it now copes with the base_time = 0 case that was fixed with
commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")

In particular, enabling strict mode leads to TX hang situations
under high traffic if taprio is applied WITHOUT taprio offload
but WITH ETF offload, e.g. as in

sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \
num_tc 1 \
map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
queues 1@0 \
base-time 0 \
sched-entry S 01 300000 \
flags 0x1 \
txtime-delay 500000 \
clockid CLOCK_TAI
sudo tc qdisc replace dev enp1s0 parent 100:1 etf \
clockid CLOCK_TAI \
delta 500000 \
offload \
skip_sock_check

and traffic generator

sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns

with traffic.cfg

#define ETH_P_IP 0x0800

{
/* Ethernet Header */
0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed
0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed
const16(ETH_P_IP),

/* IPv4 Header */
0b01000101, 0, # IPv4 version, IHL, TOS
const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header))
const16(2), # IPv4 ident
0b01000000, 0, # IPv4 flags, fragmentation off
64, # IPv4 TTL
17, # Protocol UDP
csumip(14, 33), # IPv4 checksum

/* UDP Header */
10, 0, 48, 1, # IP Src - adapt as needed
10, 0, 48, 10, # IP Dest - adapt as needed
const16(5555), # UDP Src Port
const16(6666), # UDP Dest Port
const16(1008), # UDP length (UDP header 8 bytes + payload length)
csumudp(14, 34), # UDP checksum

/* Payload */
fill('W', 1000),
}

and the observed message with that is for example

igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang
Tx Queue <0>
TDH <d0>
TDT <f0>
next_to_use <f0>
next_to_clean <d0>
buffer_info[next_to_clean]
time_stamp <ffff661f>
next_to_watch <00000000245a4efb>
jiffies <ffff6e48>
desc.status <1048000>

Fixes: d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling")
Signed-off-by: Florian Kauer <[email protected]>
Reviewed-by: Kurt Kanzenbach <[email protected]>
---
drivers/net/ethernet/intel/igc/igc_tsn.c | 24 ++++++++++++++++++++++--
1 file changed, 22 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igc/igc_tsn.c b/drivers/net/ethernet/intel/igc/igc_tsn.c
index c6636a7264d5..63d410e7b876 100644
--- a/drivers/net/ethernet/intel/igc/igc_tsn.c
+++ b/drivers/net/ethernet/intel/igc/igc_tsn.c
@@ -133,8 +133,28 @@ static int igc_tsn_enable_offload(struct igc_adapter *adapter)
wr32(IGC_STQT(i), ring->start_time);
wr32(IGC_ENDQT(i), ring->end_time);

- txqctl |= IGC_TXQCTL_STRICT_CYCLE |
- IGC_TXQCTL_STRICT_END;
+ if (adapter->taprio_offload_enable) {
+ /* If taprio_offload_enable is set we are in "taprio"
+ * mode and we need to be strict about the
+ * cycles: only transmit a packet if it can be
+ * completed during that cycle.
+ *
+ * If taprio_offload_enable is NOT true when
+ * enabling TSN offload, the cycle should have
+ * no external effects, but is only used internally
+ * to adapt the base time register after a second
+ * has passed.
+ *
+ * Enabling strict mode in this case would
+ * unneccesarily prevent the transmission of
+ * certain packets (i.e. at the boundary of a
+ * second) and thus interfer with the launchtime
+ * feature that promises transmission at a
+ * certain point in time.
+ */
+ txqctl |= IGC_TXQCTL_STRICT_CYCLE |
+ IGC_TXQCTL_STRICT_END;
+ }

if (ring->launchtime_enable)
txqctl |= IGC_TXQCTL_QUEUE_MODE_LAUNCHT;
--
2.39.2


2023-06-20 23:05:51

by Vinicius Costa Gomes

[permalink] [raw]
Subject: Re: [PATCH net v2 0/6] igc: Fix corner cases for TSN offload

Florian Kauer <[email protected]> writes:

> The igc driver supports several different offloading capabilities
> relevant in the TSN context. Recent patches in this area introduced
> regressions for certain corner cases that are fixed in this series.
>
> Each of the patches (except the first one) addresses a different
> regression that can be separately reproduced. Still, they have
> overlapping code changes so they should not be separately applied.
>
> Especially #4 and #6 address the same observation,
> but both need to be applied to avoid TX hang occurrences in
> the scenario described in the patches.
>
> Signed-off-by: Florian Kauer <[email protected]>
> Reviewed-by: Kurt Kanzenbach <[email protected]>
>
> ---

For the series:

Acked-by: Vinicius Costa Gomes <[email protected]>


Cheers,
--
Vinicius

Subject: RE: [PATCH net v2 0/6] igc: Fix corner cases for TSN offload



> -----Original Message-----
> From: Florian Kauer <[email protected]>
> Sent: Monday, 19 June, 2023 6:09 PM
> To: Brandeburg, Jesse <[email protected]>; Nguyen, Anthony L
> <[email protected]>; Gomes, Vinicius
> <[email protected]>; David S . Miller <[email protected]>; Eric
> Dumazet <[email protected]>; Jakub Kicinski <[email protected]>;
> Paolo Abeni <[email protected]>; Tan Tee Min
> <[email protected]>; Zulkifli, Muhammad Husaini
> <[email protected]>; Gunasekaran, Aravindhan
> <[email protected]>; Chilakala, Mallikarjuna
> <[email protected]>
> Cc: [email protected]; [email protected]; linux-
> [email protected]; [email protected]; [email protected]
> Subject: [PATCH net v2 0/6] igc: Fix corner cases for TSN offload
>
> The igc driver supports several different offloading capabilities relevant in the
> TSN context. Recent patches in this area introduced regressions for certain
> corner cases that are fixed in this series.
>
> Each of the patches (except the first one) addresses a different regression
> that can be separately reproduced. Still, they have overlapping code changes
> so they should not be separately applied.
>
> Especially #4 and #6 address the same observation, but both need to be
> applied to avoid TX hang occurrences in the scenario described in the
> patches.
>
> Signed-off-by: Florian Kauer <[email protected]>
> Reviewed-by: Kurt Kanzenbach <[email protected]>
>
> ---
>
> v2: - Rebased onto net. #1-#2 needed adaptations, others unmodified.
> - Extend #3 commit message that it only regards i225.
>
> ---
>
> Florian Kauer (6):
> igc: Rename qbv_enable to taprio_offload_enable
> igc: Do not enable taprio offload for invalid arguments
> igc: Handle already enabled taprio offload for basetime 0
> igc: No strict mode in pure launchtime/CBS offload
> igc: Fix launchtime before start of cycle
> igc: Fix inserting of empty frame for launchtime

Thanks!

Reviewed-by: Muhammad Husaini Zulkifli <[email protected]>

>
> drivers/net/ethernet/intel/igc/igc.h | 2 +-
> drivers/net/ethernet/intel/igc/igc_main.c | 10 ++++-----
> drivers/net/ethernet/intel/igc/igc_tsn.c | 26 ++++++++++++++++++++---
> 3 files changed, 29 insertions(+), 9 deletions(-)
>
> --
> 2.39.2


2023-07-03 08:50:38

by naamax.meir

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH net v2 2/6] igc: Do not enable taprio offload for invalid arguments

On 6/19/2023 13:08, Florian Kauer wrote:
> Only set adapter->taprio_offload_enable after validating the arguments.
> Otherwise, it stays set even if the offload was not enabled.
> Since the subsequent code does not get executed in case of invalid
> arguments, it will not be read at first.
> However, by activating and then deactivating another offload
> (e.g. ETF/TX launchtime offload), taprio_offload_enable is read
> and erroneously keeps the offload feature of the NIC enabled.
>
> This can be reproduced as follows:
>
> # TAPRIO offload (flags == 0x2) and negative base-time leading to expected -ERANGE
> sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \
> num_tc 1 \
> map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
> queues 1@0 \
> base-time -1000 \
> sched-entry S 01 300000 \
> flags 0x2
>
> # IGC_TQAVCTRL is 0x0 as expected (iomem=relaxed for reading register)
> sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1
>
> # Activate ETF offload
> sudo tc qdisc replace dev enp1s0 parent root handle 6666 mqprio \
> num_tc 3 \
> map 2 2 1 0 2 2 2 2 2 2 2 2 2 2 2 2 \
> queues 1@0 1@1 2@2 \
> hw 0
> sudo tc qdisc add dev enp1s0 parent 6666:1 etf \
> clockid CLOCK_TAI \
> delta 500000 \
> offload
>
> # IGC_TQAVCTRL is 0x9 as expected
> sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1
>
> # Deactivate ETF offload again
> sudo tc qdisc delete dev enp1s0 parent 6666:1
>
> # IGC_TQAVCTRL should now be 0x0 again, but is observed as 0x9
> sudo pcimem /sys/bus/pci/devices/0000:01:00.0/resource0 0x3570 w*1
>
> Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
> Signed-off-by: Florian Kauer <[email protected]>
> Reviewed-by: Kurt Kanzenbach <[email protected]>
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)


Tested-by: Naama Meir <[email protected]>

2023-07-04 05:43:29

by naamax.meir

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH net v2 3/6] igc: Handle already enabled taprio offload for basetime 0

On 6/19/2023 13:08, Florian Kauer wrote:
> Since commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
> it is possible to enable taprio offload with a basetime of 0.
> However, the check if taprio offload is already enabled for i225
> (and thus -EALREADY should be returned for igc_save_qbv_schedule)
> still relied on adapter->base_time > 0.
>
> This can be reproduced as follows:
>
> # TAPRIO offload (flags == 0x2) and base-time = 0
> sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \
> num_tc 1 \
> map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
> queues 1@0 \
> base-time 0 \
> sched-entry S 01 300000 \
> flags 0x2
>
> # The second call should fail with "Error: Device failed to setup taprio offload."
> # But that only happens if base-time was != 0
> sudo tc qdisc replace dev enp1s0 parent root handle 100 stab overhead 24 taprio \
> num_tc 1 \
> map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
> queues 1@0 \
> base-time 0 \
> sched-entry S 01 300000 \
> flags 0x2
>
> Fixes: e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
> Signed-off-by: Florian Kauer <[email protected]>
> Reviewed-by: Kurt Kanzenbach <[email protected]>
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Tested-by: Naama Meir <[email protected]>

2023-07-04 09:48:19

by naamax.meir

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH net v2 4/6] igc: No strict mode in pure launchtime/CBS offload

On 6/19/2023 13:08, Florian Kauer wrote:
> The flags IGC_TXQCTL_STRICT_CYCLE and IGC_TXQCTL_STRICT_END
> prevent the packet transmission over slot and cycle boundaries.
> This is important for taprio offload where the slots and
> cycles correspond to the slots and cycles configured for the
> network.
>
> However, the Qbv offload feature of the i225 is also used for
> enabling TX launchtime / ETF offload. In that case, however,
> the cycle has no meaning for the network and is only used
> internally to adapt the base time register after a second has
> passed.
>
> Enabling strict mode in this case would unneccesarily prevent
> the transmission of certain packets (i.e. at the boundary of a
> second) and thus interfers with the ETF qdisc that promises
> transmission at a certain point in time.
>
> Similar to ETF, this also applies to CBS offload that also should
> not be influenced by strict mode unless taprio offload would be
> enabled at the same time.
>
> This fully reverts
> commit d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling")
> but its commit message only describes what was already implemented
> before that commit. The difference to a plain revert of that commit
> is that it now copes with the base_time = 0 case that was fixed with
> commit e17090eb2494 ("igc: allow BaseTime 0 enrollment for Qbv")
>
> In particular, enabling strict mode leads to TX hang situations
> under high traffic if taprio is applied WITHOUT taprio offload
> but WITH ETF offload, e.g. as in
>
> sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \
> num_tc 1 \
> map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
> queues 1@0 \
> base-time 0 \
> sched-entry S 01 300000 \
> flags 0x1 \
> txtime-delay 500000 \
> clockid CLOCK_TAI
> sudo tc qdisc replace dev enp1s0 parent 100:1 etf \
> clockid CLOCK_TAI \
> delta 500000 \
> offload \
> skip_sock_check
>
> and traffic generator
>
> sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns
>
> with traffic.cfg
>
> #define ETH_P_IP 0x0800
>
> {
> /* Ethernet Header */
> 0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed
> 0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed
> const16(ETH_P_IP),
>
> /* IPv4 Header */
> 0b01000101, 0, # IPv4 version, IHL, TOS
> const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header))
> const16(2), # IPv4 ident
> 0b01000000, 0, # IPv4 flags, fragmentation off
> 64, # IPv4 TTL
> 17, # Protocol UDP
> csumip(14, 33), # IPv4 checksum
>
> /* UDP Header */
> 10, 0, 48, 1, # IP Src - adapt as needed
> 10, 0, 48, 10, # IP Dest - adapt as needed
> const16(5555), # UDP Src Port
> const16(6666), # UDP Dest Port
> const16(1008), # UDP length (UDP header 8 bytes + payload length)
> csumudp(14, 34), # UDP checksum
>
> /* Payload */
> fill('W', 1000),
> }
>
> and the observed message with that is for example
>
> igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang
> Tx Queue <0>
> TDH <d0>
> TDT <f0>
> next_to_use <f0>
> next_to_clean <d0>
> buffer_info[next_to_clean]
> time_stamp <ffff661f>
> next_to_watch <00000000245a4efb>
> jiffies <ffff6e48>
> desc.status <1048000>
>
> Fixes: d8f45be01dd9 ("igc: Use strict cycles for Qbv scheduling")
> Signed-off-by: Florian Kauer <[email protected]>
> Reviewed-by: Kurt Kanzenbach <[email protected]>
> ---
> drivers/net/ethernet/intel/igc/igc_tsn.c | 24 ++++++++++++++++++++++--
> 1 file changed, 22 insertions(+), 2 deletions(-)

Tested-by: Naama Meir <[email protected]>

2023-07-04 10:26:01

by Paul Menzel

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH net v2 4/6] igc: No strict mode in pure launchtime/CBS offload

Dear Florian,


Thank you for your patch.

Am 19.06.23 um 12:08 schrieb Florian Kauer:

[…]

For the commit summary/title, maybe make it a statement by using a verb
(in imperative mood)? Maybe:

> Forbid strict mode in pure launchtime/CBS offload


Kind regards,

Paul

2023-07-05 11:43:25

by naamax.meir

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH net v2 5/6] igc: Fix launchtime before start of cycle

On 6/19/2023 13:08, Florian Kauer wrote:
> It is possible (verified on a running system) that frames are processed
> by igc_tx_launchtime with a txtime before the start of the cycle
> (baset_est).
>
> However, the result of txtime - baset_est is written into a u32,
> leading to a wrap around to a positive number. The following
> launchtime > 0 check will only branch to executing launchtime = 0
> if launchtime is already 0.
>
> Fix it by using a s32 before checking launchtime > 0.
>
> Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
> Signed-off-by: Florian Kauer <[email protected]>
> Reviewed-by: Kurt Kanzenbach <[email protected]>
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)


Tested-by: Naama Meir <[email protected]>

2023-07-06 12:09:28

by naamax.meir

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH net v2 6/6] igc: Fix inserting of empty frame for launchtime

On 6/19/2023 13:08, Florian Kauer wrote:
> The insertion of an empty frame was introduced with
> commit db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
> in order to ensure that the current cycle has at least one packet if
> there is some packet to be scheduled for the next cycle.
>
> However, the current implementation does not properly check if
> a packet is already scheduled for the current cycle. Currently,
> an empty packet is always inserted if and only if
> txtime >= end_of_cycle && txtime > last_tx_cycle
> but since last_tx_cycle is always either the end of the current
> cycle (end_of_cycle) or the end of a previous cycle, the
> second part (txtime > last_tx_cycle) is always true unless
> txtime == last_tx_cycle.
>
> What actually needs to be checked here is if the last_tx_cycle
> was already written within the current cycle, so an empty frame
> should only be inserted if and only if
> txtime >= end_of_cycle && end_of_cycle > last_tx_cycle.
>
> This patch does not only avoid an unnecessary insertion, but it
> can actually be harmful to insert an empty packet if packets
> are already scheduled in the current cycle, because it can lead
> to a situation where the empty packet is actually processed
> as the first packet in the upcoming cycle shifting the packet
> with the first_flag even one cycle into the future, finally leading
> to a TX hang.
>
> The TX hang can be reproduced on a i225 with:
>
> sudo tc qdisc replace dev enp1s0 parent root handle 100 taprio \
> num_tc 1 \
> map 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 \
> queues 1@0 \
> base-time 0 \
> sched-entry S 01 300000 \
> flags 0x1 \
> txtime-delay 500000 \
> clockid CLOCK_TAI
> sudo tc qdisc replace dev enp1s0 parent 100:1 etf \
> clockid CLOCK_TAI \
> delta 500000 \
> offload \
> skip_sock_check
>
> and traffic generator
>
> sudo trafgen -i traffic.cfg -o enp1s0 --cpp -n0 -q -t1400ns
>
> with traffic.cfg
>
> #define ETH_P_IP 0x0800
>
> {
> /* Ethernet Header */
> 0x30, 0x1f, 0x9a, 0xd0, 0xf0, 0x0e, # MAC Dest - adapt as needed
> 0x24, 0x5e, 0xbe, 0x57, 0x2e, 0x36, # MAC Src - adapt as needed
> const16(ETH_P_IP),
>
> /* IPv4 Header */
> 0b01000101, 0, # IPv4 version, IHL, TOS
> const16(1028), # IPv4 total length (UDP length + 20 bytes (IP header))
> const16(2), # IPv4 ident
> 0b01000000, 0, # IPv4 flags, fragmentation off
> 64, # IPv4 TTL
> 17, # Protocol UDP
> csumip(14, 33), # IPv4 checksum
>
> /* UDP Header */
> 10, 0, 48, 1, # IP Src - adapt as needed
> 10, 0, 48, 10, # IP Dest - adapt as needed
> const16(5555), # UDP Src Port
> const16(6666), # UDP Dest Port
> const16(1008), # UDP length (UDP header 8 bytes + payload length)
> csumudp(14, 34), # UDP checksum
>
> /* Payload */
> fill('W', 1000),
> }
>
> and the observed message with that is for example
>
> igc 0000:01:00.0 enp1s0: Detected Tx Unit Hang
> Tx Queue <0>
> TDH <32>
> TDT <3c>
> next_to_use <3c>
> next_to_clean <32>
> buffer_info[next_to_clean]
> time_stamp <ffff26a8>
> next_to_watch <00000000632a1828>
> jiffies <ffff27f8>
> desc.status <1048000>
>
> Fixes: db0b124f02ba ("igc: Enhance Qbv scheduling by using first flag bit")
> Signed-off-by: Florian Kauer <[email protected]>
> Reviewed-by: Kurt Kanzenbach <[email protected]>
> ---
> drivers/net/ethernet/intel/igc/igc_main.c | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)

Tested-by: Naama Meir <[email protected]>