2022-04-25 09:45:56

by Pavel Pisa

[permalink] [raw]
Subject: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

The minor problems reported by actual checkpatch.pl and patchwork
rules has been resolved at price of disable of some debugging
options used initially to locate FPGA PCIe integration problems.

The CTU CAN FD IP core driver documentation has been linked
into CAN drivers index.

The code has been tested on QEMU with CTU CAN FD IP core
included functional model of PCIe integration.

The Linux net-next CTU CAN FD driver has been tested on real Xilinx
Zynq hardware by Matej Vasilevski even together with his
timestamp support patches. Preparation for public discussion
and mainlining is work in progress.

Jiapeng Chong (2):
can: ctucanfd: Remove unnecessary print function dev_err()
can: ctucanfd: Remove unused including <linux/version.h>

Pavel Pisa (2):
can: ctucanfd: remove PCI module debug parameters and core debug
statements.
docs: networking: device drivers: can: add ctucanfd and its author
e-mail update

.../can/ctu/ctucanfd-driver.rst | 2 +-
.../networking/device_drivers/can/index.rst | 1 +
drivers/net/can/ctucanfd/ctucanfd_base.c | 34 ++-----------------
drivers/net/can/ctucanfd/ctucanfd_pci.c | 22 ++++--------
drivers/net/can/ctucanfd/ctucanfd_platform.c | 1 -
5 files changed, 11 insertions(+), 49 deletions(-)

--
2.20.1



2022-04-29 04:36:58

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

On 24.04.2022 18:28:07, Pavel Pisa wrote:
> The minor problems reported by actual checkpatch.pl and patchwork
> rules has been resolved at price of disable of some debugging
> options used initially to locate FPGA PCIe integration problems.
>
> The CTU CAN FD IP core driver documentation has been linked
> into CAN drivers index.
>
> The code has been tested on QEMU with CTU CAN FD IP core
> included functional model of PCIe integration.
>
> The Linux net-next CTU CAN FD driver has been tested on real Xilinx
> Zynq hardware by Matej Vasilevski even together with his
> timestamp support patches. Preparation for public discussion
> and mainlining is work in progress.
>
> Jiapeng Chong (2):
> can: ctucanfd: Remove unnecessary print function dev_err()
> can: ctucanfd: Remove unused including <linux/version.h>

I had these already applied.

> Pavel Pisa (2):
> can: ctucanfd: remove PCI module debug parameters and core debug
> statements.
> docs: networking: device drivers: can: add ctucanfd and its author
> e-mail update

Split into separate patches and applied.

Thanks,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (1.38 kB)
signature.asc (499.00 B)
Download all attachments

2022-05-02 09:14:59

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

On 29.04.2022 23:31:28, Pavel Pisa wrote:
> > Split into separate patches and applied.
>
> Excuse me for late reply and thanks much for split to preferred
> form. Matej Vasilevski has tested updated linux-can-next testing
> on Xilinx Zynq 7000 based MZ_APO board and used it with his
> patches to do proceed next round of testing of Jan Charvat's NuttX
> TWAI (CAN) driver on ESP32C3. We plan that CTU CAN FD timestamping
> will be send for RFC/discussion soon.

Sounds good!

> I would like to thank to Andrew Dennison who implemented, tested
> and shares integration with LiteX and RISC-V
>
> https://github.com/litex-hub/linux-on-litex-vexriscv
>
> He uses development version of the CTU CAN FD IP core with configurable
> number of Tx buffers (2 to 8) for which will be required
> automatic setup logic in the driver.
>
> I need to discuss with Ondrej Ille actual state and his plans.
> But basically ntxbufs in the ctucan_probe_common() has to be assigned
> from TXTB_INFO TXT_BUFFER_COUNT field. For older core version
> the TXT_BUFFER_COUNT field bits should be equal to zero so when
> value is zero, the original version with fixed 4 buffers will
> be recognized.

Makes sense

> When value is configurable then for (uncommon) number
> of buffers which is not power of two, there will be likely
> a problem with way how buffers queue is implemented
>
> txtb_id = priv->txb_head % priv->ntxbufs;
> ...
> priv->txb_head++;
> ...
> priv->txb_tail++;
>
> When I have provided example for this type of queue many years
> ago I have probably shown example with power of 2 masking,
> but modulo by arbitrary number does not work with sequence
> overflow. Which means to add there two "if"s unfortunately
>
> if (++priv->txb_tail == 2 * priv->ntxbufs)
> priv->txb_tail = 0;

There's another way to implement this, here for ring->obj_num being
power of 2:

| static inline u8 mcp251xfd_get_tx_head(const struct mcp251xfd_tx_ring *ring)
| {
| return ring->head & (ring->obj_num - 1);
| }
|
| static inline u8 mcp251xfd_get_tx_tail(const struct mcp251xfd_tx_ring *ring)
| {
| return ring->tail & (ring->obj_num - 1);
| }
|
| static inline u8 mcp251xfd_get_tx_free(const struct mcp251xfd_tx_ring *ring)
| {
| return ring->obj_num - (ring->head - ring->tail);
| }

If you want to allow not power of 2 ring->obj_num, use "% ring->obj_num"
instead of "& (ring->obj_num - 1)".

I'm not sure of there is a real world benefit (only gut feeling, should
be measured) of using more than 4, but less than 8 TX buffers.

You can make use of more TX buffers, if you implement (fully hardware
based) TX IRQ coalescing (== handle more than one TX complete interrupt
at a time) like in the mcp251xfd driver, or BQL support (== send more
than one TX CAN frame at a time). I've played a bit with BQL support on
the mcp251xfd driver (which is attached by SPI), but with mixed results.
Probably an issue with proper configuration.

> We need 2 * priv->ntxbufs range to distinguish empty and full queue...
> But modulo is not nice either so I probably come with some other
> solution in a longer term. In the long term, I want to implement
> virtual queues to allow multiqueue to use dynamic Tx priority
> of up to 8 the buffers...

ACK, multiqueue TX support would be nice for things like the Earliest TX
Time First scheduler (ETF). 1 TX queue for ETF, the other for bulk
messages.

regards,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (3.68 kB)
signature.asc (499.00 B)
Download all attachments

2022-05-02 18:24:36

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

On 02.05.2022 09:21:51, Marc Kleine-Budde wrote:
> On 29.04.2022 23:31:28, Pavel Pisa wrote:
> > > Split into separate patches and applied.
> >
> > Excuse me for late reply and thanks much for split to preferred
> > form. Matej Vasilevski has tested updated linux-can-next testing
> > on Xilinx Zynq 7000 based MZ_APO board and used it with his
> > patches to do proceed next round of testing of Jan Charvat's NuttX
> > TWAI (CAN) driver on ESP32C3. We plan that CTU CAN FD timestamping
> > will be send for RFC/discussion soon.
>
> Sounds good!

Please make use of struct cyclecounter/struct timecounter if needed,
e.g.:

| https://elixir.bootlin.com/linux/v5.18-rc5/source/drivers/net/can/spi/mcp251xfd/mcp251xfd-timestamp.c

Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (1.00 kB)
signature.asc (499.00 B)
Download all attachments

2022-05-03 00:38:01

by Pavel Pisa

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

Hello Marc,

On Thursday 28 of April 2022 09:22:39 Marc Kleine-Budde wrote:
> > Jiapeng Chong (2):
> > can: ctucanfd: Remove unnecessary print function dev_err()
> > can: ctucanfd: Remove unused including <linux/version.h>
>
> I had these already applied.
>
> > Pavel Pisa (2):
> > can: ctucanfd: remove PCI module debug parameters and core debug
> > statements.
> > docs: networking: device drivers: can: add ctucanfd and its author
> > e-mail update
>
> Split into separate patches and applied.

Excuse me for late reply and thanks much for split to preferred
form. Matej Vasilevski has tested updated linux-can-next testing
on Xilinx Zynq 7000 based MZ_APO board and used it with his
patches to do proceed next round of testing of Jan Charvat's NuttX
TWAI (CAN) driver on ESP32C3. We plan that CTU CAN FD timestamping
will be send for RFC/discussion soon.

I would like to thank to Andrew Dennison who implemented, tested
and shares integration with LiteX and RISC-V

https://github.com/litex-hub/linux-on-litex-vexriscv

He uses development version of the CTU CAN FD IP core with configurable
number of Tx buffers (2 to 8) for which will be required
automatic setup logic in the driver.

I need to discuss with Ondrej Ille actual state and his plans.
But basically ntxbufs in the ctucan_probe_common() has to be assigned
from TXTB_INFO TXT_BUFFER_COUNT field. For older core version
the TXT_BUFFER_COUNT field bits should be equal to zero so when
value is zero, the original version with fixed 4 buffers will
be recognized. When value is configurable then for (uncommon) number
of buffers which is not power of two, there will be likely
a problem with way how buffers queue is implemented

txtb_id = priv->txb_head % priv->ntxbufs;
...
priv->txb_head++;
...
priv->txb_tail++;

When I have provided example for this type of queue many years
ago I have probably shown example with power of 2 masking,
but modulo by arbitrary number does not work with sequence
overflow. Which means to add there two "if"s unfortunately

if (++priv->txb_tail == 2 * priv->ntxbufs)
priv->txb_tail = 0;

We need 2 * priv->ntxbufs range to distinguish empty and full queue...
But modulo is not nice either so I probably come with some other
solution in a longer term. In the long term, I want to implement
virtual queues to allow multiqueue to use dynamic Tx priority
of up to 8 the buffers...

Best wishes,

Pavel Pisa
phone: +420 603531357
e-mail: [email protected]
Department of Control Engineering FEE CVUT
Karlovo namesti 13, 121 35, Prague 2
university: http://control.fel.cvut.cz/
personal: http://cmp.felk.cvut.cz/~pisa
projects: https://www.openhub.net/accounts/ppisa
CAN related:http://canbus.pages.fel.cvut.cz/
Open Technologies Research Education and Exchange Services
https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

2022-05-03 07:20:01

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

On 03.05.2022 16:32:32, Andrew Dennison wrote:
> > > When value is configurable then for (uncommon) number
> > > of buffers which is not power of two, there will be likely
> > > a problem with way how buffers queue is implemented
> >
>
> Only power of 2 makes sense to me: I didn't consider those corner
> cases but the driver could just round down to the next power of 2 and
> warn about a misconfiguration of the IP core.

+1

> I added the dynamic detection because the IP core default had changed
> to 2 TX buffers and this broke some hard coded assumptions in the
> driver in a rather obscure way that had me debugging for a bit...

The mainline driver uses a hard coded default of 4 still... Can you
provide that patch soonish?

> > You can make use of more TX buffers, if you implement (fully
> > hardware based) TX IRQ coalescing (== handle more than one TX
> > complete interrupt at a time) like in the mcp251xfd driver, or BQL
> > support (== send more than one TX CAN frame at a time). I've played
> > a bit with BQL support on the mcp251xfd driver (which is attached by
> > SPI), but with mixed results. Probably an issue with proper
> > configuration.
>
> Reducing CAN IRQ load would be good.

IRQ coalescing comes at the price of increased latency, but if you have
a timeout in hardware you can configure the latencies precisely.

> > > We need 2 * priv->ntxbufs range to distinguish empty and full
> > > queue... But modulo is not nice either so I probably come with
> > > some other solution in a longer term. In the long term, I want to
> > > implement virtual queues to allow multiqueue to use dynamic Tx
> > > priority of up to 8 the buffers...
> >
> > ACK, multiqueue TX support would be nice for things like the
> > Earliest TX Time First scheduler (ETF). 1 TX queue for ETF, the
> > other for bulk messages.
>
> Would be nice, I have multi-queue in the CAN layer I wrote for a
> little RTOS (predates socketcan) and have used for a while.

Out of interest:
What are the use cases? How did you decide which queue to use?

regards,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (2.33 kB)
signature.asc (499.00 B)
Download all attachments

2022-05-03 07:49:12

by Pavel Pisa

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

Hello Marc and Andrew,

On Tuesday 03 of May 2022 08:46:26 Marc Kleine-Budde wrote:
> On 03.05.2022 16:32:32, Andrew Dennison wrote:
> > > > When value is configurable then for (uncommon) number
> > > > of buffers which is not power of two, there will be likely
> > > > a problem with way how buffers queue is implemented
> >
> > Only power of 2 makes sense to me: I didn't consider those corner
> > cases but the driver could just round down to the next power of 2 and
> > warn about a misconfiguration of the IP core.
>
> +1

Then (n-1) mask be used instead of modulo which is what I used for these
kind of queues.

https://sourceforge.net/p/ulan/ulut/ci/master/tree/ulut/ul_dqfifo.h

> > I added the dynamic detection because the IP core default had changed
> > to 2 TX buffers and this broke some hard coded assumptions in the
> > driver in a rather obscure way that had me debugging for a bit...
>
> The mainline driver uses a hard coded default of 4 still... Can you
> provide that patch soonish?

We discuss with Ondrej Ille final location of the bits with queue
length information etc... The version 3.0 of the core is in development
still. So I do not like to introduce something which would break
compatability with future revisions. Yes, we can check for version
reported by IP core but when it is possible I would not introduce
such logic... So if it gets to 5.19 it would be great but if we should
risk incompatible changes or too cluttered logic then it will be
5.20 material. Other option is to add Kconfig option to specify
maximal number of TX buffers used by the driver for now.

> > > You can make use of more TX buffers, if you implement (fully
> > > hardware based) TX IRQ coalescing (== handle more than one TX
> > > complete interrupt at a time) like in the mcp251xfd driver, or BQL
> > > support (== send more than one TX CAN frame at a time). I've played
> > > a bit with BQL support on the mcp251xfd driver (which is attached by
> > > SPI), but with mixed results. Probably an issue with proper
> > > configuration.
> >
> > Reducing CAN IRQ load would be good.
>
> IRQ coalescing comes at the price of increased latency, but if you have
> a timeout in hardware you can configure the latencies precisely.

HW coalescing not considered yet. Generally my intention for CAN use
is usually robotic and motion control and there is CAN and even CAN FD
on edge with its latencies already and SocketCAN layer adds yet
another level due common tasklets and threads with other often
dense and complex protocols on ETHERNET so to lover CPU load
by IRQ coalescing is not my priority. We have done latencies
evaluation of SocketCAN, LinCAN and RTEMS years ago on Oliver Hartkopp's
requests on standard and fully-preemptive kernels on more targets
(x86, PowerPC, ...) and I hope that we revive CAN Bench project
on Xilinx Zynq based MZ_APO again, see Martin Jerabek's theses FPGA
Based CAN Bus Channels Mutual Latency Tester and Evaluation, 2016

https://gitlab.fel.cvut.cz/canbus/zynq/zynq-can-sja1000-top/wikis/uploads/56b4d27d8f81ae390fc98bdce803398f/F3-BP-2016-Jerabek-Martin-Jerabek-thesis-2016.pdf

It is actual work of Matej Vasilevski. So I hope to have again more
insight into latencies on CAN. By the way, I plan to speak with
Carsten Emde on Embedded World if these is interrest to add
continuous HW timestamping based CAN latencies testing into OSADL
QA Farm

https://www.osadl.org/OSADL-QA-Farm-Real-time.linux-real-time.0.html

Other option is to setup system and run it locally at CTU
as we run complete CI on CTU CAN FD.

> > > > We need 2 * priv->ntxbufs range to distinguish empty and full
> > > > queue... But modulo is not nice either so I probably come with
> > > > some other solution in a longer term. In the long term, I want to
> > > > implement virtual queues to allow multiqueue to use dynamic Tx
> > > > priority of up to 8 the buffers...
> > >
> > > ACK, multiqueue TX support would be nice for things like the
> > > Earliest TX Time First scheduler (ETF). 1 TX queue for ETF, the
> > > other for bulk messages.
> >
> > Would be nice, I have multi-queue in the CAN layer I wrote for a
> > little RTOS (predates socketcan) and have used for a while.
>
> Out of interest:
> What are the use cases? How did you decide which queue to use?

For example for CAN open there should be at least three queues
to prevent CAN Tx priority inversion. one for NMT (network
management), one for PDO (process data objects) and least priority
for SDO (service data objects). That such applications works
somehow with single queue is only matter of luck and low
level of the link bandwidth utilization.

We have done research how to use Linux networking infrastructure
to route application send CAN messages into multiple queues
according to the CAN ID priorities. There are some results in mainline
from that work

Rostislav Lisovy 2014: can: Propagate SO_PRIORITY of raw sockets to skbs
Rostislav Lisovy 2012: net: em_canid: Ematch rule to match CAN frames
according to their identifiers

https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/sched/em_canid.c

So some enhancements and testing in this direction belongs
between my long horizon goals. But low priority now because
my company and even studnets at university are paid from
other projects (silicon-heaven, ESA, Bluetooth-monitoring,
NuttX etc.) so Linux CAN is hobby only at this moment.
But others have contracts for CTU CAN FD, Skoda Auto
testers etc. there...

Best wishes,

Pavel Pisa
phone: +420 603531357
e-mail: [email protected]
Department of Control Engineering FEE CVUT
Karlovo namesti 13, 121 35, Prague 2
university: http://control.fel.cvut.cz/
personal: http://cmp.felk.cvut.cz/~pisa
projects: https://www.openhub.net/accounts/ppisa
CAN related:http://canbus.pages.fel.cvut.cz/
Open Technologies Research Education and Exchange Services
https://gitlab.fel.cvut.cz/otrees/org/-/wikis/home

2022-05-03 08:24:45

by Andrew Dennison

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

plain text this time...

On Tue, 3 May 2022 at 16:46, Marc Kleine-Budde <[email protected]> wrote:
>
> On 03.05.2022 16:32:32, Andrew Dennison wrote:
> > > > When value is configurable then for (uncommon) number
> > > > of buffers which is not power of two, there will be likely
> > > > a problem with way how buffers queue is implemented
> > >
> >
> > Only power of 2 makes sense to me: I didn't consider those corner
> > cases but the driver could just round down to the next power of 2 and
> > warn about a misconfiguration of the IP core.
>
> +1
>
> > I added the dynamic detection because the IP core default had changed
> > to 2 TX buffers and this broke some hard coded assumptions in the
> > driver in a rather obscure way that had me debugging for a bit...
>
> The mainline driver uses a hard coded default of 4 still... Can you
> provide that patch soonish?

I was using the out of tree driver but can have a look at this, unless
Pavel wants to merge this in his tree and submit?

>
> > > You can make use of more TX buffers, if you implement (fully
> > > hardware based) TX IRQ coalescing (== handle more than one TX
> > > complete interrupt at a time) like in the mcp251xfd driver, or BQL
> > > support (== send more than one TX CAN frame at a time). I've played
> > > a bit with BQL support on the mcp251xfd driver (which is attached by
> > > SPI), but with mixed results. Probably an issue with proper
> > > configuration.
> >
> > Reducing CAN IRQ load would be good.
>
> IRQ coalescing comes at the price of increased latency, but if you have
> a timeout in hardware you can configure the latencies precisely.
>
> > > > We need 2 * priv->ntxbufs range to distinguish empty and full
> > > > queue... But modulo is not nice either so I probably come with
> > > > some other solution in a longer term. In the long term, I want to
> > > > implement virtual queues to allow multiqueue to use dynamic Tx
> > > > priority of up to 8 the buffers...
> > >
> > > ACK, multiqueue TX support would be nice for things like the
> > > Earliest TX Time First scheduler (ETF). 1 TX queue for ETF, the
> > > other for bulk messages.
> >
> > Would be nice, I have multi-queue in the CAN layer I wrote for a
> > little RTOS (predates socketcan) and have used for a while.
>
> Out of interest:
> What are the use cases? How did you decide which queue to use?

I had a queue per fd, with queues sorted by id of the next message,
then sent the lowest ID next for hardware with a single queue. For
hardware with lots of buffers there was a hw buffer per queue. I
didn't have to deal with the generic cases that would need to be
handled in linux. I must say ctucanfd has a much nicer interface than
the other can hardware I've used.

Kind regards,

Andrew

2022-05-03 11:50:56

by Marc Kleine-Budde

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

On 03.05.2022 09:27:15, Pavel Pisa wrote:
> Hello Marc and Andrew,
>
> On Tuesday 03 of May 2022 08:46:26 Marc Kleine-Budde wrote:
> > On 03.05.2022 16:32:32, Andrew Dennison wrote:
> > > > > When value is configurable then for (uncommon) number
> > > > > of buffers which is not power of two, there will be likely
> > > > > a problem with way how buffers queue is implemented
> > >
> > > Only power of 2 makes sense to me: I didn't consider those corner
> > > cases but the driver could just round down to the next power of 2 and
> > > warn about a misconfiguration of the IP core.
> >
> > +1
>
> Then (n-1) mask be used instead of modulo which is what I used for these
> kind of queues.
>
> https://sourceforge.net/p/ulan/ulut/ci/master/tree/ulut/ul_dqfifo.h

ACK

> > > I added the dynamic detection because the IP core default had changed
> > > to 2 TX buffers and this broke some hard coded assumptions in the
> > > driver in a rather obscure way that had me debugging for a bit...
> >
> > The mainline driver uses a hard coded default of 4 still... Can you
> > provide that patch soonish?
>
> We discuss with Ondrej Ille final location of the bits with queue
> length information etc... The version 3.0 of the core is in
> development still. So I do not like to introduce something which would
> break compatability with future revisions.

Makes sense. I'm a bit nervous, as Andrew said the default number of TX
buffers changed to 2 in the hardware.

> Yes, we can check for version reported by IP core but when it is
> possible I would not introduce such logic... So if it gets to 5.19 it
> would be great but if we should risk incompatible changes or too
> cluttered logic then it will be 5.20 material. Other option is to add
> Kconfig option to specify maximal number of TX buffers used by the
> driver for now.

No Kconfig option please! The hardware should be introspectable,
preferred a register holds the number of TX buffers. If this is not
available use a version register and hard code the number per version.

> > > > You can make use of more TX buffers, if you implement (fully
> > > > hardware based) TX IRQ coalescing (== handle more than one TX
> > > > complete interrupt at a time) like in the mcp251xfd driver, or BQL
> > > > support (== send more than one TX CAN frame at a time). I've played
> > > > a bit with BQL support on the mcp251xfd driver (which is attached by
> > > > SPI), but with mixed results. Probably an issue with proper
> > > > configuration.
> > >
> > > Reducing CAN IRQ load would be good.
> >
> > IRQ coalescing comes at the price of increased latency, but if you have
> > a timeout in hardware you can configure the latencies precisely.
>
> HW coalescing not considered yet. Generally my intention for CAN use
> is usually robotic and motion control and there is CAN and even CAN FD
> on edge with its latencies already and SocketCAN layer adds yet
> another level due common tasklets and threads with other often dense
> and complex protocols on ETHERNET so to lover CPU load by IRQ
> coalescing is not my priority.

For MMIO attached hardware IRQ load is a far less problem than slow
busses like SPI.

> We have done latencies evaluation of SocketCAN, LinCAN and RTEMS years
> ago on Oliver Hartkopp's requests on standard and fully-preemptive
> kernels on more targets (x86, PowerPC, ...) and I hope that we revive
> CAN Bench project on Xilinx Zynq based MZ_APO again, see Martin
> Jerabek's theses FPGA Based CAN Bus Channels Mutual Latency Tester and
> Evaluation, 2016
>
> https://gitlab.fel.cvut.cz/canbus/zynq/zynq-can-sja1000-top/wikis/uploads/56b4d27d8f81ae390fc98bdce803398f/F3-BP-2016-Jerabek-Martin-Jerabek-thesis-2016.pdf

Meanwhile you can configure the NAPI to be handled in per interface
kernel thread, which can be prioritized. This should reduce latencies
introduced by NAPI of other network cards using the global soft IRQ.

> It is actual work of Matej Vasilevski. So I hope to have again more
> insight into latencies on CAN. By the way, I plan to speak with
> Carsten Emde on Embedded World if these is interrest to add
> continuous HW timestamping based CAN latencies testing into OSADL
> QA Farm
>
> https://www.osadl.org/OSADL-QA-Farm-Real-time.linux-real-time.0.html
>
> Other option is to setup system and run it locally at CTU
> as we run complete CI on CTU CAN FD.
>
> > > > > We need 2 * priv->ntxbufs range to distinguish empty and full
> > > > > queue... But modulo is not nice either so I probably come with
> > > > > some other solution in a longer term. In the long term, I want to
> > > > > implement virtual queues to allow multiqueue to use dynamic Tx
> > > > > priority of up to 8 the buffers...
> > > >
> > > > ACK, multiqueue TX support would be nice for things like the
> > > > Earliest TX Time First scheduler (ETF). 1 TX queue for ETF, the
> > > > other for bulk messages.
> > >
> > > Would be nice, I have multi-queue in the CAN layer I wrote for a
> > > little RTOS (predates socketcan) and have used for a while.
> >
> > Out of interest:
> > What are the use cases? How did you decide which queue to use?
>
> For example for CAN open there should be at least three queues
> to prevent CAN Tx priority inversion. one for NMT (network
> management), one for PDO (process data objects) and least priority
> for SDO (service data objects). That such applications works
> somehow with single queue is only matter of luck and low
> level of the link bandwidth utilization.
>
> We have done research how to use Linux networking infrastructure
> to route application send CAN messages into multiple queues
> according to the CAN ID priorities. There are some results in mainline
> from that work
>
> Rostislav Lisovy 2014: can: Propagate SO_PRIORITY of raw sockets to skbs
> Rostislav Lisovy 2012: net: em_canid: Ematch rule to match CAN frames
> according to their identifiers
>
> https://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git/tree/net/sched/em_canid.c
>
> So some enhancements and testing in this direction belongs
> between my long horizon goals. But low priority now because
> my company and even studnets at university are paid from
> other projects (silicon-heaven, ESA, Bluetooth-monitoring,
> NuttX etc.) so Linux CAN is hobby only at this moment.
> But others have contracts for CTU CAN FD, Skoda Auto
> testers etc. there...

regards,
Marc

--
Pengutronix e.K. | Marc Kleine-Budde |
Embedded Linux | https://www.pengutronix.de |
Vertretung West/Dortmund | Phone: +49-231-2826-924 |
Amtsgericht Hildesheim, HRA 2686 | Fax: +49-5121-206917-5555 |


Attachments:
(No filename) (6.62 kB)
signature.asc (499.00 B)
Download all attachments

2022-05-09 03:12:59

by Andrew Dennison

[permalink] [raw]
Subject: Re: [PATCH v1 0/4] can: ctucanfd: clenup acoording to the actual rules and documentation linking

On Sun, 8 May 2022 at 01:51, Ondrej Ille <[email protected]> wrote:
>
> Hello all,
>
> again, sorry for the late reply, my daily job keeps me very busy, and the vision of completely new silicon
> coming to our office if we meet a tape-out date is somehow motivating :)
>
> Just few notes about the discussion:
>
> 1. Number of TXT Buffers
> I agree that driver should read-out this information from the HW, not rely on fixed configuration.
> True, the default value in HW master is 2, but Linux driver had 4 hard-coded. This was coming from
> times when there were only 4 buffers (no genericity in the HW). IMHO this is HW bug, because the
> intention when doing the "arbitrary number of buffers" extension, was to keep default value the
> same as in previous implementation. System architecture document also has "4" as value of txt_buffer_count generic.
>
> I will fix this ASAP in the master branch, hopefully regression will not complain so that the current driver
> version is compatible with default HW.
>
> As per reading out number of TXT Buffers from HW, Pavel proposed moving TXTB_INFO else-where
> so that there is more space for TX_COMMAND in the same memory word. The rationale here, is having
> reserve in case of an increasing number of TXT Buffers.
>
> But, with the current format of TX_COMMAND, the reserve would be up to 24 TXT Buffers. Even if there
> ever was a use-case for more than 8 buffers, there would need to be another non-compatible changes
> in TX_PRIORITY and TX_STATUS, and the whole register map would anyway not be backwards compatible...
> So, I propose to keep TXTB_INFO in its current location.

Hi Ondrej,

Based on this it seems my patches can be cleaned up for merging.

Pavel / Odjrej: did you want to include the patches in your public
driver tree and submit from there, or shall I submit them? Adding to
yoru tree would keep it in sync with the upstream driver already
submitted. If you provide a review I'll cleanup any issues reported. I
can submit directly to Linux as Marc proposed but having a single
authoritative source seems cleanest to me.

My current patches are on master in this tree:
https://github.com/AndrewD/ctucanfd_ip_core

I'll add "Signed-off-by: " to the commit messages next time I touch
this - once I get clarity on the way forward. I don't have an
immediate need for a Linux driver for ctucanfd so haven't touched this
recently.

Kind regards,

Andrew