2020-01-09 19:02:39

by Horatiu Vultur

[permalink] [raw]
Subject: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

Media Redundancy Protocol is a data network protocol standardized by
International Electrotechnical Commission as IEC 62439-2. It allows rings of
Ethernet switches to overcome any single failure with recovery time faster than
STP. It is primarily used in Industrial Ethernet applications.

This is the first proposal of implementing a subset of the standard. It supports
only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and
Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and
in a ring can be only one MRM and multiple MRC. It is possible to have multiple
instances of MRP on a single node. But a port can be part of only one MRP
instance.

The MRM is responsible for detecting when there is a loop in the ring. It is
sending the frame MRP_Test to detect the loops. It would send MRP_Test on both
ports in the ring and if the frame is received at the other end, then the ring
is closed. Meaning that there is a loop. In this case it sets the port state to
BLOCKED, not allowing traffic to pass through except MRP frames. In case it
stops receiving MRP_Test frames from itself then the MRM will detect that the
ring is open, therefor it would notify the other nodes of this change and will
set the state of the port to be FORWARDING.

The MRC is responsible for forwarding MRP_Test frames between the ring ports
(and not to flood on other ports) and to listen when there is a change in the
network to clear the FDB.

Similar with STP, MRP is implemented on top of the bridge and they can't be
enable at the same time. While STP runs on all ports of the bridge, MRP needs to
run only on 2 ports.

The bridge needs to:
- notify when the link of one of the ports goes down or up, because MRP instance
needs to react to link changes by sending MRP_LinkChange frames.
- notify when one of the ports are removed from the bridge or when the bridge
is destroyed, because if the port is part of the MRP ring then MRP state
machine should be stopped.
- add a handler to allow MRP instance to process MRP frames, if MRP is enabled.
This is similar with STP design.
- add logic for MRP frames inside the bridge. The bridge will just detect MRP
frames and it would forward them to the upper layer to allow to process it.
- update the logic to update non-MRP frames. If MRP is enabled, then look also
at the state of the port to decide to forward or not.

To create a MRP instance on the bridge:
$ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1

Where:
p_port, s_port: can be any port under the bridge
ring_role: can have the value 1(MRC - Media Redundancy Client) or
2(MRM - Media Redundancy Manager). In a ring can be only one MRM.
ring_id: unique id for each MRP instance.

It is possible to create multiple instances. Each instance has to have it's own
ring_id and a port can't be part of multiple instances:
$ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2

To see current MRP instances and their status:
$ bridge mrp show
dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3
dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4

If this patch series is well received, the in the future it could be extended
with the following:
- add support for Media Redundancy Automanager. This role allows a node to
detect if needs to behave as a MRM or MRC. The advantage of this role is that
the user doesn't need to configure the nodes each time they are added/removed
from a ring and it adds redundancy to the manager.
- add support for Interconnect rings. This allow to connect multiple rings.
- add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10
ms). To be able to achieve 30 and 10 it is required by the HW to generate the
MRP_Test frames and detect when the ring is open/closed.

Horatiu Vultur (3):
net: bridge: mrp: Add support for Media Redundancy Protocol
net: bridge: mrp: Integrate MRP into the bridge
net: bridge: mrp: Add netlink support to configure MRP

include/uapi/linux/if_bridge.h | 27 +
include/uapi/linux/if_ether.h | 1 +
include/uapi/linux/rtnetlink.h | 7 +
net/bridge/Kconfig | 12 +
net/bridge/Makefile | 2 +
net/bridge/br.c | 19 +
net/bridge/br_device.c | 3 +
net/bridge/br_forward.c | 1 +
net/bridge/br_if.c | 10 +
net/bridge/br_input.c | 22 +
net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++
net/bridge/br_mrp_timer.c | 227 +++++
net/bridge/br_netlink.c | 9 +
net/bridge/br_private.h | 30 +
net/bridge/br_private_mrp.h | 208 +++++
security/selinux/nlmsgtab.c | 5 +-
16 files changed, 2099 insertions(+), 1 deletion(-)
create mode 100644 net/bridge/br_mrp.c
create mode 100644 net/bridge/br_mrp_timer.c
create mode 100644 net/bridge/br_private_mrp.h

--
2.17.1


2020-01-09 20:28:44

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

On Thu, 9 Jan 2020 16:06:37 +0100
Horatiu Vultur <[email protected]> wrote:

> Media Redundancy Protocol is a data network protocol standardized by
> International Electrotechnical Commission as IEC 62439-2. It allows rings of
> Ethernet switches to overcome any single failure with recovery time faster than
> STP. It is primarily used in Industrial Ethernet applications.
>
> This is the first proposal of implementing a subset of the standard. It supports
> only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and
> Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and
> in a ring can be only one MRM and multiple MRC. It is possible to have multiple
> instances of MRP on a single node. But a port can be part of only one MRP
> instance.
>
> The MRM is responsible for detecting when there is a loop in the ring. It is
> sending the frame MRP_Test to detect the loops. It would send MRP_Test on both
> ports in the ring and if the frame is received at the other end, then the ring
> is closed. Meaning that there is a loop. In this case it sets the port state to
> BLOCKED, not allowing traffic to pass through except MRP frames. In case it
> stops receiving MRP_Test frames from itself then the MRM will detect that the
> ring is open, therefor it would notify the other nodes of this change and will
> set the state of the port to be FORWARDING.
>
> The MRC is responsible for forwarding MRP_Test frames between the ring ports
> (and not to flood on other ports) and to listen when there is a change in the
> network to clear the FDB.
>
> Similar with STP, MRP is implemented on top of the bridge and they can't be
> enable at the same time. While STP runs on all ports of the bridge, MRP needs to
> run only on 2 ports.
>
> The bridge needs to:
> - notify when the link of one of the ports goes down or up, because MRP instance
> needs to react to link changes by sending MRP_LinkChange frames.
> - notify when one of the ports are removed from the bridge or when the bridge
> is destroyed, because if the port is part of the MRP ring then MRP state
> machine should be stopped.
> - add a handler to allow MRP instance to process MRP frames, if MRP is enabled.
> This is similar with STP design.
> - add logic for MRP frames inside the bridge. The bridge will just detect MRP
> frames and it would forward them to the upper layer to allow to process it.
> - update the logic to update non-MRP frames. If MRP is enabled, then look also
> at the state of the port to decide to forward or not.
>
> To create a MRP instance on the bridge:
> $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1
>
> Where:
> p_port, s_port: can be any port under the bridge
> ring_role: can have the value 1(MRC - Media Redundancy Client) or
> 2(MRM - Media Redundancy Manager). In a ring can be only one MRM.
> ring_id: unique id for each MRP instance.
>
> It is possible to create multiple instances. Each instance has to have it's own
> ring_id and a port can't be part of multiple instances:
> $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2
>
> To see current MRP instances and their status:
> $ bridge mrp show
> dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3
> dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4
>
> If this patch series is well received, the in the future it could be extended
> with the following:
> - add support for Media Redundancy Automanager. This role allows a node to
> detect if needs to behave as a MRM or MRC. The advantage of this role is that
> the user doesn't need to configure the nodes each time they are added/removed
> from a ring and it adds redundancy to the manager.
> - add support for Interconnect rings. This allow to connect multiple rings.
> - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10
> ms). To be able to achieve 30 and 10 it is required by the HW to generate the
> MRP_Test frames and detect when the ring is open/closed.
>
> Horatiu Vultur (3):
> net: bridge: mrp: Add support for Media Redundancy Protocol
> net: bridge: mrp: Integrate MRP into the bridge
> net: bridge: mrp: Add netlink support to configure MRP
>
> include/uapi/linux/if_bridge.h | 27 +
> include/uapi/linux/if_ether.h | 1 +
> include/uapi/linux/rtnetlink.h | 7 +
> net/bridge/Kconfig | 12 +
> net/bridge/Makefile | 2 +
> net/bridge/br.c | 19 +
> net/bridge/br_device.c | 3 +
> net/bridge/br_forward.c | 1 +
> net/bridge/br_if.c | 10 +
> net/bridge/br_input.c | 22 +
> net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++
> net/bridge/br_mrp_timer.c | 227 +++++
> net/bridge/br_netlink.c | 9 +
> net/bridge/br_private.h | 30 +
> net/bridge/br_private_mrp.h | 208 +++++
> security/selinux/nlmsgtab.c | 5 +-
> 16 files changed, 2099 insertions(+), 1 deletion(-)
> create mode 100644 net/bridge/br_mrp.c
> create mode 100644 net/bridge/br_mrp_timer.c
> create mode 100644 net/bridge/br_private_mrp.h
>

Can this be implemented in userspace?

Putting STP in the kernel was a mistake (even original author says so).
Adding more control protocols in kernel is a security and stability risk.

2020-01-09 20:41:00

by Asbjørn Sloth Tønnesen

[permalink] [raw]
Subject: Re: [Bridge] [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

Hi Horatiu and Stephen,

Horatiu, thanks for giving this a try. I am looking forward to maybe someday
be able to run ERPS on white box switches.

On 1/9/20 4:19 PM, Stephen Hemminger wrote:
> Can this be implemented in userspace?
>
> Putting STP in the kernel was a mistake (even original author says so).
> Adding more control protocols in kernel is a security and stability risk.

Another case is VRRP, ERPS (ITU-T G.8032), VRRP group.

My use-case might not be common, but I have machines with about 10k net_dev (QinQ),
I would like to be able to do VRRP group on the outer VLANs, which are only a few
hundred instances without excessive context switching. I would then keep the the
normal keep-alive state machine in kernel, basically a BPF-based timed periodic
packet emitter facility and a XDP recieve hook. So only setup and event handling
has to context switched to user-space.

Unfortunately I haven't had time to explore this yet, but I think such an approach
could solve a few of the reasons that scalable bridge/ring/ha protocols have to wait
20 years before being implemented in Linux.

--
Best regards
Asbjørn Sloth Tønnesen
Network Engineer
Fiberby ApS - AS42541

2020-01-10 09:04:34

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

The 01/09/2020 08:19, Stephen Hemminger wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> On Thu, 9 Jan 2020 16:06:37 +0100
> Horatiu Vultur <[email protected]> wrote:
>
> > Media Redundancy Protocol is a data network protocol standardized by
> > International Electrotechnical Commission as IEC 62439-2. It allows rings of
> > Ethernet switches to overcome any single failure with recovery time faster than
> > STP. It is primarily used in Industrial Ethernet applications.
> >
> > This is the first proposal of implementing a subset of the standard. It supports
> > only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and
> > Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and
> > in a ring can be only one MRM and multiple MRC. It is possible to have multiple
> > instances of MRP on a single node. But a port can be part of only one MRP
> > instance.
> >
> > The MRM is responsible for detecting when there is a loop in the ring. It is
> > sending the frame MRP_Test to detect the loops. It would send MRP_Test on both
> > ports in the ring and if the frame is received at the other end, then the ring
> > is closed. Meaning that there is a loop. In this case it sets the port state to
> > BLOCKED, not allowing traffic to pass through except MRP frames. In case it
> > stops receiving MRP_Test frames from itself then the MRM will detect that the
> > ring is open, therefor it would notify the other nodes of this change and will
> > set the state of the port to be FORWARDING.
> >
> > The MRC is responsible for forwarding MRP_Test frames between the ring ports
> > (and not to flood on other ports) and to listen when there is a change in the
> > network to clear the FDB.
> >
> > Similar with STP, MRP is implemented on top of the bridge and they can't be
> > enable at the same time. While STP runs on all ports of the bridge, MRP needs to
> > run only on 2 ports.
> >
> > The bridge needs to:
> > - notify when the link of one of the ports goes down or up, because MRP instance
> > needs to react to link changes by sending MRP_LinkChange frames.
> > - notify when one of the ports are removed from the bridge or when the bridge
> > is destroyed, because if the port is part of the MRP ring then MRP state
> > machine should be stopped.
> > - add a handler to allow MRP instance to process MRP frames, if MRP is enabled.
> > This is similar with STP design.
> > - add logic for MRP frames inside the bridge. The bridge will just detect MRP
> > frames and it would forward them to the upper layer to allow to process it.
> > - update the logic to update non-MRP frames. If MRP is enabled, then look also
> > at the state of the port to decide to forward or not.
> >
> > To create a MRP instance on the bridge:
> > $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1
> >
> > Where:
> > p_port, s_port: can be any port under the bridge
> > ring_role: can have the value 1(MRC - Media Redundancy Client) or
> > 2(MRM - Media Redundancy Manager). In a ring can be only one MRM.
> > ring_id: unique id for each MRP instance.
> >
> > It is possible to create multiple instances. Each instance has to have it's own
> > ring_id and a port can't be part of multiple instances:
> > $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2
> >
> > To see current MRP instances and their status:
> > $ bridge mrp show
> > dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3
> > dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4
> >
> > If this patch series is well received, the in the future it could be extended
> > with the following:
> > - add support for Media Redundancy Automanager. This role allows a node to
> > detect if needs to behave as a MRM or MRC. The advantage of this role is that
> > the user doesn't need to configure the nodes each time they are added/removed
> > from a ring and it adds redundancy to the manager.
> > - add support for Interconnect rings. This allow to connect multiple rings.
> > - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10
> > ms). To be able to achieve 30 and 10 it is required by the HW to generate the
> > MRP_Test frames and detect when the ring is open/closed.
> >
> > Horatiu Vultur (3):
> > net: bridge: mrp: Add support for Media Redundancy Protocol
> > net: bridge: mrp: Integrate MRP into the bridge
> > net: bridge: mrp: Add netlink support to configure MRP
> >
> > include/uapi/linux/if_bridge.h | 27 +
> > include/uapi/linux/if_ether.h | 1 +
> > include/uapi/linux/rtnetlink.h | 7 +
> > net/bridge/Kconfig | 12 +
> > net/bridge/Makefile | 2 +
> > net/bridge/br.c | 19 +
> > net/bridge/br_device.c | 3 +
> > net/bridge/br_forward.c | 1 +
> > net/bridge/br_if.c | 10 +
> > net/bridge/br_input.c | 22 +
> > net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++
> > net/bridge/br_mrp_timer.c | 227 +++++
> > net/bridge/br_netlink.c | 9 +
> > net/bridge/br_private.h | 30 +
> > net/bridge/br_private_mrp.h | 208 +++++
> > security/selinux/nlmsgtab.c | 5 +-
> > 16 files changed, 2099 insertions(+), 1 deletion(-)
> > create mode 100644 net/bridge/br_mrp.c
> > create mode 100644 net/bridge/br_mrp_timer.c
> > create mode 100644 net/bridge/br_private_mrp.h
> >
>
> Can this be implemented in userspace?

The reason for putting this in kernal space is to HW offload this in
switchdev/dsa driver. The switches which typically supports this are
small and don't have a lot of CPU power and the bandwidth between the
CPU and switch core is typically limited(at least this is the case with
the switches that we are working). Therefor we need to use HW offload
components which can inject the frames at the needed frequency and other
components which can terminate the expected frames and just raise and
interrupt if the test frames are not received as expected(and a few
other HW features).

To put this in user-space we see two options:
1. We need to define a netlink interface which allows a user-space
control application to ask the kernel to ask the switchdev driver to
setup the frame-injector or frame-terminator. In theory this would be
possible, and we have considered it, but we think that this interface
will be too specific for our HW and will need to be changed every time
we want to add support for a new SoC. By focusing the user-space
interfaces on the protocol requirement, we feel more confident that we
have an interface which we can continue to be backwards compatible with,
and also support future/other chips with what ever facilities (if any)
they have to HW offload.

2. Do a UIO driver and keep protocol and driver in user-space. We do not
really like this approach for many reasons: it pretty much prevents us from
collaborating with the community to solve this and it will be really hard
to have the switchdev driver controlling part of the chip and a
user-space driver controlling other parts.

>
> Putting STP in the kernel was a mistake (even original author says so).
> Adding more control protocols in kernel is a security and stability risk.

--
/Horatiu

2020-01-10 14:15:38

by Nikolay Aleksandrov

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

On 09/01/2020 17:06, Horatiu Vultur wrote:
> Media Redundancy Protocol is a data network protocol standardized by
> International Electrotechnical Commission as IEC 62439-2. It allows rings of
> Ethernet switches to overcome any single failure with recovery time faster than
> STP. It is primarily used in Industrial Ethernet applications.
>
> This is the first proposal of implementing a subset of the standard. It supports
> only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and
> Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and
> in a ring can be only one MRM and multiple MRC. It is possible to have multiple
> instances of MRP on a single node. But a port can be part of only one MRP
> instance.
>
> The MRM is responsible for detecting when there is a loop in the ring. It is
> sending the frame MRP_Test to detect the loops. It would send MRP_Test on both
> ports in the ring and if the frame is received at the other end, then the ring
> is closed. Meaning that there is a loop. In this case it sets the port state to
> BLOCKED, not allowing traffic to pass through except MRP frames. In case it
> stops receiving MRP_Test frames from itself then the MRM will detect that the
> ring is open, therefor it would notify the other nodes of this change and will
> set the state of the port to be FORWARDING.
>
> The MRC is responsible for forwarding MRP_Test frames between the ring ports
> (and not to flood on other ports) and to listen when there is a change in the
> network to clear the FDB.
>
> Similar with STP, MRP is implemented on top of the bridge and they can't be
> enable at the same time. While STP runs on all ports of the bridge, MRP needs to
> run only on 2 ports.
>
> The bridge needs to:
> - notify when the link of one of the ports goes down or up, because MRP instance
> needs to react to link changes by sending MRP_LinkChange frames.
> - notify when one of the ports are removed from the bridge or when the bridge
> is destroyed, because if the port is part of the MRP ring then MRP state
> machine should be stopped.
> - add a handler to allow MRP instance to process MRP frames, if MRP is enabled.
> This is similar with STP design.
> - add logic for MRP frames inside the bridge. The bridge will just detect MRP
> frames and it would forward them to the upper layer to allow to process it.
> - update the logic to update non-MRP frames. If MRP is enabled, then look also
> at the state of the port to decide to forward or not.
>
> To create a MRP instance on the bridge:
> $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1
>
> Where:
> p_port, s_port: can be any port under the bridge
> ring_role: can have the value 1(MRC - Media Redundancy Client) or
> 2(MRM - Media Redundancy Manager). In a ring can be only one MRM.
> ring_id: unique id for each MRP instance.
>
> It is possible to create multiple instances. Each instance has to have it's own
> ring_id and a port can't be part of multiple instances:
> $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2
>
> To see current MRP instances and their status:
> $ bridge mrp show
> dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3
> dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4
>
> If this patch series is well received, the in the future it could be extended
> with the following:
> - add support for Media Redundancy Automanager. This role allows a node to
> detect if needs to behave as a MRM or MRC. The advantage of this role is that
> the user doesn't need to configure the nodes each time they are added/removed
> from a ring and it adds redundancy to the manager.
> - add support for Interconnect rings. This allow to connect multiple rings.
> - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10
> ms). To be able to achieve 30 and 10 it is required by the HW to generate the
> MRP_Test frames and detect when the ring is open/closed.
>
> Horatiu Vultur (3):
> net: bridge: mrp: Add support for Media Redundancy Protocol
> net: bridge: mrp: Integrate MRP into the bridge
> net: bridge: mrp: Add netlink support to configure MRP
>
> include/uapi/linux/if_bridge.h | 27 +
> include/uapi/linux/if_ether.h | 1 +
> include/uapi/linux/rtnetlink.h | 7 +
> net/bridge/Kconfig | 12 +
> net/bridge/Makefile | 2 +
> net/bridge/br.c | 19 +
> net/bridge/br_device.c | 3 +
> net/bridge/br_forward.c | 1 +
> net/bridge/br_if.c | 10 +
> net/bridge/br_input.c | 22 +
> net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++
> net/bridge/br_mrp_timer.c | 227 +++++
> net/bridge/br_netlink.c | 9 +
> net/bridge/br_private.h | 30 +
> net/bridge/br_private_mrp.h | 208 +++++
> security/selinux/nlmsgtab.c | 5 +-
> 16 files changed, 2099 insertions(+), 1 deletion(-)
> create mode 100644 net/bridge/br_mrp.c
> create mode 100644 net/bridge/br_mrp_timer.c
> create mode 100644 net/bridge/br_private_mrp.h
>

Hi all,
I agree with Stephen here, IMO you have to take note of how STP has progressed
and that bringing it in the kernel was a mistake, these days mstpd has an active
community and much better support which is being extended. This looks best implemented
in user-space in my opinion with minimal kernel changes to support it. You could simply
open a packet socket with a filter and work through that, you don't need new netlink
sockets. I'm not familiar with the protocol so can't really be the judge of that, if
you present a good argument for needing a new netlink socket for these packets - then
sure, ok.

If you do decide to continue with the kernel version (which I would again discourage)
a few general points (from a quick scan):
- the single 1.6+k line patch is just hard to review, please break it into more digestable
and logical pieces
- the locking is wrong, also there're a few use-after-free bugs
- please re-work the bridge integration code, it can be simplified and tests can be eliminated
- your netlink helpers usage is generally wrong and needs more work
- use the already existing port states instead of adding new ones and you can avoid some tests in fast-path
- perhaps look into using br_afspec() for configuration/retrieval initially ? I don't think you need the new rtm messages yet.
- I'm sure I can go on, but I really think all of this should be put in user-space -
in-kernel STP is a great example of how _not_ to do it. :) As a bonus you'll avoid 90% of the
problems above just by making your own abstractions and using them for it.


Thanks,
Nik

2020-01-10 15:38:22

by Stephen Hemminger

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

On Fri, 10 Jan 2020 10:02:06 +0100
Horatiu Vultur <[email protected]> wrote:

> >
> > Can this be implemented in userspace?
>
> The reason for putting this in kernal space is to HW offload this in
> switchdev/dsa driver. The switches which typically supports this are
> small and don't have a lot of CPU power and the bandwidth between the
> CPU and switch core is typically limited(at least this is the case with
> the switches that we are working). Therefor we need to use HW offload
> components which can inject the frames at the needed frequency and other
> components which can terminate the expected frames and just raise and
> interrupt if the test frames are not received as expected(and a few
> other HW features).
>
> To put this in user-space we see two options:
> 1. We need to define a netlink interface which allows a user-space
> control application to ask the kernel to ask the switchdev driver to
> setup the frame-injector or frame-terminator. In theory this would be
> possible, and we have considered it, but we think that this interface
> will be too specific for our HW and will need to be changed every time
> we want to add support for a new SoC. By focusing the user-space
> interfaces on the protocol requirement, we feel more confident that we
> have an interface which we can continue to be backwards compatible with,
> and also support future/other chips with what ever facilities (if any)
> they have to HW offload.
>
> 2. Do a UIO driver and keep protocol and driver in user-space. We do not
> really like this approach for many reasons: it pretty much prevents us from
> collaborating with the community to solve this and it will be really hard
> to have the switchdev driver controlling part of the chip and a
> user-space driver controlling other parts.
>
> >
> > Putting STP in the kernel was a mistake (even original author says so).
> > Adding more control protocols in kernel is a security and stability risk.

The principal in networking is to separate control and data plane.
This is widely adopted in many areas: OVS, routing, etc.

There is an existing devlink interface for device control, it would
make sense to extend it to allow for more control of frame inject etc.

2020-01-10 15:40:12

by Nikolay Aleksandrov

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

On 10/01/2020 16:13, Nikolay Aleksandrov wrote:
> On 09/01/2020 17:06, Horatiu Vultur wrote:
>> Media Redundancy Protocol is a data network protocol standardized by
>> International Electrotechnical Commission as IEC 62439-2. It allows rings of
>> Ethernet switches to overcome any single failure with recovery time faster than
>> STP. It is primarily used in Industrial Ethernet applications.
>>
>> This is the first proposal of implementing a subset of the standard. It supports
>> only 2 roles of an MRP node. It supports only Media Redundancy Manager(MRM) and
>> Media Redundancy Client(MRC). In a MRP ring, each node needs to support MRP and
>> in a ring can be only one MRM and multiple MRC. It is possible to have multiple
>> instances of MRP on a single node. But a port can be part of only one MRP
>> instance.
>>
>> The MRM is responsible for detecting when there is a loop in the ring. It is
>> sending the frame MRP_Test to detect the loops. It would send MRP_Test on both
>> ports in the ring and if the frame is received at the other end, then the ring
>> is closed. Meaning that there is a loop. In this case it sets the port state to
>> BLOCKED, not allowing traffic to pass through except MRP frames. In case it
>> stops receiving MRP_Test frames from itself then the MRM will detect that the
>> ring is open, therefor it would notify the other nodes of this change and will
>> set the state of the port to be FORWARDING.
>>
>> The MRC is responsible for forwarding MRP_Test frames between the ring ports
>> (and not to flood on other ports) and to listen when there is a change in the
>> network to clear the FDB.
>>
>> Similar with STP, MRP is implemented on top of the bridge and they can't be
>> enable at the same time. While STP runs on all ports of the bridge, MRP needs to
>> run only on 2 ports.
>>
>> The bridge needs to:
>> - notify when the link of one of the ports goes down or up, because MRP instance
>> needs to react to link changes by sending MRP_LinkChange frames.
>> - notify when one of the ports are removed from the bridge or when the bridge
>> is destroyed, because if the port is part of the MRP ring then MRP state
>> machine should be stopped.
>> - add a handler to allow MRP instance to process MRP frames, if MRP is enabled.
>> This is similar with STP design.
>> - add logic for MRP frames inside the bridge. The bridge will just detect MRP
>> frames and it would forward them to the upper layer to allow to process it.
>> - update the logic to update non-MRP frames. If MRP is enabled, then look also
>> at the state of the port to decide to forward or not.
>>
>> To create a MRP instance on the bridge:
>> $ bridge mrp add dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1
>>
>> Where:
>> p_port, s_port: can be any port under the bridge
>> ring_role: can have the value 1(MRC - Media Redundancy Client) or
>> 2(MRM - Media Redundancy Manager). In a ring can be only one MRM.
>> ring_id: unique id for each MRP instance.
>>
>> It is possible to create multiple instances. Each instance has to have it's own
>> ring_id and a port can't be part of multiple instances:
>> $ bridge mrp add dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2
>>
>> To see current MRP instances and their status:
>> $ bridge mrp show
>> dev br0 p_port eth2 s_port eth3 ring_role 1 ring_id 2 ring_state 3
>> dev br0 p_port eth0 s_port eth1 ring_role 2 ring_id 1 ring_state 4
>>
>> If this patch series is well received, the in the future it could be extended
>> with the following:
>> - add support for Media Redundancy Automanager. This role allows a node to
>> detect if needs to behave as a MRM or MRC. The advantage of this role is that
>> the user doesn't need to configure the nodes each time they are added/removed
>> from a ring and it adds redundancy to the manager.
>> - add support for Interconnect rings. This allow to connect multiple rings.
>> - add HW offloading. The standard defines 4 recovery times (500, 200, 30 and 10
>> ms). To be able to achieve 30 and 10 it is required by the HW to generate the
>> MRP_Test frames and detect when the ring is open/closed.
>>
>> Horatiu Vultur (3):
>> net: bridge: mrp: Add support for Media Redundancy Protocol
>> net: bridge: mrp: Integrate MRP into the bridge
>> net: bridge: mrp: Add netlink support to configure MRP
>>
>> include/uapi/linux/if_bridge.h | 27 +
>> include/uapi/linux/if_ether.h | 1 +
>> include/uapi/linux/rtnetlink.h | 7 +
>> net/bridge/Kconfig | 12 +
>> net/bridge/Makefile | 2 +
>> net/bridge/br.c | 19 +
>> net/bridge/br_device.c | 3 +
>> net/bridge/br_forward.c | 1 +
>> net/bridge/br_if.c | 10 +
>> net/bridge/br_input.c | 22 +
>> net/bridge/br_mrp.c | 1517 ++++++++++++++++++++++++++++++++
>> net/bridge/br_mrp_timer.c | 227 +++++
>> net/bridge/br_netlink.c | 9 +
>> net/bridge/br_private.h | 30 +
>> net/bridge/br_private_mrp.h | 208 +++++
>> security/selinux/nlmsgtab.c | 5 +-
>> 16 files changed, 2099 insertions(+), 1 deletion(-)
>> create mode 100644 net/bridge/br_mrp.c
>> create mode 100644 net/bridge/br_mrp_timer.c
>> create mode 100644 net/bridge/br_private_mrp.h
>>
>
> Hi all,
> I agree with Stephen here, IMO you have to take note of how STP has progressed
> and that bringing it in the kernel was a mistake, these days mstpd has an active
> community and much better support which is being extended. This looks best implemented
> in user-space in my opinion with minimal kernel changes to support it. You could simply
> open a packet socket with a filter and work through that, you don't need new netlink
> sockets. I'm not familiar with the protocol so can't really be the judge of that, if
> you present a good argument for needing a new netlink socket for these packets - then
> sure, ok.

nevermind the last sentence (about packet/netlink), I misread your earlier reply :)

>
> If you do decide to continue with the kernel version (which I would again discourage)
> a few general points (from a quick scan):
> - the single 1.6+k line patch is just hard to review, please break it into more digestable
> and logical pieces
> - the locking is wrong, also there're a few use-after-free bugs
> - please re-work the bridge integration code, it can be simplified and tests can be eliminated
> - your netlink helpers usage is generally wrong and needs more work
> - use the already existing port states instead of adding new ones and you can avoid some tests in fast-path
> - perhaps look into using br_afspec() for configuration/retrieval initially ? I don't think you need the new rtm messages yet.
> - I'm sure I can go on, but I really think all of this should be put in user-space -
> in-kernel STP is a great example of how _not_ to do it. :) As a bonus you'll avoid 90% of the
> problems above just by making your own abstractions and using them for it.
>
>
> Thanks,
> Nik
>

2020-01-10 16:07:12

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

Hi Nik,

> I agree with Stephen here, IMO you have to take note of how STP has progressed
> and that bringing it in the kernel was a mistake, these days mstpd has an active
> community and much better support which is being extended. This looks best implemented
> in user-space in my opinion with minimal kernel changes to support it. You could simply
> open a packet socket with a filter and work through that, you don't need new netlink
> sockets. I'm not familiar with the protocol so can't really be the judge of that, if
> you present a good argument for needing a new netlink socket for these packets - then
> sure, ok.

We are aware of the STP story, and in case of STP I do agree, it is much
better to have this in user-space. But while MRP has much in common with
STP, it also differs in some important areas.

Most importantly, MRP requires sending and receiving thousands of frames
per second. To achieve the 10ms recovery time, the tx period per
interface is 500us, on two interfaces, adding up to 4000 frames per
second to RX and 4000 to TX(if the ring is closed). And this is per
ring...

The CPU systems in the kind of switches we are working on can not handle
this load, and it was not meant to handle this. Instead the switch core
can do the periodic injection of frames and automatic terminate them.

In patch posted, we have not added this HW offload (we have this in our
internal repos, where we also have implemented the remaining part of the
protocol). The reason for this is that we wanted to do a proper SW
implementation and then HW offload it.

Looking back, I can see that what we have presented here could be done
equally good in user-space (roughly), but that is because the HW offload
is not part of this patch.

The problem in putting it in user-space is that we do not have a nice a
clean API where it is just putting a port in forwarding/blocking state
(like we have with STP). To do an abstraction that actually allow us to
utilize the HW to offload a protocol like MRP will very easy become too
specific for our SoC and rejected with that argument.

>
> If you do decide to continue with the kernel version (which I would again discourage)
> a few general points (from a quick scan):
> - the single 1.6+k line patch is just hard to review, please break it into more digestable
> and logical pieces
We will work in this.

> - the locking is wrong, also there're a few use-after-free bugs
Oops, that is not good - happy that you caught it. A hint on where,
would be great.

> - please re-work the bridge integration code, it can be simplified and tests can be eliminated
We will have a second look at that.

> - your netlink helpers usage is generally wrong and needs more work
Ok - some hints on what we did wrong would be great.

> - use the already existing port states instead of adding new ones and you can avoid some tests in fast-path
I assume you want us to re-use the STP concept of forwarding/blocking
and relay on the checks it already has.

> - perhaps look into using br_afspec() for configuration/retrieval initially ? I don't think you need the new rtm messages yet.
Is that a good example on how to do the netlink interface, and you want
us to use that as a reference?

> - I'm sure I can go on, but I really think all of this should be put in user-space -
> in-kernel STP is a great example of how _not_ to do it. :) As a bonus you'll avoid 90% of the
> problems above just by making your own abstractions and using them for it.
Please continue.

We do not see any good paths for getting user-space based solutions
which actually does use the HW offloading accepted upstream. If this
path exists then we would like to understand it and evaluate it
properly.

--
/Horatiu

2020-01-10 16:24:00

by Vladimir Oltean

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

Hi Horatiu,

On Fri, 10 Jan 2020 at 18:04, Horatiu Vultur
<[email protected]> wrote:
>
> Hi Nik,
>
> > I agree with Stephen here, IMO you have to take note of how STP has progressed
> > and that bringing it in the kernel was a mistake, these days mstpd has an active
> > community and much better support which is being extended. This looks best implemented
> > in user-space in my opinion with minimal kernel changes to support it. You could simply
> > open a packet socket with a filter and work through that, you don't need new netlink
> > sockets. I'm not familiar with the protocol so can't really be the judge of that, if
> > you present a good argument for needing a new netlink socket for these packets - then
> > sure, ok.
>
> We are aware of the STP story, and in case of STP I do agree, it is much
> better to have this in user-space. But while MRP has much in common with
> STP, it also differs in some important areas.
>
> Most importantly, MRP requires sending and receiving thousands of frames
> per second. To achieve the 10ms recovery time, the tx period per
> interface is 500us, on two interfaces, adding up to 4000 frames per
> second to RX and 4000 to TX(if the ring is closed). And this is per
> ring...
>
> The CPU systems in the kind of switches we are working on can not handle
> this load, and it was not meant to handle this. Instead the switch core
> can do the periodic injection of frames and automatic terminate them.
>
> In patch posted, we have not added this HW offload (we have this in our
> internal repos, where we also have implemented the remaining part of the
> protocol). The reason for this is that we wanted to do a proper SW
> implementation and then HW offload it.
>
> Looking back, I can see that what we have presented here could be done
> equally good in user-space (roughly), but that is because the HW offload
> is not part of this patch.
>
> The problem in putting it in user-space is that we do not have a nice a
> clean API where it is just putting a port in forwarding/blocking state
> (like we have with STP). To do an abstraction that actually allow us to
> utilize the HW to offload a protocol like MRP will very easy become too
> specific for our SoC and rejected with that argument.
>
> >
> > If you do decide to continue with the kernel version (which I would again discourage)
> > a few general points (from a quick scan):
> > - the single 1.6+k line patch is just hard to review, please break it into more digestable
> > and logical pieces
> We will work in this.
>
> > - the locking is wrong, also there're a few use-after-free bugs
> Oops, that is not good - happy that you caught it. A hint on where,
> would be great.
>
> > - please re-work the bridge integration code, it can be simplified and tests can be eliminated
> We will have a second look at that.
>
> > - your netlink helpers usage is generally wrong and needs more work
> Ok - some hints on what we did wrong would be great.
>
> > - use the already existing port states instead of adding new ones and you can avoid some tests in fast-path
> I assume you want us to re-use the STP concept of forwarding/blocking
> and relay on the checks it already has.
>
> > - perhaps look into using br_afspec() for configuration/retrieval initially ? I don't think you need the new rtm messages yet.
> Is that a good example on how to do the netlink interface, and you want
> us to use that as a reference?
>
> > - I'm sure I can go on, but I really think all of this should be put in user-space -
> > in-kernel STP is a great example of how _not_ to do it. :) As a bonus you'll avoid 90% of the
> > problems above just by making your own abstractions and using them for it.
> Please continue.
>
> We do not see any good paths for getting user-space based solutions
> which actually does use the HW offloading accepted upstream. If this
> path exists then we would like to understand it and evaluate it
> properly.
>
> --
> /Horatiu

I think it would help your case if you explained a bit more about the
hw offload primitives you have implemented internally. I believe you
are talking about the frame generation engine in the Ocelot switch
which has 1024 frame slots that are periodically sent based on one of
8 timers. For receive, I believe that the functionality is to offload
the consumption of these periodic frames, and just raise an interrupt
if frames were expected but not received.
For your use case of MRP, it makes perfect sense to have this. I am
just not sure (and not knowledgeable enough in Linux) what this engine
is offloading from the operating system's perspective.

Your justification for implementing MRP in the kernel seems to be that
it's better to make MRP part of the kernel uapi than a configuration
interface for your periodic engine, which in principle I agree with.
I'm just not sure if the offload that you propose will have a trivial
path into the kernel either, so it would make sense for reviewers to
see everything put together first.

Thanks,
-Vladimir

2020-01-10 16:49:23

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

> I think it would help your case if you explained a bit more about
> the hw offload primitives you have implemented internally.

Agreed.

Horatiu, could you also give some references to the frames that need
to be sent. I've no idea what information they need to contain, if the
contents is dynamic, or static, etc.

Andrew

2020-01-10 17:26:47

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)


Hi Valdimir and Andrew

The 01/10/2020 18:21, Vladimir Oltean wrote:
> I think it would help your case if you explained a bit more about the
> hw offload primitives you have implemented internally. I believe you
> are talking about the frame generation engine in the Ocelot switch
> which has 1024 frame slots that are periodically sent based on one of
> 8 timers. For receive, I believe that the functionality is to offload
> the consumption of these periodic frames, and just raise an interrupt
> if frames were expected but not received.
Yes something like this. But it is worth mention that it is not just about
injecting frames, sequence number needs to be incremented (by HW) etc.

> For your use case of MRP, it makes perfect sense to have this. I am
> just not sure (and not knowledgeable enough in Linux) what this engine
> is offloading from the operating system's perspective.
We will try to make that more clear.

> Your justification for implementing MRP in the kernel seems to be that
> it's better to make MRP part of the kernel uapi than a configuration
> interface for your periodic engine, which in principle I agree with.
> I'm just not sure if the offload that you propose will have a trivial
> path into the kernel either, so it would make sense for reviewers to
> see everything put together first.
You are right. The decision of start by publishing a pure SW implementation with
no HW offload was not the best.

I can do a new RFC that does including the HW offload hooks, and
describe what configurations we do when these hooks are called. The
actual HW which implements these hooks is still not released (and the
SwitchDev driver for this device is still not submitted).

> Horatiu, could you also give some references to the frames that need
> to be sent. I've no idea what information they need to contain, if the
> contents is dynamic, or static, etc.
It is dynamic - but trivial... Here is a dump from WireShark with
annotation on what our HW can update:

Ethernet II, Src: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1), Dst: Iec_00:00:01 (01:15:4e:00:00:01)
Destination: Iec_00:00:01 (01:15:4e:00:00:01)
Source: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1)
Type: MRP (0x88e3)
PROFINET MRP MRP_Test, MRP_Common, MRP_End
MRP_Version: 1
MRP_TLVHeader.Type: MRP_Test (0x02)
MRP_TLVHeader.Type: MRP_Test (0x02)
MRP_TLVHeader.Length: 18
MRP_Prio: 0x1f40 High priorities
MRP_SA: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1)
MRP_PortRole: Primary ring port (0x0000)
MRP_RingState: Ring closed (0x0001)
MRP_Transition: 0x0001
MRP_TimeStamp [ms]: 0x000cf574 <---------- Updated automatic
MRP_TLVHeader.Type: MRP_Common (0x01)
MRP_TLVHeader.Type: MRP_Common (0x01)
MRP_TLVHeader.Length: 18
MRP_SequenceID: 0x00e9 <---------- Updated automatic
MRP_DomainUUID: ffffffff-ffff-ffff-ffff-ffffffffffff
MRP_TLVHeader.Type: MRP_End (0x00)
MRP_TLVHeader.Type: MRP_End (0x00)
MRP_TLVHeader.Length: 0

But all the fields can change, but to change the other fields we need to
interact with the HW. Other SoC may have other capabilities in their
offload. As an example, if the ring becomes open then the fields
MRP_RingState and MRP_Transition need to change and in our case this
requires SW interference.

Would you like a PCAP file as an example? Or do you want a better
description of the frame format.

/Horatiu

2020-01-10 17:57:17

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

> > Horatiu, could you also give some references to the frames that need
> > to be sent. I've no idea what information they need to contain, if the
> > contents is dynamic, or static, etc.
> It is dynamic - but trivial...

If it is trivial, i don't see why you are so worried about abstracting
it?

> Here is a dump from WireShark with
> annotation on what our HW can update:
>
> Ethernet II, Src: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1), Dst: Iec_00:00:01 (01:15:4e:00:00:01)
> Destination: Iec_00:00:01 (01:15:4e:00:00:01)
> Source: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1)
> Type: MRP (0x88e3)
> PROFINET MRP MRP_Test, MRP_Common, MRP_End
> MRP_Version: 1
> MRP_TLVHeader.Type: MRP_Test (0x02)
> MRP_TLVHeader.Type: MRP_Test (0x02)
> MRP_TLVHeader.Length: 18
> MRP_Prio: 0x1f40 High priorities
> MRP_SA: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1)
> MRP_PortRole: Primary ring port (0x0000)
> MRP_RingState: Ring closed (0x0001)
> MRP_Transition: 0x0001
> MRP_TimeStamp [ms]: 0x000cf574 <---------- Updated automatic
> MRP_TLVHeader.Type: MRP_Common (0x01)
> MRP_TLVHeader.Type: MRP_Common (0x01)
> MRP_TLVHeader.Length: 18
> MRP_SequenceID: 0x00e9 <---------- Updated automatic
> MRP_DomainUUID: ffffffff-ffff-ffff-ffff-ffffffffffff
> MRP_TLVHeader.Type: MRP_End (0x00)
> MRP_TLVHeader.Type: MRP_End (0x00)
> MRP_TLVHeader.Length: 0
>
> But all the fields can change, but to change the other fields we need to
> interact with the HW. Other SoC may have other capabilities in their
> offload. As an example, if the ring becomes open then the fields
> MRP_RingState and MRP_Transition need to change and in our case this
> requires SW interference.

Isn't SW always required? You need to tell your state machine that the
state has changed.

> Would you like a PCAP file as an example? Or do you want a better
> description of the frame format.

I was hoping for a link to an RFC, or some standards document.

Andrew

2020-01-10 19:28:50

by David Miller

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

From: Nikolay Aleksandrov <[email protected]>
Date: Fri, 10 Jan 2020 16:13:36 +0200

> I agree with Stephen here, IMO you have to take note of how STP has progressed
> and that bringing it in the kernel was a mistake, these days mstpd has an active
> community and much better support which is being extended. This looks best implemented
> in user-space in my opinion with minimal kernel changes to support it. You could simply
> open a packet socket with a filter and work through that, you don't need new netlink
> sockets. I'm not familiar with the protocol so can't really be the judge of that, if
> you present a good argument for needing a new netlink socket for these packets - then
> sure, ok.

With a userland implementation, what approach do you suggest for DSA/switchdev offload
of this stuff?

2020-01-10 20:04:54

by Nikolay Aleksandrov

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

On 10 January 2020 21:27:36 EET, David Miller <[email protected]> wrote:
>From: Nikolay Aleksandrov <[email protected]>
>Date: Fri, 10 Jan 2020 16:13:36 +0200
>
>> I agree with Stephen here, IMO you have to take note of how STP has
>progressed
>> and that bringing it in the kernel was a mistake, these days mstpd
>has an active
>> community and much better support which is being extended. This looks
>best implemented
>> in user-space in my opinion with minimal kernel changes to support
>it. You could simply
>> open a packet socket with a filter and work through that, you don't
>need new netlink
>> sockets. I'm not familiar with the protocol so can't really be the
>judge of that, if
>> you present a good argument for needing a new netlink socket for
>these packets - then
>> sure, ok.
>
>With a userland implementation, what approach do you suggest for
>DSA/switchdev offload
>of this stuff?

Good question, there was no mention of that initially, or I missed it at least.
There aren't many details about what/how will be offloaded right now.
We need more information about what will be offloaded and how it will fit.



l

2020-01-10 20:14:04

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

The 01/10/2020 18:56, Andrew Lunn wrote:
> EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
>
> > > Horatiu, could you also give some references to the frames that need
> > > to be sent. I've no idea what information they need to contain, if the
> > > contents is dynamic, or static, etc.
> > It is dynamic - but trivial...
>
> If it is trivial, i don't see why you are so worried about abstracting
> it?
Maybe we misunderstood each other. When you asked if it is dynamic or
static, I thought you meant if it is the same frame being send repeated
or if it needs to be changed. It needs to be changed but the changes are
trivial, but it means that a non-MRP aware frame generator can't
properly offload this.

>
> > Here is a dump from WireShark with
> > annotation on what our HW can update:
> >
> > Ethernet II, Src: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1), Dst: Iec_00:00:01 (01:15:4e:00:00:01)
> > Destination: Iec_00:00:01 (01:15:4e:00:00:01)
> > Source: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1)
> > Type: MRP (0x88e3)
> > PROFINET MRP MRP_Test, MRP_Common, MRP_End
> > MRP_Version: 1
> > MRP_TLVHeader.Type: MRP_Test (0x02)
> > MRP_TLVHeader.Type: MRP_Test (0x02)
> > MRP_TLVHeader.Length: 18
> > MRP_Prio: 0x1f40 High priorities
> > MRP_SA: 7a:8b:b1:35:96:e1 (7a:8b:b1:35:96:e1)
> > MRP_PortRole: Primary ring port (0x0000)
> > MRP_RingState: Ring closed (0x0001)
> > MRP_Transition: 0x0001
> > MRP_TimeStamp [ms]: 0x000cf574 <---------- Updated automatic
> > MRP_TLVHeader.Type: MRP_Common (0x01)
> > MRP_TLVHeader.Type: MRP_Common (0x01)
> > MRP_TLVHeader.Length: 18
> > MRP_SequenceID: 0x00e9 <---------- Updated automatic
> > MRP_DomainUUID: ffffffff-ffff-ffff-ffff-ffffffffffff
> > MRP_TLVHeader.Type: MRP_End (0x00)
> > MRP_TLVHeader.Type: MRP_End (0x00)
> > MRP_TLVHeader.Length: 0
> >
> > But all the fields can change, but to change the other fields we need to
> > interact with the HW. Other SoC may have other capabilities in their
> > offload. As an example, if the ring becomes open then the fields
> > MRP_RingState and MRP_Transition need to change and in our case this
> > requires SW interference.
>
> Isn't SW always required? You need to tell your state machine that the
> state has changed.
In the manager role(MRM), SW is always required. The client can be
implemented simply by limiting the flood-mask of the L2 multicast MAC
used.

The auto and the interconnect roles also required SW to drive the
protocol. These roles are not part of this patch set.

>
> > Would you like a PCAP file as an example? Or do you want a better
> > description of the frame format.
>
> I was hoping for a link to an RFC, or some standards document.
Yeah, I would love to have that as well... Unfortunately this is
standardized by IEC, and the standards are not free.
It can be bought here: https://webstore.iec.ch/publication/24434

Due to the copyright/license of the PDF, I'm not allowed to give you a
copy of the one I have.

>
> Andrew

--
/Horatiu

2020-01-10 20:26:19

by Horatiu Vultur

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

> >With a userland implementation, what approach do you suggest for
> >DSA/switchdev offload
> >of this stuff?
>
> Good question, there was no mention of that initially, or I missed it at least.
> There aren't many details about what/how will be offloaded right now.
> We need more information about what will be offloaded and how it will fit.
I think we should do a new version of the RFC-Patch with the hooks to
offload included. Just the signatures and the invocations should give
the context we are missing in the discussion.

Depending on how the discussion goes from there, we can then either work
on putting this in user-space or fix the issues pointed out in the
original attempt.

/Horatiu

2020-01-10 20:34:54

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC net-next Patch 0/3] net: bridge: mrp: Add support for Media Redundancy Protocol(MRP)

On Fri, Jan 10, 2020 at 09:12:48PM +0100, Horatiu Vultur wrote:
> The 01/10/2020 18:56, Andrew Lunn wrote:
> > EXTERNAL EMAIL: Do not click links or open attachments unless you know the content is safe
> >
> > > > Horatiu, could you also give some references to the frames that need
> > > > to be sent. I've no idea what information they need to contain, if the
> > > > contents is dynamic, or static, etc.
> > > It is dynamic - but trivial...
> >
> > If it is trivial, i don't see why you are so worried about abstracting
> > it?
> Maybe we misunderstood each other. When you asked if it is dynamic or
> static, I thought you meant if it is the same frame being send repeated
> or if it needs to be changed. It needs to be changed but the changes are
> trivial, but it means that a non-MRP aware frame generator can't
> properly offload this.

The only frame generator i've ever seen are for generating test
packets. They generally have random content, random length, maybe the
option to send invalid CRC etc. These are never going to work for MRP.
So we should limit our thinking to hardware specifically designed for
MRP offload.

What we need to think about is an abstract model for MRP offload. What
must such a bit of hardware do? What parameters do we need to pass to
it? When should it interrupt us because some event has happened?

Once we have an abstract model, we can define netlink messages, or
devlink messages. And you can implement driver code which takes this
abstract model and implements it for your real hardware. And if you
have the abstract model correct, other vendors should also be able to
implement drivers as well.

Since this is a closed standard, there is not much the rest of us can
do. You need to define this abstract model. We can then review it.

Andrew