Enhance the error reporting mechanism in the switchdev framework to
provide more informative and user-friendly error messages.
Following feedback from users struggling to understand the implications
of error messages like "failed (err=-28) to add object (id=2)", this
update aims to clarify what operation failed and how this might impact
the system or network.
With this change, error messages now include a description of the failed
operation, the specific object involved, and a brief explanation of the
potential impact on the system. This approach helps administrators and
developers better understand the context and severity of errors,
facilitating quicker and more effective troubleshooting.
Example of the improved logging:
[ 70.516446] ksz-switch spi0.0 uplink: Failed to add Port Multicast
Database entry (object id=2) with error: -ENOSPC (-28).
[ 70.516446] Failure in updating the port's Multicast Database could
lead to multicast forwarding issues.
[ 70.516446] Current HW/SW setup lacks sufficient resources.
This comprehensive update includes handling for a range of switchdev
object IDs, ensuring that most operations within the switchdev framework
benefit from clearer error reporting.
Signed-off-by: Oleksij Rempel <[email protected]>
Reviewed-by: Simon Horman <[email protected]>
---
changes v2:
- make all variables const
- add Reviewed-by: Simon Horman <[email protected]>
net/switchdev/switchdev.c | 106 ++++++++++++++++++++++++++++++++++++--
1 file changed, 102 insertions(+), 4 deletions(-)
diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
index c9189a970eec3..de4a7505ceb4d 100644
--- a/net/switchdev/switchdev.c
+++ b/net/switchdev/switchdev.c
@@ -244,6 +244,106 @@ static int switchdev_port_obj_notify(enum switchdev_notifier_type nt,
return 0;
}
+static void switchdev_obj_id_to_helpful_msg(struct net_device *dev,
+ enum switchdev_obj_id obj_id,
+ int err, bool add)
+{
+ const char *action = add ? "add" : "del";
+ const char *reason = "";
+ const char *problem;
+ const char *obj_str;
+
+ switch (obj_id) {
+ case SWITCHDEV_OBJ_ID_UNDEFINED:
+ obj_str = "Undefined object";
+ problem = "Attempted operation is undefined, indicating a "
+ "possible programming error.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_PORT_VLAN:
+ obj_str = "VLAN entry";
+ problem = "Failure in VLAN settings on this port might disrupt "
+ "network segmentation or traffic isolation, affecting\n"
+ "network partitioning.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_PORT_MDB:
+ obj_str = "Port Multicast Database entry";
+ problem = "Failure in updating the port's Multicast Database "
+ "could lead to multicast forwarding issues.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_HOST_MDB:
+ obj_str = "Host Multicast Database entry";
+ problem = "Failure in updating the host's Multicast Database"
+ "may impact multicast group memberships or\n"
+ "traffic delivery, affecting multicast communication.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_MRP:
+ obj_str = "Media Redundancy Protocol configuration for port";
+ problem = "Failure to set MRP ring ID on this port prevents"
+ "communication with the specified redundancy ring,\n"
+ "resulting in an inability to engage in MRP-based "
+ "network operations.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_RING_TEST_MRP:
+ obj_str = "MRP Test Frame Operations for port";
+ problem = "Failure to generate/monitor MRP test frames may lead"
+ "to inability to assess the ring's operational\n"
+ "integrity and fault response, hindering proactive "
+ "network management.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_RING_ROLE_MRP:
+ obj_str = "MRP Ring Role Configuration";
+ problem = "Improper MRP ring role configuration may create "
+ "conflicts in the ring, disrupting communication\n"
+ "for all participants, or isolate the local system "
+ "from the ring, hindering its ability to communicate "
+ "with other participants.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_RING_STATE_MRP:
+ obj_str = "MRP Ring State Configuration";
+ problem = "Failure to correctly set the MRP ring state can "
+ "result in network loops or leave segments without\n"
+ "communication. In a Closed state, it maintains loop "
+ "prevention by blocking one MRM port, while an Open\n"
+ "state activates in response to failures, changing "
+ "port states to preserve network connectivity.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_IN_TEST_MRP:
+ obj_str = "MRP_InTest Frame Generation Configuration";
+ problem = "Failure in managing MRP_InTest frame generation can "
+ "misjudge the interconnection ring's state, leading\n"
+ "to incorrect blocking or unblocking of the I/C port."
+ "This misconfiguration might result in unintended\n"
+ "network loops or isolate critical network segments, "
+ "compromising network integrity and reliability.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_IN_ROLE_MRP:
+ obj_str = "Interconnection Ring Role Configuration";
+ problem = "Failure in incorrect assignment of interconnection "
+ "ring roles (MIM/MIC) can impair the formation of the\n"
+ "interconnection rings.\n";
+ break;
+ case SWITCHDEV_OBJ_ID_IN_STATE_MRP:
+ obj_str = "Interconnection Ring State Configuration";
+ problem = "Failure in updating the interconnection ring state "
+ "can lead in case of Open state to incorrect blocking\n"
+ "or unblocking of the I/C port, resulting in unintended"
+ "network loops or isolation of critical network\n";
+ break;
+ default:
+ obj_str = "Unknown object";
+ problem = "Indicating a possible programming error.\n";
+ }
+
+ switch (err) {
+ case -ENOSPC:
+ reason = "Current HW/SW setup lacks sufficient resources.\n";
+ break;
+ }
+
+ netdev_err(dev, "Failed to %s %s (object id=%d) with error: %pe (%d).\n%s%s\n",
+ action, obj_str, obj_id, ERR_PTR(err), err, problem, reason);
+}
+
static void switchdev_port_obj_add_deferred(struct net_device *dev,
const void *data)
{
@@ -254,8 +354,7 @@ static void switchdev_port_obj_add_deferred(struct net_device *dev,
err = switchdev_port_obj_notify(SWITCHDEV_PORT_OBJ_ADD,
dev, obj, NULL);
if (err && err != -EOPNOTSUPP)
- netdev_err(dev, "failed (err=%d) to add object (id=%d)\n",
- err, obj->id);
+ switchdev_obj_id_to_helpful_msg(dev, obj->id, err, true);
if (obj->complete)
obj->complete(dev, err, obj->complete_priv);
}
@@ -304,8 +403,7 @@ static void switchdev_port_obj_del_deferred(struct net_device *dev,
err = switchdev_port_obj_del_now(dev, obj);
if (err && err != -EOPNOTSUPP)
- netdev_err(dev, "failed (err=%d) to del object (id=%d)\n",
- err, obj->id);
+ switchdev_obj_id_to_helpful_msg(dev, obj->id, err, false);
if (obj->complete)
obj->complete(dev, err, obj->complete_priv);
}
--
2.39.2
On Fri, 8 Mar 2024 11:47:24 +0100 Oleksij Rempel wrote:
> + problem = "Failure in VLAN settings on this port might disrupt "
> + "network segmentation or traffic isolation, affecting\n"
> + "network partitioning.\n";
nit: checkpatch spies with its little eye that there are spaces instead
of tabs here
FWIW I'd also personally go for splitting the string only where the \n
are, but that's up to you.
--
pw-bot: cr
On Fri, 2024-03-08 at 11:47 +0100, Oleksij Rempel wrote:
> Enhance the error reporting mechanism in the switchdev framework to
> provide more informative and user-friendly error messages.
>
> Following feedback from users struggling to understand the implications
> of error messages like "failed (err=-28) to add object (id=2)", this
> update aims to clarify what operation failed and how this might impact
> the system or network.
>
> With this change, error messages now include a description of the failed
> operation, the specific object involved, and a brief explanation of the
> potential impact on the system. This approach helps administrators and
> developers better understand the context and severity of errors,
> facilitating quicker and more effective troubleshooting.
>
> Example of the improved logging:
>
> [ 70.516446] ksz-switch spi0.0 uplink: Failed to add Port Multicast
> Database entry (object id=2) with error: -ENOSPC (-28).
> [ 70.516446] Failure in updating the port's Multicast Database could
> lead to multicast forwarding issues.
> [ 70.516446] Current HW/SW setup lacks sufficient resources.
In general, I think the "problem" is being written with 80
columns in mind in the source and is not well thought out
how someone might read the log.
Generally, I think it's better to have single line statements
in the log.
[]
> diff --git a/net/switchdev/switchdev.c b/net/switchdev/switchdev.c
[]
> @@ -244,6 +244,106 @@ static int switchdev_port_obj_notify(enum switchdev_notifier_type nt,
> return 0;
> }
>
> +static void switchdev_obj_id_to_helpful_msg(struct net_device *dev,
> + enum switchdev_obj_id obj_id,
> + int err, bool add)
> +{
> + const char *action = add ? "add" : "del";
> + const char *reason = "";
> + const char *problem;
> + const char *obj_str;
> +
> + switch (obj_id) {
> + case SWITCHDEV_OBJ_ID_UNDEFINED:
> + obj_str = "Undefined object";
> + problem = "Attempted operation is undefined, indicating a "
> + "possible programming error.\n";
My preference would be to write
problem = "Attempted operation is undefined indicating a possible programming error\n";
> + break;
> + case SWITCHDEV_OBJ_ID_PORT_VLAN:
> + obj_str = "VLAN entry";
> + problem = "Failure in VLAN settings on this port might disrupt "
> + "network segmentation or traffic isolation, affecting\n"
> + "network partitioning.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_PORT_MDB:
> + obj_str = "Port Multicast Database entry";
> + problem = "Failure in updating the port's Multicast Database "
> + "could lead to multicast forwarding issues.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_HOST_MDB:
> + obj_str = "Host Multicast Database entry";
> + problem = "Failure in updating the host's Multicast Database"
> + "may impact multicast group memberships or\n"
No space after Database makes the output "Databasemay"
> + "traffic delivery, affecting multicast communication.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_MRP:
> + obj_str = "Media Redundancy Protocol configuration for port";
> + problem = "Failure to set MRP ring ID on this port prevents"
> + "communication with the specified redundancy ring,\n"
portcommunication
> + "resulting in an inability to engage in MRP-based "
> + "network operations.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_RING_TEST_MRP:
> + obj_str = "MRP Test Frame Operations for port";
> + problem = "Failure to generate/monitor MRP test frames may lead"
> + "to inability to assess the ring's operational\n"
leadto
> + "integrity and fault response, hindering proactive "
> + "network management.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_RING_ROLE_MRP:
> + obj_str = "MRP Ring Role Configuration";
> + problem = "Improper MRP ring role configuration may create "
> + "conflicts in the ring, disrupting communication\n"
> + "for all participants, or isolate the local system "
> + "from the ring, hindering its ability to communicate "
> + "with other participants.\n";
A bunch of unnecessary commas.
> + break;
> + case SWITCHDEV_OBJ_ID_RING_STATE_MRP:
> + obj_str = "MRP Ring State Configuration";
> + problem = "Failure to correctly set the MRP ring state can "
> + "result in network loops or leave segments without\n"
> + "communication. In a Closed state, it maintains loop "
> + "prevention by blocking one MRM port, while an Open\n"
> + "state activates in response to failures, changing "
> + "port states to preserve network connectivity.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_IN_TEST_MRP:
> + obj_str = "MRP_InTest Frame Generation Configuration";
> + problem = "Failure in managing MRP_InTest frame generation can "
> + "misjudge the interconnection ring's state, leading\n"
> + "to incorrect blocking or unblocking of the I/C port."
> + "This misconfiguration might result in unintended\n"
> + "network loops or isolate critical network segments, "
> + "compromising network integrity and reliability.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_IN_ROLE_MRP:
> + obj_str = "Interconnection Ring Role Configuration";
> + problem = "Failure in incorrect assignment of interconnection "
> + "ring roles (MIM/MIC) can impair the formation of the\n"
> + "interconnection rings.\n";
> + break;
> + case SWITCHDEV_OBJ_ID_IN_STATE_MRP:
> + obj_str = "Interconnection Ring State Configuration";
> + problem = "Failure in updating the interconnection ring state "
> + "can lead in case of Open state to incorrect blocking\n"
> + "or unblocking of the I/C port, resulting in unintended"
> + "network loops or isolation of critical network\n";
> + break;
> + default:
> + obj_str = "Unknown object";
> + problem = "Indicating a possible programming error.\n";
> + }
> +
> + switch (err) {
> + case -ENOSPC:
> + reason = "Current HW/SW setup lacks sufficient resources.\n";
And adding a newline here puts an unnecessary newline between
logging output as the format also has a trailing newline.
> + break;
> + }
> +
> + netdev_err(dev, "Failed to %s %s (object id=%d) with error: %pe (%d).\n%s%s\n",
> + action, obj_str, obj_id, ERR_PTR(err), err, problem, reason);
> +}
> +
On Sat, 09 Mar 2024 06:26:28 -0800
Joe Perches <[email protected]> wrote:
> > + problem = "Attempted operation is undefined, indicating a "
> > + "possible programming error.\n";
>
> My preference would be to write
> problem = "Attempted operation is undefined indicating a possible programming error\n";
ETOOWORDY.
Be concise please.
problem = "Operations is undefined\n";
On Sat, 2024-03-09 at 09:09 -0800, Stephen Hemminger wrote:
> On Sat, 09 Mar 2024 06:26:28 -0800
> Joe Perches <[email protected]> wrote:
>
> > > + problem = "Attempted operation is undefined, indicating a "
> > > + "possible programming error.\n";
> >
> > My preference would be to write
> > problem = "Attempted operation is undefined indicating a possible programming error\n";
>
> ETOOWORDY.
> Be concise please.
> problem = "Operations is undefined\n";
>
Concision is good, plurality matching is too.