2020-11-19 03:58:41

by Huazhong Tan

[permalink] [raw]
Subject: [RFC net-next 0/2] net: updates for -next

#2 will add DIM for the HNS3 ethernet driver, then there will
be two implemation of IRQ adaptive coalescing (DIM and driver
custiom, so #1 adds a new netlink attribute to the
ETHTOOL_MSG_COALESCE_GET/ETHTOOL_MSG_COALESCE_SET commands
which controls the type of adaptive coalescing.

Huazhong Tan (2):
ethtool: add support for controling the type of adaptive coalescing
net: hns3: add support for dynamic interrupt moderation

drivers/net/ethernet/hisilicon/Kconfig | 1 +
drivers/net/ethernet/hisilicon/hns3/hns3_enet.c | 87 +++++++++++++++++++++-
drivers/net/ethernet/hisilicon/hns3/hns3_enet.h | 4 +
drivers/net/ethernet/hisilicon/hns3/hns3_ethtool.c | 7 +-
include/linux/ethtool.h | 1 +
include/uapi/linux/ethtool.h | 2 +
include/uapi/linux/ethtool_netlink.h | 1 +
net/ethtool/coalesce.c | 11 ++-
net/ethtool/netlink.h | 2 +-
9 files changed, 111 insertions(+), 5 deletions(-)

--
2.7.4


2020-11-19 03:58:58

by Huazhong Tan

[permalink] [raw]
Subject: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing

Since the information whether the adaptive behavior is implemented
by DIM or driver custom is useful, so add new attribute to
ETHTOOL_MSG_COALESCE_GET/ETHTOOL_MSG_COALESCE_SET commands to control
the type of adaptive coalescing.

Suggested-by: Jakub Kicinski <[email protected]>
Signed-off-by: Huazhong Tan <[email protected]>
---
include/linux/ethtool.h | 1 +
include/uapi/linux/ethtool.h | 2 ++
include/uapi/linux/ethtool_netlink.h | 1 +
net/ethtool/coalesce.c | 11 +++++++++--
net/ethtool/netlink.h | 2 +-
5 files changed, 14 insertions(+), 3 deletions(-)

diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 6408b44..f57cb92 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -215,6 +215,7 @@ bool ethtool_convert_link_mode_to_legacy_u32(u32 *legacy_u32,
#define ETHTOOL_COALESCE_TX_USECS_HIGH BIT(19)
#define ETHTOOL_COALESCE_TX_MAX_FRAMES_HIGH BIT(20)
#define ETHTOOL_COALESCE_RATE_SAMPLE_INTERVAL BIT(21)
+#define ETHTOOL_COALESCE_USE_DIM BIT(22)

#define ETHTOOL_COALESCE_USECS \
(ETHTOOL_COALESCE_RX_USECS | ETHTOOL_COALESCE_TX_USECS)
diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
index 9ca87bc..afd8de2 100644
--- a/include/uapi/linux/ethtool.h
+++ b/include/uapi/linux/ethtool.h
@@ -433,6 +433,7 @@ struct ethtool_modinfo {
* a TX interrupt, when the packet rate is above @pkt_rate_high.
* @rate_sample_interval: How often to do adaptive coalescing packet rate
* sampling, measured in seconds. Must not be zero.
+ * @use_dim: Use DIM for IRQ coalescing, if adaptive coalescing is enabled.
*
* Each pair of (usecs, max_frames) fields specifies that interrupts
* should be coalesced until
@@ -483,6 +484,7 @@ struct ethtool_coalesce {
__u32 tx_coalesce_usecs_high;
__u32 tx_max_coalesced_frames_high;
__u32 rate_sample_interval;
+ __u32 use_dim;
};

/**
diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
index e2bf36e..e3458d9 100644
--- a/include/uapi/linux/ethtool_netlink.h
+++ b/include/uapi/linux/ethtool_netlink.h
@@ -366,6 +366,7 @@ enum {
ETHTOOL_A_COALESCE_TX_USECS_HIGH, /* u32 */
ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH, /* u32 */
ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL, /* u32 */
+ ETHTOOL_A_COALESCE_USE_DIM, /* u8 */

/* add new constants above here */
__ETHTOOL_A_COALESCE_CNT,
diff --git a/net/ethtool/coalesce.c b/net/ethtool/coalesce.c
index 1d6bc13..f12e5de 100644
--- a/net/ethtool/coalesce.c
+++ b/net/ethtool/coalesce.c
@@ -50,6 +50,7 @@ __CHECK_SUPPORTED_OFFSET(COALESCE_RX_MAX_FRAMES_HIGH);
__CHECK_SUPPORTED_OFFSET(COALESCE_TX_USECS_HIGH);
__CHECK_SUPPORTED_OFFSET(COALESCE_TX_MAX_FRAMES_HIGH);
__CHECK_SUPPORTED_OFFSET(COALESCE_RATE_SAMPLE_INTERVAL);
+__CHECK_SUPPORTED_OFFSET(COALESCE_USE_DIM);

const struct nla_policy ethnl_coalesce_get_policy[] = {
[ETHTOOL_A_COALESCE_HEADER] =
@@ -100,7 +101,8 @@ static int coalesce_reply_size(const struct ethnl_req_info *req_base,
nla_total_size(sizeof(u32)) + /* _RX_MAX_FRAMES_HIGH */
nla_total_size(sizeof(u32)) + /* _TX_USECS_HIGH */
nla_total_size(sizeof(u32)) + /* _TX_MAX_FRAMES_HIGH */
- nla_total_size(sizeof(u32)); /* _RATE_SAMPLE_INTERVAL */
+ nla_total_size(sizeof(u32)) + /* _RATE_SAMPLE_INTERVAL */
+ nla_total_size(sizeof(u8)); /* _USE_DIM */
}

static bool coalesce_put_u32(struct sk_buff *skb, u16 attr_type, u32 val,
@@ -170,7 +172,9 @@ static int coalesce_fill_reply(struct sk_buff *skb,
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH,
coal->tx_max_coalesced_frames_high, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL,
- coal->rate_sample_interval, supported))
+ coal->rate_sample_interval, supported) ||
+ coalesce_put_bool(skb, ETHTOOL_A_COALESCE_USE_DIM,
+ coal->use_dim, supported))
return -EMSGSIZE;

return 0;
@@ -215,6 +219,7 @@ const struct nla_policy ethnl_coalesce_set_policy[] = {
[ETHTOOL_A_COALESCE_TX_USECS_HIGH] = { .type = NLA_U32 },
[ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH] = { .type = NLA_U32 },
[ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL] = { .type = NLA_U32 },
+ [ETHTOOL_A_COALESCE_USE_DIM] = { .type = NLA_U8 },
};

int ethnl_set_coalesce(struct sk_buff *skb, struct genl_info *info)
@@ -303,6 +308,8 @@ int ethnl_set_coalesce(struct sk_buff *skb, struct genl_info *info)
tb[ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH], &mod);
ethnl_update_u32(&coalesce.rate_sample_interval,
tb[ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL], &mod);
+ ethnl_update_bool32(&coalesce.use_dim,
+ tb[ETHTOOL_A_COALESCE_USE_DIM], &mod);
ret = 0;
if (!mod)
goto out_ops;
diff --git a/net/ethtool/netlink.h b/net/ethtool/netlink.h
index d8efec5..e63f44a 100644
--- a/net/ethtool/netlink.h
+++ b/net/ethtool/netlink.h
@@ -366,7 +366,7 @@ extern const struct nla_policy ethnl_rings_set_policy[ETHTOOL_A_RINGS_TX + 1];
extern const struct nla_policy ethnl_channels_get_policy[ETHTOOL_A_CHANNELS_HEADER + 1];
extern const struct nla_policy ethnl_channels_set_policy[ETHTOOL_A_CHANNELS_COMBINED_COUNT + 1];
extern const struct nla_policy ethnl_coalesce_get_policy[ETHTOOL_A_COALESCE_HEADER + 1];
-extern const struct nla_policy ethnl_coalesce_set_policy[ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL + 1];
+extern const struct nla_policy ethnl_coalesce_set_policy[ETHTOOL_A_COALESCE_USE_DIM + 1];
extern const struct nla_policy ethnl_pause_get_policy[ETHTOOL_A_PAUSE_HEADER + 1];
extern const struct nla_policy ethnl_pause_set_policy[ETHTOOL_A_PAUSE_TX + 1];
extern const struct nla_policy ethnl_eee_get_policy[ETHTOOL_A_EEE_HEADER + 1];
--
2.7.4

2020-11-19 04:18:00

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing

> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> index 9ca87bc..afd8de2 100644
> --- a/include/uapi/linux/ethtool.h
> +++ b/include/uapi/linux/ethtool.h
> @@ -433,6 +433,7 @@ struct ethtool_modinfo {
> * a TX interrupt, when the packet rate is above @pkt_rate_high.
> * @rate_sample_interval: How often to do adaptive coalescing packet rate
> * sampling, measured in seconds. Must not be zero.
> + * @use_dim: Use DIM for IRQ coalescing, if adaptive coalescing is enabled.
> *
> * Each pair of (usecs, max_frames) fields specifies that interrupts
> * should be coalesced until
> @@ -483,6 +484,7 @@ struct ethtool_coalesce {
> __u32 tx_coalesce_usecs_high;
> __u32 tx_max_coalesced_frames_high;
> __u32 rate_sample_interval;
> + __u32 use_dim;
> };

You cannot do this.

static noinline_for_stack int ethtool_set_coalesce(struct net_device *dev,
void __user *useraddr)
{
struct ethtool_coalesce coalesce;
int ret;

if (!dev->ethtool_ops->set_coalesce)
return -EOPNOTSUPP;

if (copy_from_user(&coalesce, useraddr, sizeof(coalesce)))
return -EFAULT;

An old ethtool binary is not going to set this extra last byte to
anything meaningful. You cannot tell if you have an old or new user
space, so you have no idea if it put anything into use_dim, or if it
is random junk.

You have to leave the IOCTL interface unchanged, and limit this new
feature to the netlink API.

> diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
> index e2bf36e..e3458d9 100644
> --- a/include/uapi/linux/ethtool_netlink.h
> +++ b/include/uapi/linux/ethtool_netlink.h
> @@ -366,6 +366,7 @@ enum {
> ETHTOOL_A_COALESCE_TX_USECS_HIGH, /* u32 */
> ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH, /* u32 */
> ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL, /* u32 */
> + ETHTOOL_A_COALESCE_USE_DIM, /* u8 */

This appears to be a boolean? So /* flag */ would be better. Or do you
think there is scope for a few different algorithms, and an enum would
be better. If so, you should add the enum with the two current
options.

Andrew

2020-11-19 08:58:32

by Huazhong Tan

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing



On 2020/11/19 12:15, Andrew Lunn wrote:
>> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
>> index 9ca87bc..afd8de2 100644
>> --- a/include/uapi/linux/ethtool.h
>> +++ b/include/uapi/linux/ethtool.h
>> @@ -433,6 +433,7 @@ struct ethtool_modinfo {
>> * a TX interrupt, when the packet rate is above @pkt_rate_high.
>> * @rate_sample_interval: How often to do adaptive coalescing packet rate
>> * sampling, measured in seconds. Must not be zero.
>> + * @use_dim: Use DIM for IRQ coalescing, if adaptive coalescing is enabled.
>> *
>> * Each pair of (usecs, max_frames) fields specifies that interrupts
>> * should be coalesced until
>> @@ -483,6 +484,7 @@ struct ethtool_coalesce {
>> __u32 tx_coalesce_usecs_high;
>> __u32 tx_max_coalesced_frames_high;
>> __u32 rate_sample_interval;
>> + __u32 use_dim;
>> };
>
> You cannot do this.
>
> static noinline_for_stack int ethtool_set_coalesce(struct net_device *dev,
> void __user *useraddr)
> {
> struct ethtool_coalesce coalesce;
> int ret;
>
> if (!dev->ethtool_ops->set_coalesce)
> return -EOPNOTSUPP;
>
> if (copy_from_user(&coalesce, useraddr, sizeof(coalesce)))
> return -EFAULT;
>
> An old ethtool binary is not going to set this extra last byte to
> anything meaningful. You cannot tell if you have an old or new user
> space, so you have no idea if it put anything into use_dim, or if it
> is random junk.
>
> You have to leave the IOCTL interface unchanged, and limit this new
> feature to the netlink API.
>

Hi, Andrew.
thanks for pointing out this problem, i will fix it.
without callling set_coalesce/set_coalesce of ethtool_ops, do you have
any suggestion for writing/reading this new attribute to/from the
driver? add a new field in net_device or a new callback function in
ethtool_ops seems not good.

>> diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
>> index e2bf36e..e3458d9 100644
>> --- a/include/uapi/linux/ethtool_netlink.h
>> +++ b/include/uapi/linux/ethtool_netlink.h
>> @@ -366,6 +366,7 @@ enum {
>> ETHTOOL_A_COALESCE_TX_USECS_HIGH, /* u32 */
>> ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH, /* u32 */
>> ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL, /* u32 */
>> + ETHTOOL_A_COALESCE_USE_DIM, /* u8 */
>
> This appears to be a boolean? So /* flag */ would be better. Or do you
> think there is scope for a few different algorithms, and an enum would
> be better. If so, you should add the enum with the two current
> options.
>
> Andrew
>

ok, boolean seems enough.

Thanks.
Huazhong.
> .
>

2020-11-19 21:46:20

by Michal Kubecek

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing

On Thu, Nov 19, 2020 at 05:15:57AM +0100, Andrew Lunn wrote:
> > diff --git a/include/uapi/linux/ethtool_netlink.h b/include/uapi/linux/ethtool_netlink.h
> > index e2bf36e..e3458d9 100644
> > --- a/include/uapi/linux/ethtool_netlink.h
> > +++ b/include/uapi/linux/ethtool_netlink.h
> > @@ -366,6 +366,7 @@ enum {
> > ETHTOOL_A_COALESCE_TX_USECS_HIGH, /* u32 */
> > ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH, /* u32 */
> > ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL, /* u32 */
> > + ETHTOOL_A_COALESCE_USE_DIM, /* u8 */
>
> This appears to be a boolean? So /* flag */ would be better. Or do you
> think there is scope for a few different algorithms, and an enum would
> be better. If so, you should add the enum with the two current
> options.

NLA_FLAG would suffice for a read only bool attribute but if the
attribute is expected to be set, we need to distinguish three cases:
set to true, set to false and keep current value. That's why we use
NLA_U8 for read write bool attributes (0 false, anything else true and
attribute not present means leave untouched).

Michal

2020-11-19 22:06:25

by Michal Kubecek

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing

On Thu, Nov 19, 2020 at 04:56:42PM +0800, tanhuazhong wrote:
> On 2020/11/19 12:15, Andrew Lunn wrote:
> > > diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
> > > index 9ca87bc..afd8de2 100644
> > > --- a/include/uapi/linux/ethtool.h
> > > +++ b/include/uapi/linux/ethtool.h
> > > @@ -433,6 +433,7 @@ struct ethtool_modinfo {
> > > * a TX interrupt, when the packet rate is above @pkt_rate_high.
> > > * @rate_sample_interval: How often to do adaptive coalescing packet rate
> > > * sampling, measured in seconds. Must not be zero.
> > > + * @use_dim: Use DIM for IRQ coalescing, if adaptive coalescing is enabled.
> > > *
> > > * Each pair of (usecs, max_frames) fields specifies that interrupts
> > > * should be coalesced until
> > > @@ -483,6 +484,7 @@ struct ethtool_coalesce {
> > > __u32 tx_coalesce_usecs_high;
> > > __u32 tx_max_coalesced_frames_high;
> > > __u32 rate_sample_interval;
> > > + __u32 use_dim;
> > > };
> >
> > You cannot do this.
> >
> > static noinline_for_stack int ethtool_set_coalesce(struct net_device *dev,
> > void __user *useraddr)
> > {
> > struct ethtool_coalesce coalesce;
> > int ret;
> >
> > if (!dev->ethtool_ops->set_coalesce)
> > return -EOPNOTSUPP;
> >
> > if (copy_from_user(&coalesce, useraddr, sizeof(coalesce)))
> > return -EFAULT;
> >
> > An old ethtool binary is not going to set this extra last byte to
> > anything meaningful. You cannot tell if you have an old or new user
> > space, so you have no idea if it put anything into use_dim, or if it
> > is random junk.

Even worse, as there is no indication of data length, ETHTOOL_GCOALESCE
ioctl request from old ethtool on new kernel would result in kernel
writing past the end of userspace buffer.

> > You have to leave the IOCTL interface unchanged, and limit this new
> > feature to the netlink API.
> >
>
> Hi, Andrew.
> thanks for pointing out this problem, i will fix it.
> without callling set_coalesce/set_coalesce of ethtool_ops, do you have any
> suggestion for writing/reading this new attribute to/from the driver? add a
> new field in net_device or a new callback function in ethtool_ops seems not
> good.

We could use a similar approach as struct ethtool_link_ksettings, e.g.

struct kernel_ethtool_coalesce {
struct ethtool_coalesce base;
/* new members which are not part of UAPI */
}

get_coalesce() and set_coalesce() would get pointer to struct
kernel_ethtool_coalesce and ioctl code would be modified to only touch
the base (legacy?) part.

While already changing the ops arguments, we could also add extack
pointer, either as a separate argument or as struct member (I slightly
prefer the former).

BtW, please don't forget to update the message descriptions in
Documentation/networking/ethtool-netlink.rst

Michal

2020-11-20 01:55:37

by Huazhong Tan

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing



On 2020/11/20 6:02, Michal Kubecek wrote:
> On Thu, Nov 19, 2020 at 04:56:42PM +0800, tanhuazhong wrote:
>> On 2020/11/19 12:15, Andrew Lunn wrote:
>>>> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
>>>> index 9ca87bc..afd8de2 100644
>>>> --- a/include/uapi/linux/ethtool.h
>>>> +++ b/include/uapi/linux/ethtool.h
>>>> @@ -433,6 +433,7 @@ struct ethtool_modinfo {
>>>> * a TX interrupt, when the packet rate is above @pkt_rate_high.
>>>> * @rate_sample_interval: How often to do adaptive coalescing packet rate
>>>> * sampling, measured in seconds. Must not be zero.
>>>> + * @use_dim: Use DIM for IRQ coalescing, if adaptive coalescing is enabled.
>>>> *
>>>> * Each pair of (usecs, max_frames) fields specifies that interrupts
>>>> * should be coalesced until
>>>> @@ -483,6 +484,7 @@ struct ethtool_coalesce {
>>>> __u32 tx_coalesce_usecs_high;
>>>> __u32 tx_max_coalesced_frames_high;
>>>> __u32 rate_sample_interval;
>>>> + __u32 use_dim;
>>>> };
>>>
>>> You cannot do this.
>>>
>>> static noinline_for_stack int ethtool_set_coalesce(struct net_device *dev,
>>> void __user *useraddr)
>>> {
>>> struct ethtool_coalesce coalesce;
>>> int ret;
>>>
>>> if (!dev->ethtool_ops->set_coalesce)
>>> return -EOPNOTSUPP;
>>>
>>> if (copy_from_user(&coalesce, useraddr, sizeof(coalesce)))
>>> return -EFAULT;
>>>
>>> An old ethtool binary is not going to set this extra last byte to
>>> anything meaningful. You cannot tell if you have an old or new user
>>> space, so you have no idea if it put anything into use_dim, or if it
>>> is random junk.
>
> Even worse, as there is no indication of data length, ETHTOOL_GCOALESCE
> ioctl request from old ethtool on new kernel would result in kernel
> writing past the end of userspace buffer.
>
>>> You have to leave the IOCTL interface unchanged, and limit this new
>>> feature to the netlink API.
>>>
>>
>> Hi, Andrew.
>> thanks for pointing out this problem, i will fix it.
>> without callling set_coalesce/set_coalesce of ethtool_ops, do you have any
>> suggestion for writing/reading this new attribute to/from the driver? add a
>> new field in net_device or a new callback function in ethtool_ops seems not
>> good.
>
> We could use a similar approach as struct ethtool_link_ksettings, e.g.
>
> struct kernel_ethtool_coalesce {
> struct ethtool_coalesce base;
> /* new members which are not part of UAPI */
> }
>
> get_coalesce() and set_coalesce() would get pointer to struct
> kernel_ethtool_coalesce and ioctl code would be modified to only touch
> the base (legacy?) part.
>

Thanks for your suggestion. i will implement it in V2.

> While already changing the ops arguments, we could also add extack
> pointer, either as a separate argument or as struct member (I slightly
> prefer the former).
>
> BtW, please don't forget to update the message descriptions in
> Documentation/networking/ethtool-netlink.rst
>
> Michal
>

will update the doc in V2.

Thanks.
Huazhong.

> .
>

2020-11-20 03:02:01

by Huazhong Tan

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing



On 2020/11/20 6:02, Michal Kubecek wrote:
> On Thu, Nov 19, 2020 at 04:56:42PM +0800, tanhuazhong wrote:
>> On 2020/11/19 12:15, Andrew Lunn wrote:
>>>> diff --git a/include/uapi/linux/ethtool.h b/include/uapi/linux/ethtool.h
>>>> index 9ca87bc..afd8de2 100644
>>>> --- a/include/uapi/linux/ethtool.h
>>>> +++ b/include/uapi/linux/ethtool.h
>>>> @@ -433,6 +433,7 @@ struct ethtool_modinfo {
>>>> * a TX interrupt, when the packet rate is above @pkt_rate_high.
>>>> * @rate_sample_interval: How often to do adaptive coalescing packet rate
>>>> * sampling, measured in seconds. Must not be zero.
>>>> + * @use_dim: Use DIM for IRQ coalescing, if adaptive coalescing is enabled.
>>>> *
>>>> * Each pair of (usecs, max_frames) fields specifies that interrupts
>>>> * should be coalesced until
>>>> @@ -483,6 +484,7 @@ struct ethtool_coalesce {
>>>> __u32 tx_coalesce_usecs_high;
>>>> __u32 tx_max_coalesced_frames_high;
>>>> __u32 rate_sample_interval;
>>>> + __u32 use_dim;
>>>> };
>>>
>>> You cannot do this.
>>>
>>> static noinline_for_stack int ethtool_set_coalesce(struct net_device *dev,
>>> void __user *useraddr)
>>> {
>>> struct ethtool_coalesce coalesce;
>>> int ret;
>>>
>>> if (!dev->ethtool_ops->set_coalesce)
>>> return -EOPNOTSUPP;
>>>
>>> if (copy_from_user(&coalesce, useraddr, sizeof(coalesce)))
>>> return -EFAULT;
>>>
>>> An old ethtool binary is not going to set this extra last byte to
>>> anything meaningful. You cannot tell if you have an old or new user
>>> space, so you have no idea if it put anything into use_dim, or if it
>>> is random junk.
>
> Even worse, as there is no indication of data length, ETHTOOL_GCOALESCE
> ioctl request from old ethtool on new kernel would result in kernel
> writing past the end of userspace buffer.
>
>>> You have to leave the IOCTL interface unchanged, and limit this new
>>> feature to the netlink API.
>>>
>>
>> Hi, Andrew.
>> thanks for pointing out this problem, i will fix it.
>> without callling set_coalesce/set_coalesce of ethtool_ops, do you have any
>> suggestion for writing/reading this new attribute to/from the driver? add a
>> new field in net_device or a new callback function in ethtool_ops seems not
>> good.
>
> We could use a similar approach as struct ethtool_link_ksettings, e.g.
>
> struct kernel_ethtool_coalesce {
> struct ethtool_coalesce base;
> /* new members which are not part of UAPI */
> }
>
> get_coalesce() and set_coalesce() would get pointer to struct
> kernel_ethtool_coalesce and ioctl code would be modified to only touch
> the base (legacy?) part.
> > While already changing the ops arguments, we could also add extack
> pointer, either as a separate argument or as struct member (I slightly
> prefer the former).
>


Hi, Michal.

If changing the ops arguments, each driver who implement
set_coalesce/get_coalesce of ethtool_ops need to be updated. Is it
acceptable adding two new ops to get/set ext_coalesce info (like
ecc31c60240b ("ethtool: Add link extended state") does)? Maybe i can
send V2 in this way, and then could you help to see which one is more
suitable?


> BtW, please don't forget to update the message descriptions in
> Documentation/networking/ethtool-netlink.rst
>
> Michal
>
> .
>

2020-11-20 07:25:34

by Michal Kubecek

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing

On Fri, Nov 20, 2020 at 10:59:59AM +0800, tanhuazhong wrote:
> On 2020/11/20 6:02, Michal Kubecek wrote:
> >
> > We could use a similar approach as struct ethtool_link_ksettings, e.g.
> >
> > struct kernel_ethtool_coalesce {
> > struct ethtool_coalesce base;
> > /* new members which are not part of UAPI */
> > }
> >
> > get_coalesce() and set_coalesce() would get pointer to struct
> > kernel_ethtool_coalesce and ioctl code would be modified to only touch
> > the base (legacy?) part.
> > > While already changing the ops arguments, we could also add extack
> > pointer, either as a separate argument or as struct member (I slightly
> > prefer the former).
>
> If changing the ops arguments, each driver who implement
> set_coalesce/get_coalesce of ethtool_ops need to be updated. Is it
> acceptable adding two new ops to get/set ext_coalesce info (like
> ecc31c60240b ("ethtool: Add link extended state") does)? Maybe i can send V2
> in this way, and then could you help to see which one is more suitable?

If it were just this one case, adding an extra op would be perfectly
fine. But from long term point of view, we should expect extending also
other existing ethtool requests and going this way for all of them would
essentially double the number of callbacks in struct ethtool_ops.

Michal

2020-11-20 13:42:05

by Andrew Lunn

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing

On Fri, Nov 20, 2020 at 08:23:22AM +0100, Michal Kubecek wrote:
> On Fri, Nov 20, 2020 at 10:59:59AM +0800, tanhuazhong wrote:
> > On 2020/11/20 6:02, Michal Kubecek wrote:
> > >
> > > We could use a similar approach as struct ethtool_link_ksettings, e.g.
> > >
> > > struct kernel_ethtool_coalesce {
> > > struct ethtool_coalesce base;
> > > /* new members which are not part of UAPI */
> > > }
> > >
> > > get_coalesce() and set_coalesce() would get pointer to struct
> > > kernel_ethtool_coalesce and ioctl code would be modified to only touch
> > > the base (legacy?) part.
> > > > While already changing the ops arguments, we could also add extack
> > > pointer, either as a separate argument or as struct member (I slightly
> > > prefer the former).
> >
> > If changing the ops arguments, each driver who implement
> > set_coalesce/get_coalesce of ethtool_ops need to be updated. Is it
> > acceptable adding two new ops to get/set ext_coalesce info (like
> > ecc31c60240b ("ethtool: Add link extended state") does)? Maybe i can send V2
> > in this way, and then could you help to see which one is more suitable?
>
> If it were just this one case, adding an extra op would be perfectly
> fine. But from long term point of view, we should expect extending also
> other existing ethtool requests and going this way for all of them would
> essentially double the number of callbacks in struct ethtool_ops.

coccinella might be useful here.

Andrew

2020-11-20 21:24:43

by Michal Kubecek

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing

On Fri, Nov 20, 2020 at 02:39:38PM +0100, Andrew Lunn wrote:
> On Fri, Nov 20, 2020 at 08:23:22AM +0100, Michal Kubecek wrote:
> > On Fri, Nov 20, 2020 at 10:59:59AM +0800, tanhuazhong wrote:
> > > On 2020/11/20 6:02, Michal Kubecek wrote:
> > > >
> > > > We could use a similar approach as struct ethtool_link_ksettings, e.g.
> > > >
> > > > struct kernel_ethtool_coalesce {
> > > > struct ethtool_coalesce base;
> > > > /* new members which are not part of UAPI */
> > > > }
> > > >
> > > > get_coalesce() and set_coalesce() would get pointer to struct
> > > > kernel_ethtool_coalesce and ioctl code would be modified to only touch
> > > > the base (legacy?) part.
> > > > > While already changing the ops arguments, we could also add extack
> > > > pointer, either as a separate argument or as struct member (I slightly
> > > > prefer the former).
> > >
> > > If changing the ops arguments, each driver who implement
> > > set_coalesce/get_coalesce of ethtool_ops need to be updated. Is it
> > > acceptable adding two new ops to get/set ext_coalesce info (like
> > > ecc31c60240b ("ethtool: Add link extended state") does)? Maybe i can send V2
> > > in this way, and then could you help to see which one is more suitable?
> >
> > If it were just this one case, adding an extra op would be perfectly
> > fine. But from long term point of view, we should expect extending also
> > other existing ethtool requests and going this way for all of them would
> > essentially double the number of callbacks in struct ethtool_ops.
>
> coccinella might be useful here.

I played with spatch a bit and it with the spatch and patch below, I got
only three build failures (with allmodconfig) that would need to be
addressed manually - these drivers call the set_coalesce() callback on
device initialization.

I also tried to make the structure argument const in ->set_coalesce()
but that was more tricky as adjusting other functions that the structure
is passed to required either running the spatch three times or repeating
the same two rules three times in the spatch (or perhaps there is
a cleaner way but I'm missing relevant knowledge of coccinelle). Then
there was one more problem in i40e driver which modifies the structure
before passing it on to its helpers. It could be worked around but I'm
not sure if constifying the argument is worth these extra complications.

Michal


semantic patch:
------------------------------------------------------------------------
@getc@
identifier fn;
identifier ops;
@@
struct ethtool_ops ops = {
...
,.get_coalesce = fn,
...
};


@@
identifier getc.fn;
identifier dev, data;
@@
-int fn(struct net_device *dev, struct ethtool_coalesce *data)
+int fn(struct net_device *dev, struct netlink_ext_ack *extack, struct kernel_ethtool_coalesce *data)
{
+struct ethtool_coalesce *coal_base = &data->base;
<...
-data
+coal_base
...>
}


@setc@
identifier fn;
identifier ops;
@@
struct ethtool_ops ops = {
...
,.set_coalesce = fn,
...
};


@@
identifier setc.fn;
identifier dev, data;
@@
-int fn(struct net_device *dev, struct ethtool_coalesce *data)
+int fn(struct net_device *dev, struct netlink_ext_ack *extack, struct kernel_ethtool_coalesce *data)
{
+struct ethtool_coalesce *coal_base = &data->base;
<...
-data
+coal_base
...>
}
------------------------------------------------------------------------

maual part:
------------------------------------------------------------------------
diff --git a/include/linux/ethtool.h b/include/linux/ethtool.h
index 6408b446051f..01d049d36120 100644
--- a/include/linux/ethtool.h
+++ b/include/linux/ethtool.h
@@ -15,6 +15,7 @@

#include <linux/bitmap.h>
#include <linux/compat.h>
+#include <linux/netlink.h>
#include <uapi/linux/ethtool.h>

#ifdef CONFIG_COMPAT
@@ -176,6 +177,10 @@ extern int
__ethtool_get_link_ksettings(struct net_device *dev,
struct ethtool_link_ksettings *link_ksettings);

+struct kernel_ethtool_coalesce {
+ struct ethtool_coalesce base;
+};
+
/**
* ethtool_intersect_link_masks - Given two link masks, AND them together
* @dst: first mask and where result is stored
@@ -436,8 +441,12 @@ struct ethtool_ops {
struct ethtool_eeprom *, u8 *);
int (*set_eeprom)(struct net_device *,
struct ethtool_eeprom *, u8 *);
- int (*get_coalesce)(struct net_device *, struct ethtool_coalesce *);
- int (*set_coalesce)(struct net_device *, struct ethtool_coalesce *);
+ int (*get_coalesce)(struct net_device *,
+ struct netlink_ext_ack *extack,
+ struct kernel_ethtool_coalesce *);
+ int (*set_coalesce)(struct net_device *,
+ struct netlink_ext_ack *extack,
+ struct kernel_ethtool_coalesce *);
void (*get_ringparam)(struct net_device *,
struct ethtool_ringparam *);
int (*set_ringparam)(struct net_device *,
diff --git a/net/ethtool/coalesce.c b/net/ethtool/coalesce.c
index 1d6bc132aa4d..6f64dad0202e 100644
--- a/net/ethtool/coalesce.c
+++ b/net/ethtool/coalesce.c
@@ -9,7 +9,7 @@ struct coalesce_req_info {

struct coalesce_reply_data {
struct ethnl_reply_data base;
- struct ethtool_coalesce coalesce;
+ struct kernel_ethtool_coalesce coalesce;
u32 supported_params;
};

@@ -61,6 +61,7 @@ static int coalesce_prepare_data(const struct ethnl_req_info *req_base,
struct genl_info *info)
{
struct coalesce_reply_data *data = COALESCE_REPDATA(reply_base);
+ struct netlink_ext_ack *extack = info ? info->extack : NULL;
struct net_device *dev = reply_base->dev;
int ret;

@@ -70,7 +71,7 @@ static int coalesce_prepare_data(const struct ethnl_req_info *req_base,
ret = ethnl_ops_begin(dev);
if (ret < 0)
return ret;
- ret = dev->ethtool_ops->get_coalesce(dev, &data->coalesce);
+ ret = dev->ethtool_ops->get_coalesce(dev, extack, &data->coalesce);
ethnl_ops_complete(dev);

return ret;
@@ -124,53 +125,53 @@ static int coalesce_fill_reply(struct sk_buff *skb,
const struct ethnl_reply_data *reply_base)
{
const struct coalesce_reply_data *data = COALESCE_REPDATA(reply_base);
- const struct ethtool_coalesce *coal = &data->coalesce;
+ const struct ethtool_coalesce *cbase = &data->coalesce.base;
u32 supported = data->supported_params;

if (coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_USECS,
- coal->rx_coalesce_usecs, supported) ||
+ cbase->rx_coalesce_usecs, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_MAX_FRAMES,
- coal->rx_max_coalesced_frames, supported) ||
+ cbase->rx_max_coalesced_frames, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_USECS_IRQ,
- coal->rx_coalesce_usecs_irq, supported) ||
+ cbase->rx_coalesce_usecs_irq, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_MAX_FRAMES_IRQ,
- coal->rx_max_coalesced_frames_irq, supported) ||
+ cbase->rx_max_coalesced_frames_irq, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_USECS,
- coal->tx_coalesce_usecs, supported) ||
+ cbase->tx_coalesce_usecs, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_MAX_FRAMES,
- coal->tx_max_coalesced_frames, supported) ||
+ cbase->tx_max_coalesced_frames, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_USECS_IRQ,
- coal->tx_coalesce_usecs_irq, supported) ||
+ cbase->tx_coalesce_usecs_irq, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_MAX_FRAMES_IRQ,
- coal->tx_max_coalesced_frames_irq, supported) ||
+ cbase->tx_max_coalesced_frames_irq, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_STATS_BLOCK_USECS,
- coal->stats_block_coalesce_usecs, supported) ||
+ cbase->stats_block_coalesce_usecs, supported) ||
coalesce_put_bool(skb, ETHTOOL_A_COALESCE_USE_ADAPTIVE_RX,
- coal->use_adaptive_rx_coalesce, supported) ||
+ cbase->use_adaptive_rx_coalesce, supported) ||
coalesce_put_bool(skb, ETHTOOL_A_COALESCE_USE_ADAPTIVE_TX,
- coal->use_adaptive_tx_coalesce, supported) ||
+ cbase->use_adaptive_tx_coalesce, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_PKT_RATE_LOW,
- coal->pkt_rate_low, supported) ||
+ cbase->pkt_rate_low, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_USECS_LOW,
- coal->rx_coalesce_usecs_low, supported) ||
+ cbase->rx_coalesce_usecs_low, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_MAX_FRAMES_LOW,
- coal->rx_max_coalesced_frames_low, supported) ||
+ cbase->rx_max_coalesced_frames_low, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_USECS_LOW,
- coal->tx_coalesce_usecs_low, supported) ||
+ cbase->tx_coalesce_usecs_low, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_MAX_FRAMES_LOW,
- coal->tx_max_coalesced_frames_low, supported) ||
+ cbase->tx_max_coalesced_frames_low, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_PKT_RATE_HIGH,
- coal->pkt_rate_high, supported) ||
+ cbase->pkt_rate_high, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_USECS_HIGH,
- coal->rx_coalesce_usecs_high, supported) ||
+ cbase->rx_coalesce_usecs_high, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RX_MAX_FRAMES_HIGH,
- coal->rx_max_coalesced_frames_high, supported) ||
+ cbase->rx_max_coalesced_frames_high, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_USECS_HIGH,
- coal->tx_coalesce_usecs_high, supported) ||
+ cbase->tx_coalesce_usecs_high, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH,
- coal->tx_max_coalesced_frames_high, supported) ||
+ cbase->tx_max_coalesced_frames_high, supported) ||
coalesce_put_u32(skb, ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL,
- coal->rate_sample_interval, supported))
+ cbase->rate_sample_interval, supported))
return -EMSGSIZE;

return 0;
@@ -219,7 +220,8 @@ const struct nla_policy ethnl_coalesce_set_policy[] = {

int ethnl_set_coalesce(struct sk_buff *skb, struct genl_info *info)
{
- struct ethtool_coalesce coalesce = {};
+ struct kernel_ethtool_coalesce coalesce = {};
+ struct ethtool_coalesce *coal_base = &coalesce.base;
struct ethnl_req_info req_info = {};
struct nlattr **tb = info->attrs;
const struct ethtool_ops *ops;
@@ -255,59 +257,59 @@ int ethnl_set_coalesce(struct sk_buff *skb, struct genl_info *info)
ret = ethnl_ops_begin(dev);
if (ret < 0)
goto out_rtnl;
- ret = ops->get_coalesce(dev, &coalesce);
+ ret = ops->get_coalesce(dev, info->extack, &coalesce);
if (ret < 0)
goto out_ops;

- ethnl_update_u32(&coalesce.rx_coalesce_usecs,
+ ethnl_update_u32(&coal_base->rx_coalesce_usecs,
tb[ETHTOOL_A_COALESCE_RX_USECS], &mod);
- ethnl_update_u32(&coalesce.rx_max_coalesced_frames,
+ ethnl_update_u32(&coal_base->rx_max_coalesced_frames,
tb[ETHTOOL_A_COALESCE_RX_MAX_FRAMES], &mod);
- ethnl_update_u32(&coalesce.rx_coalesce_usecs_irq,
+ ethnl_update_u32(&coal_base->rx_coalesce_usecs_irq,
tb[ETHTOOL_A_COALESCE_RX_USECS_IRQ], &mod);
- ethnl_update_u32(&coalesce.rx_max_coalesced_frames_irq,
+ ethnl_update_u32(&coal_base->rx_max_coalesced_frames_irq,
tb[ETHTOOL_A_COALESCE_RX_MAX_FRAMES_IRQ], &mod);
- ethnl_update_u32(&coalesce.tx_coalesce_usecs,
+ ethnl_update_u32(&coal_base->tx_coalesce_usecs,
tb[ETHTOOL_A_COALESCE_TX_USECS], &mod);
- ethnl_update_u32(&coalesce.tx_max_coalesced_frames,
+ ethnl_update_u32(&coal_base->tx_max_coalesced_frames,
tb[ETHTOOL_A_COALESCE_TX_MAX_FRAMES], &mod);
- ethnl_update_u32(&coalesce.tx_coalesce_usecs_irq,
+ ethnl_update_u32(&coal_base->tx_coalesce_usecs_irq,
tb[ETHTOOL_A_COALESCE_TX_USECS_IRQ], &mod);
- ethnl_update_u32(&coalesce.tx_max_coalesced_frames_irq,
+ ethnl_update_u32(&coal_base->tx_max_coalesced_frames_irq,
tb[ETHTOOL_A_COALESCE_TX_MAX_FRAMES_IRQ], &mod);
- ethnl_update_u32(&coalesce.stats_block_coalesce_usecs,
+ ethnl_update_u32(&coal_base->stats_block_coalesce_usecs,
tb[ETHTOOL_A_COALESCE_STATS_BLOCK_USECS], &mod);
- ethnl_update_bool32(&coalesce.use_adaptive_rx_coalesce,
+ ethnl_update_bool32(&coal_base->use_adaptive_rx_coalesce,
tb[ETHTOOL_A_COALESCE_USE_ADAPTIVE_RX], &mod);
- ethnl_update_bool32(&coalesce.use_adaptive_tx_coalesce,
+ ethnl_update_bool32(&coal_base->use_adaptive_tx_coalesce,
tb[ETHTOOL_A_COALESCE_USE_ADAPTIVE_TX], &mod);
- ethnl_update_u32(&coalesce.pkt_rate_low,
+ ethnl_update_u32(&coal_base->pkt_rate_low,
tb[ETHTOOL_A_COALESCE_PKT_RATE_LOW], &mod);
- ethnl_update_u32(&coalesce.rx_coalesce_usecs_low,
+ ethnl_update_u32(&coal_base->rx_coalesce_usecs_low,
tb[ETHTOOL_A_COALESCE_RX_USECS_LOW], &mod);
- ethnl_update_u32(&coalesce.rx_max_coalesced_frames_low,
+ ethnl_update_u32(&coal_base->rx_max_coalesced_frames_low,
tb[ETHTOOL_A_COALESCE_RX_MAX_FRAMES_LOW], &mod);
- ethnl_update_u32(&coalesce.tx_coalesce_usecs_low,
+ ethnl_update_u32(&coal_base->tx_coalesce_usecs_low,
tb[ETHTOOL_A_COALESCE_TX_USECS_LOW], &mod);
- ethnl_update_u32(&coalesce.tx_max_coalesced_frames_low,
+ ethnl_update_u32(&coal_base->tx_max_coalesced_frames_low,
tb[ETHTOOL_A_COALESCE_TX_MAX_FRAMES_LOW], &mod);
- ethnl_update_u32(&coalesce.pkt_rate_high,
+ ethnl_update_u32(&coal_base->pkt_rate_high,
tb[ETHTOOL_A_COALESCE_PKT_RATE_HIGH], &mod);
- ethnl_update_u32(&coalesce.rx_coalesce_usecs_high,
+ ethnl_update_u32(&coal_base->rx_coalesce_usecs_high,
tb[ETHTOOL_A_COALESCE_RX_USECS_HIGH], &mod);
- ethnl_update_u32(&coalesce.rx_max_coalesced_frames_high,
+ ethnl_update_u32(&coal_base->rx_max_coalesced_frames_high,
tb[ETHTOOL_A_COALESCE_RX_MAX_FRAMES_HIGH], &mod);
- ethnl_update_u32(&coalesce.tx_coalesce_usecs_high,
+ ethnl_update_u32(&coal_base->tx_coalesce_usecs_high,
tb[ETHTOOL_A_COALESCE_TX_USECS_HIGH], &mod);
- ethnl_update_u32(&coalesce.tx_max_coalesced_frames_high,
+ ethnl_update_u32(&coal_base->tx_max_coalesced_frames_high,
tb[ETHTOOL_A_COALESCE_TX_MAX_FRAMES_HIGH], &mod);
- ethnl_update_u32(&coalesce.rate_sample_interval,
+ ethnl_update_u32(&coal_base->rate_sample_interval,
tb[ETHTOOL_A_COALESCE_RATE_SAMPLE_INTERVAL], &mod);
ret = 0;
if (!mod)
goto out_ops;

- ret = dev->ethtool_ops->set_coalesce(dev, &coalesce);
+ ret = dev->ethtool_ops->set_coalesce(dev, info->extack, &coalesce);
if (ret < 0)
goto out_ops;
ethtool_notify(dev, ETHTOOL_MSG_COALESCE_NTF, NULL);
diff --git a/net/ethtool/ioctl.c b/net/ethtool/ioctl.c
index 771688e1b0da..cb378fd8fcae 100644
--- a/net/ethtool/ioctl.c
+++ b/net/ethtool/ioctl.c
@@ -1513,71 +1513,74 @@ static int ethtool_set_eeprom(struct net_device *dev, void __user *useraddr)
static noinline_for_stack int ethtool_get_coalesce(struct net_device *dev,
void __user *useraddr)
{
- struct ethtool_coalesce coalesce = { .cmd = ETHTOOL_GCOALESCE };
+ struct kernel_ethtool_coalesce coalesce = {
+ .base = { .cmd = ETHTOOL_GCOALESCE }
+ };
int ret;

if (!dev->ethtool_ops->get_coalesce)
return -EOPNOTSUPP;

- ret = dev->ethtool_ops->get_coalesce(dev, &coalesce);
+ ret = dev->ethtool_ops->get_coalesce(dev, NULL, &coalesce);
if (ret)
return ret;

- if (copy_to_user(useraddr, &coalesce, sizeof(coalesce)))
+ if (copy_to_user(useraddr, &coalesce.base, sizeof(coalesce.base)))
return -EFAULT;
return 0;
}

static bool
ethtool_set_coalesce_supported(struct net_device *dev,
- struct ethtool_coalesce *coalesce)
+ struct kernel_ethtool_coalesce *coalesce)
{
u32 supported_params = dev->ethtool_ops->supported_coalesce_params;
+ const struct ethtool_coalesce *coal_base = &coalesce->base;
u32 nonzero_params = 0;

- if (coalesce->rx_coalesce_usecs)
+ if (coal_base->rx_coalesce_usecs)
nonzero_params |= ETHTOOL_COALESCE_RX_USECS;
- if (coalesce->rx_max_coalesced_frames)
+ if (coal_base->rx_max_coalesced_frames)
nonzero_params |= ETHTOOL_COALESCE_RX_MAX_FRAMES;
- if (coalesce->rx_coalesce_usecs_irq)
+ if (coal_base->rx_coalesce_usecs_irq)
nonzero_params |= ETHTOOL_COALESCE_RX_USECS_IRQ;
- if (coalesce->rx_max_coalesced_frames_irq)
+ if (coal_base->rx_max_coalesced_frames_irq)
nonzero_params |= ETHTOOL_COALESCE_RX_MAX_FRAMES_IRQ;
- if (coalesce->tx_coalesce_usecs)
+ if (coal_base->tx_coalesce_usecs)
nonzero_params |= ETHTOOL_COALESCE_TX_USECS;
- if (coalesce->tx_max_coalesced_frames)
+ if (coal_base->tx_max_coalesced_frames)
nonzero_params |= ETHTOOL_COALESCE_TX_MAX_FRAMES;
- if (coalesce->tx_coalesce_usecs_irq)
+ if (coal_base->tx_coalesce_usecs_irq)
nonzero_params |= ETHTOOL_COALESCE_TX_USECS_IRQ;
- if (coalesce->tx_max_coalesced_frames_irq)
+ if (coal_base->tx_max_coalesced_frames_irq)
nonzero_params |= ETHTOOL_COALESCE_TX_MAX_FRAMES_IRQ;
- if (coalesce->stats_block_coalesce_usecs)
+ if (coal_base->stats_block_coalesce_usecs)
nonzero_params |= ETHTOOL_COALESCE_STATS_BLOCK_USECS;
- if (coalesce->use_adaptive_rx_coalesce)
+ if (coal_base->use_adaptive_rx_coalesce)
nonzero_params |= ETHTOOL_COALESCE_USE_ADAPTIVE_RX;
- if (coalesce->use_adaptive_tx_coalesce)
+ if (coal_base->use_adaptive_tx_coalesce)
nonzero_params |= ETHTOOL_COALESCE_USE_ADAPTIVE_TX;
- if (coalesce->pkt_rate_low)
+ if (coal_base->pkt_rate_low)
nonzero_params |= ETHTOOL_COALESCE_PKT_RATE_LOW;
- if (coalesce->rx_coalesce_usecs_low)
+ if (coal_base->rx_coalesce_usecs_low)
nonzero_params |= ETHTOOL_COALESCE_RX_USECS_LOW;
- if (coalesce->rx_max_coalesced_frames_low)
+ if (coal_base->rx_max_coalesced_frames_low)
nonzero_params |= ETHTOOL_COALESCE_RX_MAX_FRAMES_LOW;
- if (coalesce->tx_coalesce_usecs_low)
+ if (coal_base->tx_coalesce_usecs_low)
nonzero_params |= ETHTOOL_COALESCE_TX_USECS_LOW;
- if (coalesce->tx_max_coalesced_frames_low)
+ if (coal_base->tx_max_coalesced_frames_low)
nonzero_params |= ETHTOOL_COALESCE_TX_MAX_FRAMES_LOW;
- if (coalesce->pkt_rate_high)
+ if (coal_base->pkt_rate_high)
nonzero_params |= ETHTOOL_COALESCE_PKT_RATE_HIGH;
- if (coalesce->rx_coalesce_usecs_high)
+ if (coal_base->rx_coalesce_usecs_high)
nonzero_params |= ETHTOOL_COALESCE_RX_USECS_HIGH;
- if (coalesce->rx_max_coalesced_frames_high)
+ if (coal_base->rx_max_coalesced_frames_high)
nonzero_params |= ETHTOOL_COALESCE_RX_MAX_FRAMES_HIGH;
- if (coalesce->tx_coalesce_usecs_high)
+ if (coal_base->tx_coalesce_usecs_high)
nonzero_params |= ETHTOOL_COALESCE_TX_USECS_HIGH;
- if (coalesce->tx_max_coalesced_frames_high)
+ if (coal_base->tx_max_coalesced_frames_high)
nonzero_params |= ETHTOOL_COALESCE_TX_MAX_FRAMES_HIGH;
- if (coalesce->rate_sample_interval)
+ if (coal_base->rate_sample_interval)
nonzero_params |= ETHTOOL_COALESCE_RATE_SAMPLE_INTERVAL;

return (supported_params & nonzero_params) == nonzero_params;
@@ -1586,19 +1589,22 @@ ethtool_set_coalesce_supported(struct net_device *dev,
static noinline_for_stack int ethtool_set_coalesce(struct net_device *dev,
void __user *useraddr)
{
- struct ethtool_coalesce coalesce;
+ struct kernel_ethtool_coalesce coalesce = {};
int ret;

- if (!dev->ethtool_ops->set_coalesce)
+ if (!dev->ethtool_ops->get_coalesce || !dev->ethtool_ops->set_coalesce)
return -EOPNOTSUPP;

- if (copy_from_user(&coalesce, useraddr, sizeof(coalesce)))
+ ret = dev->ethtool_ops->get_coalesce(dev, NULL, &coalesce);
+ if (ret)
+ return ret;
+ if (copy_from_user(&coalesce.base, useraddr, sizeof(coalesce.base)))
return -EFAULT;

if (!ethtool_set_coalesce_supported(dev, &coalesce))
return -EOPNOTSUPP;

- ret = dev->ethtool_ops->set_coalesce(dev, &coalesce);
+ ret = dev->ethtool_ops->set_coalesce(dev, NULL, &coalesce);
if (!ret)
ethtool_notify(dev, ETHTOOL_MSG_COALESCE_NTF, NULL);
return ret;
@@ -2362,6 +2368,7 @@ ethtool_set_per_queue_coalesce(struct net_device *dev,
int i, ret = 0;
int n_queue;
struct ethtool_coalesce *backup = NULL, *tmp = NULL;
+ struct kernel_ethtool_coalesce coalesce = {};
DECLARE_BITMAP(queue_mask, MAX_NUM_QUEUE);

if ((!dev->ethtool_ops->set_per_queue_coalesce) ||
@@ -2377,15 +2384,14 @@ ethtool_set_per_queue_coalesce(struct net_device *dev,
return -ENOMEM;

for_each_set_bit(bit, queue_mask, MAX_NUM_QUEUE) {
- struct ethtool_coalesce coalesce;
-
ret = dev->ethtool_ops->get_per_queue_coalesce(dev, bit, tmp);
if (ret != 0)
goto roll_back;

tmp++;

- if (copy_from_user(&coalesce, useraddr, sizeof(coalesce))) {
+ if (copy_from_user(&coalesce.base, useraddr,
+ sizeof(coalesce.base))) {
ret = -EFAULT;
goto roll_back;
}
@@ -2395,7 +2401,8 @@ ethtool_set_per_queue_coalesce(struct net_device *dev,
goto roll_back;
}

- ret = dev->ethtool_ops->set_per_queue_coalesce(dev, bit, &coalesce);
+ ret = dev->ethtool_ops->set_per_queue_coalesce(dev, bit,
+ &coalesce.base);
if (ret != 0)
goto roll_back;

------------------------------------------------------------------------

2020-11-21 02:00:41

by Huazhong Tan

[permalink] [raw]
Subject: Re: [RFC net-next 1/2] ethtool: add support for controling the type of adaptive coalescing



On 2020/11/21 5:22, Michal Kubecek wrote:
> On Fri, Nov 20, 2020 at 02:39:38PM +0100, Andrew Lunn wrote:
>> On Fri, Nov 20, 2020 at 08:23:22AM +0100, Michal Kubecek wrote:
>>> On Fri, Nov 20, 2020 at 10:59:59AM +0800, tanhuazhong wrote:
>>>> On 2020/11/20 6:02, Michal Kubecek wrote:
>>>>> We could use a similar approach as struct ethtool_link_ksettings, e.g.
>>>>>
>>>>> struct kernel_ethtool_coalesce {
>>>>> struct ethtool_coalesce base;
>>>>> /* new members which are not part of UAPI */
>>>>> }
>>>>>
>>>>> get_coalesce() and set_coalesce() would get pointer to struct
>>>>> kernel_ethtool_coalesce and ioctl code would be modified to only touch
>>>>> the base (legacy?) part.
>>>>> > While already changing the ops arguments, we could also add extack
>>>>> pointer, either as a separate argument or as struct member (I slightly
>>>>> prefer the former).
>>>> If changing the ops arguments, each driver who implement
>>>> set_coalesce/get_coalesce of ethtool_ops need to be updated. Is it
>>>> acceptable adding two new ops to get/set ext_coalesce info (like
>>>> ecc31c60240b ("ethtool: Add link extended state") does)? Maybe i can send V2
>>>> in this way, and then could you help to see which one is more suitable?
>>> If it were just this one case, adding an extra op would be perfectly
>>> fine. But from long term point of view, we should expect extending also
>>> other existing ethtool requests and going this way for all of them would
>>> essentially double the number of callbacks in struct ethtool_ops.
>> coccinella might be useful here.
> I played with spatch a bit and it with the spatch and patch below, I got
> only three build failures (with allmodconfig) that would need to be
> addressed manually - these drivers call the set_coalesce() callback on
> device initialization.
>
> I also tried to make the structure argument const in ->set_coalesce()
> but that was more tricky as adjusting other functions that the structure
> is passed to required either running the spatch three times or repeating
> the same two rules three times in the spatch (or perhaps there is
> a cleaner way but I'm missing relevant knowledge of coccinelle). Then
> there was one more problem in i40e driver which modifies the structure
> before passing it on to its helpers. It could be worked around but I'm
> not sure if constifying the argument is worth these extra complications.
>
> Michal

will implement it like this in V3.

Regards,
Huazhong.