2015-07-03 14:31:16

by Maxime Ripard

[permalink] [raw]
Subject: [PATCH 0/6] net: mvneta: Switch to per-CPU irq and make rxq_def useful

Hi,

This patchset reworks the Marvell neta driver in order to really
support its per-CPU interrupts, instead of faking them as SPI, and
allow the use of any RX queue instead of the hardcoded RX queue 0 that
we have currently.

Let me know what you think,
Maxime

Maxime Ripard (6):
net: mvneta: Fix CPU_MAP registers initialisation
genirq: Fix the documentation of request_percpu_irq
irqchip: armada-370-xp: Rework per-cpu interrupts handling
net: mvneta: Handle per-cpu interrupts
net: mvneta: Allow different queues
net: mvneta: Statically assign queues to CPUs

drivers/irqchip/irq-armada-370-xp.c | 14 +--
drivers/net/ethernet/marvell/mvneta.c | 187 ++++++++++++++++------------------
kernel/irq/manage.c | 7 +-
3 files changed, 98 insertions(+), 110 deletions(-)

--
2.4.5


2015-07-03 14:30:06

by Maxime Ripard

[permalink] [raw]
Subject: [PATCH 1/6] net: mvneta: Fix CPU_MAP registers initialisation

The CPU_MAP register is duplicated for each CPUs at different addresses,
each instance being at a different address.

However, the code so far was using CONFIG_NR_CPUS to initialise the CPU_MAP
registers for each registers, while the SoCs embed at most 4 CPUs.

This is especially an issue with multi_v7_defconfig, where CONFIG_NR_CPUS
is currently set to 16, resulting in writes to registers that are not
CPU_MAP.

Fixes: c5aff18204da ("net: mvneta: driver for Marvell Armada 370/XP network unit")
Signed-off-by: Maxime Ripard <[email protected]>
Cc: <[email protected]> # v3.8+
---
drivers/net/ethernet/marvell/mvneta.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index ce5f7f9cff06..b7717375ec4d 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -948,7 +948,7 @@ static void mvneta_defaults_set(struct mvneta_port *pp)
/* Set CPU queue access map - all CPUs have access to all RX
* queues and to all TX queues
*/
- for (cpu = 0; cpu < CONFIG_NR_CPUS; cpu++)
+ for_each_present_cpu(cpu)
mvreg_write(pp, MVNETA_CPU_MAP(cpu),
(MVNETA_CPU_RXQ_ACCESS_ALL_MASK |
MVNETA_CPU_TXQ_ACCESS_ALL_MASK));
--
2.4.5

2015-07-03 14:30:16

by Maxime Ripard

[permalink] [raw]
Subject: [PATCH 2/6] genirq: Fix the documentation of request_percpu_irq

The documentation of request_percpu_irq is confusing and suggest that the
interrupt is not enabled at all, while it is actually enabled on the local
CPU.

Clarify that.

Signed-off-by: Maxime Ripard <[email protected]>
---
kernel/irq/manage.c | 7 ++++---
1 file changed, 4 insertions(+), 3 deletions(-)

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index e68932bb308e..ec31697f29b3 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1757,9 +1757,10 @@ int setup_percpu_irq(unsigned int irq, struct irqaction *act)
* @devname: An ascii name for the claiming device
* @dev_id: A percpu cookie passed back to the handler function
*
- * This call allocates interrupt resources, but doesn't
- * automatically enable the interrupt. It has to be done on each
- * CPU using enable_percpu_irq().
+ * This call allocates interrupt resources and enables the
+ * interrupt on the local CPU. If the interrupt is supposed to be
+ * enabled on other CPUs, it has to be done on each CPU using
+ * enable_percpu_irq().
*
* Dev_id must be globally unique. It is a per-cpu variable, and
* the handler gets called with the interrupted CPU's instance of
--
2.4.5

2015-07-03 14:30:29

by Maxime Ripard

[permalink] [raw]
Subject: [PATCH 3/6] irqchip: armada-370-xp: Rework per-cpu interrupts handling

The MPIC driver currently has a list of interrupts to handle as per-cpu.

Since the timer, fabric and neta interrupts were the only per-cpu
interrupts in the system, we can now remove the switch and just check for
the hardware irq number to determine whether a given interrupt is per-cpu
or not.

Signed-off-by: Maxime Ripard <[email protected]>
---
drivers/irqchip/irq-armada-370-xp.c | 14 ++++----------
1 file changed, 4 insertions(+), 10 deletions(-)

diff --git a/drivers/irqchip/irq-armada-370-xp.c b/drivers/irqchip/irq-armada-370-xp.c
index daccc8bdbb42..42c69bd95bf8 100644
--- a/drivers/irqchip/irq-armada-370-xp.c
+++ b/drivers/irqchip/irq-armada-370-xp.c
@@ -57,9 +57,6 @@

#define ARMADA_370_XP_MAX_PER_CPU_IRQS (28)

-#define ARMADA_370_XP_TIMER0_PER_CPU_IRQ (5)
-#define ARMADA_370_XP_FABRIC_IRQ (3)
-
#define IPI_DOORBELL_START (0)
#define IPI_DOORBELL_END (8)
#define IPI_DOORBELL_MASK 0xFF
@@ -82,13 +79,10 @@ static phys_addr_t msi_doorbell_addr;

static inline bool is_percpu_irq(irq_hw_number_t irq)
{
- switch (irq) {
- case ARMADA_370_XP_TIMER0_PER_CPU_IRQ:
- case ARMADA_370_XP_FABRIC_IRQ:
+ if (irq <= ARMADA_370_XP_MAX_PER_CPU_IRQS)
return true;
- default:
- return false;
- }
+
+ return false;
}

/*
@@ -552,7 +546,7 @@ static void armada_370_xp_mpic_resume(void)
if (virq == 0)
continue;

- if (irq != ARMADA_370_XP_TIMER0_PER_CPU_IRQ)
+ if (!is_percpu_irq(irq))
writel(irq, per_cpu_int_base +
ARMADA_370_XP_INT_CLEAR_MASK_OFFS);
else
--
2.4.5

2015-07-03 14:30:48

by Maxime Ripard

[permalink] [raw]
Subject: [PATCH 4/6] net: mvneta: Handle per-cpu interrupts

Now that our interrupt controller is allowing us to use per-CPU interrupts,
actually use it in the mvneta driver.

This involves obviously reworking the driver to have a CPU-local NAPI
structure, and report for incoming packet using that structure.

Signed-off-by: Maxime Ripard <[email protected]>
---
drivers/net/ethernet/marvell/mvneta.c | 85 ++++++++++++++++++++++++-----------
1 file changed, 58 insertions(+), 27 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index b7717375ec4d..aedd21ed9532 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -285,7 +285,21 @@ struct mvneta_pcpu_stats {
u64 tx_bytes;
};

+struct mvneta_pcpu_port {
+ /* Pointer to the shared port */
+ struct mvneta_port *pp;
+
+ /* Pointer to the CPU-local NAPI struct */
+ struct napi_struct napi;
+
+ /* Cause of the previous interrupt */
+ u32 cause_rx_tx;
+};
+
struct mvneta_port {
+ struct mvneta_pcpu_port __percpu *ports;
+ struct mvneta_pcpu_stats __percpu *stats;
+
int pkt_size;
unsigned int frag_size;
void __iomem *base;
@@ -293,15 +307,11 @@ struct mvneta_port {
struct mvneta_tx_queue *txqs;
struct net_device *dev;

- u32 cause_rx_tx;
- struct napi_struct napi;
-
/* Core clock */
struct clk *clk;
u8 mcast_count[256];
u16 tx_ring_size;
u16 rx_ring_size;
- struct mvneta_pcpu_stats *stats;

struct mii_bus *mii_bus;
struct phy_device *phy_dev;
@@ -1454,6 +1464,7 @@ static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
static int mvneta_rx(struct mvneta_port *pp, int rx_todo,
struct mvneta_rx_queue *rxq)
{
+ struct mvneta_pcpu_port *port = this_cpu_ptr(pp->ports);
struct net_device *dev = pp->dev;
int rx_done, rx_filled;
u32 rcvd_pkts = 0;
@@ -1508,7 +1519,7 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo,

skb->protocol = eth_type_trans(skb, dev);
mvneta_rx_csum(pp, rx_status, skb);
- napi_gro_receive(&pp->napi, skb);
+ napi_gro_receive(&port->napi, skb);

rcvd_pkts++;
rcvd_bytes += rx_bytes;
@@ -1535,7 +1546,7 @@ static int mvneta_rx(struct mvneta_port *pp, int rx_todo,

mvneta_rx_csum(pp, rx_status, skb);

- napi_gro_receive(&pp->napi, skb);
+ napi_gro_receive(&port->napi, skb);

/* Refill processing */
err = mvneta_rx_refill(pp, rx_desc);
@@ -2054,12 +2065,11 @@ static void mvneta_set_rx_mode(struct net_device *dev)
/* Interrupt handling - the callback for request_irq() */
static irqreturn_t mvneta_isr(int irq, void *dev_id)
{
- struct mvneta_port *pp = (struct mvneta_port *)dev_id;
+ struct mvneta_pcpu_port *port = (struct mvneta_pcpu_port *)dev_id;

- /* Mask all interrupts */
- mvreg_write(pp, MVNETA_INTR_NEW_MASK, 0);
+ disable_percpu_irq(port->pp->dev->irq);

- napi_schedule(&pp->napi);
+ napi_schedule(&port->napi);

return IRQ_HANDLED;
}
@@ -2097,11 +2107,11 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
{
int rx_done = 0;
u32 cause_rx_tx;
- unsigned long flags;
struct mvneta_port *pp = netdev_priv(napi->dev);
+ struct mvneta_pcpu_port *port = this_cpu_ptr(pp->ports);

if (!netif_running(pp->dev)) {
- napi_complete(napi);
+ napi_complete(&port->napi);
return rx_done;
}

@@ -2128,7 +2138,7 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
/* For the case where the last mvneta_poll did not process all
* RX packets
*/
- cause_rx_tx |= pp->cause_rx_tx;
+ cause_rx_tx |= port->cause_rx_tx;
if (rxq_number > 1) {
while ((cause_rx_tx & MVNETA_RX_INTR_MASK_ALL) && (budget > 0)) {
int count;
@@ -2159,16 +2169,11 @@ static int mvneta_poll(struct napi_struct *napi, int budget)

if (budget > 0) {
cause_rx_tx = 0;
- napi_complete(napi);
- local_irq_save(flags);
- mvreg_write(pp, MVNETA_INTR_NEW_MASK,
- MVNETA_RX_INTR_MASK(rxq_number) |
- MVNETA_TX_INTR_MASK(txq_number) |
- MVNETA_MISCINTR_INTR_MASK);
- local_irq_restore(flags);
+ napi_complete(&port->napi);
+ enable_percpu_irq(pp->dev->irq, 0);
}

- pp->cause_rx_tx = cause_rx_tx;
+ port->cause_rx_tx = cause_rx_tx;
return rx_done;
}

@@ -2417,6 +2422,8 @@ static int mvneta_setup_txqs(struct mvneta_port *pp)

static void mvneta_start_dev(struct mvneta_port *pp)
{
+ unsigned int cpu;
+
mvneta_max_rx_size_set(pp, pp->pkt_size);
mvneta_txq_max_tx_size_set(pp, pp->pkt_size);

@@ -2424,7 +2431,10 @@ static void mvneta_start_dev(struct mvneta_port *pp)
mvneta_port_enable(pp);

/* Enable polling on the port */
- napi_enable(&pp->napi);
+ for_each_present_cpu(cpu) {
+ struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
+ napi_enable(&port->napi);
+ }

/* Unmask interrupts */
mvreg_write(pp, MVNETA_INTR_NEW_MASK,
@@ -2442,9 +2452,14 @@ static void mvneta_start_dev(struct mvneta_port *pp)

static void mvneta_stop_dev(struct mvneta_port *pp)
{
+ unsigned int cpu;
+
phy_stop(pp->phy_dev);

- napi_disable(&pp->napi);
+ for_each_present_cpu(cpu) {
+ struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
+ napi_disable(&port->napi);
+ }

netif_carrier_off(pp->dev);

@@ -2683,8 +2698,8 @@ static int mvneta_open(struct net_device *dev)
goto err_cleanup_rxqs;

/* Connect to port interrupt line */
- ret = request_irq(pp->dev->irq, mvneta_isr, 0,
- MVNETA_DRIVER_NAME, pp);
+ ret = request_percpu_irq(pp->dev->irq, mvneta_isr,
+ MVNETA_DRIVER_NAME, pp->ports);
if (ret) {
netdev_err(pp->dev, "cannot request irq %d\n", pp->dev->irq);
goto err_cleanup_txqs;
@@ -3005,6 +3020,7 @@ static int mvneta_probe(struct platform_device *pdev)
int phy_mode;
int fixed_phy = 0;
int err;
+ int cpu;

/* Our multiqueue support is not complete, so for now, only
* allow the usage of the first RX queue
@@ -3079,11 +3095,18 @@ static int mvneta_probe(struct platform_device *pdev)
goto err_clk;
}

+ /* Alloc per-cpu port structure */
+ pp->ports = alloc_percpu(struct mvneta_pcpu_port);
+ if (!pp->ports) {
+ err = -ENOMEM;
+ goto err_clk;
+ }
+
/* Alloc per-cpu stats */
pp->stats = netdev_alloc_pcpu_stats(struct mvneta_pcpu_stats);
if (!pp->stats) {
err = -ENOMEM;
- goto err_clk;
+ goto err_free_ports;
}

dt_mac_addr = of_get_mac_address(dn);
@@ -3121,7 +3144,12 @@ static int mvneta_probe(struct platform_device *pdev)
if (dram_target_info)
mvneta_conf_mbus_windows(pp, dram_target_info);

- netif_napi_add(dev, &pp->napi, mvneta_poll, NAPI_POLL_WEIGHT);
+ for_each_present_cpu(cpu) {
+ struct mvneta_pcpu_port *port = per_cpu_ptr(pp->ports, cpu);
+
+ netif_napi_add(dev, &port->napi, mvneta_poll, NAPI_POLL_WEIGHT);
+ port->pp = pp;
+ }

dev->features = NETIF_F_SG | NETIF_F_IP_CSUM | NETIF_F_TSO;
dev->hw_features |= dev->features;
@@ -3150,6 +3178,8 @@ static int mvneta_probe(struct platform_device *pdev)

err_free_stats:
free_percpu(pp->stats);
+err_free_ports:
+ free_percpu(pp->ports);
err_clk:
clk_disable_unprepare(pp->clk);
err_put_phy_node:
@@ -3169,6 +3199,7 @@ static int mvneta_remove(struct platform_device *pdev)

unregister_netdev(dev);
clk_disable_unprepare(pp->clk);
+ free_percpu(pp->ports);
free_percpu(pp->stats);
irq_dispose_mapping(dev->irq);
of_node_put(pp->phy_node);
--
2.4.5

2015-07-03 14:31:00

by Maxime Ripard

[permalink] [raw]
Subject: [PATCH 5/6] net: mvneta: Allow different queues

The mvneta driver allows to change the default RX queue trough the rxq_def
kernel parameter.

However, the current code doesn't allow to have any value but 0. It is
actively checked for in the driver's probe because the drivers makes a
number of assumption and takes a number of shortcuts in order to just use
that RX queue.

Remove these limitations in order to be able to specify any available
queue.

Signed-off-by: Maxime Ripard <[email protected]>
---
drivers/net/ethernet/marvell/mvneta.c | 80 +++++------------------------------
1 file changed, 11 insertions(+), 69 deletions(-)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index aedd21ed9532..0d21b8a779d9 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -477,7 +477,7 @@ struct mvneta_rx_queue {
/* The hardware supports eight (8) rx queues, but we are only allowing
* the first one to be used. Therefore, let's just allocate one queue.
*/
-static int rxq_number = 1;
+static int rxq_number = 8;
static int txq_number = 8;

static int rxq_def;
@@ -765,14 +765,7 @@ static void mvneta_port_up(struct mvneta_port *pp)
mvreg_write(pp, MVNETA_TXQ_CMD, q_map);

/* Enable all initialized RXQs. */
- q_map = 0;
- for (queue = 0; queue < rxq_number; queue++) {
- struct mvneta_rx_queue *rxq = &pp->rxqs[queue];
- if (rxq->descs != NULL)
- q_map |= (1 << queue);
- }
-
- mvreg_write(pp, MVNETA_RXQ_CMD, q_map);
+ mvreg_write(pp, MVNETA_RXQ_CMD, BIT(rxq_def));
}

/* Stop the Ethernet port activity */
@@ -1429,17 +1422,6 @@ static u32 mvneta_skb_tx_csum(struct mvneta_port *pp, struct sk_buff *skb)
return MVNETA_TX_L4_CSUM_NOT;
}

-/* Returns rx queue pointer (find last set bit) according to causeRxTx
- * value
- */
-static struct mvneta_rx_queue *mvneta_rx_policy(struct mvneta_port *pp,
- u32 cause)
-{
- int queue = fls(cause >> 8) - 1;
-
- return (queue < 0 || queue >= rxq_number) ? NULL : &pp->rxqs[queue];
-}
-
/* Drop packets received by the RXQ and free buffers */
static void mvneta_rxq_drop_pkts(struct mvneta_port *pp,
struct mvneta_rx_queue *rxq)
@@ -2139,33 +2121,8 @@ static int mvneta_poll(struct napi_struct *napi, int budget)
* RX packets
*/
cause_rx_tx |= port->cause_rx_tx;
- if (rxq_number > 1) {
- while ((cause_rx_tx & MVNETA_RX_INTR_MASK_ALL) && (budget > 0)) {
- int count;
- struct mvneta_rx_queue *rxq;
- /* get rx queue number from cause_rx_tx */
- rxq = mvneta_rx_policy(pp, cause_rx_tx);
- if (!rxq)
- break;
-
- /* process the packet in that rx queue */
- count = mvneta_rx(pp, budget, rxq);
- rx_done += count;
- budget -= count;
- if (budget > 0) {
- /* set off the rx bit of the
- * corresponding bit in the cause rx
- * tx register, so that next iteration
- * will find the next rx queue where
- * packets are received on
- */
- cause_rx_tx &= ~((1 << rxq->id) << 8);
- }
- }
- } else {
- rx_done = mvneta_rx(pp, budget, &pp->rxqs[rxq_def]);
- budget -= rx_done;
- }
+ rx_done = mvneta_rx(pp, budget, &pp->rxqs[rxq_def]);
+ budget -= rx_done;

if (budget > 0) {
cause_rx_tx = 0;
@@ -2377,26 +2334,19 @@ static void mvneta_cleanup_txqs(struct mvneta_port *pp)
/* Cleanup all Rx queues */
static void mvneta_cleanup_rxqs(struct mvneta_port *pp)
{
- int queue;
-
- for (queue = 0; queue < rxq_number; queue++)
- mvneta_rxq_deinit(pp, &pp->rxqs[queue]);
+ mvneta_rxq_deinit(pp, &pp->rxqs[rxq_def]);
}


/* Init all Rx queues */
static int mvneta_setup_rxqs(struct mvneta_port *pp)
{
- int queue;
-
- for (queue = 0; queue < rxq_number; queue++) {
- int err = mvneta_rxq_init(pp, &pp->rxqs[queue]);
- if (err) {
- netdev_err(pp->dev, "%s: can't create rxq=%d\n",
- __func__, queue);
- mvneta_cleanup_rxqs(pp);
- return err;
- }
+ int err = mvneta_rxq_init(pp, &pp->rxqs[rxq_def]);
+ if (err) {
+ netdev_err(pp->dev, "%s: can't create rxq=%d\n",
+ __func__, rxq_def);
+ mvneta_cleanup_rxqs(pp);
+ return err;
}

return 0;
@@ -3022,14 +2972,6 @@ static int mvneta_probe(struct platform_device *pdev)
int err;
int cpu;

- /* Our multiqueue support is not complete, so for now, only
- * allow the usage of the first RX queue
- */
- if (rxq_def != 0) {
- dev_err(&pdev->dev, "Invalid rxq_def argument: %d\n", rxq_def);
- return -EINVAL;
- }
-
dev = alloc_etherdev_mqs(sizeof(struct mvneta_port), txq_number, rxq_number);
if (!dev)
return -ENOMEM;
--
2.4.5

2015-07-03 14:31:51

by Maxime Ripard

[permalink] [raw]
Subject: [PATCH 6/6] net: mvneta: Statically assign queues to CPUs

Since the switch to per-CPU interrupts, we lost the ability to set which
CPU was going to receive our RX interrupt, which was now only the CPU on
which the mvneta_open function was run.

We can now assign our queues to their respective CPUs, and make sure only
this CPU is going to handle our traffic.

This also paves the road to be able to change that at runtime, and later on
to support RSS.

Signed-off-by: Maxime Ripard <[email protected]>
---
drivers/net/ethernet/marvell/mvneta.c | 20 ++++++++++++++++++++
1 file changed, 20 insertions(+)

diff --git a/drivers/net/ethernet/marvell/mvneta.c b/drivers/net/ethernet/marvell/mvneta.c
index 0d21b8a779d9..658d713abc18 100644
--- a/drivers/net/ethernet/marvell/mvneta.c
+++ b/drivers/net/ethernet/marvell/mvneta.c
@@ -2630,6 +2630,13 @@ static void mvneta_mdio_remove(struct mvneta_port *pp)
pp->phy_dev = NULL;
}

+static void mvneta_percpu_enable(void *arg)
+{
+ struct mvneta_port *pp = arg;
+
+ enable_percpu_irq(pp->dev->irq, IRQ_TYPE_NONE);
+}
+
static int mvneta_open(struct net_device *dev)
{
struct mvneta_port *pp = netdev_priv(dev);
@@ -2655,6 +2662,19 @@ static int mvneta_open(struct net_device *dev)
goto err_cleanup_txqs;
}

+ /*
+ * Even though the documentation says that request_percpu_irq
+ * doesn't enable the interrupts automatically, it actually
+ * does so on the local CPU.
+ *
+ * Make sure it's disabled.
+ */
+ disable_percpu_irq(pp->dev->irq);
+
+ /* Enable per-CPU interrupt on the one CPU we care about */
+ smp_call_function_single(rxq_def % num_online_cpus(),
+ mvneta_percpu_enable, pp, true);
+
/* In default link is down */
netif_carrier_off(pp->dev);

--
2.4.5

2015-07-03 14:46:38

by Thomas Petazzoni

[permalink] [raw]
Subject: Re: [PATCH 6/6] net: mvneta: Statically assign queues to CPUs

Maxime,

On Fri, 3 Jul 2015 16:25:51 +0200, Maxime Ripard wrote:

> +static void mvneta_percpu_enable(void *arg)
> +{
> + struct mvneta_port *pp = arg;
> +
> + enable_percpu_irq(pp->dev->irq, IRQ_TYPE_NONE);
> +}
> +
> static int mvneta_open(struct net_device *dev)
> {
> struct mvneta_port *pp = netdev_priv(dev);
> @@ -2655,6 +2662,19 @@ static int mvneta_open(struct net_device *dev)
> goto err_cleanup_txqs;
> }
>
> + /*
> + * Even though the documentation says that request_percpu_irq
> + * doesn't enable the interrupts automatically, it actually
> + * does so on the local CPU.
> + *
> + * Make sure it's disabled.
> + */
> + disable_percpu_irq(pp->dev->irq);
> +
> + /* Enable per-CPU interrupt on the one CPU we care about */
> + smp_call_function_single(rxq_def % num_online_cpus(),
> + mvneta_percpu_enable, pp, true);

What happens if that CPU goes offline through CPU hotplug?

Thanks!

Thomas
--
Thomas Petazzoni, CTO, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com

2015-07-05 12:37:35

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 4/6] net: mvneta: Handle per-cpu interrupts

Hi Maxime,

On Fri, Jul 03, 2015 at 04:25:49PM +0200, Maxime Ripard wrote:
> Now that our interrupt controller is allowing us to use per-CPU interrupts,
> actually use it in the mvneta driver.
>
> This involves obviously reworking the driver to have a CPU-local NAPI
> structure, and report for incoming packet using that structure.
>
> Signed-off-by: Maxime Ripard <[email protected]>

This patch breaks module build of mvneta unless you export request_percpu_irq :

diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
index ec31697..1440a92 100644
--- a/kernel/irq/manage.c
+++ b/kernel/irq/manage.c
@@ -1799,6 +1799,7 @@ int request_percpu_irq(unsigned int irq, irq_handler_t handler,

return retval;
}
+EXPORT_SYMBOL_GPL(request_percpu_irq);

Regards,
Willy

2015-07-05 13:00:30

by Willy Tarreau

[permalink] [raw]
Subject: Re: [PATCH 6/6] net: mvneta: Statically assign queues to CPUs

Hi Thomas,

On Fri, Jul 03, 2015 at 04:46:24PM +0200, Thomas Petazzoni wrote:
> Maxime,
>
> On Fri, 3 Jul 2015 16:25:51 +0200, Maxime Ripard wrote:
>
> > +static void mvneta_percpu_enable(void *arg)
> > +{
> > + struct mvneta_port *pp = arg;
> > +
> > + enable_percpu_irq(pp->dev->irq, IRQ_TYPE_NONE);
> > +}
> > +
> > static int mvneta_open(struct net_device *dev)
> > {
> > struct mvneta_port *pp = netdev_priv(dev);
> > @@ -2655,6 +2662,19 @@ static int mvneta_open(struct net_device *dev)
> > goto err_cleanup_txqs;
> > }
> >
> > + /*
> > + * Even though the documentation says that request_percpu_irq
> > + * doesn't enable the interrupts automatically, it actually
> > + * does so on the local CPU.
> > + *
> > + * Make sure it's disabled.
> > + */
> > + disable_percpu_irq(pp->dev->irq);
> > +
> > + /* Enable per-CPU interrupt on the one CPU we care about */
> > + smp_call_function_single(rxq_def % num_online_cpus(),
> > + mvneta_percpu_enable, pp, true);
>
> What happens if that CPU goes offline through CPU hotplug?

I just tried : if I start mvneta with "rxq_def=1", then my irq runs on
CPU1. Then I offline CPU1 and the irqs are automatically handled by CPU0.
Then I online CPU1 and irqs stay on CPU0.

More or less related, I found that if I enable a queue number larger than
the CPU count it does work, but then the system complains during rmmod :

[ 877.146203] ------------[ cut here ]------------
[ 877.146227] WARNING: CPU: 1 PID: 1731 at fs/proc/generic.c:552 remove_proc_entry+0x144/0x15c()
[ 877.146233] remove_proc_entry: removing non-empty directory 'irq/29', leaking at least 'mvneta'
[ 877.146238] Modules linked in: mvneta(-) [last unloaded: mvneta]
[ 877.146254] CPU: 1 PID: 1731 Comm: rmmod Tainted: G W 4.1.1-mvebu-00006-g3d317ed-dirty #5
[ 877.146260] Hardware name: Marvell Armada 370/XP (Device Tree)
[ 877.146281] [<c0017194>] (unwind_backtrace) from [<c00126d4>] (show_stack+0x10/0x14)
[ 877.146293] [<c00126d4>] (show_stack) from [<c04d32e4>] (dump_stack+0x74/0x90)
[ 877.146305] [<c04d32e4>] (dump_stack) from [<c0025200>] (warn_slowpath_common+0x74/0xb0)
[ 877.146315] [<c0025200>] (warn_slowpath_common) from [<c00252d0>] (warn_slowpath_fmt+0x30/0x40)
[ 877.146325] [<c00252d0>] (warn_slowpath_fmt) from [<c0115694>] (remove_proc_entry+0x144/0x15c)
[ 877.146336] [<c0115694>] (remove_proc_entry) from [<c0060fd4>] (unregister_irq_proc+0x8c/0xb0)
[ 877.146347] [<c0060fd4>] (unregister_irq_proc) from [<c0059f84>] (free_desc+0x28/0x58)
[ 877.146356] [<c0059f84>] (free_desc) from [<c005a028>] (irq_free_descs+0x44/0x80)
[ 877.146368] [<c005a028>] (irq_free_descs) from [<bf015748>] (mvneta_remove+0x3c/0x4c [mvneta])
[ 877.146382] [<bf015748>] (mvneta_remove [mvneta]) from [<c024d6e8>] (platform_drv_remove+0x18/0x30)
[ 877.146393] [<c024d6e8>] (platform_drv_remove) from [<c024bd50>] (__device_release_driver+0x70/0xe4)
[ 877.146402] [<c024bd50>] (__device_release_driver) from [<c024c5b8>] (driver_detach+0xcc/0xd0)
[ 877.146411] [<c024c5b8>] (driver_detach) from [<c024bbe0>] (bus_remove_driver+0x4c/0x90)
[ 877.146425] [<c024bbe0>] (bus_remove_driver) from [<c007d3f0>] (SyS_delete_module+0x164/0x1b4)
[ 877.146437] [<c007d3f0>] (SyS_delete_module) from [<c000f4c0>] (ret_fast_syscall+0x0/0x3c)
[ 877.146443] ---[ end trace 48713a9ae31204b1 ]---

This was on the AX3 (dual-proc) with rxq_def=2.

Hoping this helps,
Willy

2015-07-06 13:05:08

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH 4/6] net: mvneta: Handle per-cpu interrupts

Hi Willy,

On Sun, Jul 05, 2015 at 02:37:08PM +0200, Willy Tarreau wrote:
> Hi Maxime,
>
> On Fri, Jul 03, 2015 at 04:25:49PM +0200, Maxime Ripard wrote:
> > Now that our interrupt controller is allowing us to use per-CPU interrupts,
> > actually use it in the mvneta driver.
> >
> > This involves obviously reworking the driver to have a CPU-local NAPI
> > structure, and report for incoming packet using that structure.
> >
> > Signed-off-by: Maxime Ripard <[email protected]>
>
> This patch breaks module build of mvneta unless you export request_percpu_irq :
>
> diff --git a/kernel/irq/manage.c b/kernel/irq/manage.c
> index ec31697..1440a92 100644
> --- a/kernel/irq/manage.c
> +++ b/kernel/irq/manage.c
> @@ -1799,6 +1799,7 @@ int request_percpu_irq(unsigned int irq, irq_handler_t handler,
>
> return retval;
> }
> +EXPORT_SYMBOL_GPL(request_percpu_irq);

Ah, right. Thanks!

Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


Attachments:
(No filename) (1.00 kB)
signature.asc (819.00 B)
Digital signature
Download all attachments

2015-07-06 13:20:14

by Maxime Ripard

[permalink] [raw]
Subject: Re: [PATCH 6/6] net: mvneta: Statically assign queues to CPUs

On Sun, Jul 05, 2015 at 03:00:11PM +0200, Willy Tarreau wrote:
> Hi Thomas,
>
> On Fri, Jul 03, 2015 at 04:46:24PM +0200, Thomas Petazzoni wrote:
> > Maxime,
> >
> > On Fri, 3 Jul 2015 16:25:51 +0200, Maxime Ripard wrote:
> >
> > > +static void mvneta_percpu_enable(void *arg)
> > > +{
> > > + struct mvneta_port *pp = arg;
> > > +
> > > + enable_percpu_irq(pp->dev->irq, IRQ_TYPE_NONE);
> > > +}
> > > +
> > > static int mvneta_open(struct net_device *dev)
> > > {
> > > struct mvneta_port *pp = netdev_priv(dev);
> > > @@ -2655,6 +2662,19 @@ static int mvneta_open(struct net_device *dev)
> > > goto err_cleanup_txqs;
> > > }
> > >
> > > + /*
> > > + * Even though the documentation says that request_percpu_irq
> > > + * doesn't enable the interrupts automatically, it actually
> > > + * does so on the local CPU.
> > > + *
> > > + * Make sure it's disabled.
> > > + */
> > > + disable_percpu_irq(pp->dev->irq);
> > > +
> > > + /* Enable per-CPU interrupt on the one CPU we care about */
> > > + smp_call_function_single(rxq_def % num_online_cpus(),
> > > + mvneta_percpu_enable, pp, true);
> >
> > What happens if that CPU goes offline through CPU hotplug?
>
> I just tried : if I start mvneta with "rxq_def=1", then my irq runs
> on CPU1. Then I offline CPU1 and the irqs are automatically handled
> by CPU0. Then I online CPU1 and irqs stay on CPU0.

I'm confused, I would have thought that it wouldn't even work.

I guess it can be easily solved with a cpu_notifier though.

> More or less related, I found that if I enable a queue number larger
> than the CPU count it does work, but then the system complains
> during rmmod :
>
> [ 877.146203] ------------[ cut here ]------------
> [ 877.146227] WARNING: CPU: 1 PID: 1731 at fs/proc/generic.c:552 remove_proc_entry+0x144/0x15c()
> [ 877.146233] remove_proc_entry: removing non-empty directory 'irq/29', leaking at least 'mvneta'
> [ 877.146238] Modules linked in: mvneta(-) [last unloaded: mvneta]
> [ 877.146254] CPU: 1 PID: 1731 Comm: rmmod Tainted: G W 4.1.1-mvebu-00006-g3d317ed-dirty #5
> [ 877.146260] Hardware name: Marvell Armada 370/XP (Device Tree)
> [ 877.146281] [<c0017194>] (unwind_backtrace) from [<c00126d4>] (show_stack+0x10/0x14)
> [ 877.146293] [<c00126d4>] (show_stack) from [<c04d32e4>] (dump_stack+0x74/0x90)
> [ 877.146305] [<c04d32e4>] (dump_stack) from [<c0025200>] (warn_slowpath_common+0x74/0xb0)
> [ 877.146315] [<c0025200>] (warn_slowpath_common) from [<c00252d0>] (warn_slowpath_fmt+0x30/0x40)
> [ 877.146325] [<c00252d0>] (warn_slowpath_fmt) from [<c0115694>] (remove_proc_entry+0x144/0x15c)
> [ 877.146336] [<c0115694>] (remove_proc_entry) from [<c0060fd4>] (unregister_irq_proc+0x8c/0xb0)
> [ 877.146347] [<c0060fd4>] (unregister_irq_proc) from [<c0059f84>] (free_desc+0x28/0x58)
> [ 877.146356] [<c0059f84>] (free_desc) from [<c005a028>] (irq_free_descs+0x44/0x80)
> [ 877.146368] [<c005a028>] (irq_free_descs) from [<bf015748>] (mvneta_remove+0x3c/0x4c [mvneta])
> [ 877.146382] [<bf015748>] (mvneta_remove [mvneta]) from [<c024d6e8>] (platform_drv_remove+0x18/0x30)
> [ 877.146393] [<c024d6e8>] (platform_drv_remove) from [<c024bd50>] (__device_release_driver+0x70/0xe4)
> [ 877.146402] [<c024bd50>] (__device_release_driver) from [<c024c5b8>] (driver_detach+0xcc/0xd0)
> [ 877.146411] [<c024c5b8>] (driver_detach) from [<c024bbe0>] (bus_remove_driver+0x4c/0x90)
> [ 877.146425] [<c024bbe0>] (bus_remove_driver) from [<c007d3f0>] (SyS_delete_module+0x164/0x1b4)
> [ 877.146437] [<c007d3f0>] (SyS_delete_module) from [<c000f4c0>] (ret_fast_syscall+0x0/0x3c)
> [ 877.146443] ---[ end trace 48713a9ae31204b1 ]---
>
> This was on the AX3 (dual-proc) with rxq_def=2.

Hmmm, interesting, I'll look into it, thanks!

Maxime

--
Maxime Ripard, Free Electrons
Embedded Linux, Kernel and Android engineering
http://free-electrons.com


Attachments:
(No filename) (3.79 kB)
signature.asc (819.00 B)
Digital signature
Download all attachments

2015-07-08 21:05:25

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2/6] genirq: Fix the documentation of request_percpu_irq

From: Maxime Ripard <[email protected]>
Date: Fri, 3 Jul 2015 16:25:47 +0200

> The documentation of request_percpu_irq is confusing and suggest that the
> interrupt is not enabled at all, while it is actually enabled on the local
> CPU.
>
> Clarify that.
>
> Signed-off-by: Maxime Ripard <[email protected]>

You should submit this separately to the IRQ layer maintainers, rather
than try to include it in an unrelated set of changes targetting net-next.

Thanks.