Subject: [PATCH 1/2] mlx4: allow device removal by fixing dma unmap size

After opening the network interface, Mellanox ConnectX device cannot be
removed by hotplug because it has not properly unmapped all DMA memory.

It happens that mlx4_en_activate_rx_rings overrides the variable that
keeps the size of the memory mapped.

This is fixed by passing to mlx4_en_destroy_rx_ring the same size that is
given to mlx4_en_create_rx_ring.

After applying this patch, hot unplugging the device works after opening
the interface.

Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/en_netdev.c | 3 ++-
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 4 ++--
drivers/net/ethernet/mellanox/mlx4/mlx4_en.h | 3 ++-
3 files changed, 6 insertions(+), 4 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
index 467ae58..149e60d 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_netdev.c
@@ -892,7 +892,8 @@ void mlx4_en_free_resources(struct mlx4_en_priv *priv)

for (i = 0; i < priv->rx_ring_num; i++) {
if (priv->rx_ring[i].rx_info)
- mlx4_en_destroy_rx_ring(priv, &priv->rx_ring[i]);
+ mlx4_en_destroy_rx_ring(priv, &priv->rx_ring[i],
+ priv->prof->rx_ring_size, priv->stride);
if (priv->rx_cq[i].buf)
mlx4_en_destroy_cq(priv, &priv->rx_cq[i]);
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 971d4b6..d1c631e 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -380,12 +380,12 @@ err_allocator:
}

void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
- struct mlx4_en_rx_ring *ring)
+ struct mlx4_en_rx_ring *ring, u32 size, u16 stride)
{
struct mlx4_en_dev *mdev = priv->mdev;

mlx4_en_unmap_buffer(&ring->wqres.buf);
- mlx4_free_hwq_res(mdev->dev, &ring->wqres, ring->buf_size + TXBB_SIZE);
+ mlx4_free_hwq_res(mdev->dev, &ring->wqres, size * stride + TXBB_SIZE);
vfree(ring->rx_info);
ring->rx_info = NULL;
}
diff --git a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
index 35f0884..d60335f 100644
--- a/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
+++ b/drivers/net/ethernet/mellanox/mlx4/mlx4_en.h
@@ -528,7 +528,8 @@ int mlx4_en_create_rx_ring(struct mlx4_en_priv *priv,
struct mlx4_en_rx_ring *ring,
u32 size, u16 stride);
void mlx4_en_destroy_rx_ring(struct mlx4_en_priv *priv,
- struct mlx4_en_rx_ring *ring);
+ struct mlx4_en_rx_ring *ring,
+ u32 size, u16 stride);
int mlx4_en_activate_rx_rings(struct mlx4_en_priv *priv);
void mlx4_en_deactivate_rx_ring(struct mlx4_en_priv *priv,
struct mlx4_en_rx_ring *ring);
--
1.7.4.4


Subject: [PATCH 2/2] mlx4: fix DMA mapping leak when allocation fails

mlx4_en_prepare_rx_desc does not correctly clean up after it finds an
allocation failure. It should unmap a page before calling put_page, but
it only calls the later.

This bug would prevent a device removal using hotplug after setting the
device MTU to 9000 and opening the network interface. After the fix, we
still see the allocation failure with MTU 9000, but we are able to
remove the device.

Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>
---
drivers/net/ethernet/mellanox/mlx4/en_rx.c | 6 +++++-
1 files changed, 5 insertions(+), 1 deletions(-)

diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index d1c631e..d4ad8c2 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -168,8 +168,12 @@ static int mlx4_en_prepare_rx_desc(struct mlx4_en_priv *priv,
return 0;

err:
- while (i--)
+ while (i--) {
+ dma_addr_t dma = be64_to_cpu(rx_desc->data[i].addr);
+ pci_unmap_single(priv->mdev->pdev, dma, skb_frags[i].size,
+ PCI_DMA_FROMDEVICE);
put_page(skb_frags[i].page);
+ }
return -ENOMEM;
}

--
1.7.4.4

2012-02-06 19:43:08

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 1/2] mlx4: allow device removal by fixing dma unmap size

From: Thadeu Lima de Souza Cascardo <[email protected]>
Date: Mon, 6 Feb 2012 16:39:49 -0200

> After opening the network interface, Mellanox ConnectX device cannot be
> removed by hotplug because it has not properly unmapped all DMA memory.
>
> It happens that mlx4_en_activate_rx_rings overrides the variable that
> keeps the size of the memory mapped.
>
> This is fixed by passing to mlx4_en_destroy_rx_ring the same size that is
> given to mlx4_en_create_rx_ring.
>
> After applying this patch, hot unplugging the device works after opening
> the interface.
>
> Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>

Applied.

2012-02-06 19:43:16

by David Miller

[permalink] [raw]
Subject: Re: [PATCH 2/2] mlx4: fix DMA mapping leak when allocation fails

From: Thadeu Lima de Souza Cascardo <[email protected]>
Date: Mon, 6 Feb 2012 16:39:50 -0200

> mlx4_en_prepare_rx_desc does not correctly clean up after it finds an
> allocation failure. It should unmap a page before calling put_page, but
> it only calls the later.
>
> This bug would prevent a device removal using hotplug after setting the
> device MTU to 9000 and opening the network interface. After the fix, we
> still see the allocation failure with MTU 9000, but we are able to
> remove the device.
>
> Signed-off-by: Thadeu Lima de Souza Cascardo <[email protected]>

Applied.