2018-07-20 18:02:51

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 0/8] Fix breakage caused by the NTB multi-port patchset

Hey,

This is mostly a resend.

To reiterate the main points on the feedback: the switchtec driver,
in the cross link mode, will not be able to implement the port number
function callbacks and will have to always return 0. It's a physical
impossibility due to the symmetry. Therefore, in order fix this feature
(which worked when merged), the clients must be changed to support not
knowing the port number in the way they worked previously (ie. a legacy
mode where there will only be two ports and you know there will be two
sets of doorbells, one for each peer). The majority of this patch set
fixes these issues.

The other point of controversy is the dma mask. I still strongly
disagree with doing it in the driver as the code is clearly common
to all and not at all driver specific. Moreover, I think as written,
it is extra dangerous seeing all impleminting drivers are operating on
the new struct device before it's initialized in ntb_register().
Patch 2 in this series also fixes that.

Logan

--

Changes since v1:
- Rebased onto ntb-next (there was a minor conflict in a recent change
to the intel driver)
- Collected Dave's ack.

--

Logan Gunthorpe (8):
NTB: ntb_tool: reading the link file should not end in a NULL byte
NTB: Setup the DMA mask globally for all drivers
NTB: Fix the default port and peer numbers for legacy drivers
NTB: ntb_pingpong: Choose doorbells based on port number
NTB: perf: Don't require one more memory window than number of peers
NTB: perf: Fix support for hardware that doesn't have port numbers
NTB: perf: Fix race condition when run with ntb_test
NTB: ntb_test: Fix bug when counting remote files

drivers/ntb/hw/amd/ntb_hw_amd.c | 4 ----
drivers/ntb/hw/idt/ntb_hw_idt.c | 6 ------
drivers/ntb/hw/intel/ntb_hw_gen1.c | 4 ----
drivers/ntb/ntb.c | 22 ++++++++++++++--------
drivers/ntb/test/ntb_perf.c | 22 +++++++++++++++++++---
drivers/ntb/test/ntb_pingpong.c | 14 ++++++--------
drivers/ntb/test/ntb_tool.c | 3 +--
tools/testing/selftests/ntb/ntb_test.sh | 2 +-
8 files changed, 41 insertions(+), 36 deletions(-)

--
2.11.0


2018-07-20 18:01:42

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 7/8] NTB: perf: Fix race condition when run with ntb_test

When running ntb_test, the script tries to run the ntb_perf test
immediately after probing the modules. Since adding multi-port support,
this fails seeing the new initialization procedure in ntb_perf
can not complete instantly.

To fix this we add a completion which is waited on when a test is
started. In this way, run can be written any time after the module is
loaded and it will wait for the initialization to complete instead of
sending an error.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/ntb/test/ntb_perf.c | 10 ++++++++--
1 file changed, 8 insertions(+), 2 deletions(-)

diff --git a/drivers/ntb/test/ntb_perf.c b/drivers/ntb/test/ntb_perf.c
index 6285cb8515ac..8e2b7630ecc9 100644
--- a/drivers/ntb/test/ntb_perf.c
+++ b/drivers/ntb/test/ntb_perf.c
@@ -158,6 +158,8 @@ struct perf_peer {
/* NTB connection setup service */
struct work_struct service;
unsigned long sts;
+
+ struct completion init_comp;
};
#define to_peer_service(__work) \
container_of(__work, struct perf_peer, service)
@@ -549,6 +551,7 @@ static int perf_setup_outbuf(struct perf_peer *peer)

/* Initialization is finally done */
set_bit(PERF_STS_DONE, &peer->sts);
+ complete_all(&peer->init_comp);

return 0;
}
@@ -639,6 +642,7 @@ static void perf_service_work(struct work_struct *work)
perf_setup_outbuf(peer);

if (test_and_clear_bit(PERF_CMD_CLEAR, &peer->sts)) {
+ init_completion(&peer->init_comp);
clear_bit(PERF_STS_DONE, &peer->sts);
if (test_bit(0, &peer->perf->busy_flag) &&
peer == peer->perf->test_peer) {
@@ -1046,8 +1050,9 @@ static int perf_submit_test(struct perf_peer *peer)
struct perf_thread *pthr;
int tidx, ret;

- if (!test_bit(PERF_STS_DONE, &peer->sts))
- return -ENOLINK;
+ ret = wait_for_completion_interruptible(&peer->init_comp);
+ if (ret < 0)
+ return ret;

if (test_and_set_bit_lock(0, &perf->busy_flag))
return -EBUSY;
@@ -1413,6 +1418,7 @@ static int perf_init_peers(struct perf_ctx *perf)
peer->gidx = pidx;
}
INIT_WORK(&peer->service, perf_service_work);
+ init_completion(&peer->init_comp);
}
if (perf->gidx == -1)
perf->gidx = pidx;
--
2.11.0


2018-07-20 18:01:46

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 2/8] NTB: Setup the DMA mask globally for all drivers

Commit 417cf39cfea9 ("NTB: Set dma mask and dma coherent mask to NTB
devices") added code to set the DMA mask for the NTB device
to each driver individually. However, it neglected to set it for the
Switchtec driver. So when the monolithic commit 7f46c8b3a552 ("NTB:
ntb_tool: Add full multi-port NTB API support") started allocating
DMA memory against the NTB device it broke the Switchtec driver.

Seeing this is setting up a property of the NTB device, it should be
done by the common NTB code (inside ntb_register_device()) so we can be
sure it's done properly for all drivers. This avoids each driver needing
to duplicate the code and helps prevent us from inadvertently breaking
one of the drivers in the future if we have to make changes in this area.

Additionally, all existing drivers are setting a property of the
struct device before calling device_initialize() which works fine now,
but may not be the safest choice if someone decides to change
something such that the propery is overridden by the initialization.

Fixes: 7f46c8b3a552 ("NTB: ntb_tool: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <[email protected]>
Acked-by: Dave Jiang <[email protected]>
---
drivers/ntb/hw/amd/ntb_hw_amd.c | 4 ----
drivers/ntb/hw/idt/ntb_hw_idt.c | 6 ------
drivers/ntb/hw/intel/ntb_hw_gen1.c | 4 ----
drivers/ntb/ntb.c | 13 ++++++++++++-
4 files changed, 12 insertions(+), 15 deletions(-)

diff --git a/drivers/ntb/hw/amd/ntb_hw_amd.c b/drivers/ntb/hw/amd/ntb_hw_amd.c
index efb214fc545a..16c10dfe0919 100644
--- a/drivers/ntb/hw/amd/ntb_hw_amd.c
+++ b/drivers/ntb/hw/amd/ntb_hw_amd.c
@@ -1020,10 +1020,6 @@ static int amd_ntb_init_pci(struct amd_ntb_dev *ndev,
goto err_dma_mask;
dev_warn(&pdev->dev, "Cannot DMA consistent highmem\n");
}
- rc = dma_coerce_mask_and_coherent(&ndev->ntb.dev,
- dma_get_mask(&pdev->dev));
- if (rc)
- goto err_dma_mask;

ndev->self_mmio = pci_iomap(pdev, 0, 0);
if (!ndev->self_mmio) {
diff --git a/drivers/ntb/hw/idt/ntb_hw_idt.c b/drivers/ntb/hw/idt/ntb_hw_idt.c
index dbe72f116017..576a2a986085 100644
--- a/drivers/ntb/hw/idt/ntb_hw_idt.c
+++ b/drivers/ntb/hw/idt/ntb_hw_idt.c
@@ -2447,12 +2447,6 @@ static int idt_init_pci(struct idt_ntb_dev *ndev)
dev_warn(&pdev->dev,
"Cannot set consistent DMA highmem bit mask\n");
}
- ret = dma_coerce_mask_and_coherent(&ndev->ntb.dev,
- dma_get_mask(&pdev->dev));
- if (ret != 0) {
- dev_err(&pdev->dev, "Failed to set NTB device DMA bit mask\n");
- return ret;
- }

/*
* Enable the device advanced error reporting. It's not critical to
diff --git a/drivers/ntb/hw/intel/ntb_hw_gen1.c b/drivers/ntb/hw/intel/ntb_hw_gen1.c
index 6aa573227279..805845fa614b 100644
--- a/drivers/ntb/hw/intel/ntb_hw_gen1.c
+++ b/drivers/ntb/hw/intel/ntb_hw_gen1.c
@@ -1770,10 +1770,6 @@ static int intel_ntb_init_pci(struct intel_ntb_dev *ndev, struct pci_dev *pdev)
goto err_dma_mask;
dev_warn(&pdev->dev, "Cannot DMA consistent highmem\n");
}
- rc = dma_coerce_mask_and_coherent(&ndev->ntb.dev,
- dma_get_mask(&pdev->dev));
- if (rc)
- goto err_dma_mask;

ndev->self_mmio = pci_iomap(pdev, 0, 0);
if (!ndev->self_mmio) {
diff --git a/drivers/ntb/ntb.c b/drivers/ntb/ntb.c
index 2581ab724c34..93f24440d11d 100644
--- a/drivers/ntb/ntb.c
+++ b/drivers/ntb/ntb.c
@@ -100,6 +100,8 @@ EXPORT_SYMBOL(ntb_unregister_client);

int ntb_register_device(struct ntb_dev *ntb)
{
+ int ret;
+
if (!ntb)
return -EINVAL;
if (!ntb->pdev)
@@ -120,7 +122,16 @@ int ntb_register_device(struct ntb_dev *ntb)
ntb->ctx_ops = NULL;
spin_lock_init(&ntb->ctx_lock);

- return device_register(&ntb->dev);
+ device_initialize(&ntb->dev);
+
+ ret = dma_coerce_mask_and_coherent(&ntb->dev,
+ dma_get_mask(&ntb->pdev->dev));
+ if (ret != 0) {
+ dev_err(&ntb->dev, "Failed to set NTB device DMA bit mask\n");
+ return ret;
+ }
+
+ return device_add(&ntb->dev);
}
EXPORT_SYMBOL(ntb_register_device);

--
2.11.0


2018-07-20 18:01:53

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number

This commit fixes pingpong support for existing drivers that do not
implement ntb_default_port_number() and ntb_default_peer_port_number().
This is required for hardware (like the crosslink topology of
switchtec) which cannot assign reasonable port numbers to each port due
to its perfect symmetry.

Instead of picking the doorbell to use based on the the index of the
peer, we use the peer's port number. This is a bit clearer and easier
to understand.

Fixes: c7aeb0afdcc2 ("NTB: ntb_pp: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/ntb/test/ntb_pingpong.c | 14 ++++++--------
1 file changed, 6 insertions(+), 8 deletions(-)

diff --git a/drivers/ntb/test/ntb_pingpong.c b/drivers/ntb/test/ntb_pingpong.c
index 65865e460ab8..18d00eec7b02 100644
--- a/drivers/ntb/test/ntb_pingpong.c
+++ b/drivers/ntb/test/ntb_pingpong.c
@@ -121,15 +121,14 @@ static int pp_find_next_peer(struct pp_ctx *pp)
link = ntb_link_is_up(pp->ntb, NULL, NULL);

/* Find next available peer */
- if (link & pp->nmask) {
+ if (link & pp->nmask)
pidx = __ffs64(link & pp->nmask);
- out_db = BIT_ULL(pidx + 1);
- } else if (link & pp->pmask) {
+ else if (link & pp->pmask)
pidx = __ffs64(link & pp->pmask);
- out_db = BIT_ULL(pidx);
- } else {
+ else
return -ENODEV;
- }
+
+ out_db = BIT_ULL(ntb_peer_port_number(pp->ntb, pidx));

spin_lock(&pp->lock);
pp->out_pidx = pidx;
@@ -303,7 +302,7 @@ static void pp_init_flds(struct pp_ctx *pp)
break;
}

- pp->in_db = BIT_ULL(pidx);
+ pp->in_db = BIT_ULL(lport);
pp->pmask = GENMASK_ULL(pidx, 0) >> 1;
pp->nmask = GENMASK_ULL(pcnt - 1, pidx);

@@ -435,4 +434,3 @@ static void __exit pp_exit(void)
debugfs_remove_recursive(pp_dbgfs_topdir);
}
module_exit(pp_exit);
-
--
2.11.0


2018-07-20 18:02:10

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 8/8] NTB: ntb_test: Fix bug when counting remote files

When remote files are counted in get_files_count, without using SSH,
the code returns 0 because there is a colon prepended to $LOC. $VPATH
should have been used instead of $LOC.

Fixes: 06bd0407d06c ("NTB: ntb_test: Update ntb_tool Scratchpad tests")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
tools/testing/selftests/ntb/ntb_test.sh | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/tools/testing/selftests/ntb/ntb_test.sh b/tools/testing/selftests/ntb/ntb_test.sh
index 08cbfbbc7029..17ca36403d04 100755
--- a/tools/testing/selftests/ntb/ntb_test.sh
+++ b/tools/testing/selftests/ntb/ntb_test.sh
@@ -250,7 +250,7 @@ function get_files_count()
split_remote $LOC

if [[ "$REMOTE" == "" ]]; then
- echo $(ls -1 "$LOC"/${NAME}* 2>/dev/null | wc -l)
+ echo $(ls -1 "$VPATH"/${NAME}* 2>/dev/null | wc -l)
else
echo $(ssh "$REMOTE" "ls -1 \"$VPATH\"/${NAME}* | \
wc -l" 2> /dev/null)
--
2.11.0


2018-07-20 18:02:21

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 1/8] NTB: ntb_tool: reading the link file should not end in a NULL byte

When running ntb_test this warning is issued:

./ntb_test.sh: line 200: warning: command substitution: ignored null
byte in input

This is caused by the kernel returning one more byte than is necessary
when reading the link file.

Reduce the number of bytes read back to 2 as it was before the
commit that regressed this.

Fixes: 7f46c8b3a552 ("NTB: ntb_tool: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/ntb/test/ntb_tool.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)

diff --git a/drivers/ntb/test/ntb_tool.c b/drivers/ntb/test/ntb_tool.c
index d592c0ffbd19..ec5cf095cdb9 100644
--- a/drivers/ntb/test/ntb_tool.c
+++ b/drivers/ntb/test/ntb_tool.c
@@ -504,7 +504,7 @@ static ssize_t tool_peer_link_read(struct file *filep, char __user *ubuf,
buf[1] = '\n';
buf[2] = '\0';

- return simple_read_from_buffer(ubuf, size, offp, buf, 3);
+ return simple_read_from_buffer(ubuf, size, offp, buf, 2);
}

static TOOL_FOPS_RDWR(tool_peer_link_fops,
@@ -1690,4 +1690,3 @@ static void __exit tool_exit(void)
debugfs_remove_recursive(tool_dbgfs_topdir);
}
module_exit(tool_exit);
-
--
2.11.0


2018-07-20 18:02:40

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 6/8] NTB: perf: Fix support for hardware that doesn't have port numbers

Legacy drivers do not have port numbers (but is reliably only two ports)
and was broken by the recent commit that added mult-port support to
ntb_perf. This is especially important to support the cross link
topology which is perfectly symmetric and cannot assign unique port
numbers easily.

Hardware that returns zero for both the local port and the peer should
just always use gidx=0 for the only peer.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/ntb/test/ntb_perf.c | 10 ++++++++++
1 file changed, 10 insertions(+)

diff --git a/drivers/ntb/test/ntb_perf.c b/drivers/ntb/test/ntb_perf.c
index fe27412ffe91..6285cb8515ac 100644
--- a/drivers/ntb/test/ntb_perf.c
+++ b/drivers/ntb/test/ntb_perf.c
@@ -1417,6 +1417,16 @@ static int perf_init_peers(struct perf_ctx *perf)
if (perf->gidx == -1)
perf->gidx = pidx;

+ /*
+ * Hardware with only two ports may not have unique port
+ * numbers. In this case, the gidxs should all be zero.
+ */
+ if (perf->pcnt == 1 && ntb_port_number(perf->ntb) == 0 &&
+ ntb_peer_port_number(perf->ntb, 0) == 0) {
+ perf->gidx = 0;
+ perf->peers[0].gidx = 0;
+ }
+
for (pidx = 0; pidx < perf->pcnt; pidx++) {
ret = perf_setup_peer_mw(&perf->peers[pidx]);
if (ret)
--
2.11.0


2018-07-20 18:02:41

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 5/8] NTB: perf: Don't require one more memory window than number of peers

ntb_perf should not require more than one memory window per peer. This
was probably an off-by-one error.

Fixes: 5648e56d03fa ("NTB: ntb_perf: Add full multi-port NTB API support")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/ntb/test/ntb_perf.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/drivers/ntb/test/ntb_perf.c b/drivers/ntb/test/ntb_perf.c
index 2a9d6b0d1f19..fe27412ffe91 100644
--- a/drivers/ntb/test/ntb_perf.c
+++ b/drivers/ntb/test/ntb_perf.c
@@ -655,7 +655,7 @@ static int perf_init_service(struct perf_ctx *perf)
{
u64 mask;

- if (ntb_peer_mw_count(perf->ntb) < perf->pcnt + 1) {
+ if (ntb_peer_mw_count(perf->ntb) < perf->pcnt) {
dev_err(&perf->ntb->dev, "Not enough memory windows\n");
return -EINVAL;
}
--
2.11.0


2018-07-20 18:03:46

by Logan Gunthorpe

[permalink] [raw]
Subject: [PATCH v2 3/8] NTB: Fix the default port and peer numbers for legacy drivers

When the commit adding ntb_default_port_number() and
ntb_default_peer_port_number() entered the kernel there was no
users of it so it was impossible to tell what the API needed.

When a user finally landed a year later (ntb_pingpong) there were
more NTB topologies were created and no consideration was considered
to how other drivers had changed.

Now that there is a user it can be fixed to provide a sensible default
for the legacy drivers that do not implement ntb_{peer_}port_number().
Seeing ntb_pingpong doesn't check error codes returning EINVAL was also
not sensible.

Patches for ntb_pingpong and ntb_perf follow (which are broken
otherwise) to support hardware that doesn't have port numbers. This is
important not only to not break support with existing drivers but for
the cross link topology which, due to its perfect symmetry, cannot
assign unique port numbers to each side.

Fixes: 1e5301196a88 ("NTB: Add indexed ports NTB API")
Signed-off-by: Logan Gunthorpe <[email protected]>
---
drivers/ntb/ntb.c | 9 ++-------
1 file changed, 2 insertions(+), 7 deletions(-)

diff --git a/drivers/ntb/ntb.c b/drivers/ntb/ntb.c
index 93f24440d11d..d955a92a095a 100644
--- a/drivers/ntb/ntb.c
+++ b/drivers/ntb/ntb.c
@@ -225,10 +225,8 @@ int ntb_default_port_number(struct ntb_dev *ntb)
case NTB_TOPO_B2B_DSD:
return NTB_PORT_SEC_DSD;
default:
- break;
+ return 0;
}
-
- return -EINVAL;
}
EXPORT_SYMBOL(ntb_default_port_number);

@@ -251,10 +249,8 @@ int ntb_default_peer_port_number(struct ntb_dev *ntb, int pidx)
case NTB_TOPO_B2B_DSD:
return NTB_PORT_PRI_USD;
default:
- break;
+ return 0;
}
-
- return -EINVAL;
}
EXPORT_SYMBOL(ntb_default_peer_port_number);

@@ -326,4 +322,3 @@ static void __exit ntb_driver_exit(void)
bus_unregister(&ntb_bus);
}
module_exit(ntb_driver_exit);
-
--
2.11.0


2018-07-23 14:03:50

by Allen Hubbe

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number

On Fri, Jul 20, 2018 at 2:00 PM Logan Gunthorpe <[email protected]> wrote:
>
> This commit fixes pingpong support for existing drivers that do not
> implement ntb_default_port_number() and ntb_default_peer_port_number().
> This is required for hardware (like the crosslink topology of
> switchtec) which cannot assign reasonable port numbers to each port due
> to its perfect symmetry.
>
> Instead of picking the doorbell to use based on the the index of the
> peer, we use the peer's port number. This is a bit clearer and easier
> to understand.

Does this solve the issue where two of the the port numbers are the
same, because of symmetry over a crosslink? I think the two ports
with the "same" number should be identified as different peer index,
even if the port numbers are the same.

Maybe the port of any peer connected over the crosslink should be the
local switch's crosslink port. The local port number might be needed
to configure translation tables in the local switch. If a globally
unique port number is needed, maybe encode a chip number in some high
bits of the port number? If a locally unique port number is needed,
maybe encode a path, that could be useful for configuring address
translations across multiple crosslinks. Encoding a path, then each
port will have a different number, depending on the perspective of the
source port, which could be confusing (already, peer index is local
perspective, so can cause the same kind of confusion). IMO port
number can be anything useful for specific ntb driver and devices, or
maybe just be informative, but peer index should be useful for ntb
client drivers.

> Fixes: c7aeb0afdcc2 ("NTB: ntb_pp: Add full multi-port NTB API support")
> Signed-off-by: Logan Gunthorpe <[email protected]>
> ---
> drivers/ntb/test/ntb_pingpong.c | 14 ++++++--------
> 1 file changed, 6 insertions(+), 8 deletions(-)
>
> diff --git a/drivers/ntb/test/ntb_pingpong.c b/drivers/ntb/test/ntb_pingpong.c
> index 65865e460ab8..18d00eec7b02 100644
> --- a/drivers/ntb/test/ntb_pingpong.c
> +++ b/drivers/ntb/test/ntb_pingpong.c
> @@ -121,15 +121,14 @@ static int pp_find_next_peer(struct pp_ctx *pp)
> link = ntb_link_is_up(pp->ntb, NULL, NULL);
>
> /* Find next available peer */
> - if (link & pp->nmask) {
> + if (link & pp->nmask)
> pidx = __ffs64(link & pp->nmask);
> - out_db = BIT_ULL(pidx + 1);

Without +1 here, does this ring the same bell again?

> - } else if (link & pp->pmask) {
> + else if (link & pp->pmask)
> pidx = __ffs64(link & pp->pmask);
> - out_db = BIT_ULL(pidx);
> - } else {
> + else
> return -ENODEV;
> - }
> +
> + out_db = BIT_ULL(ntb_peer_port_number(pp->ntb, pidx));

Can it not be made to work with peer index?

>
> spin_lock(&pp->lock);
> pp->out_pidx = pidx;
> @@ -303,7 +302,7 @@ static void pp_init_flds(struct pp_ctx *pp)
> break;
> }
>
> - pp->in_db = BIT_ULL(pidx);
> + pp->in_db = BIT_ULL(lport);
> pp->pmask = GENMASK_ULL(pidx, 0) >> 1;
> pp->nmask = GENMASK_ULL(pcnt - 1, pidx);
>
> @@ -435,4 +434,3 @@ static void __exit pp_exit(void)
> debugfs_remove_recursive(pp_dbgfs_topdir);
> }
> module_exit(pp_exit);
> -
> --
> 2.11.0

2018-07-23 16:10:16

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number



On 23/07/18 08:01 AM, Allen Hubbe wrote:
> Does this solve the issue where two of the the port numbers are the
> same, because of symmetry over a crosslink? I think the two ports
> with the "same" number should be identified as different peer index,
> even if the port numbers are the same.

I'm not sure if it's that general. It solves it for the case that both
peers have port numbers equal to zero. It doesn't solve it for the more
general case where you may have multiple partitions, including one or
more crosslink partitions. That is _much_ harder to solve and right now
I'm only focused on fixing the code to work as it did before. If someone
is interested in making a complex setup like that work, they'll have to
figure out how and post patches.

I don't think you'll ever have a case where two peers have the same
index, as the index is really an abstract concept the hardware doesn't
really know about.

> Maybe the port of any peer connected over the crosslink should be the
> local switch's crosslink port.

Well, the switches could in theory be configured in any way. But the
whole point and benefit of crosslink is for both switches to be
configured to be identical. The logical configuration is to configure
both hosts to be on port 0 for both switches and there's really no
benefit to doing it differently.


> The local port number might be needed
> to configure translation tables in the local switch.

It's not. The crosslink partition between the switches would typically
always be "port number" 1. (Though there may be instances where it's a
different port number, but that shouldn't effect anything and is handled
by the driver.)

> If a globally
> unique port number is needed, maybe encode a chip number in some high
> bits of the port number?

With crosslink, we can't figure out a chip number for the same reasons
we can't figure out a port number.

> If a locally unique port number is needed,
> maybe encode a path, that could be useful for configuring address
> translations across multiple crosslinks. Encoding a path, then each
> port will have a different number, depending on the perspective of the
> source port, which could be confusing (already, peer index is local
> perspective, so can cause the same kind of confusion).

I don't follow this. Any path you might get will be exactly the same for
both hosts in a symmetric crosslink configuration. The only way to find
a locally unique port number would be for the two hosts to negotiate
somehow and that's cumbersome and would introduce some randomness (based
on which of two identical machines happened to be first) into the
process which I really don't want seeing that makes debugging much more
difficult.


>IMO port
> number can be anything useful for specific ntb driver and devices, or
> maybe just be informative, but peer index should be useful for ntb
> client drivers.

Yes, the port number is commonly used to decide which resource (MW, DB,
etc) will be used to point to another peer. This is how Serge has coded
the existing clients and makes sense. The problem comes with crosslink
which can not provide these port numbers but worked fine in the previous
two-host code. These patches fix the clients so that they support
crosslink (and other two host systems) in a very similar way to how they
were supported before. In theory, we could now rip out the ugly code for
the Intel and AMD drivers that assign port numbers based on
primary/secondary and such. But I have no interest in doing so.

Logan

2018-07-24 17:27:30

by Allen Hubbe

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number

On Mon, Jul 23, 2018 at 12:08 PM Logan Gunthorpe <[email protected]> wrote:
> I don't think you'll ever have a case where two peers have the same
> index, as the index is really an abstract concept the hardware doesn't
> really know about.

That is the point of index, that there should never be two peers with
the same index, and also that the range of index values is bounded.
Port numbers are problematic, so I'm worried about the change to use
port number in the client drivers instead of using index. For
example, this change assumes that the index value is < bits per long
long, because the value is used in BIT_ULL(port number).

Maybe I'm missing something... In the crosslink case, there is
another doorbell register on the other side of the crosslink. Whether
to use the nearby or via-crosslink doorbell depends on the peer
index... making assumptions about the hw driver, but is that about
right? Then you are selecting bits in the doorbell register based on
port number, ok, that must be how the bits of the shared db register
are associated with ports in your configuration. Maybe what's
actually needed is a ntb_peer_db_valid_mask(peer index), and if only
the port-numbered db bit (or any other combination) is valid for that
peer, so be it, that can be an implementation choice of the hw driver
and below.

2018-07-24 17:39:33

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number



On 24/07/18 11:26 AM, Allen Hubbe wrote:
> On Mon, Jul 23, 2018 at 12:08 PM Logan Gunthorpe <[email protected]> wrote:
>> I don't think you'll ever have a case where two peers have the same
>> index, as the index is really an abstract concept the hardware doesn't
>> really know about.
>
> That is the point of index, that there should never be two peers with
> the same index, and also that the range of index values is bounded.

Yes, I think we are making the same point.

> Port numbers are problematic, so I'm worried about the change to use
> port number in the client drivers instead of using index. For
> example, this change assumes that the index value is < bits per long
> long, because the value is used in BIT_ULL(port number).

Huh?, I'm not making that change... We still use the index to refer to
the peer, but the resource we are using is based on the port number,
just as it was before my changes. Perhaps you can point out in my
patches what change you are worried about?

> Maybe I'm missing something... In the crosslink case, there is
> another doorbell register on the other side of the crosslink. Whether
> to use the nearby or via-crosslink doorbell depends on the peer
> index... making assumptions about the hw driver, but is that about
> right?

Not really. Given that we know there are only two peers, we always use
the other side's doorbell register. You'd only use the nearby doorbell
register if you wanted to trigger your own interrupt -- that would be
weird and we don't really have the API sophistication to do that.

If we wanted to support multiple peers with some number in crosslink
then we'd need to revamp things _significantly_. In this case we'd have
multiple doorbell registers which each apply to a different subset of
peers. This gets _very_ complicated and hurts my head. But as I said,
I'm not trying to add new functionality for multi-peer crosslink or
anything like that. I'm just trying to fix the 2 crosslink peer case so
it works like it did when it was originally merged.

Logan

2018-07-24 18:13:29

by Allen Hubbe

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number

On Tue, Jul 24, 2018 at 1:37 PM Logan Gunthorpe <[email protected]> wrote:
> Not really. Given that we know there are only two peers, we always use
> the other side's doorbell register. You'd only use the nearby doorbell
> register if you wanted to trigger your own interrupt -- that would be
> weird and we don't really have the API sophistication to do that.
>
> If we wanted to support multiple peers with some number in crosslink
> then we'd need to revamp things _significantly_. In this case we'd have
> multiple doorbell registers which each apply to a different subset of
> peers. This gets _very_ complicated and hurts my head.

...huh, looks like peer index was omitted from ntb_peer_db_set and
friends. Adding peer index there would make the interface consistent
with other ntb_peer functions. Peer index would allow the hw driver
to select which doorbell register to use for each peer. Adding a
ntb_peer_db_valid_bits to that would allow a subset of bits in the
shared register to be associated with each peer.

I think that's all that would need to change, not significantly more,
to support multiple doorbell registers associated with different
subsets of peers. The complication would at least be hidden in the hw
driver, where it would need to maintain some mapping from peer index
to the right set of registers.

> But as I said,
> I'm not trying to add new functionality for multi-peer crosslink or
> anything like that. I'm just trying to fix the 2 crosslink peer case so
> it works like it did when it was originally merged.

I thought for sure ntb_peer_db_set already had peer index, and I was
wrong. Go ahead with the change as in your patch, I won't force the
issue or that you to do that extra work and touch all the drivers
again for this. It can be addressed when there is renued interest in
making things work more than one peer.

This patch, and the others in this series:
Acked-by: Allen Hubbe <[email protected]>

2018-07-24 18:25:16

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 4/8] NTB: ntb_pingpong: Choose doorbells based on port number



On 24/07/18 12:12 PM, Allen Hubbe wrote:
> On Tue, Jul 24, 2018 at 1:37 PM Logan Gunthorpe <[email protected]> wrote:
>> Not really. Given that we know there are only two peers, we always use
>> the other side's doorbell register. You'd only use the nearby doorbell
>> register if you wanted to trigger your own interrupt -- that would be
>> weird and we don't really have the API sophistication to do that.
>>
>> If we wanted to support multiple peers with some number in crosslink
>> then we'd need to revamp things _significantly_. In this case we'd have
>> multiple doorbell registers which each apply to a different subset of
>> peers. This gets _very_ complicated and hurts my head.
>
> ...huh, looks like peer index was omitted from ntb_peer_db_set and
> friends. Adding peer index there would make the interface consistent
> with other ntb_peer functions. Peer index would allow the hw driver
> to select which doorbell register to use for each peer. Adding a
> ntb_peer_db_valid_bits to that would allow a subset of bits in the
> shared register to be associated with each peer.

The way the switch hardware works (switchtec and I assume IDT) is that
there is only one doorbell register for all peers which can trigger any
peer that doesn't have the corresponding bit masked. So the way it was
designed without the index maxes sense except when we start to add
multi-peer crosslink insanity. I don't think adding a peer index would
be sufficient as the clients would need to know which peers share which
doorbells.

If I were to attempt something like this, I'd probably look at
introducing a segment index to the entire API. So that each switch in a
crosslink topology is it's own segment with it's own set of peers, peer
index space, doorbell registers, etc. But that's all very complicated
and messy and probably something I wouldn't even look at until we have
finished with traditional multi-host setups.

> I thought for sure ntb_peer_db_set already had peer index, and I was
> wrong. Go ahead with the change as in your patch, I won't force the
> issue or that you to do that extra work and touch all the drivers
> again for this. It can be addressed when there is renued interest in
> making things work more than one peer.
>
> This patch, and the others in this series:
> Acked-by: Allen Hubbe <[email protected]>

Thanks!

Logan


2018-09-17 15:17:01

by Alexander Fomichev

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] Fix breakage caused by the NTB multi-port patchset

On Fri, Jul 20, 2018 at 12:00:26PM -0600, Logan Gunthorpe wrote:

> To reiterate the main points on the feedback: the switchtec driver,
> in the cross link mode, will not be able to implement the port number
> function callbacks and will have to always return 0. It's a physical
> impossibility due to the symmetry. Therefore, in order fix this feature
> (which worked when merged), the clients must be changed to support not
> knowing the port number in the way they worked previously (ie. a legacy
> mode where there will only be two ports and you know there will be two
> sets of doorbells, one for each peer). The majority of this patch set
> fixes these issues.
>
> The other point of controversy is the dma mask. I still strongly
> disagree with doing it in the driver as the code is clearly common
> to all and not at all driver specific. Moreover, I think as written,
> it is extra dangerous seeing all impleminting drivers are operating on
> the new struct device before it's initialized in ntb_register().
> Patch 2 in this series also fixes that.
>
> [...]
>
> Logan Gunthorpe (8):
> NTB: ntb_tool: reading the link file should not end in a NULL byte
> NTB: Setup the DMA mask globally for all drivers
> NTB: Fix the default port and peer numbers for legacy drivers
> NTB: ntb_pingpong: Choose doorbells based on port number
> NTB: perf: Don't require one more memory window than number of peers
> NTB: perf: Fix support for hardware that doesn't have port numbers
> NTB: perf: Fix race condition when run with ntb_test
> NTB: ntb_test: Fix bug when counting remote files

Tested successfully on i5-based desktop PC, NTB on PM8535 chip, both
ends are on the local system.

Tested-by: Alexander Fomichev <[email protected]>

--
Regards,
Alexander

2018-09-17 16:07:20

by Logan Gunthorpe

[permalink] [raw]
Subject: Re: [PATCH v2 0/8] Fix breakage caused by the NTB multi-port patchset



On 17/09/18 09:16 AM, Alexander Fomichev wrote:
>> Logan Gunthorpe (8):
>> NTB: ntb_tool: reading the link file should not end in a NULL byte
>> NTB: Setup the DMA mask globally for all drivers
>> NTB: Fix the default port and peer numbers for legacy drivers
>> NTB: ntb_pingpong: Choose doorbells based on port number
>> NTB: perf: Don't require one more memory window than number of peers
>> NTB: perf: Fix support for hardware that doesn't have port numbers
>> NTB: perf: Fix race condition when run with ntb_test
>> NTB: ntb_test: Fix bug when counting remote files
>
> Tested successfully on i5-based desktop PC, NTB on PM8535 chip, both
> ends are on the local system.
>
> Tested-by: Alexander Fomichev <[email protected]>

Thanks Alexander!

Logan