From: Xinghui Li <[email protected]>
In the legacy, the vmd MSI mode can only be adjusted by configuring
vmd_ids table. This patch adds another way to adjust MSI mode by
adjusting module param, which allows users easier to adjust the vmd
according to the I/O scenario without rebuilding driver. There are two
params that could be recognized: on, off. The default param is NULL,
the goal is not to effect the existing settings of the device.
Signed-off-by: Xinghui Li <[email protected]>
Reviewed-by: Nirmal Patel <[email protected]>
---
drivers/pci/controller/vmd.c | 27 +++++++++++++++++++++++++++
1 file changed, 27 insertions(+)
diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
index 990630ec57c6..fb61181baa9e 100644
--- a/drivers/pci/controller/vmd.c
+++ b/drivers/pci/controller/vmd.c
@@ -34,6 +34,19 @@
#define MB2_SHADOW_OFFSET 0x2000
#define MB2_SHADOW_SIZE 16
+/*
+ * The VMD msi_remap module parameter provides the alternative way
+ * to adjust MSI mode when loading vmd.ko other than vmd_ids table.
+ * There are two params could be recognized:
+ *
+ * off: disable MSI remapping
+ * on: enable MSI remapping
+ *
+ */
+static char *msi_remap;
+module_param(msi_remap, charp, 0444);
+MODULE_PARM_DESC(msi_remap, "Whether to enable MSI remapping function");
+
enum vmd_features {
/*
* Device may contain registers which hint the physical location of the
@@ -875,6 +888,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
return ret;
vmd_set_msi_remapping(vmd, true);
+ dev_info(&vmd->dev->dev, "init vmd with remapping MSI\n");
ret = vmd_create_irq_domain(vmd);
if (ret)
@@ -887,6 +901,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
} else {
vmd_set_msi_remapping(vmd, false);
+ dev_info(&vmd->dev->dev, "init vmd with bypass MSI\n");
}
pci_add_resource(&resources, &vmd->resources[0]);
@@ -955,6 +970,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
return 0;
}
+static void vmd_config_msi_remap_param(unsigned long *features)
+{
+ if (msi_remap) {
+ if (strcmp(msi_remap, "on") == 0)
+ *features &= ~(VMD_FEAT_CAN_BYPASS_MSI_REMAP);
+ else if (strcmp(msi_remap, "off") == 0)
+ *features |= VMD_FEAT_CAN_BYPASS_MSI_REMAP;
+ }
+}
+
static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
{
unsigned long features = (unsigned long) id->driver_data;
@@ -984,6 +1009,8 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
if (err < 0)
goto out_release_instance;
+ vmd_config_msi_remap_param(&features);
+
vmd->cfgbar = pcim_iomap(dev, VMD_CFGBAR, 0);
if (!vmd->cfgbar) {
err = -ENOMEM;
--
2.31.1
Friendly ping~
On Mon, Mar 20, 2023 at 9:23 PM <[email protected]> wrote:
>
> From: Xinghui Li <[email protected]>
>
> In the legacy, the vmd MSI mode can only be adjusted by configuring
> vmd_ids table. This patch adds another way to adjust MSI mode by
> adjusting module param, which allows users easier to adjust the vmd
> according to the I/O scenario without rebuilding driver. There are two
> params that could be recognized: on, off. The default param is NULL,
> the goal is not to effect the existing settings of the device.
>
> Signed-off-by: Xinghui Li <[email protected]>
> Reviewed-by: Nirmal Patel <[email protected]>
> ---
> drivers/pci/controller/vmd.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 990630ec57c6..fb61181baa9e 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -34,6 +34,19 @@
> #define MB2_SHADOW_OFFSET 0x2000
> #define MB2_SHADOW_SIZE 16
>
> +/*
> + * The VMD msi_remap module parameter provides the alternative way
> + * to adjust MSI mode when loading vmd.ko other than vmd_ids table.
> + * There are two params could be recognized:
> + *
> + * off: disable MSI remapping
> + * on: enable MSI remapping
> + *
> + */
> +static char *msi_remap;
> +module_param(msi_remap, charp, 0444);
> +MODULE_PARM_DESC(msi_remap, "Whether to enable MSI remapping function");
> +
> enum vmd_features {
> /*
> * Device may contain registers which hint the physical location of the
> @@ -875,6 +888,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> return ret;
>
> vmd_set_msi_remapping(vmd, true);
> + dev_info(&vmd->dev->dev, "init vmd with remapping MSI\n");
>
> ret = vmd_create_irq_domain(vmd);
> if (ret)
> @@ -887,6 +901,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> } else {
> vmd_set_msi_remapping(vmd, false);
> + dev_info(&vmd->dev->dev, "init vmd with bypass MSI\n");
> }
>
> pci_add_resource(&resources, &vmd->resources[0]);
> @@ -955,6 +970,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> return 0;
> }
>
> +static void vmd_config_msi_remap_param(unsigned long *features)
> +{
> + if (msi_remap) {
> + if (strcmp(msi_remap, "on") == 0)
> + *features &= ~(VMD_FEAT_CAN_BYPASS_MSI_REMAP);
> + else if (strcmp(msi_remap, "off") == 0)
> + *features |= VMD_FEAT_CAN_BYPASS_MSI_REMAP;
> + }
> +}
> +
> static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
> {
> unsigned long features = (unsigned long) id->driver_data;
> @@ -984,6 +1009,8 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
> if (err < 0)
> goto out_release_instance;
>
> + vmd_config_msi_remap_param(&features);
> +
> vmd->cfgbar = pcim_iomap(dev, VMD_CFGBAR, 0);
> if (!vmd->cfgbar) {
> err = -ENOMEM;
> --
> 2.31.1
>
On Mon, Mar 20, 2023 at 09:23:16PM +0800, [email protected] wrote:
> From: Xinghui Li <[email protected]>
>
> In the legacy, the vmd MSI mode can only be adjusted by configuring
> vmd_ids table. This patch adds another way to adjust MSI mode by
> adjusting module param, which allows users easier to adjust the vmd
> according to the I/O scenario without rebuilding driver. There are two
> params that could be recognized: on, off. The default param is NULL,
> the goal is not to effect the existing settings of the device.
"two params: on, off ... default param is NULL" doesn't read quite
right because "NULL" is not a valid parameter value.
I think you could just omit that last sentence completely because it's
obvious that if you don't specify the parameter, it doesn't affect
anything and the existing behavior is unchanged (it's determined by
whether VMD_FEAT_CAN_BYPASS_MSI_REMAP is specified in vmd_ids[]).
> Signed-off-by: Xinghui Li <[email protected]>
> Reviewed-by: Nirmal Patel <[email protected]>
I think Lorenzo is out of the office this week, but I'm sure he'll
take care of this when he returns.
In the meantime, can you include a sample usage in the commit log so
folks don't have to dig through the patch and figure out how to use
the parameter?
It would also be nice to include a hint about why a user would choose
"on" or "off". What is the performance effect? What sort of I/O
scenario would lead you to choose "on" vs "off"?
ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when possible")
suggests that MSI-X remapping (I assume the "msi_remap=on" case):
- Limits the number MSI-X vectors available to child devices to the
number of VMD MSI-X vectors.
- Reduces interrupt handling performance because child device
interrupts have to go through the VMD MSI-X domain interrupt
handler.
So I assume "msi_remap=off" would remove that MSI-X vector limit and
improve interrupt handling performance?
But obviously there's more to consider because those are both good
things and if we could do that all the time, we would. So there must
be cases where we *have* to remap. ee81ee84f873 suggests that not all
VMD devices support disabling remap. There's also a hint that some
virt configs require it.
This patch doesn't enforce either of those things. What happens if
the user gets it wrong?
> drivers/pci/controller/vmd.c | 27 +++++++++++++++++++++++++++
> 1 file changed, 27 insertions(+)
>
> diff --git a/drivers/pci/controller/vmd.c b/drivers/pci/controller/vmd.c
> index 990630ec57c6..fb61181baa9e 100644
> --- a/drivers/pci/controller/vmd.c
> +++ b/drivers/pci/controller/vmd.c
> @@ -34,6 +34,19 @@
> #define MB2_SHADOW_OFFSET 0x2000
> #define MB2_SHADOW_SIZE 16
>
> +/*
> + * The VMD msi_remap module parameter provides the alternative way
> + * to adjust MSI mode when loading vmd.ko other than vmd_ids table.
> + * There are two params could be recognized:
> + *
> + * off: disable MSI remapping
> + * on: enable MSI remapping
> + *
Spurious blank line.
> + */
> +static char *msi_remap;
> +module_param(msi_remap, charp, 0444);
> +MODULE_PARM_DESC(msi_remap, "Whether to enable MSI remapping function");
ee81ee84f873 mentions MSI-X explicitly (which is different from MSI),
so maybe use "MSI-X" here and in the messages below to avoid any
confusion.
> enum vmd_features {
> /*
> * Device may contain registers which hint the physical location of the
> @@ -875,6 +888,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> return ret;
>
> vmd_set_msi_remapping(vmd, true);
> + dev_info(&vmd->dev->dev, "init vmd with remapping MSI\n");
>
> ret = vmd_create_irq_domain(vmd);
> if (ret)
> @@ -887,6 +901,7 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> irq_domain_update_bus_token(vmd->irq_domain, DOMAIN_BUS_VMD_MSI);
> } else {
> vmd_set_msi_remapping(vmd, false);
> + dev_info(&vmd->dev->dev, "init vmd with bypass MSI\n");
> }
>
> pci_add_resource(&resources, &vmd->resources[0]);
> @@ -955,6 +970,16 @@ static int vmd_enable_domain(struct vmd_dev *vmd, unsigned long features)
> return 0;
> }
>
> +static void vmd_config_msi_remap_param(unsigned long *features)
> +{
> + if (msi_remap) {
> + if (strcmp(msi_remap, "on") == 0)
> + *features &= ~(VMD_FEAT_CAN_BYPASS_MSI_REMAP);
> + else if (strcmp(msi_remap, "off") == 0)
> + *features |= VMD_FEAT_CAN_BYPASS_MSI_REMAP;
The usual strcmp() idiom is to test "!strcmp(...)" instead of
"strcmp(...) == 0)". No need for "()" around
VMD_FEAT_CAN_BYPASS_MSI_REMAP.
> + }
> +}
> +
> static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
> {
> unsigned long features = (unsigned long) id->driver_data;
> @@ -984,6 +1009,8 @@ static int vmd_probe(struct pci_dev *dev, const struct pci_device_id *id)
> if (err < 0)
> goto out_release_instance;
>
> + vmd_config_msi_remap_param(&features);
> +
> vmd->cfgbar = pcim_iomap(dev, VMD_CFGBAR, 0);
> if (!vmd->cfgbar) {
> err = -ENOMEM;
> --
> 2.31.1
>
On Wed, Mar 29, 2023 at 5:34 AM Bjorn Helgaas <[email protected]> wrote:
>
> "two params: on, off ... default param is NULL" doesn't read quite
> right because "NULL" is not a valid parameter value.
>
> I think you could just omit that last sentence completely because it's
> obvious that if you don't specify the parameter, it doesn't affect
> anything and the existing behavior is unchanged (it's determined by
> whether VMD_FEAT_CAN_BYPASS_MSI_REMAP is specified in vmd_ids[]).
>
Your opinion is more reasonable. I will omit this sentence.
>
> > Signed-off-by: Xinghui Li <[email protected]>
> > Reviewed-by: Nirmal Patel <[email protected]>
>
> I think Lorenzo is out of the office this week, but I'm sure he'll
> take care of this when he returns.
>
> In the meantime, can you include a sample usage in the commit log so
> folks don't have to dig through the patch and figure out how to use
> the parameter?
>
I will add instructions on how to use this parameter.
>
> It would also be nice to include a hint about why a user would choose
> "on" or "off". What is the performance effect? What sort of I/O
> scenario would lead you to choose "on" vs "off"?
>
Before this patch, I sent the patch named :
PCI: vmd: Do not disable MSI-X remapping in VMD 28C0 controller
(patchwork link:
https://patchwork.kernel.org/project/linux-pci/patch/[email protected]/)
We found the 4k rand read's iops could drop 50% if 4 NVMEs were
mounted in one PCIE port with VMD MSI bypass.
I suppose this is because the VMD Controller can aggregate interrupts.
But those test result is so long that I didn't add them to this patch
commit log.
If you believe it is necessary, I will try to add some simple instructions
>
> ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when possible")
> suggests that MSI-X remapping (I assume the "msi_remap=on" case):
>
> - Limits the number MSI-X vectors available to child devices to the
> number of VMD MSI-X vectors.
>
> - Reduces interrupt handling performance because child device
> interrupts have to go through the VMD MSI-X domain interrupt
> handler.
>
> So I assume "msi_remap=off" would remove that MSI-X vector limit and
> improve interrupt handling performance?
>
> But obviously there's more to consider because those are both good
> things and if we could do that all the time, we would. So there must
> be cases where we *have* to remap. ee81ee84f873 suggests that not all
> VMD devices support disabling remap. There's also a hint that some
> virt configs require it.
>
I used to just want to disable 28C0's VMD MSI bypass by default.
But Nirmal suggested the current method by adjusting the param.
Because he and other reviewers worry there are some other scenarios we
didn't consider.
Adding a method to adjust VMD'S MSI-X mode is better.
> This patch doesn't enforce either of those things. What happens if
> the user gets it wrong?
>
If I am wrong, please feel free to correct me at any time.
I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
param configuring helper front
"vmd_enable_domain". So, It will not change the logic disabling
remapping from ee81ee84f873, such as
"Currently MSI remapping must be enabled in guest passthrough mode".
So, if the user config the wrong type, it will not work, and they can
find it by dmesg.
And that's why I add the MSI-X mode init print.
>
> Spurious blank line.
sorry for this
>
>
> ee81ee84f873 mentions MSI-X explicitly (which is different from MSI),
> so maybe use "MSI-X" here and in the messages below to avoid any
> confusion.
>
That's nicer.
>
> The usual strcmp() idiom is to test "!strcmp(...)" instead of
> "strcmp(...) == 0)". No need for "()" around
> VMD_FEAT_CAN_BYPASS_MSI_REMAP.
>
All right, I will change it to "!strcmp()" way.
Thanks for you review~
On Wed, Mar 29, 2023 at 04:57:08PM +0800, Xinghui Li wrote:
> On Wed, Mar 29, 2023 at 5:34 AM Bjorn Helgaas <[email protected]> wrote:
> > It would also be nice to include a hint about why a user would choose
> > "on" or "off". What is the performance effect? What sort of I/O
> > scenario would lead you to choose "on" vs "off"?
> >
> Before this patch, I sent the patch named :
> PCI: vmd: Do not disable MSI-X remapping in VMD 28C0 controller
> (patchwork link:
> https://patchwork.kernel.org/project/linux-pci/patch/[email protected]/)
> We found the 4k rand read's iops could drop 50% if 4 NVMEs were
> mounted in one PCIE port with VMD MSI bypass.
> I suppose this is because the VMD Controller can aggregate interrupts.
> But those test result is so long that I didn't add them to this patch
> commit log.
> If you believe it is necessary, I will try to add some simple instructions
I don't think we need detailed performance numbers, but we need
something like:
- "msi_remap=off" improves interrupt handling performance by
avoiding the VMD MSI-X domain interrupt handler
- But "msi_remap=on" is needed when ...?
> > ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when possible")
> > suggests that MSI-X remapping (I assume the "msi_remap=on" case):
> >
> > - Limits the number MSI-X vectors available to child devices to the
> > number of VMD MSI-X vectors.
> >
> > - Reduces interrupt handling performance because child device
> > interrupts have to go through the VMD MSI-X domain interrupt
> > handler.
> >
> > So I assume "msi_remap=off" would remove that MSI-X vector limit and
> > improve interrupt handling performance?
> >
> > But obviously there's more to consider because those are both good
> > things and if we could do that all the time, we would. So there must
> > be cases where we *have* to remap. ee81ee84f873 suggests that not all
> > VMD devices support disabling remap. There's also a hint that some
> > virt configs require it.
> >
> I used to just want to disable 28C0's VMD MSI bypass by default.
> But Nirmal suggested the current method by adjusting the param.
> Because he and other reviewers worry there are some other scenarios we
> didn't consider.
> Adding a method to adjust VMD'S MSI-X mode is better.
This commit log doesn't outline any of those other scenarios, and it
doesn't say anything about when "msi_remap=on" or "msi_remap=off"
would be necessary or desired, so I have no idea how users are
supposed to figure out whether or not to use this parameter.
> > This patch doesn't enforce either of those things. What happens if
> > the user gets it wrong?
>
> If I am wrong, please feel free to correct me at any time.
> I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
> param configuring helper front
> "vmd_enable_domain". So, It will not change the logic disabling
> remapping from ee81ee84f873, such as
> "Currently MSI remapping must be enabled in guest passthrough mode".
> So, if the user config the wrong type, it will not work, and they can
> find it by dmesg.
That's kind of a problem. I'm not in favor of something failing and
the user having to debug it via dmesg. That causes user frustration
and problem reports.
I don't know what "guest passthrough mode" is. Can you detect that
automatically?
Bjorn
On 3/29/2023 9:31 AM, Bjorn Helgaas wrote:
> On Wed, Mar 29, 2023 at 04:57:08PM +0800, Xinghui Li wrote:
>> On Wed, Mar 29, 2023 at 5:34 AM Bjorn Helgaas <[email protected]> wrote:
>>> It would also be nice to include a hint about why a user would choose
>>> "on" or "off". What is the performance effect? What sort of I/O
>>> scenario would lead you to choose "on" vs "off"?
>>>
>> Before this patch, I sent the patch named :
>> PCI: vmd: Do not disable MSI-X remapping in VMD 28C0 controller
>> (patchwork link:
>> https://patchwork.kernel.org/project/linux-pci/patch/[email protected]/)
>> We found the 4k rand read's iops could drop 50% if 4 NVMEs were
>> mounted in one PCIE port with VMD MSI bypass.
>> I suppose this is because the VMD Controller can aggregate interrupts.
>> But those test result is so long that I didn't add them to this patch
>> commit log.
>> If you believe it is necessary, I will try to add some simple instructions
> I don't think we need detailed performance numbers, but we need
> something like:
>
> - "msi_remap=off" improves interrupt handling performance by
> avoiding the VMD MSI-X domain interrupt handler
>
> - But "msi_remap=on" is needed when ...?
>
>>> ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when possible")
>>> suggests that MSI-X remapping (I assume the "msi_remap=on" case):
>>>
>>> - Limits the number MSI-X vectors available to child devices to the
>>> number of VMD MSI-X vectors.
>>>
>>> - Reduces interrupt handling performance because child device
>>> interrupts have to go through the VMD MSI-X domain interrupt
>>> handler.
>>>
>>> So I assume "msi_remap=off" would remove that MSI-X vector limit and
>>> improve interrupt handling performance?
>>>
>>> But obviously there's more to consider because those are both good
>>> things and if we could do that all the time, we would. So there must
>>> be cases where we *have* to remap. ee81ee84f873 suggests that not all
>>> VMD devices support disabling remap. There's also a hint that some
>>> virt configs require it.
>>>
>> I used to just want to disable 28C0's VMD MSI bypass by default.
>> But Nirmal suggested the current method by adjusting the param.
>> Because he and other reviewers worry there are some other scenarios we
>> didn't consider.
>> Adding a method to adjust VMD'S MSI-X mode is better.
> This commit log doesn't outline any of those other scenarios, and it
> doesn't say anything about when "msi_remap=on" or "msi_remap=off"
> would be necessary or desired, so I have no idea how users are
> supposed to figure out whether or not to use this parameter.
>
>>> This patch doesn't enforce either of those things. What happens if
>>> the user gets it wrong?
>> If I am wrong, please feel free to correct me at any time.
>> I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
>> param configuring helper front
>> "vmd_enable_domain". So, It will not change the logic disabling
>> remapping from ee81ee84f873, such as
>> "Currently MSI remapping must be enabled in guest passthrough mode".
>> So, if the user config the wrong type, it will not work, and they can
>> find it by dmesg.
> That's kind of a problem. I'm not in favor of something failing and
> the user having to debug it via dmesg. That causes user frustration
> and problem reports.
>
> I don't know what "guest passthrough mode" is. Can you detect that
> automatically?
>
> Bjorn
How about adding a boolean flag by comparing user input for module
parameter msi_remap? and add the flag at
- if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) || msi_flag
|| offset[0] || offset[1])
Correct if I am wrong, but in this way we can cover all the cases.
If user adds msi_remap=on, msi_flag=true and enables remapping.
If user adds msi_remap=off, msi_flag=false and disables remapping.
If user doesn't add anything, msi_flag=false and decision will be
made same as current implementation. This will cover guest OS case
as well.
On Thu, Mar 30, 2023 at 2:49 PM Patel, Nirmal
<[email protected]> wrote:
>
> How about adding a boolean flag by comparing user input for module
> parameter msi_remap? and add the flag at
>
> - if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) || msi_flag
> || offset[0] || offset[1])
>
> Correct if I am wrong, but in this way we can cover all the cases.
> If user adds msi_remap=on, msi_flag=true and enables remapping.
> If user adds msi_remap=off, msi_flag=false and disables remapping.
> If user doesn't add anything, msi_flag=false and decision will be
> made same as current implementation. This will cover guest OS case
> as well.
>
Sorry, I don't quite get your point. How is msi_flag assigned?
Do you mean when msi_remap=no, the msi_flag is assigned as true?
And msi_remap=off, the msi_flag is assigned as false?
Thanks~
On Thu, Mar 30, 2023 at 12:31 AM Bjorn Helgaas <[email protected]> wrote:
>
> On Wed, Mar 29, 2023 at 04:57:08PM +0800, Xinghui Li wrote:
> > On Wed, Mar 29, 2023 at 5:34 AM Bjorn Helgaas <[email protected]> wrote:
> > > It would also be nice to include a hint about why a user would choose
> > > "on" or "off". What is the performance effect? What sort of I/O
> > > scenario would lead you to choose "on" vs "off"?
> > >
> > Before this patch, I sent the patch named :
> > PCI: vmd: Do not disable MSI-X remapping in VMD 28C0 controller
> > (patchwork link:
> > https://patchwork.kernel.org/project/linux-pci/patch/[email protected]/)
> > We found the 4k rand read's iops could drop 50% if 4 NVMEs were
> > mounted in one PCIE port with VMD MSI bypass.
> > I suppose this is because the VMD Controller can aggregate interrupts.
> > But those test result is so long that I didn't add them to this patch
> > commit log.
> > If you believe it is necessary, I will try to add some simple instructions
>
> I don't think we need detailed performance numbers, but we need
> something like:
>
> - "msi_remap=off" improves interrupt handling performance by
> avoiding the VMD MSI-X domain interrupt handler
>
> - But "msi_remap=on" is needed when ...?
>
In the patch I send in last email, We speculate that the VMD
Controller aggregate interrupts,
making the PCIe port less stressed and improving iops. In this
case(lots of 4k random IO), if we enable the VMD MSI
remapping, we found the interrupts from VMD Controller's MSI are less
and the IOPS is much higher.
> > > ee81ee84f873 ("PCI: vmd: Disable MSI-X remapping when possible")
> > > suggests that MSI-X remapping (I assume the "msi_remap=on" case):
> > >
> > > - Limits the number MSI-X vectors available to child devices to the
> > > number of VMD MSI-X vectors.
> > >
> > > - Reduces interrupt handling performance because child device
> > > interrupts have to go through the VMD MSI-X domain interrupt
> > > handler.
> > >
> > > So I assume "msi_remap=off" would remove that MSI-X vector limit and
> > > improve interrupt handling performance?
> > >
> > > But obviously there's more to consider because those are both good
> > > things and if we could do that all the time, we would. So there must
> > > be cases where we *have* to remap. ee81ee84f873 suggests that not all
> > > VMD devices support disabling remap. There's also a hint that some
> > > virt configs require it.
> > >
> > I used to just want to disable 28C0's VMD MSI bypass by default.
> > But Nirmal suggested the current method by adjusting the param.
> > Because he and other reviewers worry there are some other scenarios we
> > didn't consider.
> > Adding a method to adjust VMD'S MSI-X mode is better.
>
> This commit log doesn't outline any of those other scenarios, and it
> doesn't say anything about when "msi_remap=on" or "msi_remap=off"
> would be necessary or desired, so I have no idea how users are
> supposed to figure out whether or not to use this parameter.
>
I do miss this part. I'm sorry I ignored the users outside the
community or this discussion.
I will add the necessary guidelines.
> > > This patch doesn't enforce either of those things. What happens if
> > > the user gets it wrong?
> >
> > If I am wrong, please feel free to correct me at any time.
> > I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
> > param configuring helper front
> > "vmd_enable_domain". So, It will not change the logic disabling
> > remapping from ee81ee84f873, such as
> > "Currently MSI remapping must be enabled in guest passthrough mode".
> > So, if the user config the wrong type, it will not work, and they can
> > find it by dmesg.
>
> That's kind of a problem. I'm not in favor of something failing and
> the user having to debug it via dmesg. That causes user frustration
> and problem reports.
What about adding a sysfs node for it in VMD PCI bus dir, which allows
users to catch VMD's MSI current working mode?
>
> I don't know what "guest passthrough mode" is. Can you detect that
> automatically?
I quote this from the commit ee81ee84f873's comment, it can be detected by the
logic like this:
if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
offset[0] || offset[1])
I just want to answer your comment: "There's also a hint that some
virt configs require it."
This patch will not modify the logic of determining whether MSI
remapping is enabled
when running VMD in Guest.
Thanks~
On 4/2/2023 7:34 AM, Xinghui Li wrote:
> On Thu, Mar 30, 2023 at 2:49 PM Patel, Nirmal
> <[email protected]> wrote:
>> How about adding a boolean flag by comparing user input for module
>> parameter msi_remap? and add the flag at
>>
>> - if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) || msi_flag
>> || offset[0] || offset[1])
>>
>> Correct if I am wrong, but in this way we can cover all the cases.
>> If user adds msi_remap=on, msi_flag=true and enables remapping.
>> If user adds msi_remap=off, msi_flag=false and disables remapping.
>> If user doesn't add anything, msi_flag=false and decision will be
>> made same as current implementation. This will cover guest OS case
>> as well.
>>
> Sorry, I don't quite get your point. How is msi_flag assigned?
> Do you mean when msi_remap=no, the msi_flag is assigned as true?
> And msi_remap=off, the msi_flag is assigned as false?
>
> Thanks~
if msi_remap=on, then msi_flag=true;
if msi_remap=off, then msi_flag=false.
On Sun, Apr 02, 2023 at 11:02:07PM +0800, Xinghui Li wrote:
> On Thu, Mar 30, 2023 at 12:31 AM Bjorn Helgaas <[email protected]> wrote:
> > On Wed, Mar 29, 2023 at 04:57:08PM +0800, Xinghui Li wrote:
> > > On Wed, Mar 29, 2023 at 5:34 AM Bjorn Helgaas <[email protected]> wrote:
> > > > It would also be nice to include a hint about why a user would choose
> > > > "on" or "off". What is the performance effect? What sort of I/O
> > > > scenario would lead you to choose "on" vs "off"?
> > I don't think we need detailed performance numbers, but we need
> > something like:
> >
> > - "msi_remap=off" improves interrupt handling performance by
> > avoiding the VMD MSI-X domain interrupt handler
> >
> > - But "msi_remap=on" is needed when ...?
> >
> In the patch I send in last email, We speculate that the VMD
> Controller aggregate interrupts,
> making the PCIe port less stressed and improving iops. In this
> case(lots of 4k random IO), if we enable the VMD MSI
> remapping, we found the interrupts from VMD Controller's MSI are less
> and the IOPS is much higher.
Great, that's useful information about why users would want to use
this parameter.
Obviously the other half is to mention the reasons why they may not be
able to use it. If "msi_remap=off" were *always* safe and desirable,
we would just do that unconditionally and there would be no point in a
parameter.
> > > I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
> > > param configuring helper front
> > > "vmd_enable_domain". So, It will not change the logic disabling
> > > remapping from ee81ee84f873, such as
> > > "Currently MSI remapping must be enabled in guest passthrough mode".
> > > So, if the user config the wrong type, it will not work, and they can
> > > find it by dmesg.
> >
> > That's kind of a problem. I'm not in favor of something failing and
> > the user having to debug it via dmesg. That causes user frustration
> > and problem reports.
>
> What about adding a sysfs node for it in VMD PCI bus dir, which allows
> users to catch VMD's MSI current working mode?
No, a sysfs node is not the answer to this. If we *can* avoid a
non-working situation, we should avoid it. If users see a non-working
situation, they will rightly complain, and somebody will have to debug
it. We don't want that.
> > I don't know what "guest passthrough mode" is. Can you detect that
> > automatically?
>
> I quote this from the commit ee81ee84f873's comment, it can be detected by the
> logic like this:
> if (!(features & VMD_FEAT_CAN_BYPASS_MSI_REMAP) ||
> offset[0] || offset[1])
> I just want to answer your comment: "There's also a hint that some
> virt configs require it."
> This patch will not modify the logic of determining whether MSI
> remapping is enabled
> when running VMD in Guest.
My point is that apparently guest passthrough mode is relevant and
maybe it should be part of the commit log about how this parameter
should be used.
Bjorn
On Tue, Apr 4, 2023 at 6:45 AM Bjorn Helgaas <[email protected]> wrote:
>
> > In the patch I send in last email, We speculate that the VMD
> > Controller aggregate interrupts,
> > making the PCIe port less stressed and improving iops. In this
> > case(lots of 4k random IO), if we enable the VMD MSI
> > remapping, we found the interrupts from VMD Controller's MSI are less
> > and the IOPS is much higher.
>
> Great, that's useful information about why users would want to use
> this parameter.
>
> Obviously the other half is to mention the reasons why they may not be
> able to use it. If "msi_remap=off" were *always* safe and desirable,
> we would just do that unconditionally and there would be no point in a
> parameter.
>
> > > > I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
> > > > param configuring helper front
> > > > "vmd_enable_domain". So, It will not change the logic disabling
> > > > remapping from ee81ee84f873, such as
> > > > "Currently MSI remapping must be enabled in guest passthrough mode".
> > > > So, if the user config the wrong type, it will not work, and they can
> > > > find it by dmesg.
> > >
> > > That's kind of a problem. I'm not in favor of something failing and
> > > the user having to debug it via dmesg. That causes user frustration
> > > and problem reports.
> >
> > What about adding a sysfs node for it in VMD PCI bus dir, which allows
> > users to catch VMD's MSI current working mode?
>
> No, a sysfs node is not the answer to this. If we *can* avoid a
> non-working situation, we should avoid it. If users see a non-working
> situation, they will rightly complain, and somebody will have to debug
> it. We don't want that.
emm~
I privately suppose: In the traditional way without module parameters,
this problem also exists. If the user disables remapping in guest os, the VMD
driver will force it to remapping mode.
What about I add the additional description in MODULE_PARM_DESC and comment?
As for some devices not support disable remapping, What about changing
the param to disabae_bypass=0/1?
And make it clear in the description:
this parameter will only affect the disabling of bypass mode on
devices that support it.
>
> My point is that apparently guest passthrough mode is relevant and
> maybe it should be part of the commit log about how this parameter
> should be used.
You are right~
I will add some clarification on how to configure VMD MSI in the guest.
>
> Bjorn
Friendly ping~
On Tue, Apr 4, 2023 at 7:02 PM Xinghui Li <[email protected]> wrote:
>
> On Tue, Apr 4, 2023 at 6:45 AM Bjorn Helgaas <[email protected]> wrote:
> >
> > > In the patch I send in last email, We speculate that the VMD
> > > Controller aggregate interrupts,
> > > making the PCIe port less stressed and improving iops. In this
> > > case(lots of 4k random IO), if we enable the VMD MSI
> > > remapping, we found the interrupts from VMD Controller's MSI are less
> > > and the IOPS is much higher.
> >
> > Great, that's useful information about why users would want to use
> > this parameter.
> >
> > Obviously the other half is to mention the reasons why they may not be
> > able to use it. If "msi_remap=off" were *always* safe and desirable,
> > we would just do that unconditionally and there would be no point in a
> > parameter.
> >
> > > > > I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
> > > > > param configuring helper front
> > > > > "vmd_enable_domain". So, It will not change the logic disabling
> > > > > remapping from ee81ee84f873, such as
> > > > > "Currently MSI remapping must be enabled in guest passthrough mode".
> > > > > So, if the user config the wrong type, it will not work, and they can
> > > > > find it by dmesg.
> > > >
> > > > That's kind of a problem. I'm not in favor of something failing and
> > > > the user having to debug it via dmesg. That causes user frustration
> > > > and problem reports.
> > >
> > > What about adding a sysfs node for it in VMD PCI bus dir, which allows
> > > users to catch VMD's MSI current working mode?
> >
> > No, a sysfs node is not the answer to this. If we *can* avoid a
> > non-working situation, we should avoid it. If users see a non-working
> > situation, they will rightly complain, and somebody will have to debug
> > it. We don't want that.
> emm~
> I privately suppose: In the traditional way without module parameters,
> this problem also exists. If the user disables remapping in guest os, the VMD
> driver will force it to remapping mode.
> What about I add the additional description in MODULE_PARM_DESC and comment?
> As for some devices not support disable remapping, What about changing
> the param to disabae_bypass=0/1?
> And make it clear in the description:
> this parameter will only affect the disabling of bypass mode on
> devices that support it.
> >
> > My point is that apparently guest passthrough mode is relevant and
> > maybe it should be part of the commit log about how this parameter
> > should be used.
> You are right~
> I will add some clarification on how to configure VMD MSI in the guest.
> >
> > Bjorn
On Mon, Apr 17, 2023 at 05:15:00PM +0800, Xinghui Li wrote:
> Friendly ping~
We had quite a bit of discussion that I don't see reflected in the
latest patch, so we're waiting on a v5 patch that addresses the
comments.
> On Tue, Apr 4, 2023 at 7:02 PM Xinghui Li <[email protected]> wrote:
> >
> > On Tue, Apr 4, 2023 at 6:45 AM Bjorn Helgaas <[email protected]> wrote:
> > >
> > > > In the patch I send in last email, We speculate that the VMD
> > > > Controller aggregate interrupts,
> > > > making the PCIe port less stressed and improving iops. In this
> > > > case(lots of 4k random IO), if we enable the VMD MSI
> > > > remapping, we found the interrupts from VMD Controller's MSI are less
> > > > and the IOPS is much higher.
> > >
> > > Great, that's useful information about why users would want to use
> > > this parameter.
> > >
> > > Obviously the other half is to mention the reasons why they may not be
> > > able to use it. If "msi_remap=off" were *always* safe and desirable,
> > > we would just do that unconditionally and there would be no point in a
> > > parameter.
> > >
> > > > > > I place the "vmd_config_msi_remap_param" that is VMD MSI-X's mode
> > > > > > param configuring helper front
> > > > > > "vmd_enable_domain". So, It will not change the logic disabling
> > > > > > remapping from ee81ee84f873, such as
> > > > > > "Currently MSI remapping must be enabled in guest passthrough mode".
> > > > > > So, if the user config the wrong type, it will not work, and they can
> > > > > > find it by dmesg.
> > > > >
> > > > > That's kind of a problem. I'm not in favor of something failing and
> > > > > the user having to debug it via dmesg. That causes user frustration
> > > > > and problem reports.
> > > >
> > > > What about adding a sysfs node for it in VMD PCI bus dir, which allows
> > > > users to catch VMD's MSI current working mode?
> > >
> > > No, a sysfs node is not the answer to this. If we *can* avoid a
> > > non-working situation, we should avoid it. If users see a non-working
> > > situation, they will rightly complain, and somebody will have to debug
> > > it. We don't want that.
> > emm~
> > I privately suppose: In the traditional way without module parameters,
> > this problem also exists. If the user disables remapping in guest os, the VMD
> > driver will force it to remapping mode.
> > What about I add the additional description in MODULE_PARM_DESC and comment?
> > As for some devices not support disable remapping, What about changing
> > the param to disabae_bypass=0/1?
> > And make it clear in the description:
> > this parameter will only affect the disabling of bypass mode on
> > devices that support it.
> > >
> > > My point is that apparently guest passthrough mode is relevant and
> > > maybe it should be part of the commit log about how this parameter
> > > should be used.
> > You are right~
> > I will add some clarification on how to configure VMD MSI in the guest.
> > >
> > > Bjorn
On Tue, Apr 18, 2023 at 5:45 AM Bjorn Helgaas <[email protected]> wrote:
>
> On Mon, Apr 17, 2023 at 05:15:00PM +0800, Xinghui Li wrote:
> > Friendly ping~
>
> We had quite a bit of discussion that I don't see reflected in the
> latest patch, so we're waiting on a v5 patch that addresses the
> comments.
>
Got it.
I will send the next version asap.
Thanks~