2024-02-16 16:02:01

by Robert Richter

[permalink] [raw]
Subject: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

The Linux CXL subsystem is built on the assumption that HPA == SPA.
That is, the host physical address (HPA) the HDM decoder registers are
programmed with are system physical addresses (SPA).

During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
8.1.3.8) are checked if the memory is enabled and the CXL range is in
a HPA window that is described in a CFMWS structure of the CXL host
bridge (cxl-3.1, 9.18.1.3).

Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
window and the CXL memory range will be disabled then. The HDM decoder
stops working which causes system memory being disabled and further a
system hang during HDM decoder initialization, typically when a CXL
enabled kernel boots.

Prevent a system hang and do not disable the HDM decoder if the
decoder's CXL range is not found in a CFMWS window.

Note the change only fixes a hardware hang, but does not implement
HPA/SPA translation. Support for this can be added in a follow on
patch series.

Signed-off-by: Robert Richter <[email protected]>
---
drivers/cxl/core/pci.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index a0e7ed5ae25f..18616ca873e5 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
}

if (!allowed) {
- cxl_set_mem_enable(cxlds, 0);
- info->mem_enabled = 0;
+ dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
+ return -ENXIO;
}

/*
--
2.39.2



2024-02-16 18:03:19

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

Robert Richter wrote:
> The Linux CXL subsystem is built on the assumption that HPA == SPA.
> That is, the host physical address (HPA) the HDM decoder registers are
> programmed with are system physical addresses (SPA).
>
> During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> a HPA window that is described in a CFMWS structure of the CXL host
> bridge (cxl-3.1, 9.18.1.3).
>
> Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> window and the CXL memory range will be disabled then. The HDM decoder
> stops working which causes system memory being disabled and further a
> system hang during HDM decoder initialization, typically when a CXL
> enabled kernel boots.
>
> Prevent a system hang and do not disable the HDM decoder if the
> decoder's CXL range is not found in a CFMWS window.
>
> Note the change only fixes a hardware hang, but does not implement
> HPA/SPA translation. Support for this can be added in a follow on
> patch series.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> drivers/cxl/core/pci.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a0e7ed5ae25f..18616ca873e5 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> }
>
> if (!allowed) {
> - cxl_set_mem_enable(cxlds, 0);
> - info->mem_enabled = 0;
> + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> + return -ENXIO;
> }

This looks good to me.

2024-02-16 18:10:52

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

On 16.02.24 10:02:51, Dan Williams wrote:
> Robert Richter wrote:
> > The Linux CXL subsystem is built on the assumption that HPA == SPA.
> > That is, the host physical address (HPA) the HDM decoder registers are
> > programmed with are system physical addresses (SPA).
> >
> > During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> > 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> > a HPA window that is described in a CFMWS structure of the CXL host
> > bridge (cxl-3.1, 9.18.1.3).
> >
> > Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> > window and the CXL memory range will be disabled then. The HDM decoder
> > stops working which causes system memory being disabled and further a
> > system hang during HDM decoder initialization, typically when a CXL
> > enabled kernel boots.
> >
> > Prevent a system hang and do not disable the HDM decoder if the
> > decoder's CXL range is not found in a CFMWS window.
> >
> > Note the change only fixes a hardware hang, but does not implement
> > HPA/SPA translation. Support for this can be added in a follow on
> > patch series.
> >
> > Signed-off-by: Robert Richter <[email protected]>
> > ---
> > drivers/cxl/core/pci.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index a0e7ed5ae25f..18616ca873e5 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> > }
> >
> > if (!allowed) {
> > - cxl_set_mem_enable(cxlds, 0);
> > - info->mem_enabled = 0;
> > + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> > + return -ENXIO;
> > }
>
> This looks good to me.

Thanks, Dan

2024-02-16 22:05:04

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

On 16.02.24 19:09:30, Robert Richter wrote:
> On 16.02.24 10:02:51, Dan Williams wrote:
> > Robert Richter wrote:
> > > The Linux CXL subsystem is built on the assumption that HPA == SPA.
> > > That is, the host physical address (HPA) the HDM decoder registers are
> > > programmed with are system physical addresses (SPA).
> > >
> > > During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> > > 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> > > a HPA window that is described in a CFMWS structure of the CXL host
> > > bridge (cxl-3.1, 9.18.1.3).
> > >
> > > Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> > > window and the CXL memory range will be disabled then. The HDM decoder
> > > stops working which causes system memory being disabled and further a
> > > system hang during HDM decoder initialization, typically when a CXL
> > > enabled kernel boots.
> > >
> > > Prevent a system hang and do not disable the HDM decoder if the
> > > decoder's CXL range is not found in a CFMWS window.
> > >
> > > Note the change only fixes a hardware hang, but does not implement
> > > HPA/SPA translation. Support for this can be added in a follow on
> > > patch series.
> > >

Fixes: 9de321e93c3b ("cxl/pci: Refactor cxl_hdm_decode_init()")
Fixes: 34e37b4c432c ("cxl/port: Enable HDM Capability after validating DVSEC Ranges")
Cc: [email protected]

Sorry, I forgot those tags, please add.

Thanks,

-Robert

> > > Signed-off-by: Robert Richter <[email protected]>
> > > ---
> > > drivers/cxl/core/pci.c | 4 ++--
> > > 1 file changed, 2 insertions(+), 2 deletions(-)

2024-02-17 03:07:21

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

Robert Richter wrote:
> On 16.02.24 19:09:30, Robert Richter wrote:
> > On 16.02.24 10:02:51, Dan Williams wrote:
> > > Robert Richter wrote:
> > > > The Linux CXL subsystem is built on the assumption that HPA == SPA.
> > > > That is, the host physical address (HPA) the HDM decoder registers are
> > > > programmed with are system physical addresses (SPA).
> > > >
> > > > During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> > > > 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> > > > a HPA window that is described in a CFMWS structure of the CXL host
> > > > bridge (cxl-3.1, 9.18.1.3).
> > > >
> > > > Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> > > > window and the CXL memory range will be disabled then. The HDM decoder
> > > > stops working which causes system memory being disabled and further a
> > > > system hang during HDM decoder initialization, typically when a CXL
> > > > enabled kernel boots.
> > > >
> > > > Prevent a system hang and do not disable the HDM decoder if the
> > > > decoder's CXL range is not found in a CFMWS window.
> > > >
> > > > Note the change only fixes a hardware hang, but does not implement
> > > > HPA/SPA translation. Support for this can be added in a follow on
> > > > patch series.
> > > >
>
> Fixes: 9de321e93c3b ("cxl/pci: Refactor cxl_hdm_decode_init()")

This patch just moves the memory-disable call from one place to another.

> Fixes: 34e37b4c432c ("cxl/port: Enable HDM Capability after validating DVSEC Ranges")

This is the proper Fixes tag.

> Cc: [email protected]

Added.

2024-02-17 04:23:16

by Dan Williams

[permalink] [raw]
Subject: Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

Robert Richter wrote:
> The Linux CXL subsystem is built on the assumption that HPA == SPA.
> That is, the host physical address (HPA) the HDM decoder registers are
> programmed with are system physical addresses (SPA).
>
> During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> a HPA window that is described in a CFMWS structure of the CXL host
> bridge (cxl-3.1, 9.18.1.3).
>
> Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> window and the CXL memory range will be disabled then. The HDM decoder
> stops working which causes system memory being disabled and further a
> system hang during HDM decoder initialization, typically when a CXL
> enabled kernel boots.
>
> Prevent a system hang and do not disable the HDM decoder if the
> decoder's CXL range is not found in a CFMWS window.
>
> Note the change only fixes a hardware hang, but does not implement
> HPA/SPA translation. Support for this can be added in a follow on
> patch series.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> drivers/cxl/core/pci.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a0e7ed5ae25f..18616ca873e5 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> }
>
> if (!allowed) {
> - cxl_set_mem_enable(cxlds, 0);
> - info->mem_enabled = 0;
> + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> + return -ENXIO;

While testing I found this needs the following fixup:

diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
index e24ffae8135f..e9e6c81ce034 100644
--- a/drivers/cxl/core/pci.c
+++ b/drivers/cxl/core/pci.c
@@ -477,7 +477,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
allowed++;
}

- if (!allowed) {
+ if (!allowed && info->mem_enabled) {
dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
return -ENXIO;
}


..i.e. Linux should only give up if it does not understand an active
decode region.

Now this SPA/HPA mismatch will still cause problems later in region
creation flow, but that's a separate issue.

2024-02-17 21:27:58

by Robert Richter

[permalink] [raw]
Subject: Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

On 16.02.24 20:22:58, Dan Williams wrote:
> Robert Richter wrote:
> > The Linux CXL subsystem is built on the assumption that HPA == SPA.
> > That is, the host physical address (HPA) the HDM decoder registers are
> > programmed with are system physical addresses (SPA).
> >
> > During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> > 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> > a HPA window that is described in a CFMWS structure of the CXL host
> > bridge (cxl-3.1, 9.18.1.3).
> >
> > Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> > window and the CXL memory range will be disabled then. The HDM decoder
> > stops working which causes system memory being disabled and further a
> > system hang during HDM decoder initialization, typically when a CXL
> > enabled kernel boots.
> >
> > Prevent a system hang and do not disable the HDM decoder if the
> > decoder's CXL range is not found in a CFMWS window.
> >
> > Note the change only fixes a hardware hang, but does not implement
> > HPA/SPA translation. Support for this can be added in a follow on
> > patch series.
> >
> > Signed-off-by: Robert Richter <[email protected]>
> > ---
> > drivers/cxl/core/pci.c | 4 ++--
> > 1 file changed, 2 insertions(+), 2 deletions(-)
> >
> > diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> > index a0e7ed5ae25f..18616ca873e5 100644
> > --- a/drivers/cxl/core/pci.c
> > +++ b/drivers/cxl/core/pci.c
> > @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> > }
> >
> > if (!allowed) {
> > - cxl_set_mem_enable(cxlds, 0);
> > - info->mem_enabled = 0;
> > + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> > + return -ENXIO;
>
> While testing I found this needs the following fixup:
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index e24ffae8135f..e9e6c81ce034 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -477,7 +477,7 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> allowed++;
> }
>
> - if (!allowed) {
> + if (!allowed && info->mem_enabled) {
> dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> return -ENXIO;
> }

The change looks correct to me, thanks for fixing.

-Robert

>
>
> ...i.e. Linux should only give up if it does not understand an active
> decode region.
>
> Now this SPA/HPA mismatch will still cause problems later in region
> creation flow, but that's a separate issue.

2024-03-22 03:17:18

by Li Zhijian

[permalink] [raw]
Subject: [Problem ?] cxl list show nothing after reboot Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

Robert, Dan

It's noticed that 'cxl list' show nothing after a reboot in v6.8.(A fresh boot works)
The git bisection pointed to this commit.

Haven't investigated it deeply, I'm wondering if it's a QEMU problem or
something wrong with this patch.


Reproduce step:

1. Start a cxl QEMU VM
2. cxl list works
cxl list
[
{
"memdev":"mem0",
"ram_size":2147483648,
"serial":0,
"host":"0000:54:00.0"
},
{
"memdev":"mem1",
"pmem_size":2147483648,
"serial":0,
"host":"0000:36:00.0"
}
]

3. reboot VM
4. cxl list show nothing and has following dmesg

cxl list
[
]
Warning: no matching devices found

...

[ 6.249188] pci0000:53: host supports CXL
[ 6.258168] pci0000:35: host supports CXL
[ 6.490568] cxl_pci 0000:54:00.0: Range register decodes outside platform defined CXL ranges.
[ 6.494298] cxl_mem mem0: endpoint3 failed probe
[ 6.506072] cxl_pci 0000:36:00.0: Range register decodes outside platform defined CXL ranges.
[ 6.515092] cxl_mem mem1: endpoint3 failed probe
[ 12.181188] kauditd_printk_skb: 18 callbacks suppressed


Thanks
Zhijian


On 17/02/2024 00:01, Robert Richter wrote:
> The Linux CXL subsystem is built on the assumption that HPA == SPA.
> That is, the host physical address (HPA) the HDM decoder registers are
> programmed with are system physical addresses (SPA).
>
> During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> a HPA window that is described in a CFMWS structure of the CXL host
> bridge (cxl-3.1, 9.18.1.3).
>
> Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> window and the CXL memory range will be disabled then. The HDM decoder
> stops working which causes system memory being disabled and further a
> system hang during HDM decoder initialization, typically when a CXL
> enabled kernel boots.
>
> Prevent a system hang and do not disable the HDM decoder if the
> decoder's CXL range is not found in a CFMWS window.
>
> Note the change only fixes a hardware hang, but does not implement
> HPA/SPA translation. Support for this can be added in a follow on
> patch series.
>
> Signed-off-by: Robert Richter <[email protected]>
> ---
> drivers/cxl/core/pci.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> index a0e7ed5ae25f..18616ca873e5 100644
> --- a/drivers/cxl/core/pci.c
> +++ b/drivers/cxl/core/pci.c
> @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> }
>
> if (!allowed) {
> - cxl_set_mem_enable(cxlds, 0);
> - info->mem_enabled = 0;
> + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> + return -ENXIO;
> }
>
> /*

2024-03-26 08:27:53

by Li Zhijian

[permalink] [raw]
Subject: Re: [Problem ?] cxl list show nothing after reboot Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

All guys,

In order to make the CXL memdev work again, i have to modify the QEMU side
where it resets the "DVSEC CXL Control" during reboot. A draft changes is as below:

Per 8.1.3.2 DVSEC CXL Control (Offset 0Ch), Default value of BIT(2) is 0. So is it reasonable
to have a reset dvsecs in QEMU during reboot?

Any comments @Janathan


[root@iaas-rpma qemu]# git diff
diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
index b0a7e9f11b64..31755a9f9aab 100644
--- a/hw/mem/cxl_type3.c
+++ b/hw/mem/cxl_type3.c
@@ -899,6 +899,26 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
return address_space_write(as, dpa_offset, attrs, &data, size);
}

+static void dvsecs_ctrl_reset(CXLType3Dev *ct3d)
+{
+
+ if (ct3d->sn != UI64_NULL) {
+ pcie_dev_ser_num_init(ct3d->cxl_cstate.pdev, 0x100, ct3d->sn);
+ ct3d->cxl_cstate.dvsec_offset = 0x100 + 0x0c;
+ } else {
+ ct3d->cxl_cstate.dvsec_offset = 0x100;
+ }
+
+ // FIXME?: only reset ctrl instead of rebuilding the whole dvsecs
+#if 0
+ memcpy(pdev->config + offset + sizeof(DVSECHeader),
+ body + sizeof(DVSECHeader),
+ length - sizeof(DVSECHeader));
+#else
+ build_dvsecs(ct3d);
+#endif
+}
+
static void ct3d_reset(DeviceState *dev)
{
CXLType3Dev *ct3d = CXL_TYPE3(dev);
@@ -907,6 +927,7 @@ static void ct3d_reset(DeviceState *dev)

cxl_component_register_init_common(reg_state, write_msk, CXL2_TYPE3_DEVICE);
cxl_device_register_init_t3(ct3d);
+ dvsecs_ctrl_reset(ct3d);

/*
* Bring up an endpoint to target with MCTP over VDM.





On 22/03/2024 11:15, Zhijian Li (Fujitsu) wrote:
> Robert, Dan
>
> It's noticed that 'cxl list' show nothing after a reboot in v6.8.(A fresh boot works)
> The git bisection pointed to this commit.
>
> Haven't investigated it deeply, I'm wondering if it's a QEMU problem or
> something wrong with this patch.
>
>
> Reproduce step:
>
> 1. Start a cxl QEMU VM
> 2. cxl list works
> cxl list
> [
> {
> "memdev":"mem0",
> "ram_size":2147483648,
> "serial":0,
> "host":"0000:54:00.0"
> },
> {
> "memdev":"mem1",
> "pmem_size":2147483648,
> "serial":0,
> "host":"0000:36:00.0"
> }
> ]
>
> 3. reboot VM
> 4. cxl list show nothing and has following dmesg
>
> cxl list
> [
> ]
> Warning: no matching devices found
>
> ...
>
> [ 6.249188] pci0000:53: host supports CXL
> [ 6.258168] pci0000:35: host supports CXL
> [ 6.490568] cxl_pci 0000:54:00.0: Range register decodes outside platform defined CXL ranges.
> [ 6.494298] cxl_mem mem0: endpoint3 failed probe
> [ 6.506072] cxl_pci 0000:36:00.0: Range register decodes outside platform defined CXL ranges.
> [ 6.515092] cxl_mem mem1: endpoint3 failed probe
> [ 12.181188] kauditd_printk_skb: 18 callbacks suppressed
>
>
> Thanks
> Zhijian
>
>
> On 17/02/2024 00:01, Robert Richter wrote:
>> The Linux CXL subsystem is built on the assumption that HPA == SPA.
>> That is, the host physical address (HPA) the HDM decoder registers are
>> programmed with are system physical addresses (SPA).
>>
>> During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
>> 8.1.3.8) are checked if the memory is enabled and the CXL range is in
>> a HPA window that is described in a CFMWS structure of the CXL host
>> bridge (cxl-3.1, 9.18.1.3).
>>
>> Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
>> window and the CXL memory range will be disabled then. The HDM decoder
>> stops working which causes system memory being disabled and further a
>> system hang during HDM decoder initialization, typically when a CXL
>> enabled kernel boots.
>>
>> Prevent a system hang and do not disable the HDM decoder if the
>> decoder's CXL range is not found in a CFMWS window.
>>
>> Note the change only fixes a hardware hang, but does not implement
>> HPA/SPA translation. Support for this can be added in a follow on
>> patch series.
>>
>> Signed-off-by: Robert Richter <[email protected]>
>> ---
>> drivers/cxl/core/pci.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>> index a0e7ed5ae25f..18616ca873e5 100644
>> --- a/drivers/cxl/core/pci.c
>> +++ b/drivers/cxl/core/pci.c
>> @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>> }
>>
>> if (!allowed) {
>> - cxl_set_mem_enable(cxlds, 0);
>> - info->mem_enabled = 0;
>> + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
>> + return -ENXIO;
>> }
>>
>> /*

2024-04-05 16:57:27

by Jonathan Cameron

[permalink] [raw]
Subject: Re: [Problem ?] cxl list show nothing after reboot Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

On Tue, 26 Mar 2024 08:26:21 +0000
"Zhijian Li (Fujitsu)" <[email protected]> wrote:

> All guys,
>
> In order to make the CXL memdev work again, i have to modify the QEMU side
> where it resets the "DVSEC CXL Control" during reboot. A draft changes is as below:
>
> Per 8.1.3.2 DVSEC CXL Control (Offset 0Ch), Default value of BIT(2) is 0. So is it reasonable
> to have a reset dvsecs in QEMU during reboot?
>
> Any comments @Janathan

Hi,

Sorry it took me so long to get to this.

What are you attempting to do? Use an OS reboot on QEMU to check that the flows
meant for BIOS configuration work - i.e. the OS rebuilds the state
correctly by reading the current state of the devices?

Would be good to fix that case but I want to check that's the aim before looking
too closely at this.

Thanks,

Jonathan

>
>
> [root@iaas-rpma qemu]# git diff
> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
> index b0a7e9f11b64..31755a9f9aab 100644
> --- a/hw/mem/cxl_type3.c
> +++ b/hw/mem/cxl_type3.c
> @@ -899,6 +899,26 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
> return address_space_write(as, dpa_offset, attrs, &data, size);
> }
>
> +static void dvsecs_ctrl_reset(CXLType3Dev *ct3d)
> +{
> +
> + if (ct3d->sn != UI64_NULL) {
> + pcie_dev_ser_num_init(ct3d->cxl_cstate.pdev, 0x100, ct3d->sn);
> + ct3d->cxl_cstate.dvsec_offset = 0x100 + 0x0c;
> + } else {
> + ct3d->cxl_cstate.dvsec_offset = 0x100;
> + }
> +
> + // FIXME?: only reset ctrl instead of rebuilding the whole dvsecs
> +#if 0
> + memcpy(pdev->config + offset + sizeof(DVSECHeader),
> + body + sizeof(DVSECHeader),
> + length - sizeof(DVSECHeader));
> +#else
> + build_dvsecs(ct3d);
> +#endif
> +}
> +
> static void ct3d_reset(DeviceState *dev)
> {
> CXLType3Dev *ct3d = CXL_TYPE3(dev);
> @@ -907,6 +927,7 @@ static void ct3d_reset(DeviceState *dev)
>
> cxl_component_register_init_common(reg_state, write_msk, CXL2_TYPE3_DEVICE);
> cxl_device_register_init_t3(ct3d);
> + dvsecs_ctrl_reset(ct3d);
>
> /*
> * Bring up an endpoint to target with MCTP over VDM.
>
>
>
>
>
> On 22/03/2024 11:15, Zhijian Li (Fujitsu) wrote:
> > Robert, Dan
> >
> > It's noticed that 'cxl list' show nothing after a reboot in v6.8.(A fresh boot works)
> > The git bisection pointed to this commit.
> >
> > Haven't investigated it deeply, I'm wondering if it's a QEMU problem or
> > something wrong with this patch.
> >
> >
> > Reproduce step:
> >
> > 1. Start a cxl QEMU VM
> > 2. cxl list works
> > cxl list
> > [
> > {
> > "memdev":"mem0",
> > "ram_size":2147483648,
> > "serial":0,
> > "host":"0000:54:00.0"
> > },
> > {
> > "memdev":"mem1",
> > "pmem_size":2147483648,
> > "serial":0,
> > "host":"0000:36:00.0"
> > }
> > ]
> >
> > 3. reboot VM
> > 4. cxl list show nothing and has following dmesg
> >
> > cxl list
> > [
> > ]
> > Warning: no matching devices found
> >
> > ...
> >
> > [ 6.249188] pci0000:53: host supports CXL
> > [ 6.258168] pci0000:35: host supports CXL
> > [ 6.490568] cxl_pci 0000:54:00.0: Range register decodes outside platform defined CXL ranges.
> > [ 6.494298] cxl_mem mem0: endpoint3 failed probe
> > [ 6.506072] cxl_pci 0000:36:00.0: Range register decodes outside platform defined CXL ranges.
> > [ 6.515092] cxl_mem mem1: endpoint3 failed probe
> > [ 12.181188] kauditd_printk_skb: 18 callbacks suppressed
> >
> >
> > Thanks
> > Zhijian
> >
> >
> > On 17/02/2024 00:01, Robert Richter wrote:
> >> The Linux CXL subsystem is built on the assumption that HPA == SPA.
> >> That is, the host physical address (HPA) the HDM decoder registers are
> >> programmed with are system physical addresses (SPA).
> >>
> >> During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
> >> 8.1.3.8) are checked if the memory is enabled and the CXL range is in
> >> a HPA window that is described in a CFMWS structure of the CXL host
> >> bridge (cxl-3.1, 9.18.1.3).
> >>
> >> Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
> >> window and the CXL memory range will be disabled then. The HDM decoder
> >> stops working which causes system memory being disabled and further a
> >> system hang during HDM decoder initialization, typically when a CXL
> >> enabled kernel boots.
> >>
> >> Prevent a system hang and do not disable the HDM decoder if the
> >> decoder's CXL range is not found in a CFMWS window.
> >>
> >> Note the change only fixes a hardware hang, but does not implement
> >> HPA/SPA translation. Support for this can be added in a follow on
> >> patch series.
> >>
> >> Signed-off-by: Robert Richter <[email protected]>
> >> ---
> >> drivers/cxl/core/pci.c | 4 ++--
> >> 1 file changed, 2 insertions(+), 2 deletions(-)
> >>
> >> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
> >> index a0e7ed5ae25f..18616ca873e5 100644
> >> --- a/drivers/cxl/core/pci.c
> >> +++ b/drivers/cxl/core/pci.c
> >> @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
> >> }
> >>
> >> if (!allowed) {
> >> - cxl_set_mem_enable(cxlds, 0);
> >> - info->mem_enabled = 0;
> >> + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
> >> + return -ENXIO;
> >> }
> >>
> >> /


2024-04-07 03:53:12

by Li Zhijian

[permalink] [raw]
Subject: Re: [Problem ?] cxl list show nothing after reboot Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window



On 06/04/2024 00:57, Jonathan Cameron wrote:
> On Tue, 26 Mar 2024 08:26:21 +0000
> "Zhijian Li (Fujitsu)" <[email protected]> wrote:
>
>> All guys,
>>
>> In order to make the CXL memdev work again, i have to modify the QEMU side
>> where it resets the "DVSEC CXL Control" during reboot. A draft changes is as below:
>>
>> Per 8.1.3.2 DVSEC CXL Control (Offset 0Ch), Default value of BIT(2) is 0. So is it reasonable
>> to have a reset dvsecs in QEMU during reboot?
>>
>> Any comments @Janathan
>
> Hi,
>
> Sorry it took me so long to get to this.
>
> What are you attempting to do? Use an OS reboot on QEMU to check that the flows
> meant for BIOS configuration work -


There is no doubt that *the OS rebuilds the state correctly* is the OS's responsibility.
Providing the consistent device state is the *Device*'s responsibility.

So on reboot, the device should have a consistent device state with a fresh boot.
My changes intended to let *Device* emulated by QEMU provide a consistent
device state.


Thanks
Zhijian

> i.e. the OS rebuilds the state
> correctly by reading the current state of the devices?>
>
> Would be good to fix that case but I want to check that's the aim before looking
> too closely at this.
>
> Thanks,
>
> Jonathan
>
>>
>>
>> [root@iaas-rpma qemu]# git diff
>> diff --git a/hw/mem/cxl_type3.c b/hw/mem/cxl_type3.c
>> index b0a7e9f11b64..31755a9f9aab 100644
>> --- a/hw/mem/cxl_type3.c
>> +++ b/hw/mem/cxl_type3.c
>> @@ -899,6 +899,26 @@ MemTxResult cxl_type3_write(PCIDevice *d, hwaddr host_addr, uint64_t data,
>> return address_space_write(as, dpa_offset, attrs, &data, size);
>> }
>>
>> +static void dvsecs_ctrl_reset(CXLType3Dev *ct3d)
>> +{
>> +
>> + if (ct3d->sn != UI64_NULL) {
>> + pcie_dev_ser_num_init(ct3d->cxl_cstate.pdev, 0x100, ct3d->sn);
>> + ct3d->cxl_cstate.dvsec_offset = 0x100 + 0x0c;
>> + } else {
>> + ct3d->cxl_cstate.dvsec_offset = 0x100;
>> + }
>> +
>> + // FIXME?: only reset ctrl instead of rebuilding the whole dvsecs
>> +#if 0
>> + memcpy(pdev->config + offset + sizeof(DVSECHeader),
>> + body + sizeof(DVSECHeader),
>> + length - sizeof(DVSECHeader));
>> +#else
>> + build_dvsecs(ct3d);
>> +#endif
>> +}
>> +
>> static void ct3d_reset(DeviceState *dev)
>> {
>> CXLType3Dev *ct3d = CXL_TYPE3(dev);
>> @@ -907,6 +927,7 @@ static void ct3d_reset(DeviceState *dev)
>>
>> cxl_component_register_init_common(reg_state, write_msk, CXL2_TYPE3_DEVICE);
>> cxl_device_register_init_t3(ct3d);
>> + dvsecs_ctrl_reset(ct3d);
>>
>> /*
>> * Bring up an endpoint to target with MCTP over VDM.
>>
>>
>>
>>
>>
>> On 22/03/2024 11:15, Zhijian Li (Fujitsu) wrote:
>>> Robert, Dan
>>>
>>> It's noticed that 'cxl list' show nothing after a reboot in v6.8.(A fresh boot works)
>>> The git bisection pointed to this commit.
>>>
>>> Haven't investigated it deeply, I'm wondering if it's a QEMU problem or
>>> something wrong with this patch.
>>>
>>>
>>> Reproduce step:
>>>
>>> 1. Start a cxl QEMU VM
>>> 2. cxl list works
>>> cxl list
>>> [
>>> {
>>> "memdev":"mem0",
>>> "ram_size":2147483648,
>>> "serial":0,
>>> "host":"0000:54:00.0"
>>> },
>>> {
>>> "memdev":"mem1",
>>> "pmem_size":2147483648,
>>> "serial":0,
>>> "host":"0000:36:00.0"
>>> }
>>> ]
>>>
>>> 3. reboot VM
>>> 4. cxl list show nothing and has following dmesg
>>>
>>> cxl list
>>> [
>>> ]
>>> Warning: no matching devices found
>>>
>>> ...
>>>
>>> [ 6.249188] pci0000:53: host supports CXL
>>> [ 6.258168] pci0000:35: host supports CXL
>>> [ 6.490568] cxl_pci 0000:54:00.0: Range register decodes outside platform defined CXL ranges.
>>> [ 6.494298] cxl_mem mem0: endpoint3 failed probe
>>> [ 6.506072] cxl_pci 0000:36:00.0: Range register decodes outside platform defined CXL ranges.
>>> [ 6.515092] cxl_mem mem1: endpoint3 failed probe
>>> [ 12.181188] kauditd_printk_skb: 18 callbacks suppressed
>>>
>>>
>>> Thanks
>>> Zhijian
>>>
>>>
>>> On 17/02/2024 00:01, Robert Richter wrote:
>>>> The Linux CXL subsystem is built on the assumption that HPA == SPA.
>>>> That is, the host physical address (HPA) the HDM decoder registers are
>>>> programmed with are system physical addresses (SPA).
>>>>
>>>> During HDM decoder setup, the DVSEC CXL range registers (cxl-3.1,
>>>> 8.1.3.8) are checked if the memory is enabled and the CXL range is in
>>>> a HPA window that is described in a CFMWS structure of the CXL host
>>>> bridge (cxl-3.1, 9.18.1.3).
>>>>
>>>> Now, if the HPA is not an SPA, the CXL range does not match a CFMWS
>>>> window and the CXL memory range will be disabled then. The HDM decoder
>>>> stops working which causes system memory being disabled and further a
>>>> system hang during HDM decoder initialization, typically when a CXL
>>>> enabled kernel boots.
>>>>
>>>> Prevent a system hang and do not disable the HDM decoder if the
>>>> decoder's CXL range is not found in a CFMWS window.
>>>>
>>>> Note the change only fixes a hardware hang, but does not implement
>>>> HPA/SPA translation. Support for this can be added in a follow on
>>>> patch series.
>>>>
>>>> Signed-off-by: Robert Richter <[email protected]>
>>>> ---
>>>> drivers/cxl/core/pci.c | 4 ++--
>>>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>>>
>>>> diff --git a/drivers/cxl/core/pci.c b/drivers/cxl/core/pci.c
>>>> index a0e7ed5ae25f..18616ca873e5 100644
>>>> --- a/drivers/cxl/core/pci.c
>>>> +++ b/drivers/cxl/core/pci.c
>>>> @@ -478,8 +478,8 @@ int cxl_hdm_decode_init(struct cxl_dev_state *cxlds, struct cxl_hdm *cxlhdm,
>>>> }
>>>>
>>>> if (!allowed) {
>>>> - cxl_set_mem_enable(cxlds, 0);
>>>> - info->mem_enabled = 0;
>>>> + dev_err(dev, "Range register decodes outside platform defined CXL ranges.\n");
>>>> + return -ENXIO;
>>>> }
>>>>
>>>> /
>

2024-04-08 23:14:43

by Dan Williams

[permalink] [raw]
Subject: Re: [Problem ?] cxl list show nothing after reboot Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window

Zhijian Li (Fujitsu) wrote:
>
>
> On 06/04/2024 00:57, Jonathan Cameron wrote:
> > On Tue, 26 Mar 2024 08:26:21 +0000
> > "Zhijian Li (Fujitsu)" <[email protected]> wrote:
> >
> >> All guys,
> >>
> >> In order to make the CXL memdev work again, i have to modify the QEMU side
> >> where it resets the "DVSEC CXL Control" during reboot. A draft changes is as below:
> >>
> >> Per 8.1.3.2 DVSEC CXL Control (Offset 0Ch), Default value of BIT(2) is 0. So is it reasonable
> >> to have a reset dvsecs in QEMU during reboot?
> >>
> >> Any comments @Janathan
> >
> > Hi,
> >
> > Sorry it took me so long to get to this.
> >
> > What are you attempting to do? Use an OS reboot on QEMU to check that the flows
> > meant for BIOS configuration work -
>
>
> There is no doubt that *the OS rebuilds the state correctly* is the OS's responsibility.
> Providing the consistent device state is the *Device*'s responsibility.
>
> So on reboot, the device should have a consistent device state with a fresh boot.
> My changes intended to let *Device* emulated by QEMU provide a consistent
> device state.

Why? Typically the QEMU CXL enabling is for basic checkout not for
real-world fidelity. If QEMU reboots do not result in restoring the same
device configuration as a re-launching QEMU, why is that worth fixing?
Just document it as a quirk. Now, if it is a simple fix, great, but it
seems low priority given the enabling is really only useful for kernel
development and relaunching QEMU is expected.

2024-04-09 06:55:11

by Li Zhijian

[permalink] [raw]
Subject: Re: [Problem ?] cxl list show nothing after reboot Re: [PATCH v2] cxl/pci: Fix disabling memory if DVSEC CXL Range does not match a CFMWS window



On 09/04/2024 07:14, Dan Williams wrote:
> Zhijian Li (Fujitsu) wrote:
>>
>>
>> On 06/04/2024 00:57, Jonathan Cameron wrote:
>>> On Tue, 26 Mar 2024 08:26:21 +0000
>>> "Zhijian Li (Fujitsu)" <[email protected]> wrote:
>>>
>>>> All guys,
>>>>
>>>> In order to make the CXL memdev work again, i have to modify the QEMU side
>>>> where it resets the "DVSEC CXL Control" during reboot. A draft changes is as below:
>>>>
>>>> Per 8.1.3.2 DVSEC CXL Control (Offset 0Ch), Default value of BIT(2) is 0. So is it reasonable
>>>> to have a reset dvsecs in QEMU during reboot?
>>>>
>>>> Any comments @Janathan
>>>
>>> Hi,
>>>
>>> Sorry it took me so long to get to this.
>>>
>>> What are you attempting to do? Use an OS reboot on QEMU to check that the flows
>>> meant for BIOS configuration work -
>>
>>
>> There is no doubt that *the OS rebuilds the state correctly* is the OS's responsibility.
>> Providing the consistent device state is the *Device*'s responsibility.
>>
>> So on reboot, the device should have a consistent device state with a fresh boot.
>> My changes intended to let *Device* emulated by QEMU provide a consistent
>> device state.
>
> Why? Typically the QEMU CXL enabling is for basic checkout not for
> real-world fidelity. If QEMU reboots do not result in restoring the same
> device configuration as a re-launching QEMU,

It was confirmed to be true in current QEMU, so we should fix the QEMU[1].

> why is that worth fixing?> Just document it as a quirk. Now, if it is a simple fix, great, but i> seems low priority given the enabling is really only useful for kernel
> development and relaunching QEMU is expected.

Personally, QEMU deserves a better solution than relaunching.


[1] https://lore.kernel.org/all/[email protected]/