2023-05-19 02:37:22

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: CXL memory device not created correctly

On Fri, 19 May 2023, LiuLele wrote:

>In my testing CXL device /sys/bus/cxl/devices/mem0 not created, and the get error messages :
>
>```
>cxl_pci 0000:0d:00.0: Failed to get interrupt for event Info log
>```
>
>My test environment is a qemu CXL emulator with qemu v8.0.0, Linux kernel v6.3.0.
>While with kernel 5.9.13, /sys/bus/cxl/devices/mem0 can be created.

Yes, this can be annoying and would argue the probe should not error out.
Regardless, the actual qemu support is in Jonathan's tree:

https://gitlab.com/jic23/qemu/-/commit/a04e6476df363d1f6bc160577b30dda6564d3f67

Thanks,
Davidlohr


2023-05-19 15:19:35

by Jonathan Cameron

[permalink] [raw]
Subject: Re: CXL memory device not created correctly

On Thu, 18 May 2023 18:38:46 -0700
Davidlohr Bueso <[email protected]> wrote:

> On Fri, 19 May 2023, LiuLele wrote:
>
> >In my testing CXL device /sys/bus/cxl/devices/mem0 not created, and the get error messages :
> >
> >```
> >cxl_pci 0000:0d:00.0: Failed to get interrupt for event Info log
> >```
> >
> >My test environment is a qemu CXL emulator with qemu v8.0.0, Linux kernel v6.3.0.
> >While with kernel 5.9.13, /sys/bus/cxl/devices/mem0 can be created.
>
> Yes, this can be annoying and would argue the probe should not error out.
> Regardless, the actual qemu support is in Jonathan's tree:
>
> https://gitlab.com/jic23/qemu/-/commit/a04e6476df363d1f6bc160577b30dda6564d3f67

That just failed to make it into an upstream pull request today due to some
bugs in a the poison list set that came before it :(

v6 or poison list and events support are both on list now and hopefully will make
this QEMU cycle so be in 8.1

Jonathan


>
> Thanks,
> Davidlohr


2023-05-19 15:31:34

by Ira Weiny

[permalink] [raw]
Subject: Re: CXL memory device not created correctly

Davidlohr Bueso wrote:
> On Fri, 19 May 2023, LiuLele wrote:
>
> >In my testing CXL device /sys/bus/cxl/devices/mem0 not created, and the get error messages :
> >
> >```
> >cxl_pci 0000:0d:00.0: Failed to get interrupt for event Info log
> >```
> >
> >My test environment is a qemu CXL emulator with qemu v8.0.0, Linux kernel v6.3.0.
> >While with kernel 5.9.13, /sys/bus/cxl/devices/mem0 can be created.
>
> Yes, this can be annoying and would argue the probe should not error out.

I had to double check. Events are mandatory on devices. On checking
again interrupt support is mandatory as well. So that is why I errored
out here. With real HW this should not be an issue.

> Regardless, the actual qemu support is in Jonathan's tree:
>
> https://gitlab.com/jic23/qemu/-/commit/a04e6476df363d1f6bc160577b30dda6564d3f67

That is the commit you need but it is probably best to use one of
Jonathans 'official' branches. Looks like he just pushed a new one today.

https://gitlab.com/jic23/qemu/-/tree/cxl-2023-05-19

I've not run that one yet. So if you have issues try his previous one it
is working well for me.

https://gitlab.com/jic23/qemu/-/tree/cxl-2023-04-19

Ira

>
> Thanks,
> Davidlohr



2023-05-19 15:43:25

by Jonathan Cameron

[permalink] [raw]
Subject: Re: CXL memory device not created correctly

On Fri, 19 May 2023 08:20:44 -0700
Ira Weiny <[email protected]> wrote:

> Davidlohr Bueso wrote:
> > On Fri, 19 May 2023, LiuLele wrote:
> >
> > >In my testing CXL device /sys/bus/cxl/devices/mem0 not created, and the get error messages :
> > >
> > >```
> > >cxl_pci 0000:0d:00.0: Failed to get interrupt for event Info log
> > >```
> > >
> > >My test environment is a qemu CXL emulator with qemu v8.0.0, Linux kernel v6.3.0.
> > >While with kernel 5.9.13, /sys/bus/cxl/devices/mem0 can be created.
> >
> > Yes, this can be annoying and would argue the probe should not error out.
>
> I had to double check. Events are mandatory on devices. On checking
> again interrupt support is mandatory as well. So that is why I errored
> out here. With real HW this should not be an issue.
>
> > Regardless, the actual qemu support is in Jonathan's tree:
> >
> > https://gitlab.com/jic23/qemu/-/commit/a04e6476df363d1f6bc160577b30dda6564d3f67
>
> That is the commit you need but it is probably best to use one of
> Jonathans 'official' branches. Looks like he just pushed a new one today.
>
> https://gitlab.com/jic23/qemu/-/tree/cxl-2023-05-19

Leave that one for now. It was to get the CI tests to run. I need to tidy up
a bit and will announce when I have a clean one...

>
> I've not run that one yet. So if you have issues try his previous one it
> is working well for me.
>
> https://gitlab.com/jic23/qemu/-/tree/cxl-2023-04-19

That one should be good to go still I think

Jonathan

>
> Ira
>
> >
> > Thanks,
> > Davidlohr
>
>


2023-05-31 02:33:26

by Luis Chamberlain

[permalink] [raw]
Subject: Re: CXL memory device not created correctly

On Fri, May 19, 2023 at 08:20:44AM -0700, Ira Weiny wrote:
> Davidlohr Bueso wrote:
> > On Fri, 19 May 2023, LiuLele wrote:
> >
> > >In my testing CXL device /sys/bus/cxl/devices/mem0 not created, and the get error messages :
> > >
> > >```
> > >cxl_pci 0000:0d:00.0: Failed to get interrupt for event Info log
> > >```
> > >
> > >My test environment is a qemu CXL emulator with qemu v8.0.0, Linux kernel v6.3.0.
> > >While with kernel 5.9.13, /sys/bus/cxl/devices/mem0 can be created.
> >
> > Yes, this can be annoying and would argue the probe should not error out.
>
> I had to double check. Events are mandatory on devices. On checking
> again interrupt support is mandatory as well. So that is why I errored
> out here.

The failure essentially creates a user visible regression whereas
booting an older kernel fixes it. It is not a friendly error message
when testing kernels / upgrading / test environments. The only thing
I can think of is if a new kconfig symbol is introduced so to make
such cases a bit more clearer for now as things get settled.

Otherwise for testing this creates a few cycles of just noise. And I'd
imagine even a few developer hours.

Luis

2023-06-01 03:16:38

by Ira Weiny

[permalink] [raw]
Subject: Re: CXL memory device not created correctly

Luis Chamberlain wrote:
> On Fri, May 19, 2023 at 08:20:44AM -0700, Ira Weiny wrote:
> > Davidlohr Bueso wrote:
> > > On Fri, 19 May 2023, LiuLele wrote:
> > >
> > > >In my testing CXL device /sys/bus/cxl/devices/mem0 not created, and the get error messages :
> > > >
> > > >```
> > > >cxl_pci 0000:0d:00.0: Failed to get interrupt for event Info log
> > > >```
> > > >
> > > >My test environment is a qemu CXL emulator with qemu v8.0.0, Linux kernel v6.3.0.
> > > >While with kernel 5.9.13, /sys/bus/cxl/devices/mem0 can be created.
> > >
> > > Yes, this can be annoying and would argue the probe should not error out.
> >
> > I had to double check. Events are mandatory on devices. On checking
> > again interrupt support is mandatory as well. So that is why I errored
> > out here.
>
> The failure essentially creates a user visible regression whereas
> booting an older kernel fixes it. It is not a friendly error message
> when testing kernels / upgrading / test environments. The only thing
> I can think of is if a new kconfig symbol is introduced so to make
> such cases a bit more clearer for now as things get settled.

Ah I see now. This is a qemu without the event support. :-/

>
> Otherwise for testing this creates a few cycles of just noise. And I'd
> imagine even a few developer hours.

I don't think the kernel should be changed for following the spec. But I
do sympathize with you. I know Jonathan is working to get the event
support into qemu soon. I've reviewed that series (the patches I did not
author) so I think it will land soon.

Can this be weathered until then?

Ira

2023-06-01 04:22:11

by Davidlohr Bueso

[permalink] [raw]
Subject: Re: CXL memory device not created correctly

On Wed, 31 May 2023, Ira Weiny wrote:

>I don't think the kernel should be changed for following the spec.

Agreed. If events are mandatory we just have to bite the bullet.

Thanks,
Davidlohr