2011-06-06 09:09:33

by padmanabh ratnakar

[permalink] [raw]
Subject: Seeing DMAR errors after multiple load/unload with SR-IOV

Hi,
I am using linux kernel 2.6.39. I have a IBM x3650 M3 system.
I have used following boot options -
intel_iommu=on iommu=pt

I was loading/unloading my NIC driver(be2net) with num_vfs=7.

After some iterations I get following DMAR errors -
Jun 4 03:50:20 rhel6 kernel: Uhhuh. NMI received for unknown reason
2d on CPU 0.
Jun 4 03:50:20 rhel6 kernel: Do you have a strange power saving mode enabled?
Jun 4 03:50:20 rhel6 kernel: Dazed and confused, but trying to continue
Jun 4 03:50:20 rhel6 kernel: DRHD: handling fault status reg 2
Jun 4 03:50:20 rhel6 kernel: DMAR:[DMA Read] Request device [1a:00.2]
fault addr 78077000
Jun 4 03:50:20 rhel6 kernel: DMAR:[fault reason 02] Present bit in
context entry is clear

I was trying to debug this. I dont understand iommu code much.
The physical address belongs the printed PCI function and there should
not have been an error.

I am unable to see pci_dev(pdev) of VFs getting removed from
si_domain->devices list(intel-iommu.c)
when driver gets unloaded calling pci_disable_sriov() freeing VF pdevs.
Looks like issue happens when when freed pdev is allocated again and
as it is already in list,
required initializations dont happen.

I dont know if my understanding is correct. Can anyone point me to
what the issue may be?

Thanks,
Padmanabh


2011-06-06 22:18:23

by Alex Williamson

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

On Mon, 2011-06-06 at 14:39 +0530, padmanabh ratnakar wrote:
> Hi,
> I am using linux kernel 2.6.39. I have a IBM x3650 M3 system.
> I have used following boot options -
> intel_iommu=on iommu=pt
>
> I was loading/unloading my NIC driver(be2net) with num_vfs=7.
>
> After some iterations I get following DMAR errors -
> Jun 4 03:50:20 rhel6 kernel: Uhhuh. NMI received for unknown reason
> 2d on CPU 0.
> Jun 4 03:50:20 rhel6 kernel: Do you have a strange power saving mode enabled?
> Jun 4 03:50:20 rhel6 kernel: Dazed and confused, but trying to continue
> Jun 4 03:50:20 rhel6 kernel: DRHD: handling fault status reg 2
> Jun 4 03:50:20 rhel6 kernel: DMAR:[DMA Read] Request device [1a:00.2]
> fault addr 78077000
> Jun 4 03:50:20 rhel6 kernel: DMAR:[fault reason 02] Present bit in
> context entry is clear
>
> I was trying to debug this. I dont understand iommu code much.
> The physical address belongs the printed PCI function and there should
> not have been an error.
>
> I am unable to see pci_dev(pdev) of VFs getting removed from
> si_domain->devices list(intel-iommu.c)
> when driver gets unloaded calling pci_disable_sriov() freeing VF pdevs.
> Looks like issue happens when when freed pdev is allocated again and
> as it is already in list,
> required initializations dont happen.
>
> I dont know if my understanding is correct. Can anyone point me to
> what the issue may be?

Typically devices are removed from the domain via
drivers/pci/intel-iommu.c:device_notifier(), which is called as the
device is unbound from the driver. However, this seems to get skipped
when running in passthrough mode, so I'm not sure where that's supposed
to occur. Does it happen w/o passthrough? Also note that some
intel-iommu fixes have rolled into 3.0.0-rc2, you might want to update
and see if anything is better there. Thanks,

Alex

2011-06-06 22:43:31

by Chris Wright

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

* Alex Williamson ([email protected]) wrote:
> On Mon, 2011-06-06 at 14:39 +0530, padmanabh ratnakar wrote:
> > Hi,
> > I am using linux kernel 2.6.39. I have a IBM x3650 M3 system.
> > I have used following boot options -
> > intel_iommu=on iommu=pt
> >
> > I was loading/unloading my NIC driver(be2net) with num_vfs=7.
> >
> > After some iterations I get following DMAR errors -
> > Jun 4 03:50:20 rhel6 kernel: Uhhuh. NMI received for unknown reason
> > 2d on CPU 0.
> > Jun 4 03:50:20 rhel6 kernel: Do you have a strange power saving mode enabled?
> > Jun 4 03:50:20 rhel6 kernel: Dazed and confused, but trying to continue
> > Jun 4 03:50:20 rhel6 kernel: DRHD: handling fault status reg 2
> > Jun 4 03:50:20 rhel6 kernel: DMAR:[DMA Read] Request device [1a:00.2]
> > fault addr 78077000
> > Jun 4 03:50:20 rhel6 kernel: DMAR:[fault reason 02] Present bit in
> > context entry is clear
> >
> > I was trying to debug this. I dont understand iommu code much.
> > The physical address belongs the printed PCI function and there should
> > not have been an error.
> >
> > I am unable to see pci_dev(pdev) of VFs getting removed from
> > si_domain->devices list(intel-iommu.c)
> > when driver gets unloaded calling pci_disable_sriov() freeing VF pdevs.
> > Looks like issue happens when when freed pdev is allocated again and
> > as it is already in list,
> > required initializations dont happen.
> >
> > I dont know if my understanding is correct. Can anyone point me to
> > what the issue may be?

Yes, that's correct. The (now replaced) check identity_mapping()
will succeed when the pci_dev is recycled (it's freed, but never
removed from the list, this is an issue with passtrhough mode and device
creation/desctruction). This false match happens w/ a brand new pci_dev
which still has default 32bit DMA mask, so it is removed from pt domain.
During removal domain_remove_one_dev_info() test that matches only
on bus/devfn (now also segment) will match despite the fact that the
info->pdev != pdev->dev.archdata.iommu. Then...Oops

> Typically devices are removed from the domain via
> drivers/pci/intel-iommu.c:device_notifier(), which is called as the
> device is unbound from the driver. However, this seems to get skipped
> when running in passthrough mode, so I'm not sure where that's supposed
> to occur. Does it happen w/o passthrough?

If you blacklist the driver then a create/delete may do similar (haven't
tested that idea).

> Also note that some
> intel-iommu fixes have rolled into 3.0.0-rc2, you might want to update
> and see if anything is better there. Thanks,

The change in identity_mapping() means we won't demote to 32-bit DMA
(drop out of pt domain), so I don't think we'll see the same issue.

thanks,
-chris

2011-06-07 06:23:25

by padmanabh ratnakar

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

On Tue, Jun 7, 2011 at 4:04 AM, Chris Wright <[email protected]> wrote:
> * Alex Williamson ([email protected]) wrote:
>> On Mon, 2011-06-06 at 14:39 +0530, padmanabh ratnakar wrote:
>> > Hi,
>> > ? ? ? ? I am using linux kernel 2.6.39. I have a IBM x3650 M3 system.
>> > I have used following boot options -
>> > intel_iommu=on iommu=pt
>> >
>> > I was loading/unloading my NIC driver(be2net) with num_vfs=7.
>> >
>> > After some iterations I get following DMAR errors -
>> > Jun ?4 03:50:20 rhel6 kernel: Uhhuh. NMI received for unknown reason
>> > 2d on CPU 0.
>> > Jun ?4 03:50:20 rhel6 kernel: Do you have a strange power saving mode enabled?
>> > Jun ?4 03:50:20 rhel6 kernel: Dazed and confused, but trying to continue
>> > Jun ?4 03:50:20 rhel6 kernel: DRHD: handling fault status reg 2
>> > Jun ?4 03:50:20 rhel6 kernel: DMAR:[DMA Read] Request device [1a:00.2]
>> > fault addr 78077000
>> > Jun ?4 03:50:20 rhel6 kernel: DMAR:[fault reason 02] Present bit in
>> > context entry is clear
>> >
>> > I was trying to debug this. I dont understand iommu code much.
>> > The physical address belongs the printed PCI function and there should
>> > not have been an error.
>> >
>> > I am unable to see pci_dev(pdev) of VFs getting removed from
>> > si_domain->devices list(intel-iommu.c)
>> > when driver gets unloaded calling pci_disable_sriov() freeing VF pdevs.
>> > Looks like issue happens when when freed pdev is allocated again and
>> > as it is already in list,
>> > required initializations dont happen.
>> >
>> > I dont know if my understanding is correct. Can anyone point me to
>> > what the issue may be?
>
> Yes, that's correct. ?The (now replaced) check identity_mapping()
> will succeed when the pci_dev is recycled (it's freed, but never
> removed from the list, this is an issue with passtrhough mode and device
> creation/desctruction). ?This false match happens w/ a brand new pci_dev
> which still has default 32bit DMA mask, so it is removed from pt domain.
> During removal domain_remove_one_dev_info() test that matches only
> on bus/devfn (now also segment) will match despite the fact that the
> info->pdev != pdev->dev.archdata.iommu. ?Then...Oops
>
>> Typically devices are removed from the domain via
>> drivers/pci/intel-iommu.c:device_notifier(), which is called as the
>> device is unbound from the driver. ?However, this seems to get skipped
>> when running in passthrough mode, so I'm not sure where that's supposed
>> to occur. ?Does it happen w/o passthrough?
>
I had tried without passthrough on RHEL 6.1 GA kernel. Was seeing
hangs and panics. Will check if non passthrough mode works on latest kernel.

> If you blacklist the driver then a create/delete may do similar (haven't
> tested that idea).
>
>> Also note that some
>> intel-iommu fixes have rolled into 3.0.0-rc2, you might want to update
>> and see if anything is better there. ?Thanks,
>
> The change in identity_mapping() means we won't demote to 32-bit DMA
> (drop out of pt domain), so I don't think we'll see the same issue.
>
For testing I had made a hack in 2.6.39 kernel which will prevent
demoting to 32bit DMA mask
and thereby prevent calling of domain_remove_one_dev_info() for the
specific VF device I was using
and it had worked.
So as you said I may not hit the issue in latest kernel. Will try that.

> thanks,
> -chris
>

Thanks for the response and suggestions.
Padmanabh

2011-06-07 13:38:23

by Chris Wright

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

* padmanabh ratnakar ([email protected]) wrote:
> On Tue, Jun 7, 2011 at 4:04 AM, Chris Wright <[email protected]> wrote:
> > * Alex Williamson ([email protected]) wrote:
> >> On Mon, 2011-06-06 at 14:39 +0530, padmanabh ratnakar wrote:
> >> > Hi,
> >> > ? ? ? ? I am using linux kernel 2.6.39. I have a IBM x3650 M3 system.
> >> > I have used following boot options -
> >> > intel_iommu=on iommu=pt
> >> >
> >> > I was loading/unloading my NIC driver(be2net) with num_vfs=7.
> >> >
> >> > After some iterations I get following DMAR errors -
> >> > Jun ?4 03:50:20 rhel6 kernel: Uhhuh. NMI received for unknown reason
> >> > 2d on CPU 0.
> >> > Jun ?4 03:50:20 rhel6 kernel: Do you have a strange power saving mode enabled?
> >> > Jun ?4 03:50:20 rhel6 kernel: Dazed and confused, but trying to continue
> >> > Jun ?4 03:50:20 rhel6 kernel: DRHD: handling fault status reg 2
> >> > Jun ?4 03:50:20 rhel6 kernel: DMAR:[DMA Read] Request device [1a:00.2]
> >> > fault addr 78077000
> >> > Jun ?4 03:50:20 rhel6 kernel: DMAR:[fault reason 02] Present bit in
> >> > context entry is clear
> >> >
> >> > I was trying to debug this. I dont understand iommu code much.
> >> > The physical address belongs the printed PCI function and there should
> >> > not have been an error.
> >> >
> >> > I am unable to see pci_dev(pdev) of VFs getting removed from
> >> > si_domain->devices list(intel-iommu.c)
> >> > when driver gets unloaded calling pci_disable_sriov() freeing VF pdevs.
> >> > Looks like issue happens when when freed pdev is allocated again and
> >> > as it is already in list,
> >> > required initializations dont happen.
> >> >
> >> > I dont know if my understanding is correct. Can anyone point me to
> >> > what the issue may be?
> >
> > Yes, that's correct. ?The (now replaced) check identity_mapping()
> > will succeed when the pci_dev is recycled (it's freed, but never
> > removed from the list, this is an issue with passtrhough mode and device
> > creation/desctruction). ?This false match happens w/ a brand new pci_dev
> > which still has default 32bit DMA mask, so it is removed from pt domain.
> > During removal domain_remove_one_dev_info() test that matches only
> > on bus/devfn (now also segment) will match despite the fact that the
> > info->pdev != pdev->dev.archdata.iommu. ?Then...Oops
> >
> >> Typically devices are removed from the domain via
> >> drivers/pci/intel-iommu.c:device_notifier(), which is called as the
> >> device is unbound from the driver. ?However, this seems to get skipped
> >> when running in passthrough mode, so I'm not sure where that's supposed
> >> to occur. ?Does it happen w/o passthrough?
> >
> I had tried without passthrough on RHEL 6.1 GA kernel. Was seeing
> hangs and panics. Will check if non passthrough mode works on latest kernel.
>
> > If you blacklist the driver then a create/delete may do similar (haven't
> > tested that idea).
> >
> >> Also note that some
> >> intel-iommu fixes have rolled into 3.0.0-rc2, you might want to update
> >> and see if anything is better there. ?Thanks,
> >
> > The change in identity_mapping() means we won't demote to 32-bit DMA
> > (drop out of pt domain), so I don't think we'll see the same issue.
> >
> For testing I had made a hack in 2.6.39 kernel which will prevent
> demoting to 32bit DMA mask
> and thereby prevent calling of domain_remove_one_dev_info() for the
> specific VF device I was using
> and it had worked.
> So as you said I may not hit the issue in latest kernel. Will try that.

I think we still leak the list entry though. Bottom line is that we
need to handle hotplug ADD_DEVICE and DEL_DEVICE notifications. We
happen to pick up ADD_DEVICE by accident, but it's all pretty sloppy.

thanks,
-chris

2011-06-07 13:47:05

by David Woodhouse

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

On Tue, 2011-06-07 at 06:38 -0700, Chris Wright wrote:
> I think we still leak the list entry though. Bottom line is that we
> need to handle hotplug ADD_DEVICE and DEL_DEVICE notifications. We
> happen to pick up ADD_DEVICE by accident, but it's all pretty sloppy.

Yeah, keeping a list of possible stale 'pci_dev' pointers is stupid. We
should figure out the matching DMAR unit directly from the ACPI table at
ADD_DEVICE time, and store it in pdev->archdata.iommu.

I saw patches which were going in that direction...

--
dwmw2

2011-06-07 15:11:19

by Chris Wright

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

* David Woodhouse ([email protected]) wrote:
> On Tue, 2011-06-07 at 06:38 -0700, Chris Wright wrote:
> > I think we still leak the list entry though. Bottom line is that we
> > need to handle hotplug ADD_DEVICE and DEL_DEVICE notifications. We
> > happen to pick up ADD_DEVICE by accident, but it's all pretty sloppy.
>
> Yeah, keeping a list of possible stale 'pci_dev' pointers is stupid. We
> should figure out the matching DMAR unit directly from the ACPI table at
> ADD_DEVICE time, and store it in pdev->archdata.iommu.
>
> I saw patches which were going in that direction...

Cool, where are they? I'm working on something similar, and missed them.

thanks,
-chris

2011-06-07 15:33:30

by David Woodhouse

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

On Tue, 2011-06-07 at 08:10 -0700, Chris Wright wrote:
> * David Woodhouse ([email protected]) wrote:
> > On Tue, 2011-06-07 at 06:38 -0700, Chris Wright wrote:
> > > I think we still leak the list entry though. Bottom line is that we
> > > need to handle hotplug ADD_DEVICE and DEL_DEVICE notifications. We
> > > happen to pick up ADD_DEVICE by accident, but it's all pretty sloppy.
> >
> > Yeah, keeping a list of possible stale 'pci_dev' pointers is stupid. We
> > should figure out the matching DMAR unit directly from the ACPI table at
> > ADD_DEVICE time, and store it in pdev->archdata.iommu.
> >
> > I saw patches which were going in that direction...
>
> Cool, where are they? I'm working on something similar, and missed them.

[PATCH] pci, dmar: Update dmar units devices list during hotplug

Alex was working on it.

--
dwmw2

2011-06-07 15:36:10

by Chris Wright

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

* David Woodhouse ([email protected]) wrote:
> On Tue, 2011-06-07 at 08:10 -0700, Chris Wright wrote:
> > * David Woodhouse ([email protected]) wrote:
> > > On Tue, 2011-06-07 at 06:38 -0700, Chris Wright wrote:
> > > > I think we still leak the list entry though. Bottom line is that we
> > > > need to handle hotplug ADD_DEVICE and DEL_DEVICE notifications. We
> > > > happen to pick up ADD_DEVICE by accident, but it's all pretty sloppy.
> > >
> > > Yeah, keeping a list of possible stale 'pci_dev' pointers is stupid. We
> > > should figure out the matching DMAR unit directly from the ACPI table at
> > > ADD_DEVICE time, and store it in pdev->archdata.iommu.
> > >
> > > I saw patches which were going in that direction...
> >
> > Cool, where are they? I'm working on something similar, and missed them.
>
> [PATCH] pci, dmar: Update dmar units devices list during hotplug

Oh yeah, thanks for the reminder.

thanks,
-chris

2011-06-07 15:41:15

by Alex Williamson

[permalink] [raw]
Subject: Re: Seeing DMAR errors after multiple load/unload with SR-IOV

On Tue, 2011-06-07 at 16:33 +0100, David Woodhouse wrote:
> On Tue, 2011-06-07 at 08:10 -0700, Chris Wright wrote:
> > * David Woodhouse ([email protected]) wrote:
> > > On Tue, 2011-06-07 at 06:38 -0700, Chris Wright wrote:
> > > > I think we still leak the list entry though. Bottom line is that we
> > > > need to handle hotplug ADD_DEVICE and DEL_DEVICE notifications. We
> > > > happen to pick up ADD_DEVICE by accident, but it's all pretty sloppy.
> > >
> > > Yeah, keeping a list of possible stale 'pci_dev' pointers is stupid. We
> > > should figure out the matching DMAR unit directly from the ACPI table at
> > > ADD_DEVICE time, and store it in pdev->archdata.iommu.
> > >
> > > I saw patches which were going in that direction...
> >
> > Cool, where are they? I'm working on something similar, and missed them.
>
> [PATCH] pci, dmar: Update dmar units devices list during hotplug
>
> Alex was working on it.

Nope, I had a wip patch that did an on-the-fly lookup, that I handed off
to Yinghai, but it didn't actually work. That's when the suggestion was
made to do it at hotplug, but I'm not pursuing that right now, maybe
Yinghai is? Thanks,

Alex

Alex