2010-04-08 19:01:53

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 0/6] pci/dmar: small cleanup

About print out and some corner case with SRIOV

Thanks

Yinghai




2010-04-08 19:00:36

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 5/6] pci/dmar: Fix link warning

could make it to be __init to remove the linking section mismatch warning.

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/dmar.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index 550f0a9..a9310ed 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -1467,7 +1467,7 @@ int dmar_reenable_qi(struct intel_iommu *iommu)
/*
* Check interrupt remapping support in DMAR table description.
*/
-int dmar_ir_support(void)
+int __init dmar_ir_support(void)
{
struct acpi_table_dmar *dmar;
dmar = (struct acpi_table_dmar *)dmar_tbl;
--
1.6.4.2

2010-04-08 19:01:12

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 3/6] pci/dmar: remove the noisy print out about unmapping

it seems this one should be removed.
because map page does not have this annonyed print out.

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/intel-iommu.c | 3 ---
1 files changed, 0 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 83ac214..7dddc7c 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -2742,9 +2742,6 @@ static void intel_unmap_page(struct device *dev, dma_addr_t dev_addr,
start_pfn = mm_to_dma_pfn(iova->pfn_lo);
last_pfn = mm_to_dma_pfn(iova->pfn_hi + 1) - 1;

- pr_debug("Device %s unmapping: pfn %lx-%lx\n",
- pci_name(pdev), start_pfn, last_pfn);
-
/* clear the whole page */
dma_pte_clear_range(domain, start_pfn, last_pfn);

--
1.6.4.2

2010-04-08 19:01:20

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 6/6] intel-iommu: Don't call domain_exit if can not attach with iommu

found when a lot of ixgbevf are used with intel iommu, will get crash

> >
> > eth304 ip=192.171.178.102 mac=56:95:16:88:4C:C1 pci=0000:0e:18.3 drv=ixgbevf
> > eth305 ip=192.171.179.102 mac=D6:41:8C:4A:87:B3 pci=0000:0e:18.5 drv=ixgbevf
> > [ 9534.886519] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000008
> > [ 9534.889775] Call Trace:
> > [ 9534.889775] [<ffffffff81406a3d>] domain_remove_dev_info+0x34/0xcf
> > [ 9534.889775] [<ffffffff81408a1d>] domain_exit+0x23/0xc4
> > [ 9534.889775] [<ffffffff81409c1c>] T.953+0x173/0x342
> > [ 9534.889775] [<ffffffff81409fd0>] __intel_map_single+0x63/0x1b3
> > [ 9534.889775] [<ffffffff8140a22a>] intel_alloc_coherent+0xc7/0xee
> > [ 9534.889775] [<ffffffff816c3e7d>] ? ixgbevf_setup_tx_resources+0x2f/0x12c
> > [ 9534.889775] [<ffffffff816c3f36>] ixgbevf_setup_tx_resources+0xe8/0x12c
> > [ 9534.889775] [<ffffffff816c42d2>] ixgbevf_open+0x7a/0x160
> > [ 9534.889775] [<ffffffff81adf21e>] __dev_open+0x8e/0xbc
> > [ 9534.889775] [<ffffffff81adb716>] __dev_change_flags+0xad/0x130
> > [ 9534.889775] [<ffffffff81adf15a>] dev_change_flags+0x21/0x57
> > [ 9534.889775] [<ffffffff81b211e5>] devinet_ioctl+0x29d/0x541
> > [ 9534.889775] [<ffffffff810a4b92>] ? trace_hardirqs_off_caller+0x1f/0xa9
> > [ 9534.889775] [<ffffffff81b22801>] inet_ioctl+0x8f/0xa7
> > [ 9534.889775] [<ffffffff81acc258>] sock_do_ioctl+0x29/0x48
> > [ 9534.889775] [<ffffffff81acca64>] sock_ioctl+0x1fe/0x20d
> > [ 9534.889775] [<ffffffff8113b86a>] vfs_ioctl+0x32/0xa6
> > [ 9534.889775] [<ffffffff8113bd04>] do_vfs_ioctl+0x2b0/0x2cb
> > [ 9534.889775] [<ffffffff81033c4c>] ? sysret_check+0x27/0x62
> > [ 9534.889775] [<ffffffff8113bd66>] sys_ioctl+0x47/0x6a
> > [ 9534.889775] [<ffffffff81033c1b>] system_call_fastpath+0x16/0x1b
> > [ 9534.889775] Code: cd 7f cc ff 41 54 9d 48 83 c4 20 5b 41 5c 41 5d 41
> > 5e c9 c3 55 48 89 e5 e8 11 ff ff ff c9 c3 55 48 89 e5 53 48 89 fb 48 83
> > ec 08 <48> 8b 47 08 4c 8b 00 49 39 f8 74 1d 48 89 f9 48 c7 c2 7f e9 18
> > [ 9534.889775] RIP [<ffffffff813ddf94>] list_del+0xc/0x8b
> > [ 9534.889775] RSP <ffff88c070373af8>
> > [ 9534.889775] CR2: 0000000000000008
> > [ 9534.889775] ---[ end trace 072bd8cdb08a760c ]---
> > xifconfig8_2x_vf.sh: line 10: 28555 Killed ifconfig
> > $DEV $IP
> >
when intel_iommu=off or iommu=pt is used, will work well.

root cause:
domain is initialized after trying attached it,

if it can not be attached, don't call domain_exit yet.
it will cause kernel crash in domain_exit()

after patch will will get error instead of crash.

[ 1781.910241] IOMMU: no free domain ids
[ 1781.910244] Allocating domain for 0000:d0:1f.5 failed
SIOCSIFFLAGS: Cannot allocate memory


Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/intel-iommu.c | 2 +-
1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 7dddc7c..c112e5d 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -1853,7 +1853,7 @@ static struct dmar_domain *get_domain_for_dev(struct pci_dev *pdev, int gaw)

ret = iommu_attach_domain(domain, iommu);
if (ret) {
- domain_exit(domain);
+ free_domain_mem(domain);
goto error;
}

--
1.6.4.2

2010-04-08 19:01:43

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 4/6] pci/dmar/sriov: use physfn to search drhd for VF

When virtfn is used, we should use physfn to find correct drhd

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/dmar.c | 5 +++++
1 files changed, 5 insertions(+), 0 deletions(-)

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index d2f66a6..550f0a9 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -534,6 +534,11 @@ dmar_find_matched_drhd_unit(struct pci_dev *dev)
struct dmar_drhd_unit *dmaru = NULL;
struct acpi_dmar_hardware_unit *drhd;

+#ifdef CONFIG_PCI_IOV
+ if (dev->is_virtfn)
+ dev = dev->physfn;
+#endif
+
list_for_each_entry(dmaru, &dmar_drhd_units, list) {
drhd = container_of(dmaru->hdr,
struct acpi_dmar_hardware_unit,
--
1.6.4.2

2010-04-08 19:01:29

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 2/6] pci/dmar: Print out iommu seq_id

more info on system with more than one IOMMU

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/dmar.c | 3 ++-
drivers/pci/intel-iommu.c | 9 ++++++---
drivers/pci/intr_remapping.c | 6 +++---
3 files changed, 11 insertions(+), 7 deletions(-)

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index 3bd013f..d2f66a6 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -810,7 +810,8 @@ int alloc_iommu(struct dmar_drhd_unit *drhd)
}

ver = readl(iommu->reg + DMAR_VER_REG);
- pr_info("IOMMU %llx: ver %d:%d cap %llx ecap %llx\n",
+ pr_info("IOMMU %d: reg_base_addr %llx ver %d:%d cap %llx ecap %llx\n",
+ iommu->seq_id,
(unsigned long long)drhd->reg_base_addr,
DMAR_VER_MAJOR(ver), DMAR_VER_MINOR(ver),
(unsigned long long)iommu->cap,
diff --git a/drivers/pci/intel-iommu.c b/drivers/pci/intel-iommu.c
index 4173125..83ac214 100644
--- a/drivers/pci/intel-iommu.c
+++ b/drivers/pci/intel-iommu.c
@@ -1154,7 +1154,8 @@ static int iommu_init_domains(struct intel_iommu *iommu)
unsigned long nlongs;

ndomains = cap_ndoms(iommu->cap);
- pr_debug("Number of Domains supportd <%ld>\n", ndomains);
+ pr_debug("IOMMU %d: Number of Domains supportd <%ld>\n", iommu->seq_id,
+ ndomains);
nlongs = BITS_TO_LONGS(ndomains);

spin_lock_init(&iommu->lock);
@@ -2333,14 +2334,16 @@ int __init init_dmars(void)
*/
iommu->flush.flush_context = __iommu_flush_context;
iommu->flush.flush_iotlb = __iommu_flush_iotlb;
- printk(KERN_INFO "IOMMU 0x%Lx: using Register based "
+ printk(KERN_INFO "IOMMU %d 0x%Lx: using Register based "
"invalidation\n",
+ iommu->seq_id,
(unsigned long long)drhd->reg_base_addr);
} else {
iommu->flush.flush_context = qi_flush_context;
iommu->flush.flush_iotlb = qi_flush_iotlb;
- printk(KERN_INFO "IOMMU 0x%Lx: using Queued "
+ printk(KERN_INFO "IOMMU %d 0x%Lx: using Queued "
"invalidation\n",
+ iommu->seq_id,
(unsigned long long)drhd->reg_base_addr);
}
}
diff --git a/drivers/pci/intr_remapping.c b/drivers/pci/intr_remapping.c
index 6ee98a5..1315ac6 100644
--- a/drivers/pci/intr_remapping.c
+++ b/drivers/pci/intr_remapping.c
@@ -832,9 +832,9 @@ static int ir_parse_ioapic_hpet_scope(struct acpi_dmar_header *header,
return -1;
}

- printk(KERN_INFO "IOAPIC id %d under DRHD base"
- " 0x%Lx\n", scope->enumeration_id,
- drhd->address);
+ printk(KERN_INFO "IOAPIC id %d under DRHD base "
+ " 0x%Lx IOMMU %d\n", scope->enumeration_id,
+ drhd->address, iommu->seq_id);

ir_parse_one_ioapic_scope(scope, iommu);
} else if (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_HPET) {
--
1.6.4.2

2010-04-08 19:01:09

by Yinghai Lu

[permalink] [raw]
Subject: [PATCH 1/6] pci/dmar: Don't complain that IOPAIC is not supported

INTR_REMAP handle that type, so if that is used, we should not complain that

Signed-off-by: Yinghai Lu <[email protected]>
---
drivers/pci/dmar.c | 10 +++++++---
1 files changed, 7 insertions(+), 3 deletions(-)

diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
index 33ead97..3bd013f 100644
--- a/drivers/pci/dmar.c
+++ b/drivers/pci/dmar.c
@@ -131,9 +131,13 @@ static int __init dmar_parse_dev_scope(void *start, void *end, int *cnt,
if (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT ||
scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE)
(*cnt)++;
- else
- printk(KERN_WARNING PREFIX
- "Unsupported device scope\n");
+ else {
+#ifdef CONFIG_INTR_REMAP
+ if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_IOAPIC)
+#endif
+ printk(KERN_WARNING PREFIX
+ "Unsupported device scope\n");
+ }
start += scope->length;
}
if (*cnt == 0)
--
1.6.4.2

2010-04-08 19:10:59

by Roland Dreier

[permalink] [raw]
Subject: Re: [PATCH 4/6] pci/dmar/sriov: use physfn to search drhd for VF

> +#ifdef CONFIG_PCI_IOV
> + if (dev->is_virtfn)
> + dev = dev->physfn;
> +#endif

Seems we would prefer to avoid this ifdef... if we had a function like

struct pci_dev *pci_physfn(struct pci_dev *virtfn)

as an inline in <linux/pci.h> that just returns NULL if PCI_IOV is not
enabled, then we could write this as

if (dev->is_virtfn)
dev = pci_physfn(dev)

without any #ifdef.
--
Roland Dreier <[email protected]> || For corporate legal information go to:
http://www.cisco.com/web/about/doing_business/legal/cri/index.html

2010-04-08 19:33:49

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH 5/6] pci/dmar: Fix link warning

* Yinghai Lu ([email protected]) wrote:
> could make it to be __init to remove the linking section mismatch warning.

Hey David, I just noticed this one is already in your tree, looks like
the only commit in there that didn't get pushed, accidentally stuck
perhaps?

thanks,
-chris

2010-04-08 20:54:12

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 4/6] pci/dmar/sriov: use physfn to search drhd for VF

please check

Subject: [PATCH] pci/dmar/sriov: use physfn to search drhd for VF

When virtfn is used, we should use physfn to find correct drhd

-v2: add pci_physfn() Suggested by Roland Dreier <[email protected]>
do can remove ifdef in dmar.c

Signed-off-by: Yinghai Lu <[email protected]>

---
drivers/pci/dmar.c | 2 ++
include/linux/pci.h | 10 ++++++++++
2 files changed, 12 insertions(+)

Index: linux-2.6/drivers/pci/dmar.c
===================================================================
--- linux-2.6.orig/drivers/pci/dmar.c
+++ linux-2.6/drivers/pci/dmar.c
@@ -534,6 +534,8 @@ dmar_find_matched_drhd_unit(struct pci_d
struct dmar_drhd_unit *dmaru = NULL;
struct acpi_dmar_hardware_unit *drhd;

+ dev = pci_physfn(dev);
+
list_for_each_entry(dmaru, &dmar_drhd_units, list) {
drhd = container_of(dmaru->hdr,
struct acpi_dmar_hardware_unit,
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -334,6 +334,16 @@ struct pci_dev {
#endif
};

+static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
+{
+#ifdef CONFIG_PCI_IOV
+ if (dev->is_virtfn)
+ dev = dev->physfn;
+#endif
+
+ return dev;
+}
+
extern struct pci_dev *alloc_pci_dev(void);

#define pci_dev_b(n) list_entry(n, struct pci_dev, bus_list)

2010-04-08 23:24:55

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH 4/6] pci/dmar/sriov: use physfn to search drhd for VF

* Yinghai ([email protected]) wrote:
> --- linux-2.6.orig/drivers/pci/dmar.c
> +++ linux-2.6/drivers/pci/dmar.c
> @@ -534,6 +534,8 @@ dmar_find_matched_drhd_unit(struct pci_d
> struct dmar_drhd_unit *dmaru = NULL;
> struct acpi_dmar_hardware_unit *drhd;
>
> + dev = pci_physfn(dev);
> +

Yeah, we typically don't have enough VF's to wrap bus numbers, or we're
under a catchall IOMMU. In the catchall case both vf->bus and vf->pf->bus
will have the same domain (segment) regardless of whether we have large
VF count or big offset/stride. So, I suppose this could be done inside
of dmar_pci_device_match().

Otherwise, I think you'd want to add the same thing to
dmar_find_matched_atsr_unit() since it's the same device scopes there.

thanks,
-chris

2010-04-09 00:10:33

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 4/6] pci/dmar/sriov: use physfn to search drhd for VF

On 04/08/2010 04:24 PM, Chris Wright wrote:
> * Yinghai ([email protected]) wrote:
>> --- linux-2.6.orig/drivers/pci/dmar.c
>> +++ linux-2.6/drivers/pci/dmar.c
>> @@ -534,6 +534,8 @@ dmar_find_matched_drhd_unit(struct pci_d
>> struct dmar_drhd_unit *dmaru = NULL;
>> struct acpi_dmar_hardware_unit *drhd;
>>
>> + dev = pci_physfn(dev);
>> +
>
> Yeah, we typically don't have enough VF's to wrap bus numbers, or we're
> under a catchall IOMMU. In the catchall case both vf->bus and vf->pf->bus
> will have the same domain (segment) regardless of whether we have large
> VF count or big offset/stride. So, I suppose this could be done inside
> of dmar_pci_device_match().
>
> Otherwise, I think you'd want to add the same thing to
> dmar_find_matched_atsr_unit() since it's the same device scopes there.

please check

Subject: [PATCH] pci/dmar/sriov: use physfn to search drhd for VF

When virtfn is used, we should use physfn to find correct drhd

-v2: add pci_physfn() Suggested by Roland Dreier <[email protected]>
do can remove ifdef in dmar.c
-v3: Chris pointed out we need that for dma_find_matched_atsr_unit too
also change dmar_pci_device_match() static

Signed-off-by: Yinghai Lu <[email protected]>
Acked-by: Roland Dreier <[email protected]>

---
drivers/pci/dmar.c | 6 +++++-
include/linux/pci.h | 10 ++++++++++
2 files changed, 15 insertions(+), 1 deletion(-)

Index: linux-2.6/drivers/pci/dmar.c
===================================================================
--- linux-2.6.orig/drivers/pci/dmar.c
+++ linux-2.6/drivers/pci/dmar.c
@@ -313,6 +313,8 @@ int dmar_find_matched_atsr_unit(struct p
struct acpi_dmar_atsr *atsr;
struct dmar_atsr_unit *atsru;

+ dev = pci_physfn(dev);
+
list_for_each_entry(atsru, &dmar_atsr_units, list) {
atsr = container_of(atsru->hdr, struct acpi_dmar_atsr, header);
if (atsr->segment == pci_domain_nr(dev->bus))
@@ -511,7 +513,7 @@ parse_dmar_table(void)
return ret;
}

-int dmar_pci_device_match(struct pci_dev *devices[], int cnt,
+static int dmar_pci_device_match(struct pci_dev *devices[], int cnt,
struct pci_dev *dev)
{
int index;
@@ -534,6 +536,8 @@ dmar_find_matched_drhd_unit(struct pci_d
struct dmar_drhd_unit *dmaru = NULL;
struct acpi_dmar_hardware_unit *drhd;

+ dev = pci_physfn(dev);
+
list_for_each_entry(dmaru, &dmar_drhd_units, list) {
drhd = container_of(dmaru->hdr,
struct acpi_dmar_hardware_unit,
Index: linux-2.6/include/linux/pci.h
===================================================================
--- linux-2.6.orig/include/linux/pci.h
+++ linux-2.6/include/linux/pci.h
@@ -334,6 +334,16 @@ struct pci_dev {
#endif
};

+static inline struct pci_dev *pci_physfn(struct pci_dev *dev)
+{
+#ifdef CONFIG_PCI_IOV
+ if (dev->is_virtfn)
+ dev = dev->physfn;
+#endif
+
+ return dev;
+}
+
extern struct pci_dev *alloc_pci_dev(void);

#define pci_dev_b(n) list_entry(n, struct pci_dev, bus_list)

2010-04-09 00:17:15

by Chris Wright

[permalink] [raw]
Subject: Re: [PATCH 4/6] pci/dmar/sriov: use physfn to search drhd for VF

* Yinghai ([email protected]) wrote:
> Subject: [PATCH] pci/dmar/sriov: use physfn to search drhd for VF
>
> When virtfn is used, we should use physfn to find correct drhd
>
> -v2: add pci_physfn() Suggested by Roland Dreier <[email protected]>
> do can remove ifdef in dmar.c
> -v3: Chris pointed out we need that for dma_find_matched_atsr_unit too
> also change dmar_pci_device_match() static
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Acked-by: Roland Dreier <[email protected]>

Looks good

Acked-by: Chris Wright <[email protected]>

2010-04-09 15:33:47

by Woodhouse, David

[permalink] [raw]
Subject: Re: [PATCH 1/6] pci/dmar: Don't complain that IOPAIC is not supported

On Thu, 2010-04-08 at 19:58 +0100, Yinghai Lu wrote:
> INTR_REMAP handle that type, so if that is used, we should not complain that
>
> Signed-off-by: Yinghai Lu <[email protected]>
> ---
> drivers/pci/dmar.c | 10 +++++++---
> 1 files changed, 7 insertions(+), 3 deletions(-)
>
> diff --git a/drivers/pci/dmar.c b/drivers/pci/dmar.c
> index 33ead97..3bd013f 100644
> --- a/drivers/pci/dmar.c
> +++ b/drivers/pci/dmar.c
> @@ -131,9 +131,13 @@ static int __init dmar_parse_dev_scope(void *start, void *end, int *cnt,
> if (scope->entry_type == ACPI_DMAR_SCOPE_TYPE_ENDPOINT ||
> scope->entry_type == ACPI_DMAR_SCOPE_TYPE_BRIDGE)
> (*cnt)++;
> - else
> - printk(KERN_WARNING PREFIX
> - "Unsupported device scope\n");
> + else {
> +#ifdef CONFIG_INTR_REMAP
> + if (scope->entry_type != ACPI_DMAR_SCOPE_TYPE_IOAPIC)
> +#endif
> + printk(KERN_WARNING PREFIX
> + "Unsupported device scope\n");
> + }
> start += scope->length;
> }
> if (*cnt == 0)

Applied, without the ifdef and slightly cleaned up.

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2010-04-09 15:34:17

by Woodhouse, David

[permalink] [raw]
Subject: Re: [PATCH 3/6] pci/dmar: remove the noisy print out about unmapping

On Thu, 2010-04-08 at 19:58 +0100, Yinghai Lu wrote:
> it seems this one should be removed.
> because map page does not have this annonyed print out.

It's debug output... I'd be more inclined to add it in map_page() than
remove it from here.

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2010-04-09 15:43:21

by Woodhouse, David

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel-iommu: Don't call domain_exit if can not attach with iommu

On Thu, 2010-04-08 at 19:58 +0100, Yinghai Lu wrote:
>
> --- a/drivers/pci/intel-iommu.c
> +++ b/drivers/pci/intel-iommu.c
> @@ -1853,7 +1853,7 @@ static struct dmar_domain
> *get_domain_for_dev(struct pci_dev *pdev, int gaw)
>
> ret = iommu_attach_domain(domain, iommu);
> if (ret) {
> - domain_exit(domain);
> + free_domain_mem(domain);
> goto error;
> }
>

Um, shouldn't the next error have the same change? If domain_init()
fails before it gets round to initialising the domain->devices list, the
same oops in domain_exit() will occur, won't it? It's because
domain->devices.{next,prev} are both NULL?

--
David Woodhouse Open Source Technology Centre
[email protected] Intel Corporation

2010-04-09 15:54:48

by Jesse Barnes

[permalink] [raw]
Subject: Re: [PATCH 4/6] pci/dmar/sriov: use physfn to search drhd for VF

On Thu, 08 Apr 2010 17:07:55 -0700
Yinghai <[email protected]> wrote:

> On 04/08/2010 04:24 PM, Chris Wright wrote:
> > * Yinghai ([email protected]) wrote:
> >> --- linux-2.6.orig/drivers/pci/dmar.c
> >> +++ linux-2.6/drivers/pci/dmar.c
> >> @@ -534,6 +534,8 @@ dmar_find_matched_drhd_unit(struct pci_d
> >> struct dmar_drhd_unit *dmaru = NULL;
> >> struct acpi_dmar_hardware_unit *drhd;
> >>
> >> + dev = pci_physfn(dev);
> >> +
> >
> > Yeah, we typically don't have enough VF's to wrap bus numbers, or we're
> > under a catchall IOMMU. In the catchall case both vf->bus and vf->pf->bus
> > will have the same domain (segment) regardless of whether we have large
> > VF count or big offset/stride. So, I suppose this could be done inside
> > of dmar_pci_device_match().
> >
> > Otherwise, I think you'd want to add the same thing to
> > dmar_find_matched_atsr_unit() since it's the same device scopes there.
>
> please check
>
> Subject: [PATCH] pci/dmar/sriov: use physfn to search drhd for VF
>
> When virtfn is used, we should use physfn to find correct drhd
>
> -v2: add pci_physfn() Suggested by Roland Dreier <[email protected]>
> do can remove ifdef in dmar.c
> -v3: Chris pointed out we need that for dma_find_matched_atsr_unit too
> also change dmar_pci_device_match() static
>
> Signed-off-by: Yinghai Lu <[email protected]>
> Acked-by: Roland Dreier <[email protected]>

This one looks good:

Acked-by: Jesse Barnes <[email protected]>

Thanks,
--
Jesse Barnes, Intel Open Source Technology Center

2010-04-09 17:52:54

by Yinghai Lu

[permalink] [raw]
Subject: Re: [PATCH 6/6] intel-iommu: Don't call domain_exit if can not attach with iommu

On 04/09/2010 08:43 AM, David Woodhouse wrote:
> On Thu, 2010-04-08 at 19:58 +0100, Yinghai Lu wrote:
>>
>> --- a/drivers/pci/intel-iommu.c
>> +++ b/drivers/pci/intel-iommu.c
>> @@ -1853,7 +1853,7 @@ static struct dmar_domain
>> *get_domain_for_dev(struct pci_dev *pdev, int gaw)
>>
>> ret = iommu_attach_domain(domain, iommu);
>> if (ret) {
>> - domain_exit(domain);
>> + free_domain_mem(domain);
>> goto error;
>> }
>>
>
> Um, shouldn't the next error have the same change? If domain_init()
> fails before it gets round to initialising the domain->devices list, the
> same oops in domain_exit() will occur, won't it? It's because
> domain->devices.{next,prev} are both NULL?
>

thanks. please check

Subject: [PATCH] intel-iommu: Don't call domain_exit if can not attach with iommu

found when a lot of ixgbevf are used with intel iommu, will get crash

> >
> > eth304 ip=192.171.178.102 mac=56:95:16:88:4C:C1 pci=0000:0e:18.3 drv=ixgbevf
> > eth305 ip=192.171.179.102 mac=D6:41:8C:4A:87:B3 pci=0000:0e:18.5 drv=ixgbevf
> > [ 9534.886519] BUG: unable to handle kernel NULL pointer dereference at
> > 0000000000000008
> > [ 9534.889775] Call Trace:
> > [ 9534.889775] [<ffffffff81406a3d>] domain_remove_dev_info+0x34/0xcf
> > [ 9534.889775] [<ffffffff81408a1d>] domain_exit+0x23/0xc4
> > [ 9534.889775] [<ffffffff81409c1c>] T.953+0x173/0x342
> > [ 9534.889775] [<ffffffff81409fd0>] __intel_map_single+0x63/0x1b3
> > [ 9534.889775] [<ffffffff8140a22a>] intel_alloc_coherent+0xc7/0xee
> > [ 9534.889775] [<ffffffff816c3e7d>] ? ixgbevf_setup_tx_resources+0x2f/0x12c
> > [ 9534.889775] [<ffffffff816c3f36>] ixgbevf_setup_tx_resources+0xe8/0x12c
> > [ 9534.889775] [<ffffffff816c42d2>] ixgbevf_open+0x7a/0x160
> > [ 9534.889775] [<ffffffff81adf21e>] __dev_open+0x8e/0xbc
> > [ 9534.889775] [<ffffffff81adb716>] __dev_change_flags+0xad/0x130
> > [ 9534.889775] [<ffffffff81adf15a>] dev_change_flags+0x21/0x57
> > [ 9534.889775] [<ffffffff81b211e5>] devinet_ioctl+0x29d/0x541
> > [ 9534.889775] [<ffffffff810a4b92>] ? trace_hardirqs_off_caller+0x1f/0xa9
> > [ 9534.889775] [<ffffffff81b22801>] inet_ioctl+0x8f/0xa7
> > [ 9534.889775] [<ffffffff81acc258>] sock_do_ioctl+0x29/0x48
> > [ 9534.889775] [<ffffffff81acca64>] sock_ioctl+0x1fe/0x20d
> > [ 9534.889775] [<ffffffff8113b86a>] vfs_ioctl+0x32/0xa6
> > [ 9534.889775] [<ffffffff8113bd04>] do_vfs_ioctl+0x2b0/0x2cb
> > [ 9534.889775] [<ffffffff81033c4c>] ? sysret_check+0x27/0x62
> > [ 9534.889775] [<ffffffff8113bd66>] sys_ioctl+0x47/0x6a
> > [ 9534.889775] [<ffffffff81033c1b>] system_call_fastpath+0x16/0x1b
> > [ 9534.889775] Code: cd 7f cc ff 41 54 9d 48 83 c4 20 5b 41 5c 41 5d 41
> > 5e c9 c3 55 48 89 e5 e8 11 ff ff ff c9 c3 55 48 89 e5 53 48 89 fb 48 83
> > ec 08 <48> 8b 47 08 4c 8b 00 49 39 f8 74 1d 48 89 f9 48 c7 c2 7f e9 18
> > [ 9534.889775] RIP [<ffffffff813ddf94>] list_del+0xc/0x8b
> > [ 9534.889775] RSP <ffff88c070373af8>
> > [ 9534.889775] CR2: 0000000000000008
> > [ 9534.889775] ---[ end trace 072bd8cdb08a760c ]---
> > xifconfig8_2x_vf.sh: line 10: 28555 Killed ifconfig
> > $DEV $IP
> >
when intel_iommu=off or iommu=pt is used, will work well.

root cause:
domain is initialized after trying attached it,

if it can not be attached, don't call domain_exit yet.
it will cause kernel crash in domain_exit()

after patch will will get error instead of crash.

[ 1781.910241] IOMMU: no free domain ids
[ 1781.910244] Allocating domain for 0000:d0:1f.5 failed
SIOCSIFFLAGS: Cannot allocate memory

-v2: DavidW pointed out next error with domain_init could have the same problem.
So try to adjust sequence in domain_init() to make sure domain->devices get
initialized properly

Signed-off-by: Yinghai Lu <[email protected]>

---
drivers/pci/intel-iommu.c | 37 ++++++++++++++++++++-----------------
1 file changed, 20 insertions(+), 17 deletions(-)

Index: linux-2.6/drivers/pci/intel-iommu.c
===================================================================
--- linux-2.6.orig/drivers/pci/intel-iommu.c
+++ linux-2.6/drivers/pci/intel-iommu.c
@@ -1379,24 +1379,10 @@ static int domain_init(struct dmar_domai

domain_reserve_special_ranges(domain);

- /* calculate AGAW */
- iommu = domain_get_iommu(domain);
- if (guest_width > cap_mgaw(iommu->cap))
- guest_width = cap_mgaw(iommu->cap);
- domain->gaw = guest_width;
- adjust_width = guestwidth_to_adjustwidth(guest_width);
- agaw = width_to_agaw(adjust_width);
- sagaw = cap_sagaw(iommu->cap);
- if (!test_bit(agaw, &sagaw)) {
- /* hardware doesn't support it, choose a bigger one */
- pr_debug("IOMMU: hardware doesn't support agaw %d\n", agaw);
- agaw = find_next_bit(&sagaw, 5, agaw);
- if (agaw >= 5)
- return -ENODEV;
- }
- domain->agaw = agaw;
INIT_LIST_HEAD(&domain->devices);

+ iommu = domain_get_iommu(domain);
+
if (ecap_coherent(iommu->ecap))
domain->iommu_coherency = 1;
else
@@ -1415,6 +1401,23 @@ static int domain_init(struct dmar_domai
if (!domain->pgd)
return -ENOMEM;
__iommu_flush_cache(iommu, domain->pgd, PAGE_SIZE);
+
+ /* calculate AGAW */
+ if (guest_width > cap_mgaw(iommu->cap))
+ guest_width = cap_mgaw(iommu->cap);
+ domain->gaw = guest_width;
+ adjust_width = guestwidth_to_adjustwidth(guest_width);
+ agaw = width_to_agaw(adjust_width);
+ sagaw = cap_sagaw(iommu->cap);
+ if (!test_bit(agaw, &sagaw)) {
+ /* hardware doesn't support it, choose a bigger one */
+ pr_debug("IOMMU: hardware doesn't support agaw %d\n", agaw);
+ agaw = find_next_bit(&sagaw, 5, agaw);
+ if (agaw >= 5)
+ return -ENODEV;
+ }
+ domain->agaw = agaw;
+
return 0;
}

@@ -1853,7 +1856,7 @@ static struct dmar_domain *get_domain_fo

ret = iommu_attach_domain(domain, iommu);
if (ret) {
- domain_exit(domain);
+ free_domain_mem(domain);
goto error;
}