2011-06-20 18:44:46

by Andrew Lutomirski

[permalink] [raw]
Subject: [3.0 regression] DMAR faults in iwlagn

On 3.0-git today, I get:

[ 40.142872] DRHD: handling fault status reg 2
[ 40.142892] DMAR:[DMA Write] Request device [03:00.0] fault addr ffe38000
[ 40.142896] DMAR:[fault reason 05] PTE Write access is not set

lspci says:

03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)

This seems to be triggered by interface state changes. The driver
still works. I don't see this on 2.6.39.1.

--Andy


2011-06-22 20:24:46

by Johannes Berg

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

Andy, could you try this patch?

I haven't tested it at all yet, it's kinda late here.

johannes

iwlagn: map command buffers BIDI

From: Johannes Berg <[email protected]>

Evidently, the device sometimes wants to write back
to command buffers, even if I see no reason why it
should. Allow it to do that.

Signed-off-by: Johannes Berg <[email protected]>
---
drivers/net/wireless/iwlwifi/iwl-tx.c | 17 ++++++++++-------
1 file changed, 10 insertions(+), 7 deletions(-)

--- a/drivers/net/wireless/iwlwifi/iwl-tx.c 2011-06-22 22:19:35.000000000 +0200
+++ b/drivers/net/wireless/iwlwifi/iwl-tx.c 2011-06-22 22:21:48.000000000 +0200
@@ -126,7 +126,7 @@ static inline u8 iwl_tfd_get_num_tbs(str
}

static void iwlagn_unmap_tfd(struct iwl_priv *priv, struct iwl_cmd_meta *meta,
- struct iwl_tfd *tfd)
+ struct iwl_tfd *tfd, enum dma_data_direction dma_dir)
{
int i;
int num_tbs;
@@ -150,7 +150,7 @@ static void iwlagn_unmap_tfd(struct iwl_
/* Unmap chunks, if any. */
for (i = 1; i < num_tbs; i++)
dma_unmap_single(priv->bus.dev, iwl_tfd_tb_get_addr(tfd, i),
- iwl_tfd_tb_get_len(tfd, i), DMA_TO_DEVICE);
+ iwl_tfd_tb_get_len(tfd, i), dma_dir);
}

/**
@@ -166,7 +166,8 @@ void iwlagn_txq_free_tfd(struct iwl_priv
struct iwl_tfd *tfd_tmp = txq->tfds;
int index = txq->q.read_ptr;

- iwlagn_unmap_tfd(priv, &txq->meta[index], &tfd_tmp[index]);
+ iwlagn_unmap_tfd(priv, &txq->meta[index], &tfd_tmp[index],
+ DMA_TO_DEVICE);

/* free SKB */
if (txq->txb) {
@@ -309,7 +310,8 @@ void iwl_cmd_queue_unmap(struct iwl_priv
i = get_cmd_index(q, q->read_ptr);

if (txq->meta[i].flags & CMD_MAPPED) {
- iwlagn_unmap_tfd(priv, &txq->meta[i], &txq->tfds[i]);
+ iwlagn_unmap_tfd(priv, &txq->meta[i], &txq->tfds[i],
+ DMA_BIDIRECTIONAL);
txq->meta[i].flags = 0;
}

@@ -691,10 +693,11 @@ int iwl_enqueue_hcmd(struct iwl_priv *pr
if (!(cmd->dataflags[i] & IWL_HCMD_DFL_NOCOPY))
continue;
phys_addr = dma_map_single(priv->bus.dev, (void *)cmd->data[i],
- cmd->len[i], DMA_TO_DEVICE);
+ cmd->len[i], DMA_BIDIRECTIONAL);
if (dma_mapping_error(priv->bus.dev, phys_addr)) {
iwlagn_unmap_tfd(priv, out_meta,
- &txq->tfds[q->write_ptr]);
+ &txq->tfds[q->write_ptr],
+ DMA_BIDIRECTIONAL);
idx = -ENOMEM;
goto out;
}
@@ -798,7 +801,7 @@ void iwl_tx_cmd_complete(struct iwl_priv
cmd = txq->cmd[cmd_index];
meta = &txq->meta[cmd_index];

- iwlagn_unmap_tfd(priv, meta, &txq->tfds[index]);
+ iwlagn_unmap_tfd(priv, meta, &txq->tfds[index], DMA_BIDIRECTIONAL);

/* Input error checking is done when commands are added to queue. */
if (meta->flags & CMD_WANT_SKB) {



2011-06-25 13:08:56

by Johannes Berg

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Fri, 2011-06-24 at 09:13 -0600, Andrew Lutomirski wrote:
> On Wed, Jun 22, 2011 at 2:24 PM, Johannes Berg
> <[email protected]> wrote:
> > Andy, could you try this patch?
> >
> > I haven't tested it at all yet, it's kinda late here.

> Works for me.
>
> Tested-by: Andy Lutomirski <[email protected]>

Thanks Andrew, Kyle.

I don't pretend to understand what the device is doing, but for now I
guess we'll keep this and I'll try to figure out independently why this
happens.

johannes


2011-06-21 13:38:51

by Johannes Berg

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Mon, 2011-06-20 at 14:44 -0400, Andrew Lutomirski wrote:
> On 3.0-git today, I get:
>
> [ 40.142872] DRHD: handling fault status reg 2
> [ 40.142892] DMAR:[DMA Write] Request device [03:00.0] fault addr ffe38000
> [ 40.142896] DMAR:[fault reason 05] PTE Write access is not set
>
> lspci says:
>
> 03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
>
> This seems to be triggered by interface state changes. The driver
> still works. I don't see this on 2.6.39.1.

Thanks Andy. Do you happen to know if there's any way to figure out what
ffe38000 was? Clearly the device is attempting to write something that
we haven't mapped properly, but I'm not sure _what_?

johannes


2011-06-24 15:13:48

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Wed, Jun 22, 2011 at 2:24 PM, Johannes Berg
<[email protected]> wrote:
> Andy, could you try this patch?
>
> I haven't tested it at all yet, it's kinda late here.
>
> johannes
>
> iwlagn: map command buffers BIDI

Works for me.

Tested-by: Andy Lutomirski <[email protected]>

Thanks!

2011-06-24 20:56:08

by Kyle McMartin

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

Fixing it up to make it apply against 3.0-rc4 kills the warning for me
as well. Thanks for being so quick to post a fix, Johannes!

Tested-by: Kyle McMartin <[email protected]>

---
diff --git a/drivers/net/wireless/iwlwifi/iwl-tx.c b/drivers/net/wireless/iwlwifi/iwl-tx.c
index 686e176..11da90a 100644
--- a/drivers/net/wireless/iwlwifi/iwl-tx.c
+++ b/drivers/net/wireless/iwlwifi/iwl-tx.c
@@ -126,7 +126,7 @@ static inline u8 iwl_tfd_get_num_tbs(struct iwl_tfd *tfd)
}

static void iwlagn_unmap_tfd(struct iwl_priv *priv, struct iwl_cmd_meta *meta,
- struct iwl_tfd *tfd)
+ struct iwl_tfd *tfd, enum dma_data_direction dma_dir)
{
struct pci_dev *dev = priv->pci_dev;
int i;
@@ -151,7 +151,7 @@ static void iwlagn_unmap_tfd(struct iwl_priv *priv, struct iwl_cmd_meta *meta,
/* Unmap chunks, if any. */
for (i = 1; i < num_tbs; i++)
pci_unmap_single(dev, iwl_tfd_tb_get_addr(tfd, i),
- iwl_tfd_tb_get_len(tfd, i), PCI_DMA_TODEVICE);
+ iwl_tfd_tb_get_len(tfd, i), dma_dir);
}

/**
@@ -167,7 +167,8 @@ void iwlagn_txq_free_tfd(struct iwl_priv *priv, struct iwl_tx_queue *txq)
struct iwl_tfd *tfd_tmp = txq->tfds;
int index = txq->q.read_ptr;

- iwlagn_unmap_tfd(priv, &txq->meta[index], &tfd_tmp[index]);
+ iwlagn_unmap_tfd(priv, &txq->meta[index], &tfd_tmp[index],
+ PCI_DMA_TODEVICE);

/* free SKB */
if (txq->txb) {
@@ -700,10 +701,11 @@ int iwl_enqueue_hcmd(struct iwl_priv *priv, struct iwl_host_cmd *cmd)
if (!(cmd->dataflags[i] & IWL_HCMD_DFL_NOCOPY))
continue;
phys_addr = pci_map_single(priv->pci_dev, (void *)cmd->data[i],
- cmd->len[i], PCI_DMA_TODEVICE);
+ cmd->len[i], PCI_DMA_BIDIRECTIONAL);
if (pci_dma_mapping_error(priv->pci_dev, phys_addr)) {
iwlagn_unmap_tfd(priv, out_meta,
- &txq->tfds[q->write_ptr]);
+ &txq->tfds[q->write_ptr],
+ PCI_DMA_BIDIRECTIONAL);
idx = -ENOMEM;
goto out;
}
@@ -807,7 +809,7 @@ void iwl_tx_cmd_complete(struct iwl_priv *priv, struct iwl_rx_mem_buffer *rxb)
cmd = txq->cmd[cmd_index];
meta = &txq->meta[cmd_index];

- iwlagn_unmap_tfd(priv, meta, &txq->tfds[index]);
+ iwlagn_unmap_tfd(priv, meta, &txq->tfds[index], PCI_DMA_BIDIRECTIONAL);

/* Input error checking is done when commands are added to queue. */
if (meta->flags & CMD_WANT_SKB) {

2011-06-22 20:26:59

by Johannes Berg

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Wed, 2011-06-22 at 22:24 +0200, Johannes Berg wrote:
> Andy, could you try this patch?
>
> I haven't tested it at all yet, it's kinda late here.

Oh, it also requires this patch first:

http://git.kernel.org/?p=linux/kernel/git/iwlwifi/iwlwifi-2.6.git;a=commitdiff;h=3d007d90374fe263d9e7982d04217cdcbaceb54c

johannes


2011-06-20 19:22:11

by Andrew Lutomirski

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Mon, Jun 20, 2011 at 3:06 PM, wwguy <[email protected]> wrote:
> On Mon, 2011-06-20 at 11:44 -0700, Andrew Lutomirski wrote:
>> On 3.0-git today, I get:
>>
>> [ ? 40.142872] DRHD: handling fault status reg 2
>> [ ? 40.142892] DMAR:[DMA Write] Request device [03:00.0] fault addr ffe38000
>> [ ? 40.142896] DMAR:[fault reason 05] PTE Write access is not set
>>
>> lspci says:
>>
>> 03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
>>
>> This seems to be triggered by interface state changes. ?The driver
>> still works. ?I don't see this on 2.6.39.1.
>>
> can I got more information. which repo you are using?
> git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-testing.git

This is de505e709ffb09a7382ca8e0d8c7dbb171ba5830 from Linus' tree.

>
> Can you provide full log?

dmesg attached.

/var/log/messages says:

NetworkManager[804]: <info> (wlan0): bringing up device.
kernel: [ 85.153160] DRHD: handling fault status reg 2
kernel: [ 85.153175] DMAR:[DMA Write] Request device [03:00.0] fault
addr ffe0e000
kernel: [ 85.153178] DMAR:[fault reason 05] PTE Write access is not set
kernel: [ 85.207502] ADDRCONF(NETDEV_UP): wlan0: link is not ready
NetworkManager[804]: <info> Activation (wlan0/wireless): access point
'xxxx' has security, but secrets are required.
NetworkManager[804]: <info> (wlan0): device state change: config ->
need-auth (reason 'none') [50 60 0]
NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
Configure) complete.
NetworkManager[804]: <info> Activation (wlan0) Stage 1 of 5 (Device
Prepare) scheduled...
NetworkManager[804]: <info> Activation (wlan0) Stage 1 of 5 (Device
Prepare) started...
NetworkManager[804]: <info> (wlan0): device state change: need-auth ->
prepare (reason 'none') [60 40 0]
NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
Configure) scheduled...
NetworkManager[804]: <info> Activation (wlan0) Stage 1 of 5 (Device
Prepare) complete.
NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
Configure) starting...
NetworkManager[804]: <info> (wlan0): device state change: prepare ->
config (reason 'none') [40 50 0]
NetworkManager[804]: <info> Activation (wlan0/wireless): connection
'Auto Aether Wind' has security, and secrets exist. No new secrets
needed.
NetworkManager[804]: <info> Config: added 'ssid' value 'xxxx'
NetworkManager[804]: <info> Config: added 'scan_ssid' value '1'
NetworkManager[804]: <info> Config: added 'key_mgmt' value 'WPA-PSK'
NetworkManager[804]: <info> Config: added 'psk' value '<omitted>'
NetworkManager[804]: <info> Config: added 'proto' value 'WPA RSN'
NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
Configure) complete.
NetworkManager[804]: <info> Config: set interface ap_scan to 1
kernel: [ 85.228522] DRHD: handling fault status reg 2
kernel: [ 85.228526] DMAR:[DMA Write] Request device [03:00.0] fault
addr ffeb8000
kernel: [ 85.228527] DMAR:[fault reason 05] PTE Write access is not set
NetworkManager[804]: <info> (wlan0): supplicant interface state:
inactive -> scanning
kernel: [ 85.471825] DRHD: handling fault status reg 2
kernel: [ 85.471840] DMAR:[DMA Write] Request device [03:00.0] fault
addr ffe54000
kernel: [ 85.471843] DMAR:[fault reason 05] PTE Write access is not set
NetworkManager[804]: <info> (wlan0): supplicant interface state:
scanning -> authenticating


--Andy


Attachments:
dmesg.txt (89.27 kB)

2011-06-21 13:59:10

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Tue, 2011-06-21 at 06:38 -0700, Johannes Berg wrote:
> On Mon, 2011-06-20 at 14:44 -0400, Andrew Lutomirski wrote:
> > On 3.0-git today, I get:
> >
> > [ 40.142872] DRHD: handling fault status reg 2
> > [ 40.142892] DMAR:[DMA Write] Request device [03:00.0] fault addr ffe38000
> > [ 40.142896] DMAR:[fault reason 05] PTE Write access is not set
> >
> > lspci says:
> >
> > 03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
> >
> > This seems to be triggered by interface state changes. The driver
> > still works. I don't see this on 2.6.39.1.
>
> Thanks Andy. Do you happen to know if there's any way to figure out what
> ffe38000 was? Clearly the device is attempting to write something that
> we haven't mapped properly, but I'm not sure _what_?

Andy, did you have chance try wireless-testing tree, I could not see the
issue, but maybe we can bisect the problem if you also do't see the
problem there.

Thanks
Wey


2011-06-20 19:46:36

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Mon, 2011-06-20 at 12:21 -0700, Andrew Lutomirski wrote:
> On Mon, Jun 20, 2011 at 3:06 PM, wwguy <[email protected]> wrote:
> > On Mon, 2011-06-20 at 11:44 -0700, Andrew Lutomirski wrote:
> >> On 3.0-git today, I get:
> >>
> >> [ 40.142872] DRHD: handling fault status reg 2
> >> [ 40.142892] DMAR:[DMA Write] Request device [03:00.0] fault addr ffe38000
> >> [ 40.142896] DMAR:[fault reason 05] PTE Write access is not set
> >>
> >> lspci says:
> >>
> >> 03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
> >>
> >> This seems to be triggered by interface state changes. The driver
> >> still works. I don't see this on 2.6.39.1.
> >>
> > can I got more information. which repo you are using?
> > git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-testing.git
>
> This is de505e709ffb09a7382ca8e0d8c7dbb171ba5830 from Linus' tree.
>
> >
> > Can you provide full log?
>
> dmesg attached.
>
> /var/log/messages says:
>
> NetworkManager[804]: <info> (wlan0): bringing up device.
> kernel: [ 85.153160] DRHD: handling fault status reg 2
> kernel: [ 85.153175] DMAR:[DMA Write] Request device [03:00.0] fault
> addr ffe0e000
> kernel: [ 85.153178] DMAR:[fault reason 05] PTE Write access is not set
> kernel: [ 85.207502] ADDRCONF(NETDEV_UP): wlan0: link is not ready
> NetworkManager[804]: <info> Activation (wlan0/wireless): access point
> 'xxxx' has security, but secrets are required.
> NetworkManager[804]: <info> (wlan0): device state change: config ->
> need-auth (reason 'none') [50 60 0]
> NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
> Configure) complete.
> NetworkManager[804]: <info> Activation (wlan0) Stage 1 of 5 (Device
> Prepare) scheduled...
> NetworkManager[804]: <info> Activation (wlan0) Stage 1 of 5 (Device
> Prepare) started...
> NetworkManager[804]: <info> (wlan0): device state change: need-auth ->
> prepare (reason 'none') [60 40 0]
> NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
> Configure) scheduled...
> NetworkManager[804]: <info> Activation (wlan0) Stage 1 of 5 (Device
> Prepare) complete.
> NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
> Configure) starting...
> NetworkManager[804]: <info> (wlan0): device state change: prepare ->
> config (reason 'none') [40 50 0]
> NetworkManager[804]: <info> Activation (wlan0/wireless): connection
> 'Auto Aether Wind' has security, and secrets exist. No new secrets
> needed.
> NetworkManager[804]: <info> Config: added 'ssid' value 'xxxx'
> NetworkManager[804]: <info> Config: added 'scan_ssid' value '1'
> NetworkManager[804]: <info> Config: added 'key_mgmt' value 'WPA-PSK'
> NetworkManager[804]: <info> Config: added 'psk' value '<omitted>'
> NetworkManager[804]: <info> Config: added 'proto' value 'WPA RSN'
> NetworkManager[804]: <info> Activation (wlan0) Stage 2 of 5 (Device
> Configure) complete.
> NetworkManager[804]: <info> Config: set interface ap_scan to 1
> kernel: [ 85.228522] DRHD: handling fault status reg 2
> kernel: [ 85.228526] DMAR:[DMA Write] Request device [03:00.0] fault
> addr ffeb8000
> kernel: [ 85.228527] DMAR:[fault reason 05] PTE Write access is not set
> NetworkManager[804]: <info> (wlan0): supplicant interface state:
> inactive -> scanning
> kernel: [ 85.471825] DRHD: handling fault status reg 2
> kernel: [ 85.471840] DMAR:[DMA Write] Request device [03:00.0] fault
> addr ffe54000
> kernel: [ 85.471843] DMAR:[fault reason 05] PTE Write access is not set
> NetworkManager[804]: <info> (wlan0): supplicant interface state:
> scanning -> authenticating
>
>
Are you seeing the same problem if pull from wireless-testing? I did not
see any iwlagn related log in the log file. But I am pulling from Linus
tree and check it out.

Thanks
Wey




2011-06-20 19:10:45

by Wey-Yi Guy

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Mon, 2011-06-20 at 11:44 -0700, Andrew Lutomirski wrote:
> On 3.0-git today, I get:
>
> [ 40.142872] DRHD: handling fault status reg 2
> [ 40.142892] DMAR:[DMA Write] Request device [03:00.0] fault addr ffe38000
> [ 40.142896] DMAR:[fault reason 05] PTE Write access is not set
>
> lspci says:
>
> 03:00.0 Network controller: Intel Corporation Centrino Ultimate-N 6300 (rev 35)
>
> This seems to be triggered by interface state changes. The driver
> still works. I don't see this on 2.6.39.1.
>
can I got more information. which repo you are using?
git://git.kernel.org/pub/scm/linux/kernel/git/linville/wireless-testing.git

Can you provide full log?

Thanks
Wey


2011-06-22 18:54:05

by Johannes Berg

[permalink] [raw]
Subject: Re: [3.0 regression] DMAR faults in iwlagn

On Mon, 2011-06-20 at 14:44 -0400, Andrew Lutomirski wrote:
> On 3.0-git today, I get:
>
> [ 40.142872] DRHD: handling fault status reg 2
> [ 40.142892] DMAR:[DMA Write] Request device [03:00.0] fault addr ffe38000
> [ 40.142896] DMAR:[fault reason 05] PTE Write access is not set

Reinette also said she saw this and kindly let me use one of her systems
to track it down.

It looks like this:

iwl_enqueue_hcmd : mapping 1 at ffce2000
dmar_fault : DMAR:[DMA Write] Request device [02:00.0] fault addr ffce2000
DMAR:[fault reason 05] PTE Write access is not set

So it's really due to my command changes, and the device is attempting
to write back to at least some commands. I see it doing that for PHY
calibration data, but I think on Reinette's system it was probably for
other commands as well since it happened a lot...

I think we have to map commands BIDI, even if I don't really understand
why.

johannes