2016-11-08 07:03:49

by Cao jin

[permalink] [raw]
Subject: [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

When running as guest, under certain condition, it will oops as following.
writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
is NULL. While other register access won't oops kernel because they use
wr32/rd32 which have a defense against NULL pointer.

[ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
error received: id=0101
[ 141.225523] igb 0000:01:00.1: PCIe Bus Error:
severity=Uncorrected (Fatal), type=Unaccessible,
id=0101(Unregistered Agent ID)
[ 141.299442] igb 0000:01:00.1: broadcast error_detected message
[ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
detached
[ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
detached
[ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
[ 143.465994] igb 0000:01:00.1: broadcast slot_reset message
[ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
[ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
[ 145.312078] igb 0000:01:00.1: broadcast resume message
[ 145.322211] BUG: unable to handle kernel paging request at
0000000000003818
[ 145.361275] IP: [<ffffffffa02fd38d>]
igb_configure_tx_ring+0x14d/0x280 [igb]
[ 145.400048] PGD 0
[ 145.438007] Oops: 0002 [#1] SMP

A similiar issue & solution could be found at:
http://patchwork.ozlabs.org/patch/689592/

Signed-off-by: Cao jin <[email protected]>
---
drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
index edc9a6a..3f240ac 100644
--- a/drivers/net/ethernet/intel/igb/igb_main.c
+++ b/drivers/net/ethernet/intel/igb/igb_main.c
@@ -3390,7 +3390,7 @@ void igb_configure_tx_ring(struct igb_adapter *adapter,
tdba & 0x00000000ffffffffULL);
wr32(E1000_TDBAH(reg_idx), tdba >> 32);

- ring->tail = hw->hw_addr + E1000_TDT(reg_idx);
+ ring->tail = adapter->io_addr + E1000_TDT(reg_idx);
wr32(E1000_TDH(reg_idx), 0);
writel(0, ring->tail);

@@ -3729,7 +3729,7 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
ring->count * sizeof(union e1000_adv_rx_desc));

/* initialize head and tail */
- ring->tail = hw->hw_addr + E1000_RDT(reg_idx);
+ ring->tail = adapter->io_addr + E1000_RDT(reg_idx);
wr32(E1000_RDH(reg_idx), 0);
writel(0, ring->tail);

--
2.1.0




2016-11-08 16:42:18

by Corinna Vinschen

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Nov 8 15:06, Cao jin wrote:
> When running as guest, under certain condition, it will oops as following.
> writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> is NULL. While other register access won't oops kernel because they use
> wr32/rd32 which have a defense against NULL pointer.
>
> [ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
> error received: id=0101
> [ 141.225523] igb 0000:01:00.1: PCIe Bus Error:
> severity=Uncorrected (Fatal), type=Unaccessible,
> id=0101(Unregistered Agent ID)
> [ 141.299442] igb 0000:01:00.1: broadcast error_detected message
> [ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
> detached
> [ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
> detached
> [ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
> [ 143.465994] igb 0000:01:00.1: broadcast slot_reset message
> [ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
> [ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
> [ 145.312078] igb 0000:01:00.1: broadcast resume message
> [ 145.322211] BUG: unable to handle kernel paging request at
> 0000000000003818
> [ 145.361275] IP: [<ffffffffa02fd38d>]
> igb_configure_tx_ring+0x14d/0x280 [igb]
> [ 145.400048] PGD 0
> [ 145.438007] Oops: 0002 [#1] SMP
>
> A similiar issue & solution could be found at:
> http://patchwork.ozlabs.org/patch/689592/
>
> Signed-off-by: Cao jin <[email protected]>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index edc9a6a..3f240ac 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -3390,7 +3390,7 @@ void igb_configure_tx_ring(struct igb_adapter *adapter,
> tdba & 0x00000000ffffffffULL);
> wr32(E1000_TDBAH(reg_idx), tdba >> 32);
>
> - ring->tail = hw->hw_addr + E1000_TDT(reg_idx);
> + ring->tail = adapter->io_addr + E1000_TDT(reg_idx);
> wr32(E1000_TDH(reg_idx), 0);
> writel(0, ring->tail);
>
> @@ -3729,7 +3729,7 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
> ring->count * sizeof(union e1000_adv_rx_desc));
>
> /* initialize head and tail */
> - ring->tail = hw->hw_addr + E1000_RDT(reg_idx);
> + ring->tail = adapter->io_addr + E1000_RDT(reg_idx);
> wr32(E1000_RDH(reg_idx), 0);
> writel(0, ring->tail);
>
> --
> 2.1.0

Incidentally we're just looking for a solution to that problem too.
Do three patches to fix the same problem at rougly the same time already
qualify as freak accident?

FTR, I attached my current patch, which I was planning to submit after
some external testing.

However, all three patches have one thing in common: They workaround
a somewhat dubious resetting of the hardware address to NULL in case
reading from a register failed.

That makes me wonder if setting the hardware address to NULL in
rd32/igb_rd32 is really such a good idea. It's performed in a function
which return value is *never* tested for validity in the calling
functions and leads to subsequent crashes since no tests for hw_addr ==
NULL are performed.

Maybe commit 22a8b2915 should be reconsidered? Isn't there some more
graceful way to handle the "surprise removal"?


Thanks,
Corinna


Attachments:
(No filename) (0.00 B)
signature.asc (819.00 B)
Download all attachments

2016-11-08 17:26:45

by Hisashi T Fujinaka

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Tue, 8 Nov 2016, Corinna Vinschen wrote:

> On Nov 8 15:06, Cao jin wrote:
>> When running as guest, under certain condition, it will oops as following.
>> writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
>> is NULL. While other register access won't oops kernel because they use
>> wr32/rd32 which have a defense against NULL pointer.
>>
>> [ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
>> error received: id=0101
>> [ 141.225523] igb 0000:01:00.1: PCIe Bus Error:
>> severity=Uncorrected (Fatal), type=Unaccessible,
>> id=0101(Unregistered Agent ID)
>> [ 141.299442] igb 0000:01:00.1: broadcast error_detected message
>> [ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
>> detached
>> [ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
>> detached
>> [ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
>> [ 143.465994] igb 0000:01:00.1: broadcast slot_reset message
>> [ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
>> [ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
>> [ 145.312078] igb 0000:01:00.1: broadcast resume message
>> [ 145.322211] BUG: unable to handle kernel paging request at
>> 0000000000003818
>> [ 145.361275] IP: [<ffffffffa02fd38d>]
>> igb_configure_tx_ring+0x14d/0x280 [igb]
>> [ 145.400048] PGD 0
>> [ 145.438007] Oops: 0002 [#1] SMP
>>
>> A similiar issue & solution could be found at:
>> http://patchwork.ozlabs.org/patch/689592/
>>
>> Signed-off-by: Cao jin <[email protected]>
>> ---
>> drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
>> 1 file changed, 2 insertions(+), 2 deletions(-)
>>
>> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
>> index edc9a6a..3f240ac 100644
>> --- a/drivers/net/ethernet/intel/igb/igb_main.c
>> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
>> @@ -3390,7 +3390,7 @@ void igb_configure_tx_ring(struct igb_adapter *adapter,
>> tdba & 0x00000000ffffffffULL);
>> wr32(E1000_TDBAH(reg_idx), tdba >> 32);
>>
>> - ring->tail = hw->hw_addr + E1000_TDT(reg_idx);
>> + ring->tail = adapter->io_addr + E1000_TDT(reg_idx);
>> wr32(E1000_TDH(reg_idx), 0);
>> writel(0, ring->tail);
>>
>> @@ -3729,7 +3729,7 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
>> ring->count * sizeof(union e1000_adv_rx_desc));
>>
>> /* initialize head and tail */
>> - ring->tail = hw->hw_addr + E1000_RDT(reg_idx);
>> + ring->tail = adapter->io_addr + E1000_RDT(reg_idx);
>> wr32(E1000_RDH(reg_idx), 0);
>> writel(0, ring->tail);
>>
>> --
>> 2.1.0
>
> Incidentally we're just looking for a solution to that problem too.
> Do three patches to fix the same problem at rougly the same time already
> qualify as freak accident?
>
> FTR, I attached my current patch, which I was planning to submit after
> some external testing.
>
> However, all three patches have one thing in common: They workaround
> a somewhat dubious resetting of the hardware address to NULL in case
> reading from a register failed.
>
> That makes me wonder if setting the hardware address to NULL in
> rd32/igb_rd32 is really such a good idea. It's performed in a function
> which return value is *never* tested for validity in the calling
> functions and leads to subsequent crashes since no tests for hw_addr ==
> NULL are performed.
>
> Maybe commit 22a8b2915 should be reconsidered? Isn't there some more
> graceful way to handle the "surprise removal"?

Answering this from my home account because, well, work is Outlook.

"Reconsidering" would be great. In fact, revert if if you'd like. I'm
uncertain that the surprise removal code actually works the way I
thought previously and I think I took a lot of it out of my local code.

Unfortuantely I don't have any equipment that I can use to reproduce
surprise removal any longer so that means I wouldn't be able to test
anything. I have to defer to you or Cao Jin.

--
Hisashi T Fujinaka - [email protected] ([email protected])

2016-11-08 18:32:15

by Hisashi T Fujinaka

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Tue, 8 Nov 2016, Hisashi T Fujinaka wrote:

>> Incidentally we're just looking for a solution to that problem too.
>> Do three patches to fix the same problem at rougly the same time already
>> qualify as freak accident?
>>
>> FTR, I attached my current patch, which I was planning to submit after
>> some external testing.
>>
>> However, all three patches have one thing in common: They workaround
>> a somewhat dubious resetting of the hardware address to NULL in case
>> reading from a register failed.
>>
>> That makes me wonder if setting the hardware address to NULL in
>> rd32/igb_rd32 is really such a good idea. It's performed in a function
>> which return value is *never* tested for validity in the calling
>> functions and leads to subsequent crashes since no tests for hw_addr ==
>> NULL are performed.
>>
>> Maybe commit 22a8b2915 should be reconsidered? Isn't there some more
>> graceful way to handle the "surprise removal"?
>
> Answering this from my home account because, well, work is Outlook.
>
> "Reconsidering" would be great. In fact, revert if if you'd like. I'm
> uncertain that the surprise removal code actually works the way I
> thought previously and I think I took a lot of it out of my local code.
>
> Unfortuantely I don't have any equipment that I can use to reproduce
> surprise removal any longer so that means I wouldn't be able to test
> anything. I have to defer to you or Cao Jin.

Whoops. Never mind. I was just told that I had a bug that Alex Duyck and
Cao Jin just fixed. I'd stick to listening to Alex.

--
Hisashi T Fujinaka - [email protected]

2016-11-08 18:37:43

by Corinna Vinschen

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Nov 8 09:16, Hisashi T Fujinaka wrote:
> On Tue, 8 Nov 2016, Corinna Vinschen wrote:
> > On Nov 8 15:06, Cao jin wrote:
> > > When running as guest, under certain condition, it will oops as following.
> > > writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> > > is NULL. While other register access won't oops kernel because they use
> > > wr32/rd32 which have a defense against NULL pointer.
> > > [...]
> >
> > Incidentally we're just looking for a solution to that problem too.
> > Do three patches to fix the same problem at rougly the same time already
> > qualify as freak accident?
> >
> > FTR, I attached my current patch, which I was planning to submit after
> > some external testing.
> >
> > However, all three patches have one thing in common: They workaround
> > a somewhat dubious resetting of the hardware address to NULL in case
> > reading from a register failed.
> >
> > That makes me wonder if setting the hardware address to NULL in
> > rd32/igb_rd32 is really such a good idea. It's performed in a function
> > which return value is *never* tested for validity in the calling
> > functions and leads to subsequent crashes since no tests for hw_addr ==
> > NULL are performed.
> >
> > Maybe commit 22a8b2915 should be reconsidered? Isn't there some more
> > graceful way to handle the "surprise removal"?
>
> Answering this from my home account because, well, work is Outlook.
>
> "Reconsidering" would be great. In fact, revert if if you'd like. I'm
> uncertain that the surprise removal code actually works the way I
> thought previously and I think I took a lot of it out of my local code.
>
> Unfortuantely I don't have any equipment that I can use to reproduce
> surprise removal any longer so that means I wouldn't be able to test
> anything. I have to defer to you or Cao Jin.

I'm not too keen to rip out a PCIe NIC under power from my locale
desktop machine, but I think an actual surprise removal is not the
problem.

As described in my git log entry, the error condition in igb_rd32 can be
triggered during a suspend. The HW has been put into a sleep state but
some register read requests are apparently not guarded against that
situation. Reading a register in this state returns -1, thus a suspend
is erroneously triggering the "surprise removal" sequence.

Here's a raw idea:

- Note that device is suspended in e1000_hw struct. Don't trigger
error sequence in igb_rd32 if so (...and return a 0 value???)

- Otherwise assume it's actually a surprise removal. In theory that
should somehow trigger a device removal sequence, kind of like
calling igb_remove, no?


Thanks,
Corinna


Attachments:
(No filename) (2.60 kB)
signature.asc (819.00 B)
Download all attachments

2016-11-08 19:33:23

by Alexander Duyck

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Tue, Nov 8, 2016 at 10:37 AM, Corinna Vinschen <[email protected]> wrote:
> On Nov 8 09:16, Hisashi T Fujinaka wrote:
>> On Tue, 8 Nov 2016, Corinna Vinschen wrote:
>> > On Nov 8 15:06, Cao jin wrote:
>> > > When running as guest, under certain condition, it will oops as following.
>> > > writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
>> > > is NULL. While other register access won't oops kernel because they use
>> > > wr32/rd32 which have a defense against NULL pointer.
>> > > [...]
>> >
>> > Incidentally we're just looking for a solution to that problem too.
>> > Do three patches to fix the same problem at rougly the same time already
>> > qualify as freak accident?
>> >
>> > FTR, I attached my current patch, which I was planning to submit after
>> > some external testing.
>> >
>> > However, all three patches have one thing in common: They workaround
>> > a somewhat dubious resetting of the hardware address to NULL in case
>> > reading from a register failed.
>> >
>> > That makes me wonder if setting the hardware address to NULL in
>> > rd32/igb_rd32 is really such a good idea. It's performed in a function
>> > which return value is *never* tested for validity in the calling
>> > functions and leads to subsequent crashes since no tests for hw_addr ==
>> > NULL are performed.
>> >
>> > Maybe commit 22a8b2915 should be reconsidered? Isn't there some more
>> > graceful way to handle the "surprise removal"?
>>
>> Answering this from my home account because, well, work is Outlook.
>>
>> "Reconsidering" would be great. In fact, revert if if you'd like. I'm
>> uncertain that the surprise removal code actually works the way I
>> thought previously and I think I took a lot of it out of my local code.
>>
>> Unfortuantely I don't have any equipment that I can use to reproduce
>> surprise removal any longer so that means I wouldn't be able to test
>> anything. I have to defer to you or Cao Jin.
>
> I'm not too keen to rip out a PCIe NIC under power from my locale
> desktop machine, but I think an actual surprise removal is not the
> problem.
>
> As described in my git log entry, the error condition in igb_rd32 can be
> triggered during a suspend. The HW has been put into a sleep state but
> some register read requests are apparently not guarded against that
> situation. Reading a register in this state returns -1, thus a suspend
> is erroneously triggering the "surprise removal" sequence.

The question I would have is what is reading the device when it is in
this state. The watchdog and any other functions that would read the
device should be disabled.

One possibility could be a race between a call to igb_close and the
igb_suspend function. We have seen some of those pop up recently on
ixgbe and it looks like igb has the same bug. We should probably be
using the rtnl_lock to guarantee that netif_device_detach and the call
to __igb_close are completed before igb_close could possibly be called
by the network stack.

> Here's a raw idea:
>
> - Note that device is suspended in e1000_hw struct. Don't trigger
> error sequence in igb_rd32 if so (...and return a 0 value???)

The thing is that a suspended device should not be accessed at all.
If we are accessing it while it is suspended then that is a bug. If
you could throw a WARN_ON call in igb_rd32 to capture where this is
being triggered that might be useful.

> - Otherwise assume it's actually a surprise removal. In theory that
> should somehow trigger a device removal sequence, kind of like
> calling igb_remove, no?

Well a read of the MMIO region while suspended is more of a surprise
read since there shouldn't be anything going on. We need to isolate
where that read is coming from and fix it.

Thanks.

- Alex

2016-11-09 13:25:27

by Cao jin

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

Thanks Corrina for your info.

I tested my patch, it works for me on kernel 4.9-rc4.
"surprise removal" maybe another issue to solve. This one is enough to
solve my issue and other one's, could it be accept first?

Cao jin

On 11/09/2016 03:33 AM, Alexander Duyck wrote:
> On Tue, Nov 8, 2016 at 10:37 AM, Corinna Vinschen <[email protected]> wrote:
>> On Nov 8 09:16, Hisashi T Fujinaka wrote:
>>> On Tue, 8 Nov 2016, Corinna Vinschen wrote:
>>>> On Nov 8 15:06, Cao jin wrote:
>>>>> When running as guest, under certain condition, it will oops as following.
>>>>> writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
>>>>> is NULL. While other register access won't oops kernel because they use
>>>>> wr32/rd32 which have a defense against NULL pointer.
>>>>> [...]
>>>>
>>>> Incidentally we're just looking for a solution to that problem too.
>>>> Do three patches to fix the same problem at rougly the same time already
>>>> qualify as freak accident?
>>>>
>>>> FTR, I attached my current patch, which I was planning to submit after
>>>> some external testing.
>>>>
>>>> However, all three patches have one thing in common: They workaround
>>>> a somewhat dubious resetting of the hardware address to NULL in case
>>>> reading from a register failed.
>>>>
>>>> That makes me wonder if setting the hardware address to NULL in
>>>> rd32/igb_rd32 is really such a good idea. It's performed in a function
>>>> which return value is *never* tested for validity in the calling
>>>> functions and leads to subsequent crashes since no tests for hw_addr ==
>>>> NULL are performed.
>>>>
>>>> Maybe commit 22a8b2915 should be reconsidered? Isn't there some more
>>>> graceful way to handle the "surprise removal"?
>>>
>>> Answering this from my home account because, well, work is Outlook.
>>>
>>> "Reconsidering" would be great. In fact, revert if if you'd like. I'm
>>> uncertain that the surprise removal code actually works the way I
>>> thought previously and I think I took a lot of it out of my local code.
>>>
>>> Unfortuantely I don't have any equipment that I can use to reproduce
>>> surprise removal any longer so that means I wouldn't be able to test
>>> anything. I have to defer to you or Cao Jin.
>>
>> I'm not too keen to rip out a PCIe NIC under power from my locale
>> desktop machine, but I think an actual surprise removal is not the
>> problem.
>>
>> As described in my git log entry, the error condition in igb_rd32 can be
>> triggered during a suspend. The HW has been put into a sleep state but
>> some register read requests are apparently not guarded against that
>> situation. Reading a register in this state returns -1, thus a suspend
>> is erroneously triggering the "surprise removal" sequence.
>
> The question I would have is what is reading the device when it is in
> this state. The watchdog and any other functions that would read the
> device should be disabled.
>
> One possibility could be a race between a call to igb_close and the
> igb_suspend function. We have seen some of those pop up recently on
> ixgbe and it looks like igb has the same bug. We should probably be
> using the rtnl_lock to guarantee that netif_device_detach and the call
> to __igb_close are completed before igb_close could possibly be called
> by the network stack.
>
>> Here's a raw idea:
>>
>> - Note that device is suspended in e1000_hw struct. Don't trigger
>> error sequence in igb_rd32 if so (...and return a 0 value???)
>
> The thing is that a suspended device should not be accessed at all.
> If we are accessing it while it is suspended then that is a bug. If
> you could throw a WARN_ON call in igb_rd32 to capture where this is
> being triggered that might be useful.
>
>> - Otherwise assume it's actually a surprise removal. In theory that
>> should somehow trigger a device removal sequence, kind of like
>> calling igb_remove, no?
>
> Well a read of the MMIO region while suspended is more of a surprise
> read since there shouldn't be anything going on. We need to isolate
> where that read is coming from and fix it.
>
> Thanks.
>
> - Alex
>
>
> .
>




2016-11-09 16:28:37

by Alexander Duyck

[permalink] [raw]
Subject: Re: [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Mon, Nov 7, 2016 at 11:06 PM, Cao jin <[email protected]> wrote:
> When running as guest, under certain condition, it will oops as following.
> writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> is NULL. While other register access won't oops kernel because they use
> wr32/rd32 which have a defense against NULL pointer.
>
> [ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
> error received: id=0101
> [ 141.225523] igb 0000:01:00.1: PCIe Bus Error:
> severity=Uncorrected (Fatal), type=Unaccessible,
> id=0101(Unregistered Agent ID)
> [ 141.299442] igb 0000:01:00.1: broadcast error_detected message
> [ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
> detached
> [ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
> detached
> [ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
> [ 143.465994] igb 0000:01:00.1: broadcast slot_reset message
> [ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
> [ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
> [ 145.312078] igb 0000:01:00.1: broadcast resume message
> [ 145.322211] BUG: unable to handle kernel paging request at
> 0000000000003818
> [ 145.361275] IP: [<ffffffffa02fd38d>]
> igb_configure_tx_ring+0x14d/0x280 [igb]
> [ 145.400048] PGD 0
> [ 145.438007] Oops: 0002 [#1] SMP
>
> A similiar issue & solution could be found at:
> http://patchwork.ozlabs.org/patch/689592/
>
> Signed-off-by: Cao jin <[email protected]>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/net/ethernet/intel/igb/igb_main.c b/drivers/net/ethernet/intel/igb/igb_main.c
> index edc9a6a..3f240ac 100644
> --- a/drivers/net/ethernet/intel/igb/igb_main.c
> +++ b/drivers/net/ethernet/intel/igb/igb_main.c
> @@ -3390,7 +3390,7 @@ void igb_configure_tx_ring(struct igb_adapter *adapter,
> tdba & 0x00000000ffffffffULL);
> wr32(E1000_TDBAH(reg_idx), tdba >> 32);
>
> - ring->tail = hw->hw_addr + E1000_TDT(reg_idx);
> + ring->tail = adapter->io_addr + E1000_TDT(reg_idx);
> wr32(E1000_TDH(reg_idx), 0);
> writel(0, ring->tail);
>
> @@ -3729,7 +3729,7 @@ void igb_configure_rx_ring(struct igb_adapter *adapter,
> ring->count * sizeof(union e1000_adv_rx_desc));
>
> /* initialize head and tail */
> - ring->tail = hw->hw_addr + E1000_RDT(reg_idx);
> + ring->tail = adapter->io_addr + E1000_RDT(reg_idx);
> wr32(E1000_RDH(reg_idx), 0);
> writel(0, ring->tail);
>

This patch looks good to me.

Acked-by: Alexander Duyck <[email protected]>

2016-11-10 09:35:56

by Corinna Vinschen

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Nov 8 11:33, Alexander Duyck wrote:
> On Tue, Nov 8, 2016 at 10:37 AM, Corinna Vinschen <[email protected]> wrote:
> > On Nov 8 09:16, Hisashi T Fujinaka wrote:
> >> On Tue, 8 Nov 2016, Corinna Vinschen wrote:
> >> > On Nov 8 15:06, Cao jin wrote:
> >> > > When running as guest, under certain condition, it will oops as following.
> >> > > writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> >> > > is NULL. While other register access won't oops kernel because they use
> >> > > wr32/rd32 which have a defense against NULL pointer.
> >> > > [...]
> >> >
> >> > Incidentally we're just looking for a solution to that problem too.
> >> > Do three patches to fix the same problem at rougly the same time already
> >> > qualify as freak accident?
> >> >
> >> > FTR, I attached my current patch, which I was planning to submit after
> >> > some external testing.
> >> >
> >> > However, all three patches have one thing in common: They workaround
> >> > a somewhat dubious resetting of the hardware address to NULL in case
> >> > reading from a register failed.
> >> >
> >> > That makes me wonder if setting the hardware address to NULL in
> >> > rd32/igb_rd32 is really such a good idea. It's performed in a function
> >> > which return value is *never* tested for validity in the calling
> >> > functions and leads to subsequent crashes since no tests for hw_addr ==
> >> > NULL are performed.
> >> >
> >> > Maybe commit 22a8b2915 should be reconsidered? Isn't there some more
> >> > graceful way to handle the "surprise removal"?
> >>
> >> Answering this from my home account because, well, work is Outlook.
> >>
> >> "Reconsidering" would be great. In fact, revert if if you'd like. I'm
> >> uncertain that the surprise removal code actually works the way I
> >> thought previously and I think I took a lot of it out of my local code.
> >>
> >> Unfortuantely I don't have any equipment that I can use to reproduce
> >> surprise removal any longer so that means I wouldn't be able to test
> >> anything. I have to defer to you or Cao Jin.
> >
> > I'm not too keen to rip out a PCIe NIC under power from my locale
> > desktop machine, but I think an actual surprise removal is not the
> > problem.
> >
> > As described in my git log entry, the error condition in igb_rd32 can be
> > triggered during a suspend. The HW has been put into a sleep state but
> > some register read requests are apparently not guarded against that
> > situation. Reading a register in this state returns -1, thus a suspend
> > is erroneously triggering the "surprise removal" sequence.
>
> The question I would have is what is reading the device when it is in
> this state. The watchdog and any other functions that would read the
> device should be disabled.
>
> One possibility could be a race between a call to igb_close and the
> igb_suspend function. We have seen some of those pop up recently on
> ixgbe and it looks like igb has the same bug. We should probably be
> using the rtnl_lock to guarantee that netif_device_detach and the call
> to __igb_close are completed before igb_close could possibly be called
> by the network stack.

Do you have a pointer to the related ixgbe patch, by any chance?

> > Here's a raw idea:
> >
> > - Note that device is suspended in e1000_hw struct. Don't trigger
> > error sequence in igb_rd32 if so (...and return a 0 value???)
>
> The thing is that a suspended device should not be accessed at all.
> If we are accessing it while it is suspended then that is a bug. If
> you could throw a WARN_ON call in igb_rd32 to capture where this is
> being triggered that might be useful.
>
> > - Otherwise assume it's actually a surprise removal. In theory that
> > should somehow trigger a device removal sequence, kind of like
> > calling igb_remove, no?
>
> Well a read of the MMIO region while suspended is more of a surprise
> read since there shouldn't be anything going on. We need to isolate
> where that read is coming from and fix it.

That would be ideal, but the problem couldn't be reproduced yet apart
from at a customer's customer site. It's not clear yet if we can access
the machine for further testing.


Corinna


Attachments:
(No filename) (4.10 kB)
signature.asc (819.00 B)
Download all attachments

2016-11-10 13:48:24

by Hisashi T Fujinaka

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Thu, 10 Nov 2016, Corinna Vinschen wrote:

> On Nov 8 11:33, Alexander Duyck wrote:
...
>> The question I would have is what is reading the device when it is in
>> this state. The watchdog and any other functions that would read the
>> device should be disabled.
>>
>> One possibility could be a race between a call to igb_close and the
>> igb_suspend function. We have seen some of those pop up recently on
>> ixgbe and it looks like igb has the same bug. We should probably be
>> using the rtnl_lock to guarantee that netif_device_detach and the call
>> to __igb_close are completed before igb_close could possibly be called
>> by the network stack.
>
> Do you have a pointer to the related ixgbe patch, by any chance?
...
>> The thing is that a suspended device should not be accessed at all.
>> If we are accessing it while it is suspended then that is a bug. If
>> you could throw a WARN_ON call in igb_rd32 to capture where this is
>> being triggered that might be useful.
>>
>>> - Otherwise assume it's actually a surprise removal. In theory that
>>> should somehow trigger a device removal sequence, kind of like
>>> calling igb_remove, no?
>>
>> Well a read of the MMIO region while suspended is more of a surprise
>> read since there shouldn't be anything going on. We need to isolate
>> where that read is coming from and fix it.
>
> That would be ideal, but the problem couldn't be reproduced yet apart
> from at a customer's customer site. It's not clear yet if we can access
> the machine for further testing.

Here's the initial patch for igb I have, but it's on hold awaiting more
changes in ixgbe regarding AER.

--
Hisashi T Fujinaka - [email protected]
BSEE + BSChem + BAEnglish + MSCS + $2.50 = coffee


Attachments:
igb.patch (2.25 kB)

2016-11-10 17:28:12

by Corinna Vinschen

[permalink] [raw]
Subject: Re: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

On Nov 10 05:48, Hisashi T Fujinaka wrote:
> On Thu, 10 Nov 2016, Corinna Vinschen wrote:
> > On Nov 8 11:33, Alexander Duyck wrote:
> ...
> > > The question I would have is what is reading the device when it is in
> > > this state. The watchdog and any other functions that would read the
> > > device should be disabled.
> > >
> > > One possibility could be a race between a call to igb_close and the
> > > igb_suspend function. We have seen some of those pop up recently on
> > > ixgbe and it looks like igb has the same bug. We should probably be
> > > using the rtnl_lock to guarantee that netif_device_detach and the call
> > > to __igb_close are completed before igb_close could possibly be called
> > > by the network stack.
> >
> > Do you have a pointer to the related ixgbe patch, by any chance?
> ...
> Here's the initial patch for igb I have, but it's on hold awaiting more
> changes in ixgbe regarding AER.

Thanks a lot!


Corinna


Attachments:
(No filename) (951.00 B)
signature.asc (819.00 B)
Download all attachments

2016-11-23 23:48:56

by Brown, Aaron F

[permalink] [raw]
Subject: RE: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr

> From: Intel-wired-lan [[email protected]] on behalf of Cao jin [[email protected]]
> Sent: Monday, November 07, 2016 11:06 PM
To> : [email protected]; [email protected]
> Cc: [email protected]; [email protected]
> Subject: [Intel-wired-lan] [PATCH] igb: use igb_adapter->io_addr instead of e1000_hw->hw_addr
>
> When running as guest, under certain condition, it will oops as following.
> writel() in igb_configure_tx_ring() results in oops, because hw->hw_addr
> is NULL. While other register access won't oops kernel because they use
> wr32/rd32 which have a defense against NULL pointer.
>
> [ 141.225449] pcieport 0000:00:1c.0: AER: Multiple Uncorrected (Fatal)
> error received: id=0101
> [ 141.225523] igb 0000:01:00.1: PCIe Bus Error:
> severity=Uncorrected (Fatal), type=Unaccessible,
> id=0101(Unregistered Agent ID)
> [ 141.299442] igb 0000:01:00.1: broadcast error_detected message
> [ 141.300539] igb 0000:01:00.0 enp1s0f0: PCIe link lost, device now
> detached
> [ 141.351019] igb 0000:01:00.1 enp1s0f1: PCIe link lost, device now
> detached
> [ 143.465904] pcieport 0000:00:1c.0: Root Port link has been reset
> [ 143.465994] igb 0000:01:00.1: broadcast slot_reset message
> [ 143.466039] igb 0000:01:00.0: enabling device (0000 -> 0002)
> [ 144.389078] igb 0000:01:00.1: enabling device (0000 -> 0002)
> [ 145.312078] igb 0000:01:00.1: broadcast resume message
> [ 145.322211] BUG: unable to handle kernel paging request at
> 0000000000003818
> [ 145.361275] IP: [<ffffffffa02fd38d>]
> igb_configure_tx_ring+0x14d/0x280 [igb]
> [ 145.400048] PGD 0
> [ 145.438007] Oops: 0002 [#1] SMP
>
> A similiar issue & solution could be found at:
> http://patchwork.ozlabs.org/patch/689592/
>
> Signed-off-by: Cao jin <[email protected]>
> ---
> drivers/net/ethernet/intel/igb/igb_main.c | 4 ++--
> 1 file changed, 2 insertions(+), 2 deletions(-)

Tested-by: Aaron Brown <[email protected]>