2008-07-22 15:04:49

by Lukas Hejtmanek

[permalink] [raw]
Subject: Regression: myri10ge driver in 2.6.26

Hello,

the myri10ge driver in 2.6.26 vanila kernel does not work.

Loading the driver produces only:
[ 86.641632] myri10ge: Version 1.3.99-1.347
[ 86.641632] PCI: Setting latency timer of device 0000:07:00.0 to 64

and nothing more.

The driver from Myricom web page version 1.4.2 works fine under the same
kernel:
[ 664.668857] myri10ge: Version 1.4.2
[ 664.668857] PCI: Setting latency timer of device 0000:07:00.0 to 64
[ 664.675996] firmware: requesting myri10ge_eth_z8e.dat
[ 664.704897] myri10ge 0000:07:00.0: Unable to load myri10ge_eth_z8e.dat
firmware image via hotplug
[ 664.704897] myri10ge 0000:07:00.0: hotplug firmware loading failed
[ 664.704897] myri10ge 0000:07:00.0: Successfully adopted running firmware
[ 664.704897] myri10ge 0000:07:00.0: Using firmware currently running on NIC.
For optimal
[ 664.704897] myri10ge 0000:07:00.0: performance consider loading optimized
firmware
[ 664.704897] myri10ge 0000:07:00.0: via hotplug
[ 664.754309] firmware: requesting adopted
[ 664.754896] myri10ge 0000:07:00.0: Unable to load adopted firmware image
via hotplug
[ 664.754896] myri10ge 0000:07:00.0: hotplug firmware loading failed
[ 664.754896] myri10ge 0000:07:00.0: Successfully adopted running firmware
[ 664.828230] myri10ge 0000:07:00.0: MSI IRQ 499, tx bndry 2048, fw adopted,
WC Enabled


--
Luk?? Hejtm?nek


2008-07-22 21:29:42

by Willy Tarreau

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

[ linux-net and brice CCed ]

Not many changes since 2.6.25, but some seem to affect firmware part.
What was your last known working version ? Also, you appear not to be
using the firmware with 1.4.2, is there a specific reason ?

Willy

On Tue, Jul 22, 2008 at 05:04:38PM +0200, Lukas Hejtmanek wrote:
> Hello,
>
> the myri10ge driver in 2.6.26 vanila kernel does not work.
>
> Loading the driver produces only:
> [ 86.641632] myri10ge: Version 1.3.99-1.347
> [ 86.641632] PCI: Setting latency timer of device 0000:07:00.0 to 64
>
> and nothing more.
>
> The driver from Myricom web page version 1.4.2 works fine under the same
> kernel:
> [ 664.668857] myri10ge: Version 1.4.2
> [ 664.668857] PCI: Setting latency timer of device 0000:07:00.0 to 64
> [ 664.675996] firmware: requesting myri10ge_eth_z8e.dat
> [ 664.704897] myri10ge 0000:07:00.0: Unable to load myri10ge_eth_z8e.dat
> firmware image via hotplug
> [ 664.704897] myri10ge 0000:07:00.0: hotplug firmware loading failed
> [ 664.704897] myri10ge 0000:07:00.0: Successfully adopted running firmware
> [ 664.704897] myri10ge 0000:07:00.0: Using firmware currently running on NIC.
> For optimal
> [ 664.704897] myri10ge 0000:07:00.0: performance consider loading optimized
> firmware
> [ 664.704897] myri10ge 0000:07:00.0: via hotplug
> [ 664.754309] firmware: requesting adopted
> [ 664.754896] myri10ge 0000:07:00.0: Unable to load adopted firmware image
> via hotplug
> [ 664.754896] myri10ge 0000:07:00.0: hotplug firmware loading failed
> [ 664.754896] myri10ge 0000:07:00.0: Successfully adopted running firmware
> [ 664.828230] myri10ge 0000:07:00.0: MSI IRQ 499, tx bndry 2048, fw adopted,
> WC Enabled
>
>
> --
> Luk?? Hejtm?nek
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

2008-07-22 22:11:53

by Brice Goglin

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

Lukas Hejtmanek wrote:
> Hello,
>
> the myri10ge driver in 2.6.26 vanila kernel does not work.
>
> Loading the driver produces only:
> [ 86.641632] myri10ge: Version 1.3.99-1.347
> [ 86.641632] PCI: Setting latency timer of device 0000:07:00.0 to 64
>
> and nothing more.
>

I have seen one machine with a similar problem. Does reverting
014377a1df693ff30a9e8b69f0bbb0a38e601f75 help? Also could you check that
current Linus' git works fine? (just get the driver source from there,
no need to build the whole tree)

This commit went into 2.6.26 without the final multislice support
enabling, but it was supposed to help anyway...

Brice

2008-07-22 22:52:31

by Brice Goglin

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

Brice Goglin wrote:
> Lukas Hejtmanek wrote:
>
>> Hello,
>>
>> the myri10ge driver in 2.6.26 vanila kernel does not work.
>>
>> Loading the driver produces only:
>> [ 86.641632] myri10ge: Version 1.3.99-1.347
>> [ 86.641632] PCI: Setting latency timer of device 0000:07:00.0 to 64
>>
>> and nothing more.
>>
>
> I have seen one machine with a similar problem. Does reverting
> 014377a1df693ff30a9e8b69f0bbb0a38e601f75 help? Also could you check that
> current Linus' git works fine? (just get the driver source from there,
> no need to build the whole tree)
>

If the above is correct, then patch below should fix it. If so, I'll
push this into the next stable 2.6.26.x release.

Brice

--- linux.old/drivers/net/myri10ge/myri10ge.c 2008-07-23 00:38:52.000000000 +0200
+++ linux/drivers/net/myri10ge/myri10ge.c 2008-07-23 00:43:16.000000000 +0200
@@ -3213,26 +3213,26 @@
for (i = 0; i < ETH_ALEN; i++)
netdev->dev_addr[i] = mgp->mac_addr[i];

- /* allocate rx done ring */
- bytes = mgp->max_intr_slots * sizeof(*mgp->ss.rx_done.entry);
- mgp->ss.rx_done.entry = dma_alloc_coherent(&pdev->dev, bytes,
- &mgp->ss.rx_done.bus, GFP_KERNEL);
- if (mgp->ss.rx_done.entry == NULL)
- goto abort_with_ioremap;
- memset(mgp->ss.rx_done.entry, 0, bytes);
-
myri10ge_select_firmware(mgp);

status = myri10ge_load_firmware(mgp);
if (status != 0) {
dev_err(&pdev->dev, "failed to load firmware\n");
- goto abort_with_rx_done;
+ goto abort_with_ioremap;
}

+ /* allocate rx done ring */
+ bytes = mgp->max_intr_slots * sizeof(*mgp->ss.rx_done.entry);
+ mgp->ss.rx_done.entry = dma_alloc_coherent(&pdev->dev, bytes,
+ &mgp->ss.rx_done.bus, GFP_KERNEL);
+ if (mgp->ss.rx_done.entry == NULL)
+ goto abort_with_firmware;
+ memset(mgp->ss.rx_done.entry, 0, bytes);
+
status = myri10ge_reset(mgp);
if (status != 0) {
dev_err(&pdev->dev, "failed reset\n");
- goto abort_with_firmware;
+ goto abort_with_rx_done;
}

pci_set_drvdata(pdev, mgp);
@@ -3258,7 +3258,7 @@
* is set to correct value if MSI is enabled */
status = myri10ge_request_irq(mgp);
if (status != 0)
- goto abort_with_firmware;
+ goto abort_with_rx_done;
netdev->irq = pdev->irq;
myri10ge_free_irq(mgp);

@@ -3287,14 +3287,14 @@
abort_with_state:
pci_restore_state(pdev);

-abort_with_firmware:
- myri10ge_dummy_rdma(mgp, 0);
-
abort_with_rx_done:
bytes = mgp->max_intr_slots * sizeof(*mgp->ss.rx_done.entry);
dma_free_coherent(&pdev->dev, bytes,
mgp->ss.rx_done.entry, mgp->ss.rx_done.bus);

+abort_with_firmware:
+ myri10ge_dummy_rdma(mgp, 0);
+
abort_with_ioremap:
iounmap(mgp->sram);


2008-07-23 08:07:13

by Lukas Hejtmanek

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

On Wed, Jul 23, 2008 at 12:52:16AM +0200, Brice Goglin wrote:
> > I have seen one machine with a similar problem. Does reverting
> > 014377a1df693ff30a9e8b69f0bbb0a38e601f75 help? Also could you check that
> > current Linus' git works fine? (just get the driver source from there,
> > no need to build the whole tree)
> >
>
> If the above is correct, then patch below should fix it. If so, I'll
> push this into the next stable 2.6.26.x release.

did not try to revert the mentioned patch but your attached patch made the
kernel driver working. So, please push it into the next stable.

--
Luk?? Hejtm?nek

2008-07-23 08:18:15

by Brice Goglin

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

Lukas Hejtmanek wrote:
> On Wed, Jul 23, 2008 at 12:52:16AM +0200, Brice Goglin wrote:
>
>>> I have seen one machine with a similar problem. Does reverting
>>> 014377a1df693ff30a9e8b69f0bbb0a38e601f75 help? Also could you check that
>>> current Linus' git works fine? (just get the driver source from there,
>>> no need to build the whole tree)
>>>
>>>
>> If the above is correct, then patch below should fix it. If so, I'll
>> push this into the next stable 2.6.26.x release.
>>
>
> did not try to revert the mentioned patch but your attached patch made the
> kernel driver working. So, please push it into the next stable.
>


Thanks for testing, I just sent the patch to stable@

Brice

2008-07-23 12:54:39

by Lukas Hejtmanek

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

Hello,

On Wed, Jul 23, 2008 at 10:17:54AM +0200, Brice Goglin wrote:
> Thanks for testing, I just sent the patch to stable@

it seems, I got another problem with the driver from 2.6.26:

modprobe myri10ge; ifconfig eth2 ip netmask mask up
produces:
[ 232.446645] myri10ge: Version 1.3.99-1.347
[ 232.446645] ACPI: PCI Interrupt 0000:07:00.0[A] -> GSI 18 (level, low) -> IRQ 18
[ 232.446645] PCI: Setting latency timer of device 0000:07:00.0 to 64
[ 232.446645] firmware: requesting myri10ge_eth_z8e.dat
[ 232.473332] myri10ge 0000:07:00.0: Unable to load myri10ge_eth_z8e.dat firmware image via hotplug
[ 232.473332] myri10ge 0000:07:00.0: hotplug firmware loading failed
[ 232.479988] myri10ge 0000:07:00.0: Successfully adopted running firmware
[ 232.479988] myri10ge 0000:07:00.0: Using firmware currently running on NIC. For optimal
[ 232.479988] myri10ge 0000:07:00.0: performance consider loading optimized firmware
[ 232.479988] myri10ge 0000:07:00.0: via hotplug
[ 232.532326] firmware: requesting adopted
[ 232.533331] myri10ge 0000:07:00.0: Unable to load adopted firmware image via hotplug
[ 232.533331] myri10ge 0000:07:00.0: hotplug firmware loading failed
[ 232.533331] myri10ge 0000:07:00.0: Successfully adopted running firmware
[ 232.640929] myri10ge 0000:07:00.0: MSI IRQ 499, tx bndry 2048, fw adopted, WC Enabled
[ 266.273295] myri10ge: Version 1.3.99-1.347
[ 266.273295] PCI: Setting latency timer of device 0000:07:00.0 to 64
[ 266.273295] firmware: requesting myri10ge_eth_z8e.dat
[ 266.406642] firmware: requesting myri10ge_eth_z8e.dat
[ 266.590610] myri10ge 0000:07:00.0: MSI IRQ 499, tx bndry 4096, fw myri10ge_eth_z8e.dat, WC Enabled
[ 271.500645] BUG: unable to handle kernel NULL pointer dereference at 0000000000000598
[ 271.500826] IP: [<ffffffffa0093308>] :myri10ge:myri10ge_open+0x1c8/0x880
[ 271.500957] PGD 43a88a067 PUD 43c636067 PMD 0
[ 271.501152] Oops: 0000 [1] SMP
[ 271.501301] CPU 1
[ 271.501405] Modules linked in: myri10ge ipv6 ac raid0 evdev usbhid loop serio_raw ehci_hcd i2c_i801 psmouse uhci_hcd i2c_core [last unloaded: myri10ge]
[ 271.502150] Pid: 2099, comm: ifconfig Not tainted 2.6.26 #1
[ 271.502217] RIP: 0010:[<ffffffffa0093308>] [<ffffffffa0093308>] :myri10ge:myri10ge_open+0x1c8/0x880
[ 271.502349] RSP: 0018:ffff81043a859d68 EFLAGS: 00010246
[ 271.502416] RAX: 0000000000000000 RBX: ffff81043a859d78 RCX: 0000000000000000
[ 271.502486] RDX: ffff81043a859d78 RSI: 000000000000000b RDI: 0000000000000000
[ 271.502556] RBP: ffff81043bb826c0 R08: ffff81043a858000 R09: ffffffff80730280
[ 271.502626] R10: ffff810080914000 R11: 0000000000000000 R12: 0000000000001043
[ 271.502696] R13: 0000000000001000 R14: ffff81043bb82000 R15: ffff81043bbedc10
[ 271.502766] FS: 00007f752d3686d0(0000) GS:ffff81043e42e480(0000) knlGS:0000000000000000
[ 271.502851] CS: 0010 DS: 0000 ES: 0000 CR0: 000000008005003b
[ 271.502919] CR2: 0000000000000598 CR3: 000000043b87a000 CR4: 00000000000006e0
[ 271.502988] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[ 271.503058] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
[ 271.503128] Process ifconfig (pid: 2099, threadinfo ffff81043a858000, task ffff81043c5ebff0)
[ 271.503214] Stack: 0000000000000001 0000000000000000 0000000000000296 ffffffff803b2fc1
[ 271.503516] ffffffff000a0000 0000000000000000 0000000000000296 ffff81043bb82000
[ 271.503597] 0000000000000001 0000000000001043 0000000000001002 0000000000000000
[ 271.503597] Call Trace:
[ 271.503597] [<ffffffff803b2fc1>] ? __up_read+0x21/0xb0
[ 271.503597] [<ffffffff804cea79>] ? dev_open+0x59/0xb0
[ 271.503597] [<ffffffff804cd36f>] ? dev_change_flags+0x9f/0x1d0
[ 271.503597] [<ffffffff80510d74>] ? devinet_ioctl+0x594/0x730
[ 271.503597] [<ffffffff804bf4e4>] ? sock_ioctl+0x54/0x240
[ 271.503597] [<ffffffff802af57f>] ? vfs_ioctl+0x2f/0xa0
[ 271.503597] [<ffffffff802af664>] ? do_vfs_ioctl+0x74/0x2d0
[ 271.503597] [<ffffffff802af951>] ? sys_ioctl+0x91/0xb0
[ 271.503597] [<ffffffff8020c15b>] ? system_call_after_swapgs+0x7b/0x80
[ 271.503597]
[ 271.503597]
[ 271.503597] Code: 42 ff 48 85 d0 75 ec 8d 41 14 89 85 90 05 00 00 48 8b 85 f0 04 00 00 48 8d 5c 24 10 31 c9 be 0b 00 00 00 48 89 da 48 89 44 24 08 <48> 8b 80 98 05 00 00 48 8b 7c 24 08 48 89 04 24 e8 f3 ef ff ff
[ 271.503597] RIP [<ffffffffa0093308>] :myri10ge:myri10ge_open+0x1c8/0x880
[ 271.503597] RSP <ffff81043a859d68>
[ 271.503597] CR2: 0000000000000598
[ 271.507626] ---[ end trace d24e8d9a8e3ce862 ]---


--
Luk?? Hejtm?nek

2008-07-23 20:17:57

by Brice Goglin

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

Lukas Hejtmanek wrote:
> Hello,
>
> On Wed, Jul 23, 2008 at 10:17:54AM +0200, Brice Goglin wrote:
>
>> Thanks for testing, I just sent the patch to stable@
>>
>
> it seems, I got another problem with the driver from 2.6.26:
>

Does this help ?

--- linux.old/drivers/net/myri10ge/myri10ge.c 2008-07-23 17:13:20.000000000 +0200
+++ linux/drivers/net/myri10ge/myri10ge.c 2008-07-23 22:07:58.000000000 +0200
@@ -3126,6 +3126,8 @@ static int myri10ge_probe(struct pci_dev

mgp = netdev_priv(netdev);
mgp->dev = netdev;
+ mgp->ss.mgp = mgp;
+ mgp->ss.dev = mgp->dev;
netif_napi_add(netdev, &mgp->ss.napi, myri10ge_poll, myri10ge_napi_weight);
mgp->pdev = pdev;
mgp->csum_flag = MXGEFW_FLAGS_CKSUM;

2008-07-24 13:43:33

by Lukas Hejtmanek

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

On Wed, Jul 23, 2008 at 10:17:31PM +0200, Brice Goglin wrote:
> Does this help ?

it looks good now.

--
Luk?? Hejtm?nek

2008-07-24 17:12:49

by Brice Goglin

[permalink] [raw]
Subject: Re: Regression: myri10ge driver in 2.6.26

Lukas Hejtmanek wrote:
> On Wed, Jul 23, 2008 at 10:17:31PM +0200, Brice Goglin wrote:
>
>> Does this help ?
>>
>
> it looks good now.
>


Ok, sent to stable@ as well. Sorry for the mess.

Brice