2013-05-29 19:00:09

by Pravin Shelar

[permalink] [raw]
Subject: Re: Network issue on 3.10 rcs, bisected

On Wed, May 29, 2013 at 4:37 AM, Joao Correia
<[email protected]> wrote:
> Hello list
>
> While trying the rc's for 3.10, i've stumbled upon a problem where
> networking does not work at all. Iptables will show packet counts going up,
> but nothing actually reaches the programs.
>
> I'm running fedora under hyper-v 3 (a windows 2012 host). Only tested ipv4
> traffic, and everything times out (ping, telnet to open ports) on both
> directions. The networking devices come up apparently ok - has static ip
> set, and dmesg shows no errors (although i don't have many debugging options
> enabled).
>
> I bisected this, and git blames commit
> ec5f061564238892005257c83565a0b58ec79295 (net: ill link between CSUM and SG
> features.). I can't revert it cleanly on current rc's.
>
Can you also send network features set on the device?
ethtool -k <dev>

Thanks,
Pravin.
> Please find the attached config used, as well as lshw output. Logs show no
> errors.
>
> I'm willing and able to test fixes.
>
> Thank you for your time,
> Joao Correia
> CIUBI
> Universidade da Beira Interior
> Portugal
>
>
>


2013-05-29 19:54:15

by Joao Correia

[permalink] [raw]
Subject: Re: Network issue on 3.10 rcs, bisected

On Wed, May 29, 2013 at 7:59 PM, Pravin Shelar <[email protected]> wrote:
>
> On Wed, May 29, 2013 at 4:37 AM, Joao Correia
> <[email protected]> wrote:
> > Hello list
> >
> > While trying the rc's for 3.10, i've stumbled upon a problem where
> > networking does not work at all. Iptables will show packet counts going up,
> > but nothing actually reaches the programs.
> >
> > I'm running fedora under hyper-v 3 (a windows 2012 host). Only tested ipv4
> > traffic, and everything times out (ping, telnet to open ports) on both
> > directions. The networking devices come up apparently ok - has static ip
> > set, and dmesg shows no errors (although i don't have many debugging options
> > enabled).
> >
> > I bisected this, and git blames commit
> > ec5f061564238892005257c83565a0b58ec79295 (net: ill link between CSUM and SG
> > features.). I can't revert it cleanly on current rc's.
> >
> Can you also send network features set on the device?
> ethtool -k <dev>


As requested:
Features for eth0:
rx-checksumming: off [fixed]
tx-checksumming: off
tx-checksum-ipv4: off [fixed]
tx-checksum-ip-generic: off [fixed]
tx-checksum-ipv6: off [fixed]
tx-checksum-fcoe-crc: off [fixed]
tx-checksum-sctp: off [fixed]
scatter-gather: on
tx-scatter-gather: on
tx-scatter-gather-fraglist: off [fixed]
tcp-segmentation-offload: off
tx-tcp-segmentation: off [fixed]
tx-tcp-ecn-segmentation: off [fixed]
tx-tcp6-segmentation: off [fixed]
udp-fragmentation-offload: off [fixed]
generic-segmentation-offload: on
generic-receive-offload: on
large-receive-offload: off [fixed]
rx-vlan-offload: off
tx-vlan-offload: on
ntuple-filters: off [fixed]
receive-hashing: off [fixed]
highdma: off [fixed]
tx-vlan-ctag-hw-insert: on [fixed]
rx-vlan-ctag-hw-parse: off [fixed]
rx-vlan-ctag-filter: off [fixed]
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
vlan-challenged: off [fixed]
tx-lockless: off [fixed]
netns-local: off [fixed]
tx-gso-robust: off [fixed]
tx-fcoe-segmentation: off [fixed]
tx-gre-segmentation: off [fixed]
tx-udp_tnl-segmentation: off [fixed]
fcoe-mtu: off [fixed]
tx-nocache-copy: off
loopback: off [fixed]
rx-fcs: off [fixed]
rx-all: off [fixed]

The output is similar on a working (3.9) and a bad (3.10) kernel.
diff-ing both outputs shows:
tx-vlan-stag-hw-insert: off [fixed]
rx-vlan-stag-hw-parse: off [fixed]
rx-vlan-stag-filter: off [fixed]
as last on the broken kernels.

Thank you for your time,
Joao Correia
CIUBI
Universidade da Beira Interior
Portugal

2013-05-30 16:48:30

by Pravin Shelar

[permalink] [raw]
Subject: Re: Network issue on 3.10 rcs, bisected

On Wed, May 29, 2013 at 12:53 PM, Joao Correia
<[email protected]> wrote:
> On Wed, May 29, 2013 at 7:59 PM, Pravin Shelar <[email protected]> wrote:
>>
>> On Wed, May 29, 2013 at 4:37 AM, Joao Correia
>> <[email protected]> wrote:
>> > Hello list
>> >
>> > While trying the rc's for 3.10, i've stumbled upon a problem where
>> > networking does not work at all. Iptables will show packet counts going up,
>> > but nothing actually reaches the programs.
>> >
>> > I'm running fedora under hyper-v 3 (a windows 2012 host). Only tested ipv4
>> > traffic, and everything times out (ping, telnet to open ports) on both
>> > directions. The networking devices come up apparently ok - has static ip
>> > set, and dmesg shows no errors (although i don't have many debugging options
>> > enabled).
>> >
>> > I bisected this, and git blames commit
>> > ec5f061564238892005257c83565a0b58ec79295 (net: ill link between CSUM and SG
>> > features.). I can't revert it cleanly on current rc's.
>> >
>> Can you also send network features set on the device?
>> ethtool -k <dev>
>
>
> As requested:
> Features for eth0:
> rx-checksumming: off [fixed]
> tx-checksumming: off
> tx-checksum-ipv4: off [fixed]
> tx-checksum-ip-generic: off [fixed]
> tx-checksum-ipv6: off [fixed]
> tx-checksum-fcoe-crc: off [fixed]
> tx-checksum-sctp: off [fixed]
> scatter-gather: on
> tx-scatter-gather: on
> tx-scatter-gather-fraglist: off [fixed]
> tcp-segmentation-offload: off
> tx-tcp-segmentation: off [fixed]
> tx-tcp-ecn-segmentation: off [fixed]
> tx-tcp6-segmentation: off [fixed]
> udp-fragmentation-offload: off [fixed]
> generic-segmentation-offload: on
> generic-receive-offload: on
> large-receive-offload: off [fixed]
> rx-vlan-offload: off
> tx-vlan-offload: on
> ntuple-filters: off [fixed]
> receive-hashing: off [fixed]
> highdma: off [fixed]
> tx-vlan-ctag-hw-insert: on [fixed]
> rx-vlan-ctag-hw-parse: off [fixed]
> rx-vlan-ctag-filter: off [fixed]
> tx-vlan-stag-hw-insert: off [fixed]
> rx-vlan-stag-hw-parse: off [fixed]
> rx-vlan-stag-filter: off [fixed]
> vlan-challenged: off [fixed]
> tx-lockless: off [fixed]
> netns-local: off [fixed]
> tx-gso-robust: off [fixed]
> tx-fcoe-segmentation: off [fixed]
> tx-gre-segmentation: off [fixed]
> tx-udp_tnl-segmentation: off [fixed]
> fcoe-mtu: off [fixed]
> tx-nocache-copy: off
> loopback: off [fixed]
> rx-fcs: off [fixed]
> rx-all: off [fixed]
>
> The output is similar on a working (3.9) and a bad (3.10) kernel.
> diff-ing both outputs shows:
> tx-vlan-stag-hw-insert: off [fixed]
> rx-vlan-stag-hw-parse: off [fixed]
> rx-vlan-stag-filter: off [fixed]
> as last on the broken kernels.
>

I could not reproduce it, I will try it on VM. Meanwhile can you turn
off feature "sg" and try same test ?

Thanks.

> Thank you for your time,
> Joao Correia
> CIUBI
> Universidade da Beira Interior
> Portugal

2013-05-30 21:15:05

by Joao Correia

[permalink] [raw]
Subject: Re: Network issue on 3.10 rcs, bisected

On Thu, May 30, 2013 at 5:48 PM, Pravin Shelar <[email protected]> wrote:
> On Wed, May 29, 2013 at 12:53 PM, Joao Correia
> <[email protected]> wrote:
>> On Wed, May 29, 2013 at 7:59 PM, Pravin Shelar <[email protected]> wrote:
>>>
>>> On Wed, May 29, 2013 at 4:37 AM, Joao Correia
>>> <[email protected]> wrote:
>>> > Hello list
>>> >
>>> > While trying the rc's for 3.10, i've stumbled upon a problem where
>>> > networking does not work at all. Iptables will show packet counts going up,
>>> > but nothing actually reaches the programs.
>>> >
>>> > I'm running fedora under hyper-v 3 (a windows 2012 host). Only tested ipv4
>>> > traffic, and everything times out (ping, telnet to open ports) on both
>>> > directions. The networking devices come up apparently ok - has static ip
>>> > set, and dmesg shows no errors (although i don't have many debugging options
>>> > enabled).
>>> >
>>> > I bisected this, and git blames commit
>>> > ec5f061564238892005257c83565a0b58ec79295 (net: ill link between CSUM and SG
>>> > features.). I can't revert it cleanly on current rc's.
>>> >
>>> Can you also send network features set on the device?
>>> ethtool -k <dev>
>>
>>
>> As requested:
>> Features for eth0:
>> rx-checksumming: off [fixed]
>> tx-checksumming: off
>> tx-checksum-ipv4: off [fixed]
>> tx-checksum-ip-generic: off [fixed]
>> tx-checksum-ipv6: off [fixed]
>> tx-checksum-fcoe-crc: off [fixed]
>> tx-checksum-sctp: off [fixed]
>> scatter-gather: on
>> tx-scatter-gather: on
>> tx-scatter-gather-fraglist: off [fixed]
>> tcp-segmentation-offload: off
>> tx-tcp-segmentation: off [fixed]
>> tx-tcp-ecn-segmentation: off [fixed]
>> tx-tcp6-segmentation: off [fixed]
>> udp-fragmentation-offload: off [fixed]
>> generic-segmentation-offload: on
>> generic-receive-offload: on
>> large-receive-offload: off [fixed]
>> rx-vlan-offload: off
>> tx-vlan-offload: on
>> ntuple-filters: off [fixed]
>> receive-hashing: off [fixed]
>> highdma: off [fixed]
>> tx-vlan-ctag-hw-insert: on [fixed]
>> rx-vlan-ctag-hw-parse: off [fixed]
>> rx-vlan-ctag-filter: off [fixed]
>> tx-vlan-stag-hw-insert: off [fixed]
>> rx-vlan-stag-hw-parse: off [fixed]
>> rx-vlan-stag-filter: off [fixed]
>> vlan-challenged: off [fixed]
>> tx-lockless: off [fixed]
>> netns-local: off [fixed]
>> tx-gso-robust: off [fixed]
>> tx-fcoe-segmentation: off [fixed]
>> tx-gre-segmentation: off [fixed]
>> tx-udp_tnl-segmentation: off [fixed]
>> fcoe-mtu: off [fixed]
>> tx-nocache-copy: off
>> loopback: off [fixed]
>> rx-fcs: off [fixed]
>> rx-all: off [fixed]
>>
>> The output is similar on a working (3.9) and a bad (3.10) kernel.
>> diff-ing both outputs shows:
>> tx-vlan-stag-hw-insert: off [fixed]
>> rx-vlan-stag-hw-parse: off [fixed]
>> rx-vlan-stag-filter: off [fixed]
>> as last on the broken kernels.
>>
>
> I could not reproduce it, I will try it on VM. Meanwhile can you turn
> off feature "sg" and try same test ?

Hello

Your hint was spot-on. With sg off, i can't reproduce the problem and
networking seems fine. It works fine either way on 3.9, so this is a
regression for 3.10.

Thank you very much for your assistance.
Joao Correia
CIUBI
Universidade da Beira Interior
Portugal

>
> Thanks.
>
>> Thank you for your time,
>> Joao Correia
>> CIUBI
>> Universidade da Beira Interior
>> Portugal

2013-05-30 23:13:30

by Pravin Shelar

[permalink] [raw]
Subject: Re: Network issue on 3.10 rcs, bisected

On Thu, May 30, 2013 at 2:14 PM, Joao Correia
<[email protected]> wrote:
> On Thu, May 30, 2013 at 5:48 PM, Pravin Shelar <[email protected]> wrote:
>> On Wed, May 29, 2013 at 12:53 PM, Joao Correia
>> <[email protected]> wrote:
>>> On Wed, May 29, 2013 at 7:59 PM, Pravin Shelar <[email protected]> wrote:
>>>>
>>>> On Wed, May 29, 2013 at 4:37 AM, Joao Correia
>>>> <[email protected]> wrote:
>>>> > Hello list
>>>> >
>>>> > While trying the rc's for 3.10, i've stumbled upon a problem where
>>>> > networking does not work at all. Iptables will show packet counts going up,
>>>> > but nothing actually reaches the programs.
>>>> >
>>>> > I'm running fedora under hyper-v 3 (a windows 2012 host). Only tested ipv4
>>>> > traffic, and everything times out (ping, telnet to open ports) on both
>>>> > directions. The networking devices come up apparently ok - has static ip
>>>> > set, and dmesg shows no errors (although i don't have many debugging options
>>>> > enabled).
>>>> >
>>>> > I bisected this, and git blames commit
>>>> > ec5f061564238892005257c83565a0b58ec79295 (net: ill link between CSUM and SG
>>>> > features.). I can't revert it cleanly on current rc's.
>>>> >
>>>> Can you also send network features set on the device?
>>>> ethtool -k <dev>
>>>
>>>
>>> As requested:
>>> Features for eth0:
>>> rx-checksumming: off [fixed]
>>> tx-checksumming: off
>>> tx-checksum-ipv4: off [fixed]
>>> tx-checksum-ip-generic: off [fixed]
>>> tx-checksum-ipv6: off [fixed]
>>> tx-checksum-fcoe-crc: off [fixed]
>>> tx-checksum-sctp: off [fixed]
>>> scatter-gather: on
>>> tx-scatter-gather: on
>>> tx-scatter-gather-fraglist: off [fixed]
>>> tcp-segmentation-offload: off
>>> tx-tcp-segmentation: off [fixed]
>>> tx-tcp-ecn-segmentation: off [fixed]
>>> tx-tcp6-segmentation: off [fixed]
>>> udp-fragmentation-offload: off [fixed]
>>> generic-segmentation-offload: on
>>> generic-receive-offload: on
>>> large-receive-offload: off [fixed]
>>> rx-vlan-offload: off
>>> tx-vlan-offload: on
>>> ntuple-filters: off [fixed]
>>> receive-hashing: off [fixed]
>>> highdma: off [fixed]
>>> tx-vlan-ctag-hw-insert: on [fixed]
>>> rx-vlan-ctag-hw-parse: off [fixed]
>>> rx-vlan-ctag-filter: off [fixed]
>>> tx-vlan-stag-hw-insert: off [fixed]
>>> rx-vlan-stag-hw-parse: off [fixed]
>>> rx-vlan-stag-filter: off [fixed]
>>> vlan-challenged: off [fixed]
>>> tx-lockless: off [fixed]
>>> netns-local: off [fixed]
>>> tx-gso-robust: off [fixed]
>>> tx-fcoe-segmentation: off [fixed]
>>> tx-gre-segmentation: off [fixed]
>>> tx-udp_tnl-segmentation: off [fixed]
>>> fcoe-mtu: off [fixed]
>>> tx-nocache-copy: off
>>> loopback: off [fixed]
>>> rx-fcs: off [fixed]
>>> rx-all: off [fixed]
>>>
>>> The output is similar on a working (3.9) and a bad (3.10) kernel.
>>> diff-ing both outputs shows:
>>> tx-vlan-stag-hw-insert: off [fixed]
>>> rx-vlan-stag-hw-parse: off [fixed]
>>> rx-vlan-stag-filter: off [fixed]
>>> as last on the broken kernels.
>>>
>>
>> I could not reproduce it, I will try it on VM. Meanwhile can you turn
>> off feature "sg" and try same test ?
>
> Hello
>
> Your hint was spot-on. With sg off, i can't reproduce the problem and
> networking seems fine. It works fine either way on 3.9, so this is a
> regression for 3.10.
>
Nice.
but still this does not look right. Can you tell me driver for the nic?
`ethtool -i <dev>`

Thanks,
Pravin.

2013-05-31 08:27:29

by Joao Correia

[permalink] [raw]
Subject: Re: Network issue on 3.10 rcs, bisected

On Fri, May 31, 2013 at 12:13 AM, Pravin Shelar <[email protected]> wrote:
> On Thu, May 30, 2013 at 2:14 PM, Joao Correia
> <[email protected]> wrote:
>> On Thu, May 30, 2013 at 5:48 PM, Pravin Shelar <[email protected]> wrote:
>>> On Wed, May 29, 2013 at 12:53 PM, Joao Correia
>>> <[email protected]> wrote:
>>>> On Wed, May 29, 2013 at 7:59 PM, Pravin Shelar <[email protected]> wrote:
>>>>>
>>>>> On Wed, May 29, 2013 at 4:37 AM, Joao Correia
>>>>> <[email protected]> wrote:
>>>>> > Hello list
>>>>> >
>>>>> > While trying the rc's for 3.10, i've stumbled upon a problem where
>>>>> > networking does not work at all. Iptables will show packet counts going up,
>>>>> > but nothing actually reaches the programs.
>>>>> >
>>>>> > I'm running fedora under hyper-v 3 (a windows 2012 host). Only tested ipv4
>>>>> > traffic, and everything times out (ping, telnet to open ports) on both
>>>>> > directions. The networking devices come up apparently ok - has static ip
>>>>> > set, and dmesg shows no errors (although i don't have many debugging options
>>>>> > enabled).
>>>>> >
>>>>> > I bisected this, and git blames commit
>>>>> > ec5f061564238892005257c83565a0b58ec79295 (net: ill link between CSUM and SG
>>>>> > features.). I can't revert it cleanly on current rc's.
>>>>> >
>>>>> Can you also send network features set on the device?
>>>>> ethtool -k <dev>
>>>>
>>>>
>>>> As requested:
>>>> Features for eth0:
>>>> rx-checksumming: off [fixed]
>>>> tx-checksumming: off
>>>> tx-checksum-ipv4: off [fixed]
>>>> tx-checksum-ip-generic: off [fixed]
>>>> tx-checksum-ipv6: off [fixed]
>>>> tx-checksum-fcoe-crc: off [fixed]
>>>> tx-checksum-sctp: off [fixed]
>>>> scatter-gather: on
>>>> tx-scatter-gather: on
>>>> tx-scatter-gather-fraglist: off [fixed]
>>>> tcp-segmentation-offload: off
>>>> tx-tcp-segmentation: off [fixed]
>>>> tx-tcp-ecn-segmentation: off [fixed]
>>>> tx-tcp6-segmentation: off [fixed]
>>>> udp-fragmentation-offload: off [fixed]
>>>> generic-segmentation-offload: on
>>>> generic-receive-offload: on
>>>> large-receive-offload: off [fixed]
>>>> rx-vlan-offload: off
>>>> tx-vlan-offload: on
>>>> ntuple-filters: off [fixed]
>>>> receive-hashing: off [fixed]
>>>> highdma: off [fixed]
>>>> tx-vlan-ctag-hw-insert: on [fixed]
>>>> rx-vlan-ctag-hw-parse: off [fixed]
>>>> rx-vlan-ctag-filter: off [fixed]
>>>> tx-vlan-stag-hw-insert: off [fixed]
>>>> rx-vlan-stag-hw-parse: off [fixed]
>>>> rx-vlan-stag-filter: off [fixed]
>>>> vlan-challenged: off [fixed]
>>>> tx-lockless: off [fixed]
>>>> netns-local: off [fixed]
>>>> tx-gso-robust: off [fixed]
>>>> tx-fcoe-segmentation: off [fixed]
>>>> tx-gre-segmentation: off [fixed]
>>>> tx-udp_tnl-segmentation: off [fixed]
>>>> fcoe-mtu: off [fixed]
>>>> tx-nocache-copy: off
>>>> loopback: off [fixed]
>>>> rx-fcs: off [fixed]
>>>> rx-all: off [fixed]
>>>>
>>>> The output is similar on a working (3.9) and a bad (3.10) kernel.
>>>> diff-ing both outputs shows:
>>>> tx-vlan-stag-hw-insert: off [fixed]
>>>> rx-vlan-stag-hw-parse: off [fixed]
>>>> rx-vlan-stag-filter: off [fixed]
>>>> as last on the broken kernels.
>>>>
>>>
>>> I could not reproduce it, I will try it on VM. Meanwhile can you turn
>>> off feature "sg" and try same test ?
>>
>> Hello
>>
>> Your hint was spot-on. With sg off, i can't reproduce the problem and
>> networking seems fine. It works fine either way on 3.9, so this is a
>> regression for 3.10.
>>
> Nice.
> but still this does not look right. Can you tell me driver for the nic?
> `ethtool -i <dev>`

ethtool -i eth0:
driver: hv_netvsc
version: 3.1
firmware-version: N/A
bus-info:
supports-statistics: no
supports-test: no
supports-eeprom-access: no
supports-register-dump: no
supports-priv-flags: no

>
> Thanks,
> Pravin.