Hit the following, after which my network connection was pretty much hosed
This ring any bells? I don't have a reproducer for this, so bisecting is problematic....
[ 1820.832682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000209
[ 1820.832728] RIP: 0010:ipv6_add_addr+0x280/0xd10
[ 1820.832732] Code: 49 8b 1f 0f 84 6a 0a 00 00 48 85 db 0f 84 4e 0a 00 00 48 8b 03 48 8b 53 08 49 89 45 00 49 8b 47 10
49 89 55 08 48 85 c0 74 15 <48> 8b 50 08 48 8b 00 49 89 95 b8 01 00 00 49 89 85 b0 01 00 00 4c
[ 1820.832847] RSP: 0018:ffffaa07c2fd7880 EFLAGS: 00010202
[ 1820.832853] RAX: 0000000000000201 RBX: ffffaa07c2fd79b0 RCX: 0000000000000000
[ 1820.832858] RDX: a4cfbfba2cbfa64c RSI: 0000000000000000 RDI: ffffffff8a8e9fa0
[ 1820.832862] RBP: ffffaa07c2fd7920 R08: 000000000000017a R09: ffffffff8a555300
[ 1820.832866] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888d18e71c00
[ 1820.832871] R13: ffff888d0a9b1200 R14: 0000000000000000 R15: ffffaa07c2fd7980
[ 1820.832876] FS: 00007faa51bdb800(0000) GS:ffff888d1d400000(0000) knlGS:0000000000000000
[ 1820.832880] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 1820.832885] CR2: 0000000000000209 CR3: 000000021e8f8001 CR4: 00000000001606e0
[ 1820.832888] Call Trace:
[ 1820.832898] ? __local_bh_enable_ip+0x119/0x260
[ 1820.832904] ? ipv6_create_tempaddr+0x259/0x5a0
[ 1820.832912] ? __local_bh_enable_ip+0x139/0x260
[ 1820.832921] ipv6_create_tempaddr+0x2da/0x5a0
[ 1820.832926] ? ipv6_create_tempaddr+0x2da/0x5a0
[ 1820.832941] manage_tempaddrs+0x1a5/0x240
[ 1820.832951] inet6_addr_del+0x20b/0x3b0
[ 1820.832959] ? nla_parse+0xce/0x1e0
[ 1820.832968] inet6_rtm_deladdr+0xd9/0x210
[ 1820.832981] rtnetlink_rcv_msg+0x1d4/0x5f0
[ 1820.832989] ? netlink_deliver_tap+0xe2/0x7a0
[ 1820.832996] ? rtnetlink_put_metrics+0x290/0x290
[ 1820.833003] netlink_rcv_skb+0x4d/0x140
[ 1820.833012] rtnetlink_rcv+0x15/0x20
[ 1820.833018] netlink_unicast+0x215/0x340
[ 1820.833027] netlink_sendmsg+0x2e3/0x6a0
[ 1820.833040] sock_sendmsg+0x6a/0xf0
[ 1820.833047] __sys_sendto+0x15d/0x220
[ 1820.833059] ? __sys_recvmsg+0x51/0x90
[ 1820.833068] ? do_syscall_64+0x30/0xee7
[ 1820.833076] __x64_sys_sendto+0x2f/0x50
[ 1820.833082] do_syscall_64+0x98/0xee7
[ 1820.833089] ? trace_hardirqs_off_caller+0x1f/0xd0
[ 1820.833096] ? trace_hardirqs_off_thunk+0x1a/0x1c
[ 1820.833107] entry_SYSCALL_64_after_hwframe+0x49/0xbe
[ 1820.833113] RIP: 0033:0x7faa50ed831d
On 6/7/18 1:17 PM, [email protected] wrote:
> Hit the following, after which my network connection was pretty much hosed
>
> This ring any bells? I don't have a reproducer for this, so bisecting is problematic....
>
> [ 1820.832682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000209
> [ 1820.832728] RIP: 0010:ipv6_add_addr+0x280/0xd10
> [ 1820.832732] Code: 49 8b 1f 0f 84 6a 0a 00 00 48 85 db 0f 84 4e 0a 00 00 48 8b 03 48 8b 53 08 49 89 45 00 49 8b 47 10
> 49 89 55 08 48 85 c0 74 15 <48> 8b 50 08 48 8b 00 49 89 95 b8 01 00 00 49 89 85 b0 01 00 00 4c
> [ 1820.832847] RSP: 0018:ffffaa07c2fd7880 EFLAGS: 00010202
> [ 1820.832853] RAX: 0000000000000201 RBX: ffffaa07c2fd79b0 RCX: 0000000000000000
> [ 1820.832858] RDX: a4cfbfba2cbfa64c RSI: 0000000000000000 RDI: ffffffff8a8e9fa0
> [ 1820.832862] RBP: ffffaa07c2fd7920 R08: 000000000000017a R09: ffffffff8a555300
> [ 1820.832866] R10: 0000000000000000 R11: 0000000000000000 R12: ffff888d18e71c00
> [ 1820.832871] R13: ffff888d0a9b1200 R14: 0000000000000000 R15: ffffaa07c2fd7980
> [ 1820.832876] FS: 00007faa51bdb800(0000) GS:ffff888d1d400000(0000) knlGS:0000000000000000
> [ 1820.832880] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 1820.832885] CR2: 0000000000000209 CR3: 000000021e8f8001 CR4: 00000000001606e0
> [ 1820.832888] Call Trace:
> [ 1820.832898] ? __local_bh_enable_ip+0x119/0x260
> [ 1820.832904] ? ipv6_create_tempaddr+0x259/0x5a0
> [ 1820.832912] ? __local_bh_enable_ip+0x139/0x260
> [ 1820.832921] ipv6_create_tempaddr+0x2da/0x5a0
> [ 1820.832926] ? ipv6_create_tempaddr+0x2da/0x5a0
> [ 1820.832941] manage_tempaddrs+0x1a5/0x240
> [ 1820.832951] inet6_addr_del+0x20b/0x3b0
> [ 1820.832959] ? nla_parse+0xce/0x1e0
> [ 1820.832968] inet6_rtm_deladdr+0xd9/0x210
> [ 1820.832981] rtnetlink_rcv_msg+0x1d4/0x5f0
I am the most likely guilty party. I have been staring at the code for
this stack trace for a while and nothing jumps out. Can you send me the
kernel config?
On Thu, 07 Jun 2018 16:49:07 -0700, David Ahern said:
> On 6/7/18 1:17 PM, [email protected] wrote:
> > [ 1820.832682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000209
> > [ 1820.832728] RIP: 0010:ipv6_add_addr+0x280/0xd10
> > [ 1820.832888] Call Trace:
> > [ 1820.832898] ? __local_bh_enable_ip+0x119/0x260
> > [ 1820.832904] ? ipv6_create_tempaddr+0x259/0x5a0
> > [ 1820.832912] ? __local_bh_enable_ip+0x139/0x260
> > [ 1820.832921] ipv6_create_tempaddr+0x2da/0x5a0
> > [ 1820.832926] ? ipv6_create_tempaddr+0x2da/0x5a0
> > [ 1820.832941] manage_tempaddrs+0x1a5/0x240
> > [ 1820.832951] inet6_addr_del+0x20b/0x3b0
> > [ 1820.832959] ? nla_parse+0xce/0x1e0
> > [ 1820.832968] inet6_rtm_deladdr+0xd9/0x210
> > [ 1820.832981] rtnetlink_rcv_msg+0x1d4/0x5f0
>
> I am the most likely guilty party. I have been staring at the code for
> this stack trace for a while and nothing jumps out. Can you send me the
> kernel config?
Attached. Note that this one happened while I was on wireless at work,
where we're *heavily* IPv6 (I've had days where I'll work for 2-3 hours before
I notice that IPv4 didn't dhcp and I've been ipv6-only the whole time.
Also, the interface was config'ed as:
conf/wlp3s0b1/temp_prefered_lft:86400
conf/wlp3s0b1/temp_valid_lft:604800
conf/wlp3s0b1/use_tempaddr:2
On 6/7/18 5:03 PM, [email protected] wrote:
> On Thu, 07 Jun 2018 16:49:07 -0700, David Ahern said:
>> On 6/7/18 1:17 PM, [email protected] wrote:
>
>>> [ 1820.832682] BUG: unable to handle kernel NULL pointer dereference at 0000000000000209
>>> [ 1820.832728] RIP: 0010:ipv6_add_addr+0x280/0xd10
>
>>> [ 1820.832888] Call Trace:
>>> [ 1820.832898] ? __local_bh_enable_ip+0x119/0x260
>>> [ 1820.832904] ? ipv6_create_tempaddr+0x259/0x5a0
>>> [ 1820.832912] ? __local_bh_enable_ip+0x139/0x260
>>> [ 1820.832921] ipv6_create_tempaddr+0x2da/0x5a0
>>> [ 1820.832926] ? ipv6_create_tempaddr+0x2da/0x5a0
>>> [ 1820.832941] manage_tempaddrs+0x1a5/0x240
>>> [ 1820.832951] inet6_addr_del+0x20b/0x3b0
>>> [ 1820.832959] ? nla_parse+0xce/0x1e0
>>> [ 1820.832968] inet6_rtm_deladdr+0xd9/0x210
>>> [ 1820.832981] rtnetlink_rcv_msg+0x1d4/0x5f0
>>
>> I am the most likely guilty party. I have been staring at the code for
>> this stack trace for a while and nothing jumps out. Can you send me the
>> kernel config?
>
> Attached. Note that this one happened while I was on wireless at work,
> where we're *heavily* IPv6 (I've had days where I'll work for 2-3 hours before
> I notice that IPv4 didn't dhcp and I've been ipv6-only the whole time.
>
> Also, the interface was config'ed as:
>
> conf/wlp3s0b1/temp_prefered_lft:86400
> conf/wlp3s0b1/temp_valid_lft:604800
> conf/wlp3s0b1/use_tempaddr:2
>
I know you don't have a reliable reproducer, but I did find one spot
where I was too clever and did not initialize a new cfg variable:
diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
index 89019bf59f46..59c22a25e654 100644
--- a/net/ipv6/addrconf.c
+++ b/net/ipv6/addrconf.c
@@ -1324,6 +1324,7 @@ static int ipv6_create_tempaddr(struct
inet6_ifaddr *ifp,
}
}
+ memset(&cfg, 0, sizeof(cfg));
cfg.valid_lft = min_t(__u32, ifp->valid_lft,
idev->cnf.temp_valid_lft + age);
cfg.preferred_lft = cnf_temp_preferred_lft + age -
idev->desync_factor;
On Thu, Jun 7, 2018 at 5:51 PM, David Ahern <[email protected]> wrote:
>
> ...
> I know you don't have a reliable reproducer, but I did find one spot
> where I was too clever and did not initialize a new cfg variable:
>
> diff --git a/net/ipv6/addrconf.c b/net/ipv6/addrconf.c
> index 89019bf59f46..59c22a25e654 100644
> --- a/net/ipv6/addrconf.c
> +++ b/net/ipv6/addrconf.c
> @@ -1324,6 +1324,7 @@ static int ipv6_create_tempaddr(struct
> inet6_ifaddr *ifp,
> }
> }
>
> + memset(&cfg, 0, sizeof(cfg));
> cfg.valid_lft = min_t(__u32, ifp->valid_lft,
> idev->cnf.temp_valid_lft + age);
> cfg.preferred_lft = cnf_temp_preferred_lft + age -
> idev->desync_factor;
This works for me. Great!
Thanks,
-- Dexuan