2012-06-28 13:36:51

by Xiaotian Feng

[permalink] [raw]
Subject: [RFC PATCH net-next] ipvs: add missing lock in ip_vs_ftp_init_conn()

We met a kernel panic in 2.6.32.43 kernel:

[2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
<snip>
[2680311.849009] general protection fault: 0000 [#1] SMP
[2680311.853001] RIP: 0010:[<ffffffff815f155c>] [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
[2680311.853001] RSP: 0018:ffff880028303e70 EFLAGS: 00010202
[2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
[2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
<snip>
[2680311.853001] Call Trace:
[2680311.853001] <IRQ>
[2680311.853001] [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
[2680311.853001] [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
[2680311.853001] [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
[2680311.853001] [<ffffffff81049a13>] __do_softirq+0xb3/0x150
[2680311.853001] [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
[2680311.853001] [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
[2680311.853001] [<ffffffff81049957>] irq_exit+0x77/0x80
[2680311.853001] [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
[2680311.853001] [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
[2680311.853001] <EOI>
[2680311.853001] [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
[2680311.853001] [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
[2680311.853001] [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
[2680311.853001] [<ffffffff816d504d>] ? start_secondary+0x19d/0x280

rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
connection and result the general protect fault.

The "request for already hashed" warning, told us someone might change the connection flags
incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
and list_del() it, then kernel panic happened.

After code review, the only chance that kernel change connection flag without protection is
in ip_vs_ftp_init_conn().

Signed-off-by: Xiaotian Feng <[email protected]>
Cc: Wensong Zhang <[email protected]>
Cc: Simon Horman <[email protected]>
Cc: Julian Anastasov <[email protected]>
Cc: Pablo Neira Ayuso <[email protected]>
Cc: Patrick McHardy <[email protected]>
Cc: "David S. Miller" <[email protected]>
---
net/netfilter/ipvs/ip_vs_ftp.c | 2 ++
1 files changed, 2 insertions(+), 0 deletions(-)

diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
index b20b29c..c2bc264 100644
--- a/net/netfilter/ipvs/ip_vs_ftp.c
+++ b/net/netfilter/ipvs/ip_vs_ftp.c
@@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
static int
ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
{
+ spin_lock(&cp->lock);
/* We use connection tracking for the command connection */
cp->flags |= IP_VS_CONN_F_NFCT;
+ spin_unlock(&cp->lock);
return 0;
}

--
1.7.1


2012-06-29 00:17:45

by Julian Anastasov

[permalink] [raw]
Subject: Re: [RFC PATCH net-next] ipvs: add missing lock in ip_vs_ftp_init_conn()


Hello,

On Thu, 28 Jun 2012, Xiaotian Feng wrote:

> We met a kernel panic in 2.6.32.43 kernel:
>
> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> <snip>
> [2680311.849009] general protection fault: 0000 [#1] SMP
> [2680311.853001] RIP: 0010:[<ffffffff815f155c>] [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
> [2680311.853001] RSP: 0018:ffff880028303e70 EFLAGS: 00010202
> [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
> [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
> <snip>
> [2680311.853001] Call Trace:
> [2680311.853001] <IRQ>
> [2680311.853001] [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
> [2680311.853001] [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
> [2680311.853001] [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
> [2680311.853001] [<ffffffff81049a13>] __do_softirq+0xb3/0x150
> [2680311.853001] [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
> [2680311.853001] [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
> [2680311.853001] [<ffffffff81049957>] irq_exit+0x77/0x80
> [2680311.853001] [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
> [2680311.853001] [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
> [2680311.853001] <EOI>
> [2680311.853001] [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
> [2680311.853001] [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
> [2680311.853001] [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
> [2680311.853001] [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
>
> rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
> connection and result the general protect fault.
>
> The "request for already hashed" warning, told us someone might change the connection flags
> incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
> put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
> Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
> and list_del() it, then kernel panic happened.
>
> After code review, the only chance that kernel change connection flag without protection is
> in ip_vs_ftp_init_conn().

Hm, ip_vs_ftp_init_conn is called before 1st hashing,
from ip_vs_bind_app() in ip_vs_conn_new() before
ip_vs_conn_hash(). It should be another problem with
the flags. How different is IPVS in 2.6.32.43 compared to
recent kernels? If commit aea9d711 is present, I'm not
aware of other similar problems.

May be you can post some details for your setup on
[email protected]. I assume FTP is used,
what about master-backup sync? Can you confirm that
this fix solves the problem or it happens in rare cases?

> Signed-off-by: Xiaotian Feng <[email protected]>
> Cc: Wensong Zhang <[email protected]>
> Cc: Simon Horman <[email protected]>
> Cc: Julian Anastasov <[email protected]>
> Cc: Pablo Neira Ayuso <[email protected]>
> Cc: Patrick McHardy <[email protected]>
> Cc: "David S. Miller" <[email protected]>
> ---
> net/netfilter/ipvs/ip_vs_ftp.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
> index b20b29c..c2bc264 100644
> --- a/net/netfilter/ipvs/ip_vs_ftp.c
> +++ b/net/netfilter/ipvs/ip_vs_ftp.c
> @@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
> static int
> ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
> {
> + spin_lock(&cp->lock);
> /* We use connection tracking for the command connection */
> cp->flags |= IP_VS_CONN_F_NFCT;
> + spin_unlock(&cp->lock);
> return 0;
> }
>
> --
> 1.7.1

Regards

--
Julian Anastasov <[email protected]>

2012-06-29 00:34:52

by Simon Horman

[permalink] [raw]
Subject: Re: [RFC PATCH net-next] ipvs: add missing lock in ip_vs_ftp_init_conn()

On Thu, Jun 28, 2012 at 09:36:27PM +0800, Xiaotian Feng wrote:
> We met a kernel panic in 2.6.32.43 kernel:
>
> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> <snip>
> [2680311.849009] general protection fault: 0000 [#1] SMP
> [2680311.853001] RIP: 0010:[<ffffffff815f155c>] [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
> [2680311.853001] RSP: 0018:ffff880028303e70 EFLAGS: 00010202
> [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
> [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
> <snip>
> [2680311.853001] Call Trace:
> [2680311.853001] <IRQ>
> [2680311.853001] [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
> [2680311.853001] [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
> [2680311.853001] [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
> [2680311.853001] [<ffffffff81049a13>] __do_softirq+0xb3/0x150
> [2680311.853001] [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
> [2680311.853001] [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
> [2680311.853001] [<ffffffff81049957>] irq_exit+0x77/0x80
> [2680311.853001] [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
> [2680311.853001] [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
> [2680311.853001] <EOI>
> [2680311.853001] [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
> [2680311.853001] [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
> [2680311.853001] [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
> [2680311.853001] [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
>
> rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
> connection and result the general protect fault.
>
> The "request for already hashed" warning, told us someone might change the connection flags
> incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
> put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
> Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
> and list_del() it, then kernel panic happened.
>
> After code review, the only chance that kernel change connection flag without protection is
> in ip_vs_ftp_init_conn().

Thanks for being thorough.

> Signed-off-by: Xiaotian Feng <[email protected]>
> Cc: Wensong Zhang <[email protected]>
> Cc: Simon Horman <[email protected]>
> Cc: Julian Anastasov <[email protected]>
> Cc: Pablo Neira Ayuso <[email protected]>
> Cc: Patrick McHardy <[email protected]>
> Cc: "David S. Miller" <[email protected]>

Acked-by: Simon Horman <[email protected]>

Pablo, can you handle this directly through your tree?
I think it also needs to go to -stable.

> ---
> net/netfilter/ipvs/ip_vs_ftp.c | 2 ++
> 1 files changed, 2 insertions(+), 0 deletions(-)
>
> diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
> index b20b29c..c2bc264 100644
> --- a/net/netfilter/ipvs/ip_vs_ftp.c
> +++ b/net/netfilter/ipvs/ip_vs_ftp.c
> @@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
> static int
> ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
> {
> + spin_lock(&cp->lock);
> /* We use connection tracking for the command connection */
> cp->flags |= IP_VS_CONN_F_NFCT;
> + spin_unlock(&cp->lock);
> return 0;
> }

2012-06-29 01:51:04

by Xiaotian Feng

[permalink] [raw]
Subject: Re: [RFC PATCH net-next] ipvs: add missing lock in ip_vs_ftp_init_conn()

On Fri, Jun 29, 2012 at 8:17 AM, Julian Anastasov <[email protected]> wrote:
>
>        Hello,
>
> On Thu, 28 Jun 2012, Xiaotian Feng wrote:
>
>> We met a kernel panic in 2.6.32.43 kernel:
>>
>> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
>> <snip>
>> [2680311.849009] general protection fault: 0000 [#1] SMP
>> [2680311.853001] RIP: 0010:[<ffffffff815f155c>]  [<ffffffff815f155c>] ip_vs_conn_expire+0xdc/0x2f0
>> [2680311.853001] RSP: 0018:ffff880028303e70  EFLAGS: 00010202
>> [2680311.853001] RAX: dead000000200200 RBX: ffff8801aad00b80 RCX: 0000000000001d90
>> [2680311.853001] RDX: dead000000100100 RSI: 000000004fd59800 RDI: ffff8801aad00c08
>> <snip>
>> [2680311.853001] Call Trace:
>> [2680311.853001]  <IRQ>
>> [2680311.853001]  [<ffffffff815f1480>] ? ip_vs_conn_expire+0x0/0x2f0
>> [2680311.853001]  [<ffffffff8104e2a5>] run_timer_softirq+0x175/0x1d0
>> [2680311.853001]  [<ffffffff81021a48>] ? lapic_next_event+0x18/0x20
>> [2680311.853001]  [<ffffffff81049a13>] __do_softirq+0xb3/0x150
>> [2680311.853001]  [<ffffffff8100cc5c>] call_softirq+0x1c/0x30
>> [2680311.853001]  [<ffffffff8100ea9a>] do_softirq+0x4a/0x80
>> [2680311.853001]  [<ffffffff81049957>] irq_exit+0x77/0x80
>> [2680311.853001]  [<ffffffff81021f2c>] smp_apic_timer_interrupt+0x6c/0xa0
>> [2680311.853001]  [<ffffffff8100c633>] apic_timer_interrupt+0x13/0x20
>> [2680311.853001]  <EOI>
>> [2680311.853001]  [<ffffffff81013b52>] ? mwait_idle+0x52/0x70
>> [2680311.853001]  [<ffffffff8100a7b0>] ? enter_idle+0x20/0x30
>> [2680311.853001]  [<ffffffff8100ac62>] ? cpu_idle+0x52/0x80
>> [2680311.853001]  [<ffffffff816d504d>] ? start_secondary+0x19d/0x280
>>
>> rax and rdx is LIST_POISON1 and LIST_POISON2, so kernel is list_del() on an already deleted
>> connection and result the general protect fault.
>>
>> The "request for already hashed" warning, told us someone might change the connection flags
>> incorrectly, like described in commit aea9d711, it changes the connection flags, but doesn't
>> put the connection back to the list. So ip_vs_conn_hash() throw a warning and return.
>> Later, when ip_vs_conn_expire fire again, ip_vs_conn_unhash() will find the HASHED connection
>> and list_del() it, then kernel panic happened.
>>
>> After code review, the only chance that kernel change connection flag without protection is
>> in ip_vs_ftp_init_conn().
>
>        Hm, ip_vs_ftp_init_conn is called before 1st hashing,
> from ip_vs_bind_app() in ip_vs_conn_new() before
> ip_vs_conn_hash(). It should be another problem with
> the flags. How different is IPVS in 2.6.32.43 compared to
> recent kernels? If commit aea9d711 is present, I'm not
> aware of other similar problems.

ip_vs_bind_app() is also called by ip_vs_try_bind_dest(), which can be
traced to ip_vs_proc_conn().
I've checked the changes in upstream, but nothing helps since aea9d711
has been taken into 2.6.32.28 kernel.

>
>        May be you can post some details for your setup on
> [email protected]. I assume FTP is used,
> what about master-backup sync? Can you confirm that
> this fix solves the problem or it happens in rare cases?

Yes, ftp and WRR scheduler is used. Unfortunately, the oops
is unreproducible :(

>
>> Signed-off-by: Xiaotian Feng <[email protected]>
>> Cc: Wensong Zhang <[email protected]>
>> Cc: Simon Horman <[email protected]>
>> Cc: Julian Anastasov <[email protected]>
>> Cc: Pablo Neira Ayuso <[email protected]>
>> Cc: Patrick McHardy <[email protected]>
>> Cc: "David S. Miller" <[email protected]>
>> ---
>>  net/netfilter/ipvs/ip_vs_ftp.c |    2 ++
>>  1 files changed, 2 insertions(+), 0 deletions(-)
>>
>> diff --git a/net/netfilter/ipvs/ip_vs_ftp.c b/net/netfilter/ipvs/ip_vs_ftp.c
>> index b20b29c..c2bc264 100644
>> --- a/net/netfilter/ipvs/ip_vs_ftp.c
>> +++ b/net/netfilter/ipvs/ip_vs_ftp.c
>> @@ -65,8 +65,10 @@ static int ip_vs_ftp_pasv;
>>  static int
>>  ip_vs_ftp_init_conn(struct ip_vs_app *app, struct ip_vs_conn *cp)
>>  {
>> +     spin_lock(&cp->lock);
>>       /* We use connection tracking for the command connection */
>>       cp->flags |= IP_VS_CONN_F_NFCT;
>> +     spin_unlock(&cp->lock);
>>       return 0;
>>  }
>>
>> --
>> 1.7.1
>
> Regards
>
> --
> Julian Anastasov <[email protected]>

2012-06-29 08:59:49

by Julian Anastasov

[permalink] [raw]
Subject: Re: [RFC PATCH net-next] ipvs: add missing lock in ip_vs_ftp_init_conn()


Hello,

On Fri, 29 Jun 2012, Xiaotian Feng wrote:

> > On Thu, 28 Jun 2012, Xiaotian Feng wrote:
> >
> >> We met a kernel panic in 2.6.32.43 kernel:
> >>
> >> [2680191.848044] IPVS: ip_vs_conn_hash(): request for already hashed, called from run_timer_softirq+0x175/0x1d0
> >> <snip>
> >> [2680311.849009] general protection fault: 0000 [#1] SMP

What we see here is 120 seconds between 2680191 and
2680311. It can mean 2 things:

- some state timeout, it depends on your forwarding method.
What is it? NAT? DR?

- 60 seconds for ip_vs_conn_expire retries

> >> After code review, the only chance that kernel change connection flag without protection is
> >> in ip_vs_ftp_init_conn().
> >
> >        Hm, ip_vs_ftp_init_conn is called before 1st hashing,
> > from ip_vs_bind_app() in ip_vs_conn_new() before
> > ip_vs_conn_hash(). It should be another problem with
> > the flags. How different is IPVS in 2.6.32.43 compared to
> > recent kernels? If commit aea9d711 is present, I'm not
> > aware of other similar problems.
>
> ip_vs_bind_app() is also called by ip_vs_try_bind_dest(), which can be
> traced to ip_vs_proc_conn().
> I've checked the changes in upstream, but nothing helps since aea9d711
> has been taken into 2.6.32.28 kernel.

OK, this fix should make it safe for master-backup
sync and it should be applied but I suspect you are not
using sync, right? And then this fix will not solve the oops.

There are no many places that rehash conn:

ip_vs_conn_fill_cport
- used for FTP

ip_vs_check_template:
- do you have persistence configured?

After you provide details for the used forwarding
method, persistence and sync we should think how such races
with rehashing can lead to double hlist_del. May be
you can modify the debug message in ip_vs_conn_hash, so
that we can see cp->flags and ntohs of cp->cport, cp->dport
and cp->vport when oops happens again.

Regards

--
Julian Anastasov <[email protected]>