2004-09-12 15:45:44

by Wolfpaw - Dale Corse

[permalink] [raw]
Subject: RE: Linux 2.4.27 SECURITY BUG - TCP Local and REMOTE(verified) Denial of Service Attack

Hi Willy,

No problem :) I run the following, against SSH as the target, and I
can also kill it. (using telnet as the other side of the attack)

root@magik:/etc# telnet 0.0.0.0 22
Trying 0.0.0.0...
Connected to 0.0.0.0.
Escape character is '^]'.
Connection closed by foreign host.
root@magik:/etc# telnet 0.0.0.0 23
Trying 0.0.0.0...
Connected to 0.0.0.0.
Escape character is '^]'.

Magik login: Connection closed by foreign host.
root@magik:/etc#

And from a remote host:

root@maximus:/home/admin# telnet XXXXXXXXXXXXXXX 22
Trying XXXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXXXXXX.
Escape character is '^]'.
Connection closed by foreign host.
root@maximus:/home/admin#

And it gets worse now..:

root@avalon:/root# telnet XXXXXXXXXXXXXXX
Trying XXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXX.
Escape character is '^]'.
telnetd: All network ports in use.
Connection closed by foreign host.
root@avalon:/root# telnet XXXXXXXXXXXXXX 22
Trying XXXXXXXXXXXXXX...
Connected to XXXXXXXXXXXXXXXXX.
Escape character is '^]'.
Connection closed by foreign host.
root@avalon:/root#

Well well.. We have ourselves a Remote Denial of Service tool.

Now.. Do you really want me to post the source code for it?

I wouldn't want to upset David again. This has basically
disabled interactive administration on that machine, by taking
out both ssh and telnet at the same time. If you want to demo
it, I can send it to you privately.. I'm a bit apprehensive
about releasing a 'ready-to-rock' remote DoS exploit on a
list though :(

Just think here a moment.. Lets say I modify it a bit more,
and turn it into a DDOS utility, so now you have (pardon my
language) .. An assload of these coming at your server, how
are you going to stop it? Simply - you can't, and your server
will run out of sockets long before all the remote hosts do.

This one in essence takes a nice working tcp connection application,
and removes the close statements, which as you mentioned will cause
the sockets to end up in that state. What I am attempting to demonstrate
here, is the fact it can relatively easily take out any tcp based app,
and simply saying, something should be done about it. What the actual
bug is, I think I know (and have said), but I will leave that determination
to the actual kernel developers.

I was not attempting to say you should break TCP with the timeouts, or
even make them short, I just would tend to think no timeout at ALL is
a bit of a design flaw, because if the other end is no longer there,
the work will never get done, but one of the ends expects the response
indefinitely, which to me looks like an assumption. I think we have all
learned these days that you can't make anymore assumptions, people use
those to break things.

I also would like to thank you for engaging in a discussion about the
bug with me, in a polite manner, instead of simply writing me off as
some loud mouth neophyte.

Anyway - from my view, this is a bug in the OS, because it should not
occur, if it does, we need to find a way to ensure it doesn't. I know
a few firewall tricks that might stop it, but I'm not sure. If a regular
user can invoke this kind of response so easily, I would say it's a bad
thing.

Regards,
Dale.
> -----Original Message-----
> From: Dale Corse [mailto:[email protected]]
> Sent: Sunday, September 12, 2004 8:58 AM
> To: Dale
> Subject: FW: Linux 2.4.27 SECURITY BUG - TCP Local (probable
> Remote) Denial of Service
>
>
>
>
> -----Original Message-----
> From: Willy Tarreau [mailto:[email protected]]
> Sent: Sunday, September 12, 2004 4:36 AM
> To: Wolfpaw - Dale Corse
> Cc: [email protected]; [email protected]
> Subject: Re: Linux 2.4.27 SECURITY BUG - TCP Local (probable
> Remote) Denial of Service
>
>
> On Sun, Sep 12, 2004 at 03:24:11AM -0600, Wolfpaw - Dale Corse wrote:
>
> > This is the odd part, try the exploit,
>
> I have nothing to try it right here.
>
> > they are detached in
> > the list, but it appears apache isn't aware of that. If you run the
> > code, and do multiple telnets from another window, you will
> see that
> > there are occurrences where a connection can't be established, and
> > this is where the problem is. I used a stock version of Mysql 3
> > (latest stable), stock apache, on an unmoded Linux box
> (except it had
> > GrSecurity) and I was able to see a noticeable slowdown in web
> > transactions with a browser. I was also the only person hitting the
> > machine.
>
> How can you be sure that your problem is not simply related
> to either apache or mysql not freeing the connection fast
> enough ? Apache is very limited in terms of simultaneous
> connections, and it is trivial for anyone to block an apache
> server by establishing as many connections as it can handle,
> sending the start of a request and doing nothing more (and it
> has a very long default time out BTW). It might be the same
> with mysql.
>
> > I am not saying you are incorrect, I'm simply clarifying
> what seems to
> > be occurring with the issue I found.
> >
> > Do you happen to know of any solution for sockets stuck in
> CLOSE_WAIT,
> > they seem to stick around forever.
>
> Yes, the only solution is to debug the process and make it
> sanely close the socket once it does not need it anymore.
> Usually, in such circumstances, you'll find that an strace on
> the process shows either :
> - a select loop with your socket in the list of the active FDs, but
> nothing in the process will do anything on this FD and the process
> will go back to the select loop => bug in FD handling
> - a select loop which does not include the FD while it has
> not been released
> => bug in FD releasing code (usually a missing close).
>
> > This bug may be more Mysql then kernel, I don't know - I still would
> > tend to think these connections should not be clogging up the
> > applications connection queue, and that CLOSE_WAIT should have a
> > settable timeout, regardless of what the RFC says about it.
>
> No, CLOSE_WAIT means that the application still has some work
> to do. Under no circumstances, the kernel should destroy its
> ability to work normally !
>
> > I did experience more CLOSE_WAIT's stuck at one point with
> Mysql.. we
> > had an issue wherein after calling mysql_close with the C
> API it was
> > still leaving the sessions established, so I had moved the
> timeout on
> > that sql daemon to 20 seconds (its all fast transactions) .. This
> > caused a lot of CLOSE_WAIT issues for some reason.
>
> So you've just demonstrated that it's mysql_close which is
> the culprit. If it does not really close the connection while
> you expected it to, it is the real problem. If you lower the
> mysql timeout, mysql will close on its end, but as long as
> the code using mysql_close() will not close, of course the
> socket will remain close_wait. And to be clear, even if you
> would have a short CLOSE_WAIT time-out, it would not help
> because you would still be running out of file-descriptors
> after a moment.
>
> > We then
> > added something that would go through and use 'close' on
> the fd of the
> > Mysql connection, after mysql_close was called. This had the odd
> > effect of the fd being reused by a connection, before it was out of
> > CLOSE_WAIT and actually closed, so it would close the new
> Connection,
> > and also the old one :P which led us to this discovery that
> connect()
> > appears to reuse FD's before they are actually fully
> closed.. This is
> > how it appears anyway. Thus my use of specifically mysql
> and connect
> > in the PoC code.
>
> If you manage to write a PoC code which does not involve
> either apache not mysql, and which still exhibits the
> described behaviour, then perhaps kernel developpers will
> listen a bit more, but at the moment, you only showed us how
> you could trigger a DoS by connecting to a buggy application.
>
> Cheers,
> Willy
>
>
> --------------------------------------------------------------
> --------------
> -
> This message has been scanned for Spam and Viruses by ClamAV
> and SpamAssassin
> --------------------------------------------------------------
> --------------
> -
>
>
>
>
>


2004-09-12 16:48:00

by Petri Kaukasoina

[permalink] [raw]
Subject: Re: Linux 2.4.27 SECURITY BUG - TCP Local and REMOTE(verified) Denial of Service Attack

On Sun, Sep 12, 2004 at 09:45:38AM -0600, Wolfpaw - Dale Corse wrote:
> No problem :) I run the following, against SSH as the target, and I
> can also kill it. (using telnet as the other side of the attack)
>
> root@maximus:/home/admin# telnet XXXXXXXXXXXXXXX 22
> Trying XXXXXXXXXXXXXX...
> Connected to XXXXXXXXXXXXXXXXX.
> Escape character is '^]'.
> Connection closed by foreign host.
> root@maximus:/home/admin#

> Now.. Do you really want me to post the source code for it?

With default sshd_config you can DOS sshd trivially by opening ten
connections using ten times "telnet XXXXXXXXXXXXXXX 22".

2004-09-12 18:00:16

by Willy Tarreau

[permalink] [raw]
Subject: Re: Linux 2.4.27 SECURITY BUG - TCP Local and REMOTE(verified) Denial of Service Attack

Hi Dale,

I've tried your code right here.
The "attacker" was 10.0.3.1, and the victim 10.0.3.2.

I could successfully generate 1 CLOSE_WAIT on the victim with your program.
It was on port 23 and attached to inetd as fd #3. So I killed inetd, the
connection was then freed, and restarted it.

I changed the code slightly to be able to pass IP/ports as arguments.
On the victim, I straced inetd (pid 1013), and captured all TCP traffic
on port 23.

attacker> ./tcpnclose2 10.0.3.2 22 10.0.3.2 23

I stopped it when it was shouting at me :
socket failed.Connecting to 10.0.3.2:22 (FD: -1)... FAILED: UNKNOWN ERROR.
socket failed.Connecting to 10.0.3.2:23 (FD: -1)... FAILED: UNKNOWN ERROR.

Then, on the victim :

victim> sudo netstat -atnp|grep -v LISTEN
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 1 0 10.0.3.2:23 10.0.3.1:34058 CLOSE_WAIT 1013/inetd

victim> tcpdump -Svnr capture-victim.cap tcp port 34058
reading from file capture-victim.cap, link-type EN10MB (Ethernet)
19:05:10.360728 IP (tos 0x0, ttl 64, id 8168, offset 0, flags [DF], length: 48) 10.0.3.1.34058 > 10.0.3.2.23: S [tcp sum ok] 2882867180:2882867180(0) win 15920 <mss 7960,nop,nop,sackOK>
19:05:10.360764 IP (tos 0x0, ttl 64, id 0, offset 0, flags [DF], length: 48) 10.0.3.2.23 > 10.0.3.1.34058: S [tcp sum ok] 2614211278:2614211278(0) ack 2882867181 win 5840 <mss 1460,nop,nop,sackOK>
19:05:10.360863 IP (tos 0x0, ttl 64, id 8169, offset 0, flags [DF], length: 40) 10.0.3.1.34058 > 10.0.3.2.23: . [tcp sum ok] ack 2614211279 win 15920
19:06:17.668670 IP (tos 0x0, ttl 64, id 8170, offset 0, flags [DF], length: 40) 10.0.3.1.34058 > 10.0.3.2.23: F [tcp sum ok] 2882867181:2882867181(0) ack 2614211279 win 15920
19:06:17.671102 IP (tos 0x0, ttl 64, id 11127, offset 0, flags [DF], length: 40) 10.0.3.2.23 > 10.0.3.1.34058: . [tcp sum ok] ack 2882867182 win 5840

==> We see that the victim (10.0.3.2) did not send the FIN in return.

Now let's take a closer look at inetd :

victim> cat /proc/net/tcp
sl local_address rem_address st tx_queue rx_queue tr tm->when retrnsmt uid timeout inode
16: 0203000A:0017 0103000A:850A 08 00000000:00000001 00:00000000 00000000 0 0 6420 1 d5dac400 1500 20 0 2 -1

==> The socket (state 8 = CLOSE_WAIT) is bound to inode 6420.

victim> sudo ls -l /proc/1013/fd/|grep 6420
lrwx------ 1 root root 64 Sep 12 19:28 3 -> socket:[6420]

==> Again, it's FD #3.

I restarted strace on inetd, and noticed that fd#3 was not in the select fd
list anymore (remember one of the two cases I spoke about a few hours ago ?) :
victim> strace -p 1013
select(22, [4 5 6 7 8 9 11 12 13 14 15 16 17 18 19 20 21], NULL, NULL, NULL <unfinished ...>

Then, I took a look at the strace capture (184 MB !), to which I inserted line
numbers for better readability :

1:1013 accept(10, 0, NULL) = 3
2:1013 fcntl64(10, F_SETFL, O_RDONLY) = 0
3:1013 rt_sigprocmask(SIG_BLOCK, [HUP ALRM CHLD], NULL, 8) = 0
4:1013 fork() = 1108
5:1013 rt_sigprocmask(SIG_SETMASK, [], NULL, 8) = 0
6:1013 close(3) = 0

This was the last but one connection assigned to fd #3. As you see, it's
finally closed. But a few lines later :

7:1013 select(22, [4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21], NULL, NULL, NULL) = 1 (in [10])
8:1013 fcntl64(10, F_SETFL, O_RDONLY|O_NONBLOCK) = 0
9:1013 accept(10, 0, NULL) = 3
10:1013 fcntl64(10, F_SETFL, O_RDONLY) = 0
11:1013 rt_sigprocmask(SIG_BLOCK, [HUP ALRM CHLD], NULL, 8) = 0
12:1013 gettimeofday({1095008773, 685550}, NULL) = 0

The FD gets re-used, but is never scanned anymore, so never closed either :

35:1013 select(22, [4 5 6 7 8 9 11 12 13 14 15 16 17 18 19 20 21], NULL, NULL, NULL <unfinished ...>

Conclusion :
============

The problem is within inetd. In my case it could be because it was a bit
old (1999), but since you have it too, it might indicate an old bug. The
fact that it affects mysql too does not prove that the problem is in the
kernel, and I suspect that for whatever reason, there are some race
conditions in these two programs if the connection is either reused or
closed very quickly.

To demonstrate this, I've run your program against my reverse-proxy,
haproxy, which I fortunately happen to know better than these other
programs. I could not manage to get even a CLOSE_WAIT session after
several attempts. All connections are closed normally, and as you'll
see with this extract from strace, the polled file-descriptors are
active once you kill the attacker :

(...)
close(593) = 0
select(684, [3 5 603 604 605 606 607 608 609 610 611 612 613 614 615 616 617
618 619 620 621 622 623 624 625 626 627 628 629 630 631 632 633 634 635 636
637 638 639 640 641 642 643 644 645 646 647 648 649 650 651 652 653 654 655
656 657 658 659 660 661 662 663 664 665 666 667 668 669 670 671 672 673 674
675 676 677 678 679 680 681 682 683], NULL, NULL, {4, 835000}) = 81 (in [603
604 605 606 607 608 609 610 611 612 613 614 615 616 617 618 619 620 621 622
623 624 625 626 627 628 629 630 631 632 633 634 635 636 637 638 639 640 641
642 643 644 645 646 647 648 649 650 651 652 653 654 655 656 657 658 659 660
661 662 663 664 665 666 667 668 669 670 671 672 673 674 675 676 677 678 679
680 681 682 683], left {4, 836000})
gettimeofday({1095011266, 506966}, NULL) = 0
recv(603, "", 4096, 0x4000) = 0
recv(604, "", 4096, 0x4000) = 0
recv(605, "", 4096, 0x4000) = 0
(...)
close(605) = 0
close(604) = 0
close(603) = 0
select(6, [3 5], NULL, NULL, NULL <unfinished ...>

So I believe you'll have to dig into some programs because at least you found
a vulnerability in both inetd and mysql :-)

Regards,
Willy

2004-09-12 18:18:55

by Willy Tarreau

[permalink] [raw]
Subject: Re: Linux 2.4.27 SECURITY BUG - TCP Local and REMOTE(verified) Denial of Service Attack

Hi again, Dale,

I forgot to say that you don't need to fear releasing your exploit. I
developped its equivalent 4 years ago to stress-test web servers and
proxies, and if I launch it against victim:23, I get the exact same
result within seconds : a CLOSE_WAIT socket :

attacker> ./connectdata 10.0.3.2 23 200 1
ERROR: connect()=-1, nbconn=134 : Connection refused
ERROR: connect()=-1, nbconn=135 : Connection refused
ERROR: connect()=-1, nbconn=136 : Connection refused
ERROR: connect()=-1, nbconn=137 : Connection refused

The program connects 200 sockets to the same IP:port, and sends the begining
of an HTTP request.

victim> sudo netstat -atnp|grep -v LISTEN
Active Internet connections (servers and established)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 17 0 10.0.3.2:23 10.0.3.1:38214 CLOSE_WAIT 1333/inetd

It's even not necessary to send data, then even faster to block my very old
inetd :

attacker> ./connectdata-nb 10.0.3.2 23 200
200 connections established.
Press any key so exit.

This time, it sends 200 non-blocking connect() calls without any data. It
takes a fraction of a second with the same result. Hopefully, it'll will
help Peter and you reproduce the problem faster on mysql.

Both programs have been freely available here for two years ; I didn't think
they would be useful again !

http://w.ods.org/tools/connect/

Regards,
Willy

2004-09-12 18:19:58

by Alan

[permalink] [raw]
Subject: Re: Linux 2.4.27 SECURITY BUG - TCP Local and REMOTE(verified) Denial of Service Attack

On Sul, 2004-09-12 at 18:59, Willy Tarreau wrote:
> The problem is within inetd. In my case it could be because it was a bit
> old (1999), but since you have it too,

Ancient inetd had several fd leak bugs fixed over time and some other
problems with built in services. Not much of a suprise that a 1999 inetd
has it.

Alan