Most SIP devices use a source port of 5060/udp on SIP requests, so the
response automatically comes back to port 5060:
phone_ip:5060 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:5060 100 Trying
The newer Cisco IP phones, however, use a randomly chosen high source
port for the SIP request but expect the response on port 5060:
phone_ip:49173 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:5060 100 Trying
Standard Linux NAT, with or without nf_nat_sip, will send the reply back
to port 49173, not 5060:
phone_ip:49173 -> proxy_ip:5060 REGISTER
proxy_ip:5060 -> phone_ip:49173 100 Trying
But the phone is not listening on 49173, so it will never see the reply.
This issue was seen on a Cisco CP-7965G, firmware 8-5(3). It appears
to be a well-known problem on 7941 and newer:
http://www.voip-info.org/wiki/view/Standalone+Cisco+7941%252F7961+without+a+local+PBX
Search for "Connecting to the outside world"
I contacted Cisco support and they were not amenable to changing the
behavior. It appears to be RFC3261-compliant, as the "Sent-by port"
field in the request specifies 5060:
18.2.2 Sending Responses
The server transport uses the value of the top Via header field in
order to determine where to send a response. It MUST follow the
following process:
...
o Otherwise (for unreliable unicast transports), if the top Via
has a "received" parameter, the response MUST be sent to the
address in the "received" parameter, using the port indicated
in the "sent-by" value, or using port 5060 if none is specified
explicitly. If this fails, for example, elicits an ICMP "port
unreachable" response, the procedures of Section 5 of [4]
SHOULD be used to determine where to send the response.
This patch modifies nf_*_sip to work around this quirk, by rewriting
the response port to 5060 when the following conditions are met:
- User-Agent starts with "Cisco"
- Incoming TTL was exactly 64 (meaning that our system is the phone's
local router, not an intermediate router)
Tested on Linus' latest 2.6.37-rc tree.
Signed-off-by: Kevin Cernekee <[email protected]>
---
include/linux/netfilter/nf_conntrack_sip.h | 2 ++
net/ipv4/netfilter/nf_nat_sip.c | 12 ++++++++++++
net/netfilter/nf_conntrack_sip.c | 25 +++++++++++++++++++++++++
3 files changed, 39 insertions(+), 0 deletions(-)
diff --git a/include/linux/netfilter/nf_conntrack_sip.h b/include/linux/netfilter/nf_conntrack_sip.h
index 0ce91d5..a6ea620 100644
--- a/include/linux/netfilter/nf_conntrack_sip.h
+++ b/include/linux/netfilter/nf_conntrack_sip.h
@@ -8,6 +8,7 @@
struct nf_ct_sip_master {
unsigned int register_cseq;
unsigned int invite_cseq;
+ unsigned int cisco_port_mangle;
};
enum sip_expectation_classes {
@@ -90,6 +91,7 @@ enum sip_header_types {
SIP_HDR_EXPIRES,
SIP_HDR_CONTENT_LENGTH,
SIP_HDR_CALL_ID,
+ SIP_HDR_USER_AGENT,
};
enum sdp_header_types {
diff --git a/net/ipv4/netfilter/nf_nat_sip.c b/net/ipv4/netfilter/nf_nat_sip.c
index e40cf78..4b9a46d 100644
--- a/net/ipv4/netfilter/nf_nat_sip.c
+++ b/net/ipv4/netfilter/nf_nat_sip.c
@@ -121,6 +121,7 @@ static unsigned int ip_nat_sip(struct sk_buff *skb, unsigned int dataoff,
enum ip_conntrack_info ctinfo;
struct nf_conn *ct = nf_ct_get(skb, &ctinfo);
enum ip_conntrack_dir dir = CTINFO2DIR(ctinfo);
+ struct nf_conn_help *help = nfct_help(ct);
unsigned int coff, matchoff, matchlen;
enum sip_header_types hdr;
union nf_inet_addr addr;
@@ -225,6 +226,17 @@ next:
return NF_DROP;
}
+ /* Mangle destination port for Cisco phones, then fix up checksums */
+ if (help->help.ct_sip_info.cisco_port_mangle) {
+ struct udphdr *uh;
+
+ uh = (struct udphdr *)(skb->data + ip_hdrlen(skb));
+ uh->dest = htons(SIP_PORT);
+
+ if (!nf_nat_mangle_udp_packet(skb, ct, ctinfo, 0, 0, NULL, 0))
+ return NF_DROP;
+ }
+
if (!map_sip_addr(skb, dataoff, dptr, datalen, SIP_HDR_FROM) ||
!map_sip_addr(skb, dataoff, dptr, datalen, SIP_HDR_TO))
return NF_DROP;
diff --git a/net/netfilter/nf_conntrack_sip.c b/net/netfilter/nf_conntrack_sip.c
index bcf47eb..6042f66 100644
--- a/net/netfilter/nf_conntrack_sip.c
+++ b/net/netfilter/nf_conntrack_sip.c
@@ -18,6 +18,7 @@
#include <linux/udp.h>
#include <linux/tcp.h>
#include <linux/netfilter.h>
+#include <linux/ip.h>
#include <net/netfilter/nf_conntrack.h>
#include <net/netfilter/nf_conntrack_core.h>
@@ -338,6 +339,7 @@ static const struct sip_header ct_sip_hdrs[] = {
[SIP_HDR_EXPIRES] = SIP_HDR("Expires", NULL, NULL, digits_len),
[SIP_HDR_CONTENT_LENGTH] = SIP_HDR("Content-Length", "l", NULL, digits_len),
[SIP_HDR_CALL_ID] = SIP_HDR("Call-Id", "i", NULL, callid_len),
+ [SIP_HDR_USER_AGENT] = SIP_HDR("User-Agent", NULL, NULL, string_len),
};
static const char *sip_follow_continuation(const char *dptr, const char *limit)
@@ -1366,6 +1368,29 @@ static int process_sip_request(struct sk_buff *skb, unsigned int dataoff,
unsigned int matchoff, matchlen;
unsigned int cseq, i;
+ /* Many Cisco IP phones use a high source port for SIP requests, but
+ * listen for the response on port 5060. If we are the local
+ * router for one of these phones, flag the connection here so that
+ * responses will be redirected to the correct port.
+ */
+ do {
+ static const char cisco[] = "Cisco";
+ struct iphdr *iph = ip_hdr(skb);
+ struct nf_conn_help *help = nfct_help(ct);
+
+ if (iph->ttl != 63)
+ break;
+ if (ct_sip_get_header(ct, *dptr, 0, *datalen,
+ SIP_HDR_USER_AGENT, &matchoff, &matchlen) <= 0)
+ break;
+ if (matchlen < strlen(cisco))
+ break;
+ if (strnicmp(*dptr + matchoff, cisco, strlen(cisco)) != 0)
+ break;
+
+ help->help.ct_sip_info.cisco_port_mangle = 1;
+ } while (0);
+
for (i = 0; i < ARRAY_SIZE(sip_handlers); i++) {
const struct sip_handler *handler;
--
1.7.0.4
Le dimanche 14 novembre 2010 à 00:32 -0800, Kevin Cernekee a écrit :
> Most SIP devices use a source port of 5060/udp on SIP requests, so the
> response automatically comes back to port 5060:
>
> phone_ip:5060 -> proxy_ip:5060 REGISTER
> proxy_ip:5060 -> phone_ip:5060 100 Trying
>
> The newer Cisco IP phones, however, use a randomly chosen high source
> port for the SIP request but expect the response on port 5060:
>
> phone_ip:49173 -> proxy_ip:5060 REGISTER
> proxy_ip:5060 -> phone_ip:5060 100 Trying
>
> Standard Linux NAT, with or without nf_nat_sip, will send the reply back
> to port 49173, not 5060:
>
> phone_ip:49173 -> proxy_ip:5060 REGISTER
> proxy_ip:5060 -> phone_ip:49173 100 Trying
>
> But the phone is not listening on 49173, so it will never see the reply.
>
> This issue was seen on a Cisco CP-7965G, firmware 8-5(3). It appears
> to be a well-known problem on 7941 and newer:
>
> http://www.voip-info.org/wiki/view/Standalone+Cisco+7941%252F7961+without+a+local+PBX
>
> Search for "Connecting to the outside world"
>
> I contacted Cisco support and they were not amenable to changing the
> behavior. It appears to be RFC3261-compliant, as the "Sent-by port"
> field in the request specifies 5060:
>
There is a difference between being RFC compliant, and being usable.
Most SIP sotfwares I know will break with such a stupid CISCO behavior.
> 18.2.2 Sending Responses
>
> The server transport uses the value of the top Via header field in
> order to determine where to send a response. It MUST follow the
> following process:
>
> ...
>
> o Otherwise (for unreliable unicast transports), if the top Via
> has a "received" parameter, the response MUST be sent to the
> address in the "received" parameter, using the port indicated
> in the "sent-by" value, or using port 5060 if none is specified
> explicitly. If this fails, for example, elicits an ICMP "port
> unreachable" response, the procedures of Section 5 of [4]
> SHOULD be used to determine where to send the response.
>
> This patch modifies nf_*_sip to work around this quirk, by rewriting
> the response port to 5060 when the following conditions are met:
>
> - User-Agent starts with "Cisco"
>
> - Incoming TTL was exactly 64 (meaning that our system is the phone's
> local router, not an intermediate router)
>
This seems a hack to me, sorry. How many different vendors will switch
to "Cisco" broken way, and we have to patch over and over ?
I would like to get an exact SIP exchange to make sure their is not
another way to handle this without adding a "Cisco" string somewhere...
Please provide a pcap or tcpdump -A
Thanks
On Sun, Nov 14, 2010 at 12:59 AM, Eric Dumazet <[email protected]> wrote:
> I would like to get an exact SIP exchange to make sure their is not
> another way to handle this without adding a "Cisco" string somewhere...
>
> Please provide a pcap or tcpdump -A
Existing nf_nat_sip: phone sends unauthenticated REGISTER requests
over and over again, because it is not seeing the replies sent back to
port 50070:
10:05:53.496479 IP 192.168.2.28.50070 > 67.215.241.250.5060: SIP, length: 723
E`...[[email protected] sip:losangeles.voip.ms SIP/2.0
Via: SIP/2.0/
10:05:53.587370 IP 67.215.241.250.5060 > 192.168.2.28.50070: SIP, length: 486
E.......3..fC...............SIP/2.0 100 Trying
Via: SIP/2.0/UDP 192.168.2.28:5060
10:05:53.587807 IP 67.215.241.250.5060 > 192.168.2.28.50070: SIP, length: 550
E..B....3..%C...............SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP 192.168.2.2
10:05:57.496541 IP 192.168.2.28.50070 > 67.215.241.250.5060: SIP, length: 723
E`...\[email protected] sip:losangeles.voip.ms SIP/2.0
Via: SIP/2.0/
10:05:57.526716 IP 67.215.241.250.5060 > 192.168.2.28.50070: SIP, length: 486
E.......3..dC...............SIP/2.0 100 Trying
Via: SIP/2.0/UDP 192.168.2.28:5060
10:05:57.527162 IP 67.215.241.250.5060 > 192.168.2.28.50070: SIP, length: 550
E..B....3..#C...............SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP 192.168.2.2
10:06:01.486821 IP 192.168.2.28.50070 > 67.215.241.250.5060: SIP, length: 723
E`...][email protected] sip:losangeles.voip.ms SIP/2.0
Via: SIP/2.0/
10:06:01.515611 IP 67.215.241.250.5060 > 192.168.2.28.50070: SIP, length: 486
E.......3..bC...............SIP/2.0 100 Trying
Via: SIP/2.0/UDP 192.168.2.28:5060
10:06:01.516024 IP 67.215.241.250.5060 > 192.168.2.28.50070: SIP, length: 550
E..B....3..!C...............SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP 192.168.2.2
... continues forever ...
Patched nf_nat_sip: router sends the replies back to port 5060, so the
phone is now able to register itself and make calls:
10:09:46.221631 IP 192.168.2.28.50618 > 67.215.241.250.5060: SIP, length: 723
E`[email protected] sip:losangeles.voip.ms SIP/2.0
Via: SIP/2.0/
10:09:46.253052 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 491
E....+..4..$C...............SIP/2.0 100 Trying
Via: SIP/2.0/UDP 192.168.2.28:5060
10:09:46.253472 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 550
E..B.,..4...C...............SIP/2.0 401 Unauthorized
Via: SIP/2.0/UDP 192.168.2.2
10:09:46.261602 IP 192.168.2.28.50618 > 67.215.241.250.5060: SIP, length: 900
E`[email protected] sip:losangeles.voip.ms SIP/2.0
Via: SIP/2.0/
10:09:46.290211 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 491
E....-..4.."C...............SIP/2.0 100 Trying
Via: SIP/2.0/UDP 192.168.2.28:5060
10:09:46.295041 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 579
E.._....4...C............K..SIP/2.0 200 OK
Via: SIP/2.0/UDP 192.168.2.28:5060;bra
BTW, I thought of two possible issues with the original patch:
1) Might need to call skb_make_writable() prior to modifying the
packet. Presumably the second invocation inside
nf_nat_mangle_udp_packet() will have no effect.
(Is there a cleaner way to mangle just the port number? Most of the
utility functions only help with modifying the data area.)
2) I should probably be checking to make sure request == 0 before
mangling the packet. The current behavior is harmless if the SIP
proxy is on port 5060, but that might not always be the case.
I can roll these, along with any other suggestions, into v2.
Le dimanche 14 novembre 2010 à 10:33 -0800, Kevin Cernekee a écrit :
> On Sun, Nov 14, 2010 at 12:59 AM, Eric Dumazet <[email protected]> wrote:
> > I would like to get an exact SIP exchange to make sure their is not
> > another way to handle this without adding a "Cisco" string somewhere...
> >
> > Please provide a pcap or tcpdump -A
>
> Existing nf_nat_sip: phone sends unauthenticated REGISTER requests
> over and over again, because it is not seeing the replies sent back to
> port 50070:
>
> 10:05:53.496479 IP 192.168.2.28.50070 > 67.215.241.250.5060: SIP, length: 723
> E`...[[email protected] sip:losangeles.voip.ms SIP/2.0
> Via: SIP/2.0/
>
Hmm, partial tcpdump... you should use" tcpdump -s 1000 -A"
We miss the
Via: SIP/2.0/UDP 192.168.2.28:5060;branch=xxxxxxxx
Maybe a fix would be to use this "5060" port, instead of hardcoding it
like you did ?
>
> Patched nf_nat_sip: router sends the replies back to port 5060, so the
> phone is now able to register itself and make calls:
>
> 10:09:46.221631 IP 192.168.2.28.50618 > 67.215.241.250.5060: SIP, length: 723
> E`[email protected] sip:losangeles.voip.ms SIP/2.0
> Via: SIP/2.0/
>
> 10:09:46.253052 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 491
> E....+..4..$C...............SIP/2.0 100 Trying
> Via: SIP/2.0/UDP 192.168.2.28:5060
>
> 10:09:46.253472 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 550
> E..B.,..4...C...............SIP/2.0 401 Unauthorized
> Via: SIP/2.0/UDP 192.168.2.2
>
> 10:09:46.261602 IP 192.168.2.28.50618 > 67.215.241.250.5060: SIP, length: 900
> E`[email protected] sip:losangeles.voip.ms SIP/2.0
> Via: SIP/2.0/
>
> 10:09:46.290211 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 491
> E....-..4.."C...............SIP/2.0 100 Trying
> Via: SIP/2.0/UDP 192.168.2.28:5060
>
> 10:09:46.295041 IP 67.215.241.250.5060 > 192.168.2.28.5060: SIP, length: 579
> E.._....4...C............K..SIP/2.0 200 OK
> Via: SIP/2.0/UDP 192.168.2.28:5060;bra
>
>
> BTW, I thought of two possible issues with the original patch:
>
> 1) Might need to call skb_make_writable() prior to modifying the
> packet. Presumably the second invocation inside
> nf_nat_mangle_udp_packet() will have no effect.
>
> (Is there a cleaner way to mangle just the port number? Most of the
> utility functions only help with modifying the data area.)
>
> 2) I should probably be checking to make sure request == 0 before
> mangling the packet. The current behavior is harmless if the SIP
> proxy is on port 5060, but that might not always be the case.
>
> I can roll these, along with any other suggestions, into v2.
On Sun, Nov 14, 2010 at 11:57 AM, Eric Dumazet <[email protected]> wrote:
> Via: SIP/2.0/UDP 192.168.2.28:5060;branch=xxxxxxxx
>
>
> Maybe a fix would be to use this "5060" port, instead of hardcoding it
> like you did ?
Just posted v2... appreciate the advice so far.
My new code in process_sip_request() looks for an address match + port
mismatch between the IP source and the Via: header. This is how it
tries to detect whether we are talking directly to an afflicted Cisco
phone. If the address doesn't match, I assume the request is passing
through a non-SIP-aware NAT router so there is no special handling.
Assuming we can reliably detect the "quirky phone" condition, is there
any way to just trick Netfilter into thinking the source port was 5060
instead of 49xxx? 3/4ths of the patch could probably be eliminated if
we could overwrite the port number inside tuplehash.
On 14.11.2010 20:57, Eric Dumazet wrote:
> Le dimanche 14 novembre 2010 ? 10:33 -0800, Kevin Cernekee a ?crit :
>> On Sun, Nov 14, 2010 at 12:59 AM, Eric Dumazet <[email protected]> wrote:
>>> I would like to get an exact SIP exchange to make sure their is not
>>> another way to handle this without adding a "Cisco" string somewhere...
>>>
>>> Please provide a pcap or tcpdump -A
>>
>> Existing nf_nat_sip: phone sends unauthenticated REGISTER requests
>> over and over again, because it is not seeing the replies sent back to
>> port 50070:
>>
>> 10:05:53.496479 IP 192.168.2.28.50070 > 67.215.241.250.5060: SIP, length: 723
>> E`...[[email protected] sip:losangeles.voip.ms SIP/2.0
>> Via: SIP/2.0/
>>
>
> Hmm, partial tcpdump... you should use" tcpdump -s 1000 -A"
>
> We miss the
>
> Via: SIP/2.0/UDP 192.168.2.28:5060;branch=xxxxxxxx
>
>
> Maybe a fix would be to use this "5060" port, instead of hardcoding it
> like you did ?
I agree, using the Via header to route the response makes more sense.
On 15.11.2010 04:01, Kevin Cernekee wrote:
> On Sun, Nov 14, 2010 at 11:57 AM, Eric Dumazet <[email protected]> wrote:
>> Via: SIP/2.0/UDP 192.168.2.28:5060;branch=xxxxxxxx
>>
>>
>> Maybe a fix would be to use this "5060" port, instead of hardcoding it
>> like you did ?
>
> Just posted v2... appreciate the advice so far.
>
> My new code in process_sip_request() looks for an address match + port
> mismatch between the IP source and the Via: header. This is how it
> tries to detect whether we are talking directly to an afflicted Cisco
> phone. If the address doesn't match, I assume the request is passing
> through a non-SIP-aware NAT router so there is no special handling.
>
> Assuming we can reliably detect the "quirky phone" condition, is there
> any way to just trick Netfilter into thinking the source port was 5060
> instead of 49xxx? 3/4ths of the patch could probably be eliminated if
> we could overwrite the port number inside tuplehash.
The problem in doing this is that further packets from port 49xxx
wouldn't be recognized as belonging to the same connection. If another
packet was sent to the same destination conntrack would treat it as
a new connection, rewrite the source port number, notice the clash and
drop the packet.
The same problem exists with your current patch, packets from port
5060 to the same destination won't be recognized as belonging to the
connection that sent the REGISTER and thus won't be able to modify the
timeout or unregister.
Basically we would need three-legged connections to handle this
situation correctly. I've actually done some work to move one of
the conntrack tuples to a ct_extend since in most situations
(all except IPv4 NAT and ICMP packets) the tuples are symetrical
and the second one can easily be derived, but I never managed
to finish it - not sure what the problem was anymore, I'll see
if I can still find those patches. With this we could simply
attach a third tuple to a connection.
On Mon, Nov 15, 2010 at 2:15 AM, Patrick McHardy <[email protected]> wrote:
> The problem in doing this is that further packets from port 49xxx
> wouldn't be recognized as belonging to the same connection.
OK, makes sense.
> The same problem exists with your current patch, packets from port
> 5060 to the same destination won't be recognized as belonging to the
> connection that sent the REGISTER and thus won't be able to modify the
> timeout or unregister.
>
> Basically we would need three-legged connections to handle this
> situation correctly.
Just to clarify: the actual source port on a given device will be
EITHER a high-numbered port (Cisco) or 5060 (others). I have not come
across any devices that send from a "mix" of source ports, e.g. 49xxx
for REGISTER and then 5060 for INVITE.
>From what I have seen, subsequent SIP requests from the Cisco phone
are indeed getting associated with the original connection. My phone
is logging into two different SIP accounts, and each account seems to
use its own unique UDP source port for all control traffic (both
expecting replies on 5060).
If Netfilter adds support for three-legged connections, will the third
leg show up in the tuplehash so I don't have to track it in the "help"
structure?
On 15.11.2010 17:46, Kevin Cernekee wrote:
> On Mon, Nov 15, 2010 at 2:15 AM, Patrick McHardy <[email protected]> wrote:
>> The problem in doing this is that further packets from port 49xxx
>> wouldn't be recognized as belonging to the same connection.
>
> OK, makes sense.
>
>> The same problem exists with your current patch, packets from port
>> 5060 to the same destination won't be recognized as belonging to the
>> connection that sent the REGISTER and thus won't be able to modify the
>> timeout or unregister.
>>
>> Basically we would need three-legged connections to handle this
>> situation correctly.
>
> Just to clarify: the actual source port on a given device will be
> EITHER a high-numbered port (Cisco) or 5060 (others). I have not come
> across any devices that send from a "mix" of source ports, e.g. 49xxx
> for REGISTER and then 5060 for INVITE.
>
>>From what I have seen, subsequent SIP requests from the Cisco phone
> are indeed getting associated with the original connection. My phone
> is logging into two different SIP accounts, and each account seems to
> use its own unique UDP source port for all control traffic (both
> expecting replies on 5060).
Could you provide a binary tcpdump (-w file -s0) of registration
and a subsequent call please?
> If Netfilter adds support for three-legged connections, will the third
> leg show up in the tuplehash so I don't have to track it in the "help"
> structure?
Yes, basically by default all connections would only have a single
tuplehash. The lookup would look up the tuple based on the packet
and, if not found, reverse it and retry the lookup. When the tuples are
asymetric (NAT and ICMP/ICMPv6) a second one would be added in the
ct_extend area and would be added to the hash table as usual. For the
SIP case, we could simply add a third one in a ct_extend area.
Unfortunately I wasn't able to find my old patch so far.
On Mon, Nov 15, 2010 at 8:58 AM, Patrick McHardy <[email protected]> wrote:
> Could you provide a binary tcpdump (-w file -s0) of registration
> and a subsequent call please?
These were captured running my "v2" patch, since the phone cannot
register without it:
http://lkml.org/lkml/2010/11/14/181
goodreg.pcap - good registration
goodcall.pcap - call initiation + termination
This is all that happens in the absence of the "Cisco patch":
badreg.pcap - bad registration
Phone's LAN IP is 192.168.0.28
Public IP (as seen by the SIP proxy) is 111.222.33.222
SIP proxy is sip.iptel.org = 213.192.59.75