Hi there!
I don't know where I should mail this, so maybe it fits to the LKML.
I have a problem with one of my programs which streams data out over udp as a
multicast.
the program creates a thread. in the thread it creates a socket and sets the
multicast stuff:
[code]
if((sendfd = socket(AF_INET, SOCK_DGRAM, IPPROTO_UDP)) < 0) {
perror("socket(sendfd)");
return -3;
}
ret = 1472; // max send buffer for udp (disable fragmentation?)
if(setsockopt(sendfd, SOL_SOCKET, SO_SNDBUF, (char *)&ret, sizeof (ret))
!= 0) { perror("setsockopt(SO_SNDBUF)");
}
{
struct ip_mreq imr; //XXX zero out
imr.imr_multiaddr.s_addr = inet_addr(td->sendip);
imr.imr_interface.s_addr = 0;
if (setsockopt(sendfd, IPPROTO_IP, IP_ADD_MEMBERSHIP, &imr, sizeof(struct
ip_mreq)) < 0)
perror("IP_ADD_MEMBERSHIP");
}
[/code]
then it reads data from a file and with sendto sends it out:
[code]
ret = read(readfd, sendbuf, sizeof(sendbuf));
ret = sendto(sendfd, sendbuf, ret, 0, (struct sockaddr*)&td->udp,
sizeof(td->udp)); //send the stream out
if(ret < 0) {
perror("sendto(sendfd)");
} else {
sent_bytes += ret;
}
[/code]
on the other side is a client(I started it on the same machine) which reads data
from the connected socket and writes the data to a file, which is a pipe.
but sometimes the sendto hangs for a few seconds and longer. the strace output
of the thread:
# strace -p 23620
Process 23620 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
sendto(8, "\0\1\0\31z\30\243\206(\341\216)#\212I#\206H\341\226)%\212"..., 1316,
0, {sa_family=AF_INET, sin_port=htons(22333), sin_addr=inet_addr("224.0.0.1")},
16 <unfinished ...>
Process 23620 detached
and the kernel task output (sysrq + t):
streamserver S D6564C80 0 23995 23994 (NOTLB)
d1fcfbc8 00200082 00000000 d6564c80 c1574800 c0320214 00000001 dff9e200
00b3af60 df8d1c00 c3010010 000002a6 2764d1d8 00000c50 cefb09f8 d1fce000
7fffffff d1fcfc28 7fffffff c0378e02 dffef680 00000000 00000010 c04e3d50
Call Trace:
[<c0320214>] ip_local_deliver+0xdf/0x20b
[<c0378e02>] schedule_timeout+0xb1/0xb3
[<c031108e>] netif_receive_skb+0x167/0x194
[<c030c01f>] sock_wait_for_wmem+0xad/0xc5
[<c0115a27>] autoremove_wake_function+0x0/0x43
[<c0115a27>] autoremove_wake_function+0x0/0x43
[<c030c0bb>] sock_alloc_send_pskb+0x84/0x1cb
[<c030c21b>] sock_alloc_send_skb+0x19/0x21
[<c0324605>] ip_append_data+0x63d/0x6e5
[<c01870be>] do_get_write_access+0x246/0x5bc
[<c0323f23>] ip_generic_getfrag+0x0/0xa5
[<c033ff01>] udp_sendmsg+0x2d9/0x70a
[<c0181c8d>] __ext3_journal_stop+0x24/0x4a
[<c03476d2>] inet_sendmsg+0x4a/0x62
[<c03097cf>] sock_sendmsg+0x86/0xb2
[<c012dcb1>] __generic_file_aio_read+0x1cc/0x1fe
[<c012da1b>] file_read_actor+0x0/0xca
[<c025683a>] opost_block+0xc8/0x16e
[<c0227663>] copy_from_user+0x34/0x61
[<c030aa0b>] sys_sendto+0xc7/0xe2
[<c0157601>] __pollwait+0x0/0xc0
[<c0157c89>] sys_select+0x220/0x493
[<c030b1d0>] sys_socketcall+0x17e/0x249
[<c0103d63>] syscall_call+0x7/0xb
if I run "arp" while it hangs, the sendto continues...
and I also think that it continues if the kernel receives a network packet.
can this be a locking bug?
Regards,
Martin
ps.: I'm sorry if this is the wrong ml!
--
MyExcuse:
monitor VLF leakage
Martin Zwickel <[email protected]>
Research & Development
TechnoTrend AG <http://www.technotrend.de>
if I use MSG_DONTWAIT with sendto, I get temporarily unavailable resources
(many!):
sendto(sendfd): Resource temporarily unavailable
but isn't udp supposed to not block?
Martin
--
MyExcuse:
astropneumatic oscillations in the water-cooling
Martin Zwickel <[email protected]>
Research & Development
TechnoTrend AG <http://www.technotrend.de>
On Wednesday 23 June 2004 12:56, Martin Zwickel wrote:
> if I use MSG_DONTWAIT with sendto, I get temporarily unavailable resources
> (many!):
>
> sendto(sendfd): Resource temporarily unavailable
>
> but isn't udp supposed to not block?
Think about what will happen if you will try to spew
udp packets continuously:
while(1)
sendto(...);
They will pile up in queue and eventually it will fill up.
Then kernel may either drop excess packets silently
or return you EAGAIN.
--
vda
On Wed, 23 Jun 2004 13:34:57 +0300
Denis Vlasenko <[email protected]> bubbled:
> On Wednesday 23 June 2004 12:56, Martin Zwickel wrote:
> > if I use MSG_DONTWAIT with sendto, I get temporarily unavailable resources
> > (many!):
> >
> > sendto(sendfd): Resource temporarily unavailable
> >
> > but isn't udp supposed to not block?
>
> Think about what will happen if you will try to spew
> udp packets continuously:
>
> while(1)
> sendto(...);
>
> They will pile up in queue and eventually it will fill up.
> Then kernel may either drop excess packets silently
> or return you EAGAIN.
Yes, but why does the kernel not send out the queue?(I don't know if the queue
is empty or full when my sendto stops)
Without MSG_DONTWAIT, sendto waits endlessly. But on what?
Normally the kernel should put the queued packets on the line and accept new
ones, or did I misunderstand this?
My program sends out many udp packets, and sometimes it just stops until the
kernel receives a network packet or I access the local network(with arp
command).
So if I run arp in an endless loop(while :; do arp; done), sendto runs smooth.
For me it smells like a bug ;)
Martin
--
MyExcuse:
I'd love to help you -- it's just that the Boss won't let me near the computer.
Martin Zwickel <[email protected]>
Research & Development
TechnoTrend AG <http://www.technotrend.de>
On Wednesday 23 June 2004 15:00, Martin Zwickel wrote:
> On Wed, 23 Jun 2004 13:34:57 +0300
>
> Denis Vlasenko <[email protected]> bubbled:
> > On Wednesday 23 June 2004 12:56, Martin Zwickel wrote:
> > > if I use MSG_DONTWAIT with sendto, I get temporarily unavailable
> > > resources (many!):
> > >
> > > sendto(sendfd): Resource temporarily unavailable
> > >
> > > but isn't udp supposed to not block?
> >
> > Think about what will happen if you will try to spew
> > udp packets continuously:
> >
> > while(1)
> > sendto(...);
> >
> > They will pile up in queue and eventually it will fill up.
> > Then kernel may either drop excess packets silently
> > or return you EAGAIN.
>
> Yes, but why does the kernel not send out the queue?(I don't know if the
> queue is empty or full when my sendto stops)
> Without MSG_DONTWAIT, sendto waits endlessly. But on what?
strace, gdb and/or (SysRq-T with ksymoops) will tell you.
> Normally the kernel should put the queued packets on the line and accept
> new ones, or did I misunderstand this?
Hm, yes. What does tcpdump tell you?
> My program sends out many udp packets, and sometimes it just stops until
> the kernel receives a network packet or I access the local network(with arp
> command).
arp does not access network. I think it just prints current arp cache.
> So if I run arp in an endless loop(while :; do arp; done), sendto runs
> smooth.
>
> For me it smells like a bug ;)
Possible. We need more details. Also CC network folks :)
--
vda
On Wed, 23 Jun 2004 15:36:10 +0300
Denis Vlasenko <[email protected]> bubbled:
> > Yes, but why does the kernel not send out the queue?(I don't know if the
> > queue is empty or full when my sendto stops)
> > Without MSG_DONTWAIT, sendto waits endlessly. But on what?
>
> strace, gdb and/or (SysRq-T with ksymoops) will tell you.
As I wrote in my first mail:
# strace -p 23620
Process 23620 attached - interrupt to quit
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
--- SIGSTOP (Stopped (signal)) @ 0 (0) ---
sendto(8, "\0\1\0\31z\30\243\206(\341\216)#\212I#\206H\341\226)%\212"..., 1316,
0, {sa_family=AF_INET, sin_port=htons(22333), sin_addr=inet_addr("224.0.0.1")},
16 <unfinished ...>
Process 23620 detached
and the kernel task output (sysrq + t):
streamserver S D6564C80 0 23995 23994 (NOTLB)
d1fcfbc8 00200082 00000000 d6564c80 c1574800 c0320214 00000001 dff9e200
00b3af60 df8d1c00 c3010010 000002a6 2764d1d8 00000c50 cefb09f8 d1fce000
7fffffff d1fcfc28 7fffffff c0378e02 dffef680 00000000 00000010 c04e3d50
Call Trace:
[<c0320214>] ip_local_deliver+0xdf/0x20b
[<c0378e02>] schedule_timeout+0xb1/0xb3
[<c031108e>] netif_receive_skb+0x167/0x194
[<c030c01f>] sock_wait_for_wmem+0xad/0xc5
[<c0115a27>] autoremove_wake_function+0x0/0x43
[<c0115a27>] autoremove_wake_function+0x0/0x43
[<c030c0bb>] sock_alloc_send_pskb+0x84/0x1cb
[<c030c21b>] sock_alloc_send_skb+0x19/0x21
[<c0324605>] ip_append_data+0x63d/0x6e5
[<c01870be>] do_get_write_access+0x246/0x5bc
[<c0323f23>] ip_generic_getfrag+0x0/0xa5
[<c033ff01>] udp_sendmsg+0x2d9/0x70a
[<c0181c8d>] __ext3_journal_stop+0x24/0x4a
[<c03476d2>] inet_sendmsg+0x4a/0x62
[<c03097cf>] sock_sendmsg+0x86/0xb2
[<c012dcb1>] __generic_file_aio_read+0x1cc/0x1fe
[<c012da1b>] file_read_actor+0x0/0xca
[<c025683a>] opost_block+0xc8/0x16e
[<c0227663>] copy_from_user+0x34/0x61
[<c030aa0b>] sys_sendto+0xc7/0xe2
[<c0157601>] __pollwait+0x0/0xc0
[<c0157c89>] sys_select+0x220/0x493
[<c030b1d0>] sys_socketcall+0x17e/0x249
[<c0103d63>] syscall_call+0x7/0xb
>
> > Normally the kernel should put the queued packets on the line and accept
> > new ones, or did I misunderstand this?
>
> Hm, yes. What does tcpdump tell you?
15:33:14.571437 IP phoebee.32797 > ALL-SYSTEMS.MCAST.NET.22333: UDP, length:
1316
15:33:14.595429 IP phoebee.32797 > ALL-SYSTEMS.MCAST.NET.22333: UDP, length:
1316
15:33:14.596445 IP phoebee.32797 > ALL-SYSTEMS.MCAST.NET.22333: UDP, length:
1316
15:33:14.597445 IP phoebee.32797 > ALL-SYSTEMS.MCAST.NET.22333: UDP, length:
1316
>
> > My program sends out many udp packets, and sometimes it just stops until
> > the kernel receives a network packet or I access the local network(with arp
> > command).
>
> arp does not access network. I think it just prints current arp cache.
well, but it resolves the ip's to the hostnames. arp contacts my nameserver, so
it accesses the network :)
socket(PF_INET, SOCK_DGRAM, IPPROTO_IP) = 5
connect(5, {sa_family=AF_INET, sin_port=htons(53), sin_addr=inet_addr("192.168.0
.2")}, 28) = 0
>
> > So if I run arp in an endless loop(while :; do arp; done), sendto runs
> > smooth.
> >
> > For me it smells like a bug ;)
>
> Possible. We need more details. Also CC network folks :)
Hopefully not! But then I need a bugfix for my program...
Where can I find the address?
--
MyExcuse:
Boredom in the Kernel.
Martin Zwickel <[email protected]>
Research & Development
TechnoTrend AG <http://www.technotrend.de>