2007-09-11 07:45:39

by Alvin Valera

[permalink] [raw]
Subject: Socket owner problem?

I am currently writing a kernel module that will apply some delay to
incoming packets. The module is implemented using netfilter hooked
into the NF_IP_LOCAL_IN. Once the module receives a packet of interest
from the lower layer, it will queue the packet (in it's own queue) and
associate a kernel timer. Once the kernel timer expires, the packet is
then propagated up the higher layer.

The problem happens like this:
Once the socket is closed by the user-space application, there are
still packets left in the module's queue. Now, the moment the kernel
timer expires and the module propagates those packets up into the
higher layer, the system hangs.

I've been searching for ways to determine if associated socket is
closed. This way, if my module knows that the user-space already
closed the socket, it will not propagate the packet up. Does anyone
have a solution for this problem?

Thanks!


2007-09-11 08:01:33

by David Schwartz

[permalink] [raw]
Subject: RE: Socket owner problem?


> The problem happens like this:
> Once the socket is closed by the user-space application, there are
> still packets left in the module's queue. Now, the moment the kernel
> timer expires and the module propagates those packets up into the
> higher layer, the system hangs.

If that were true, anyone who could send those packets to your machine would
be able to cause the system to hang too. Perhaps you are feeding the packets
back in at too high a layer.

> I've been searching for ways to determine if associated socket is
> closed. This way, if my module knows that the user-space already
> closed the socket, it will not propagate the packet up. Does anyone
> have a solution for this problem?

What object is this queue logically associated with? If the socket, you
should probably hook 'release' so you can purge the queue when the socket is
removed.

DS


2007-09-11 08:38:19

by Alvin Valera

[permalink] [raw]
Subject: Re: Socket owner problem?

Hi David,

Thanks for your quick reply.

> If that were true, anyone who could send those packets to your machine would
> be able to cause the system to hang too.

You're right to say that :)

> Perhaps you are feeding the packets
> back in at too high a layer.

Not really. In fact, I pass the packet back to the "lower layer" again
by calling netif_receive_skb(). Note that packets can go in a loop
here. To avoid queuing the same packets repeatedly, the module "marks"
them the first time they are queued. Marked packets are simply
NF_ACCEPT'ed by the module hook and therefore are propagated up the
netfilter chain.

> What object is this queue logically associated with? If the socket, you
> should probably hook 'release' so you can purge the queue when the socket is
> removed.

The queue is not associated with the socket. It is independent and is
meant just for the module to use for queuing packets that are supposed
to be delayed. But for each packet in this queue, there is an
associated kernel timer. Once this timer expires, the associated
packet is fed into the netif_receive_skb().


AV

2007-09-11 09:28:20

by David Schwartz

[permalink] [raw]
Subject: RE: Socket owner problem?


> Hi David,
>
> Thanks for your quick reply.
>
> > If that were true, anyone who could send those packets to your
> > machine would
> > be able to cause the system to hang too.
>
> You're right to say that :)
>
> > Perhaps you are feeding the packets
> > back in at too high a layer.
>
> Not really. In fact, I pass the packet back to the "lower layer" again
> by calling netif_receive_skb(). Note that packets can go in a loop
> here. To avoid queuing the same packets repeatedly, the module "marks"
> them the first time they are queued. Marked packets are simply
> NF_ACCEPT'ed by the module hook and therefore are propagated up the
> netfilter chain.

So then there is no reason there should be any problem if the packets are
fed after the socket is destroyed. I would try to figure out why something
that should not give you a problem is giving you a problem.

> > What object is this queue logically associated with? If the socket, you
> > should probably hook 'release' so you can purge the queue when
> > the socket is
> > removed.

> The queue is not associated with the socket. It is independent and is
> meant just for the module to use for queuing packets that are supposed
> to be delayed. But for each packet in this queue, there is an
> associated kernel timer. Once this timer expires, the associated
> packet is fed into the netif_receive_skb().

So what exactly goes wrong then? This approach sounds bulletproof.

DS