Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1759598AbYAJWgR (ORCPT ); Thu, 10 Jan 2008 17:36:17 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S1759135AbYAJWf6 (ORCPT ); Thu, 10 Jan 2008 17:35:58 -0500 Received: from netops-testserver-3-out.sgi.com ([192.48.171.28]:42544 "EHLO relay.sgi.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1756136AbYAJWf5 (ORCPT ); Thu, 10 Jan 2008 17:35:57 -0500 Date: Thu, 10 Jan 2008 16:35:55 -0600 (CST) From: Brent Casavant Reply-To: Brent Casavant To: Herbert Xu cc: penguin-kernel@i-love.sakura.ne.jp, netdev@vger.kernel.org, davem@davemloft.net, linux-kernel@vger.kernel.org Subject: Re: AF_UNIX MSG_PEEK bug? In-Reply-To: Message-ID: References: User-Agent: Alpine 1.00 (BSF 882 2007-12-20) Organization: "Silicon Graphics, Inc." MIME-Version: 1.0 Content-Type: TEXT/PLAIN; charset=US-ASCII Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2922 Lines: 95 Here's what I think is a better patch. Or maybe just simpler. However, I'm still unsure what the effect of this patch on file descriptor passing might be. Reading the prior code, and the parallel portions/comments in unix_dgram_recvmsg(), it looks like there's been a lot of uncertainty as to how file descriptor passing should be handled durning MSG_PEEK operations. To quote: /* It is questionable: on PEEK we could: - do not return fds - good, but too simple 8) - return fds, and do not return them on read (old strategy, apparently wrong) - clone fds (I chose it for now, it is the most universal solution) POSIX 1003.1g does not actually define this clearly at all. POSIX 1003.1g doesn't define a lot of things clearly however! */ With this patch, passed file descriptors are ignored during MSG_PEEK. This is essentially the first case in the comment above. What I can't seem to figure out is why this is incorrect. I suspect there's some history here that I can't find via Google, mailing list archives, or revision logs. So, that said, here's a cleaner patch. It's still not ready for application until the file descriptor passing is better understood. Thanks, Brent diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 060bba4..6d6cdb4 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1750,6 +1750,8 @@ static int unix_stream_recvmsg(struct ki int target; int err = 0; long timeo; + struct sk_buff *skb; + struct sk_buff_head peek_stack; err = -EINVAL; if (sk->sk_state != TCP_ESTABLISHED) @@ -1759,6 +1761,9 @@ static int unix_stream_recvmsg(struct ki if (flags&MSG_OOB) goto out; + if (flags & MSG_PEEK) + skb_queue_head_init(&peek_stack); + target = sock_rcvlowat(sk, flags&MSG_WAITALL, size); timeo = sock_rcvtimeo(sk, flags&MSG_DONTWAIT); @@ -1778,7 +1783,6 @@ static int unix_stream_recvmsg(struct ki do { int chunk; - struct sk_buff *skb; unix_state_lock(sk); skb = skb_dequeue(&sk->sk_receive_queue); @@ -1864,19 +1868,14 @@ static int unix_stream_recvmsg(struct ki if (siocb->scm->fp) break; - } - else - { - /* It is questionable, see note in unix_dgram_recvmsg. - */ - if (UNIXCB(skb).fp) - siocb->scm->fp = scm_fp_dup(UNIXCB(skb).fp); + } else + __skb_queue_head(&peek_stack, skb); + } while (size); - /* put message back and return */ + /* Push all peeked skbs back onto receive queue */ + if (flags & MSG_PEEK) + while ((skb = __skb_dequeue(&peek_stack))) skb_queue_head(&sk->sk_receive_queue, skb); - break; - } - } while (size); mutex_unlock(&u->readlock); scm_recv(sock, msg, siocb->scm, flags); -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/