Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753526AbbKYQnk (ORCPT ); Wed, 25 Nov 2015 11:43:40 -0500 Received: from tiger.mobileactivedefense.com ([217.174.251.109]:57892 "EHLO tiger.mobileactivedefense.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751704AbbKYQni (ORCPT ); Wed, 25 Nov 2015 11:43:38 -0500 From: Rainer Weikusat To: Eric Dumazet Cc: rweikusat@mobileactivedefense.com, Dmitry Vyukov , Benjamin LaHaise , "David S. Miller" , Hannes Frederic Sowa , Al Viro , David Howells , Ying Xue , "Eric W. Biederman" , netdev , LKML , syzkaller , Kostya Serebryany , Alexander Potapenko , Sasha Levin Subject: Re: use-after-free in sock_wake_async In-Reply-To: (Eric Dumazet's message of "Tue, 24 Nov 2015 17:18:27 -0800") References: <87poyzj7j2.fsf@doppelsaurus.mobileactivedefense.com> <87io4qevdp.fsf@doppelsaurus.mobileactivedefense.com> User-Agent: Gnus/5.13 (Gnus v5.13) Emacs/23.4 (gnu/linux) Date: Wed, 25 Nov 2015 16:43:13 +0000 Message-ID: <87io4q3u8u.fsf@doppelsaurus.mobileactivedefense.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii X-Greylist: Sender succeeded SMTP AUTH, not delayed by milter-greylist-4.4.3 (tiger.mobileactivedefense.com [217.174.251.109]); Wed, 25 Nov 2015 16:43:23 +0000 (GMT) Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3626 Lines: 94 Eric Dumazet writes: > On Tue, Nov 24, 2015 at 5:10 PM, Rainer Weikusat > wrote: [...] >> It's also easy to verify: Swap the unix_state_lock and >> other->sk_data_ready and see if the issue still occurs. Right now (this >> may change after I had some sleep as it's pretty late for me), I don't >> think there's another local fix: The ->sk_data_ready accesses a >> pointer after the lock taken by the code which will clear and >> then later free it was released. > > It seems that : > > int sock_wake_async(struct socket *sock, int how, int band) > > should really be changed to > > int sock_wake_async(struct socket_wq *wq, int how, int band) > > So that RCU rules (already present) apply safely. > > sk->sk_socket is inherently racy (that is : racy without using > sk_callback_lock rwlock ) The comment above sock_wait_async states that /* This function may be called only under socket lock or callback_lock or rcu_lock */ In this case, it's called via sk_wake_async (include/net/sock.h) which is - in turn - called via sock_def_readable (the 'default' data ready routine/ net/core/sock.c) which looks like this: static void sock_def_readable(struct sock *sk) { struct socket_wq *wq; rcu_read_lock(); wq = rcu_dereference(sk->sk_wq); if (wq_has_sleeper(wq)) wake_up_interruptible_sync_poll(&wq->wait, POLLIN | POLLPRI | POLLRDNORM | POLLRDBAND); sk_wake_async(sk, SOCK_WAKE_WAITD, POLL_IN); rcu_read_unlock(); } and should thus satisfy the constraint documented by the comment (I didn't verify if the comment is actually correct, though). Further - sorry about that - I think changing code in "half of the network stack" in order to avoid calling a certain routine which will only ever do something in case someone's using signal-driven I/O with an already acquired lock held is a terrifying idea. Because of this, I propose the following alternate patch which should also solve the problem by ensuring that the ->sk_data_ready activity happens before unix_release_sock/ sock_release get a chance to clear or free anything which will be needed. In case this demonstrably causes other issues, a more complicated alternate idea (still restricting itself to changes to the af_unix code) would be to move the socket_wq structure to a dummy struct socket allocated by unix_release_sock and freed by the destructor. --- diff --git a/net/unix/af_unix.c b/net/unix/af_unix.c index 4e95bdf..5c87ea6 100644 --- a/net/unix/af_unix.c +++ b/net/unix/af_unix.c @@ -1754,8 +1754,8 @@ restart_locked: skb_queue_tail(&other->sk_receive_queue, skb); if (max_level > unix_sk(other)->recursion_level) unix_sk(other)->recursion_level = max_level; - unix_state_unlock(other); other->sk_data_ready(other); + unix_state_unlock(other); sock_put(other); scm_destroy(&scm); return len; @@ -1860,8 +1860,8 @@ static int unix_stream_sendmsg(struct socket *sock, struct msghdr *msg, skb_queue_tail(&other->sk_receive_queue, skb); if (max_level > unix_sk(other)->recursion_level) unix_sk(other)->recursion_level = max_level; - unix_state_unlock(other); other->sk_data_ready(other); + unix_state_unlock(other); sent += size; } -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/