Received: by 2002:a05:6a10:16a7:0:0:0:0 with SMTP id gp39csp3524538pxb; Mon, 9 Nov 2020 13:35:30 -0800 (PST) X-Google-Smtp-Source: ABdhPJyQO1TqvfxJ/mInofMlIi8pPmcxQnnHn0dGIqDdr97zGDL3Weg12UYOkHenLa+s0yqBDoac X-Received: by 2002:a17:906:1c8f:: with SMTP id g15mr16727013ejh.179.1604957729841; Mon, 09 Nov 2020 13:35:29 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1604957729; cv=none; d=google.com; s=arc-20160816; b=zGlOUQH11ANRXf1t2guIc+PnKkuVT61CmDTEJefFX2GSIL4OoNMtdw0QwOh6HkAFK7 YyVL6Fbx5rL5LNxkzU//oj75zY/sdGJYh7y5Kx0rw7ags5KkxeNsydSVrLJsUk3UisXK hXjffIxYG8HV2UVbHGqQV/EvPQgLUOQFiLDii1MshBi6LKlG5Pj3WfBhf+vXDSli6vj6 8ksHTD1aGg1/y5y6XyrayAQzW/AeDUhcqWpWqllsbP+pU1wx8Pr5rr4PbEYva95X6rKk 1lTIvMSQepiwa4a2ljZHaQe4KGEuYIwZsZVHVHTtmXa9QGwGWpoC1i2IPLfk4nQpKp/L OwfA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:in-reply-to:content-disposition:mime-version :references:message-id:subject:cc:to:from:date; bh=CtBs6BSgk/g/m6BwVYMFW7yaJOwq+6Zx4cziEousJ5E=; b=GGmugngakg1+3FLoKcJ4jcwaL1UYvgpBnlaMxxUyn3Il2bNU3DuPaeZu3gjfHSbBTP WBazl20qe7hwEoDi0PYriM+y0mzFrt982D8bX704LamY3Mn+qRl3OIBPmBr95+e5WYhM Z3z5iviPyC5ujO5cTX6XQWWMQ/d1fGZxvoaqz31wUoN3AmGqOWZNr7v8wmRhPrp+86UF wKKKf9RTgN7nZXKbxnPAFPwJgTO7/VL5WnX8WcUfrKaiionEqqT43LnqTqcjcviHJxc6 rwnPZZwnS96r2sd9FB8eEDNrtdWGO318uPtXX0ZWEy1wqAvHtFghvkwV7IipdspHoAev 2/AA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id t5si7867808ejx.583.2020.11.09.13.35.05; Mon, 09 Nov 2020 13:35:29 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=canonical.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1730574AbgKIVbq (ORCPT + 99 others); Mon, 9 Nov 2020 16:31:46 -0500 Received: from youngberry.canonical.com ([91.189.89.112]:38081 "EHLO youngberry.canonical.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727311AbgKIVbp (ORCPT ); Mon, 9 Nov 2020 16:31:45 -0500 Received: from 1.general.cascardo.us.vpn ([10.172.70.58] helo=mussarela) by youngberry.canonical.com with esmtpsa (TLS1.2:ECDHE_RSA_AES_128_GCM_SHA256:128) (Exim 4.86_2) (envelope-from ) id 1kcElJ-0002mr-La; Mon, 09 Nov 2020 21:31:42 +0000 Date: Mon, 9 Nov 2020 18:31:34 -0300 From: Thadeu Lima de Souza Cascardo To: Jakub Kicinski Cc: Kleber Sacilotto de Souza , Eric Dumazet , netdev@vger.kernel.org, Gerrit Renker , "David S. Miller" , "Gustavo A. R. Silva" , "Alexander A. Klimov" , Kees Cook , Alexey Kodanev , dccp@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH 1/2] dccp: ccid: move timers to struct dccp_sock Message-ID: <20201109213134.GR595944@mussarela> References: <20201013171849.236025-1-kleber.souza@canonical.com> <20201013171849.236025-2-kleber.souza@canonical.com> <20201016153016.04bffc1e@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> <20201109114828.GP595944@mussarela> <20201109094938.45b230c9@kicinski-fedora-pc1c0hjn.dhcp.thefacebook.com> <20201109210909.GQ595944@mussarela> <20201109131554.5f65b2fa@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20201109131554.5f65b2fa@kicinski-fedora-PC1C0HJN.hsd1.ca.comcast.net> Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Nov 09, 2020 at 01:15:54PM -0800, Jakub Kicinski wrote: > On Mon, 9 Nov 2020 18:09:09 -0300 Thadeu Lima de Souza Cascardo wrote: > > On Mon, Nov 09, 2020 at 09:49:38AM -0800, Jakub Kicinski wrote: > > > On Mon, 9 Nov 2020 08:48:28 -0300 Thadeu Lima de Souza Cascardo wrote: > > > > On Fri, Oct 16, 2020 at 03:30:16PM -0700, Jakub Kicinski wrote: > > > > > On Tue, 13 Oct 2020 19:18:48 +0200 Kleber Sacilotto de Souza wrote: > > > > > > From: Thadeu Lima de Souza Cascardo > > > > > > > > > > > > When dccps_hc_tx_ccid is freed, ccid timers may still trigger. The reason > > > > > > del_timer_sync can't be used is because this relies on keeping a reference > > > > > > to struct sock. But as we keep a pointer to dccps_hc_tx_ccid and free that > > > > > > during disconnect, the timer should really belong to struct dccp_sock. > > > > > > > > > > > > This addresses CVE-2020-16119. > > > > > > > > > > > > Fixes: 839a6094140a (net: dccp: Convert timers to use timer_setup()) > > > > > > Signed-off-by: Thadeu Lima de Souza Cascardo > > > > > > Signed-off-by: Kleber Sacilotto de Souza > > > > > > > > > > I've been mulling over this fix. > > > > > > > > > > The layering violation really doesn't sit well. > > > > > > > > > > We're reusing the timer object. What if we are really unlucky, the > > > > > fires and gets blocked by a cosmic ray just as it's about to try to > > > > > lock the socket, then user manages to reconnect, and timer starts > > > > > again. Potentially with a different CCID algo altogether? > > > > > > > > > > Is disconnect ever called under the BH lock? Maybe plumb a bool > > > > > argument through to ccid*_hc_tx_exit() and do a sk_stop_timer_sync() > > > > > when called from disconnect()? > > > > > > > > > > Or do refcounting on ccid_priv so that the timer holds both the socket > > > > > and the priv? > > > > > > > > Sorry about too late a response. I was on vacation, then came back and spent a > > > > couple of days testing this further, and had to switch to other tasks. > > > > > > > > So, while testing this, I had to resort to tricks like having a very small > > > > expire and enqueuing on a different CPU. Then, after some minutes, I hit a UAF. > > > > That's with or without the first of the second patch. > > > > > > > > I also tried to refcount ccid instead of the socket, keeping the timer on the > > > > ccid, but that still hit the UAF, and that's when I had to switch tasks. > > > > > > Hm, not instead, as well. I think trying cancel the timer _sync from > > > the disconnect path would be the simplest solution, tho. > > > > I don't think so. On other paths, we would still have the possibility that: > > > > CPU1: timer expires and is about to run > > CPU2: calls stop_timer (which does not stop anything) and frees ccid > > CPU1: timer runs and uses freed ccid > > > > And those paths, IIUC, may be run under a SoftIRQ on the receive path, so would > > not be able to call stop_timer_sync. > > Which paths are those (my memory of this code is waning)? I thought > disconnect is only called from the user space side (shutdown syscall). > The only other way to terminate the connection is to close the socket, > which Eric already fixed by postponing the destruction of ccid in that > case. dccp_v4_do_rcv -> dccp_rcv_established -> dccp_parse_options -> dccp_feat_parse_options -> dccp_feat_handle_nn_established -> dccp_feat_activate -> __dccp_feat_activate -> dccp_hdlr_ccid -> ccid_hc_tx_delete > > > > > Oh, and in the meantime, I found one or two other fixes that we > > > > should apply, will send them shortly. > > > > > > > > But I would argue that we should apply the revert as it addresses the > > > > CVE, without really regressing the other UAF, as I argued. Does that > > > > make sense? > > > > > > We can - it's always a little strange to go from one bug to a different > > > without a fix - but the justification being that while the previous UAF > > > required a race condition the new one is actually worst because it can > > > be triggered reliably? > > > > Well, I am arguing here that commit 2677d20677314101293e6da0094ede7b5526d2b1 > > ("dccp: don't free ccid2_hc_tx_sock struct in dccp_disconnect()") doesn't > > really fix anything. Whenever ccid_hx_tx_delete is called, that UAF might > > happen, because the timer might trigger right after we free the ccid struct. > > > > And, yes, on the other hand, we can reliably launch the DoS attack that is > > fixed by the revert of that commit. > > OK. >