Received: by 2002:a25:6193:0:0:0:0:0 with SMTP id v141csp1334895ybb; Fri, 20 Mar 2020 18:34:32 -0700 (PDT) X-Google-Smtp-Source: ADFU+vszo9TJ6c/O++8krKtAAgc5m+3L0Z3UkKYi+4DFMqTGXTTlbq37E+RX9UzfcxBzfAPSTytY X-Received: by 2002:a9d:4ee:: with SMTP id 101mr9131495otm.301.1584754472460; Fri, 20 Mar 2020 18:34:32 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1584754472; cv=none; d=google.com; s=arc-20160816; b=oeC+bJiL9vsjgXknU9VBoMWwXIM6pE/W2sdGb4g1x5DsftFcDZEM5ZfrdlHE0rbQ4w i9vaX8L/0HnBO115dsLnT6MM6FvmMaXNEq+WqubNwX3YTyKIMIKoicHZypae6CIo6r3n h6s3zjUBJwlo3zuLrvDdRTSr5/sY5AP1ZUWfqco7KS0RVe04mKjPX1CAp987y7YQ+Xgd zpeq5yotiYq7iLtV6GhC2BzbkktlOh+V2Af58Xc8y1N399p8jzsczp6ASR1IyCrLIxcy c5wZyYS5zLY3G+V06mYg14b8qRb1Ku9835WrkS6FSlkAAHCGD6ZLMt0jp31QkZ4UDdNk w7iA== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:in-reply-to:content-disposition :mime-version:references:message-id:subject:cc:to:from:date :dkim-signature; bh=nWwxA1w2Gatv80CNv5vvgq55JU/X2LGJD7QRCwC8gwI=; b=evMbdCpVAgTkr7E/o9iY1k7shakrB4xP2MRKackCAvcahQafYPP/gZzsn5ClCi+6qi nYne4qAfPo3Jb7hphlHuxEzDSekPOmzqNtXwEyyi3DpEXgOSAqx21+qk7sDkdncmgSh8 eyA3qdoOcntgvbVL0tR/Smu2XDuP1i+Vpq/gaoMm3uzME4BbLWv7wv2nwHfDtsVAAHaq AGqq/D9C/n5qRIOt+toBO9F0pSgAFv89mxdtpappEAVvEHMJ4pW9l98WAiiDF353pcfH IKWvojJA2SOCP6kwv0a8krFdZBqyATet97iq/3rN1Qtgnmw0Sf32ZcH3mXQzLtzTOYGE nQbA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ODKjjbxz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 7si3531670oij.97.2020.03.20.18.34.19; Fri, 20 Mar 2020 18:34:32 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b=ODKjjbxz; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727916AbgCUBdW (ORCPT + 99 others); Fri, 20 Mar 2020 21:33:22 -0400 Received: from mail-qk1-f195.google.com ([209.85.222.195]:35943 "EHLO mail-qk1-f195.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726840AbgCUBdW (ORCPT ); Fri, 20 Mar 2020 21:33:22 -0400 Received: by mail-qk1-f195.google.com with SMTP id d11so9226632qko.3; Fri, 20 Mar 2020 18:33:21 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=date:from:to:cc:subject:message-id:references:mime-version :content-disposition:in-reply-to; bh=nWwxA1w2Gatv80CNv5vvgq55JU/X2LGJD7QRCwC8gwI=; b=ODKjjbxzI8SNz+b/0IuuUCfMJejDoMhqo4S5/Se9pve6CE5nc6PSWIOf9dQVKhaG5O qwBek0oPpvBkEMzaZYKP06cc7rd3PvUcrYxARQ8Ovh/z8r8ATxtSajk+oF1hXIp0ukLy NKWUx/63Eosw1W7ALEWbzc2PduE1Zoj9WFHkHezKtpOzMoy2xBa5TtC1pufFlid+ubpd Sxyij2fCCqosZLM/h7wWF1Gk7nApgxhj1cuJyikRW1wpGm3tjveG5Coj8kKlRUcroPqy DYmwdBstNr3tGmk9n32wNhMIVSpxEytCJufe9BfsmS0XuC6J6Hkj5nx01QaTEoeZNtFO tlng== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:date:from:to:cc:subject:message-id:references :mime-version:content-disposition:in-reply-to; bh=nWwxA1w2Gatv80CNv5vvgq55JU/X2LGJD7QRCwC8gwI=; b=ZZG98/lghIqfS6X9Eg+WLekmwMfGcjclc5VjF0iqpc6iCyclm45hnZfqOINn2aCSUh Qve2RYDuEMrQpLbqi9r/QnvEspBKWzGGFf4AlstwIgGlFtWVU65dYIi6fkFWWkH6Z4Kg dC0gMCPOwDmF2u4i28JzLmJYQu0t9Ej75EPpZrRPgJMyfTtOyxMoGWQq23PYmU7+4J+9 2d8siqDNCcQkv6z+IQFLCGWPcz+ZPVbbCBPSQuwR5BFtCc0kdTOqbOh5pdOjyYwIUGVW M+X9GFyyZEHvACzuqxZ+PXWRDvTX+WNKop7BiIVjCiezw0SL7EDra+POK0CJM4qdE6ge p/GQ== X-Gm-Message-State: ANhLgQ3XJuv/uf7WKmk0zanVLEd6mXX7GoxmnwghLil6HB/aTp9EElDy /3gy2tcmYrDhqcxs5lJZ1O6gQiPEsA8= X-Received: by 2002:a37:e47:: with SMTP id 68mr10646046qko.17.1584754400994; Fri, 20 Mar 2020 18:33:20 -0700 (PDT) Received: from localhost.localdomain ([177.220.176.176]) by smtp.gmail.com with ESMTPSA id w18sm5582324qkw.130.2020.03.20.18.33.19 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Fri, 20 Mar 2020 18:33:20 -0700 (PDT) Received: by localhost.localdomain (Postfix, from userid 1000) id DFC88C3145; Fri, 20 Mar 2020 22:33:17 -0300 (-03) Date: Fri, 20 Mar 2020 22:33:17 -0300 From: Marcelo Ricardo Leitner To: Qiujun Huang Cc: "David S. Miller" , vyasevich@gmail.com, nhorman@tuxdriver.com, Jakub Kicinski , linux-sctp@vger.kernel.org, netdev , LKML , anenbupt@gmail.com Subject: Re: [PATCH v3] sctp: fix refcount bug in sctp_wfree Message-ID: <20200321013317.GF3756@localhost.localdomain> References: <20200320110959.2114-1-hqjagain@gmail.com> <20200320185204.GB3828@localhost.localdomain> <20200321010246.GC3828@localhost.localdomain> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Sat, Mar 21, 2020 at 09:23:54AM +0800, Qiujun Huang wrote: > On Sat, Mar 21, 2020 at 9:02 AM Marcelo Ricardo Leitner > wrote: > > > > On Sat, Mar 21, 2020 at 07:53:29AM +0800, Qiujun Huang wrote: > > ... > > > > > So, sctp_wfree was not called to destroy SKB) > > > > > > > > > > then migrate happened > > > > > > > > > > sctp_for_each_tx_datachunk( > > > > > sctp_clear_owner_w); > > > > > sctp_assoc_migrate(); > > > > > sctp_for_each_tx_datachunk( > > > > > sctp_set_owner_w); > > > > > SKB was not in the outq, and was not changed to newsk > > > > > > > > The real fix is to fix the migration to the new socket, though the > > > > situation on which it is happening is still not clear. > > > > > > > > The 2nd sendto() call on the reproducer is sending 212992 bytes on a > > > > single call. That's usually the whole sndbuf size, and will cause > > > > fragmentation to happen. That means the datamsg will contain several > > > > skbs. But still, the sacked chunks should be freed if needed while the > > > > remaining ones will be left on the queues that they are. > > > > > > in sctp_sendmsg_to_asoc > > > datamsg holds his chunk result in that the sacked chunks can't be freed > > > > Right! Now I see it, thanks. > > In the end, it's not a locking race condition. It's just not iterating > > on the lists properly. > > > > > > > > list_for_each_entry(chunk, &datamsg->chunks, frag_list) { > > > sctp_chunk_hold(chunk); > > > sctp_set_owner_w(chunk); > > > chunk->transport = transport; > > > } > > > > > > any ideas to handle it? > > > > sctp_for_each_tx_datachunk() needs to be aware of this situation. > > Instead of iterating directly/only over the chunk list, it should > > iterate over the datamsgs instead. Something like the below (just > > compile tested). > > > > Then, the old socket will be free to die regardless of the new one. > > Otherwise, if this association gets stuck on retransmissions or so, > > the old socket would not be freed till then. > > > > diff --git a/net/sctp/socket.c b/net/sctp/socket.c > > index fed26a1e9518..85c742310d26 100644 > > --- a/net/sctp/socket.c > > +++ b/net/sctp/socket.c > > @@ -151,9 +151,10 @@ static void sctp_for_each_tx_datachunk(struct sctp_association *asoc, > > void (*cb)(struct sctp_chunk *)) > > > > { > > + struct sctp_datamsg *msg, *prev_msg = NULL; > > struct sctp_outq *q = &asoc->outqueue; > > struct sctp_transport *t; > > - struct sctp_chunk *chunk; > > + struct sctp_chunk *chunk, *c; I missed to swap some lines here, for reverse christmass-tree style, btw. > > > > list_for_each_entry(t, &asoc->peer.transport_addr_list, transports) > > list_for_each_entry(chunk, &t->transmitted, transmitted_list) > > @@ -162,8 +163,14 @@ static void sctp_for_each_tx_datachunk(struct sctp_association *asoc, > > list_for_each_entry(chunk, &q->retransmit, transmitted_list) > > cb(chunk); > > > > - list_for_each_entry(chunk, &q->sacked, transmitted_list) > > - cb(chunk); > > + list_for_each_entry(chunk, &q->sacked, transmitted_list) { > > + msg = chunk->msg; > > + if (msg == prev_msg) > > + continue; > > + list_for_each_entry(c, &msg->chunks, frag_list) > > + cb(c); > > + prev_msg = msg; > > + } > > great. I'll trigger a syzbot test. Thanks. Mind that it may need to handled on the other lists as well. I didn't check them :] > > > > > list_for_each_entry(chunk, &q->abandoned, transmitted_list) > > cb(chunk);