Received: by 2002:ac0:a5a6:0:0:0:0:0 with SMTP id m35-v6csp4977822imm; Tue, 11 Sep 2018 22:38:52 -0700 (PDT) X-Google-Smtp-Source: ANB0VdbHPpwkGNwOEshKBDsDVvlT/nONSrjJ8mFBWhHQqVbbRdemxk81X2b6cms30JIlOTjYk7Z6 X-Received: by 2002:a17:902:d808:: with SMTP id a8-v6mr243572plz.68.1536730732745; Tue, 11 Sep 2018 22:38:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1536730732; cv=none; d=google.com; s=arc-20160816; b=x0AZr1JyvtpACDrxN9FgqN9VtL5O6OF4+8Eah8VdNarDSxcKdWmsB0VCl72c9iy+tA vsyYyRrwRIBnTBeL1iP/CFO1AmOfe4B5pEgp7GqHInwoflYwSovLbqcZyVv9bUad25WY YFrSjFRIBQ0y9OOmjvo5xKNcxGvwDspMLREc8JFrEsOJguFpwTJvJwd9ol2XXUZukc/A WmfR+A3O2YY5Vgducv/DoPzFDj8bDTSPe9DM8rXrZqvv/6diGfu/8PRmSy8Do+0bl6Dr M/HNqFu8BgyJRKEfYND+r/WoSmVv+57WVq2rfGNujqi0FLXIl6+5PQMD48dHlWsVgyHi ts0g== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=fc9lrNM93hzbJe2A5/TttLQkyCdIAruzuXUOEPHbZfk=; b=eWPf4nC5e5+oSKgxMe3igbdWRaoCD18PUKtBajD/aR7wHCOsIDVO3jHb/W2rjUZXIq d1QH+E3SPFxqiVWf1F6jCfflICtLVsfLgaft4X2gG1T5YL09/5H4Y0lg7k+LYVCYdJOO qTEjnQw83U+3RC0J2CTWC9QZDzizcThemi95FeMhoPTwz++0L304S6Nbf8FNOCzuSziv fNjVeTI2PXcUbrDg8a8ivwGhurMN6dfoijHeOCqnK3q7ZGzmQwCJzPZneecdcK/qKyPp G3UiIkowvefRWglgMi1tP+DlW4oS60Cvdf6zznYzSCOa0xiQkquhNbrAOtzZ9hZkNut1 Qx5w== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id ca2-v6si24258935plb.305.2018.09.11.22.38.37; Tue, 11 Sep 2018 22:38:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1727287AbeILKju (ORCPT + 99 others); Wed, 12 Sep 2018 06:39:50 -0400 Received: from nautica.notk.org ([91.121.71.147]:41477 "EHLO nautica.notk.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726138AbeILKju (ORCPT ); Wed, 12 Sep 2018 06:39:50 -0400 Received: by nautica.notk.org (Postfix, from userid 1001) id F040BC009; Wed, 12 Sep 2018 07:36:57 +0200 (CEST) Date: Wed, 12 Sep 2018 07:36:42 +0200 From: Dominique Martinet To: Doron Roberts-Kedes , Tom Herbert , Dave Watson Cc: "David S. Miller" , netdev@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH v2] kcm: remove any offset before parsing messages Message-ID: <20180912053642.GA2912@nautica> References: <1536657703-27577-1-git-send-email-asmadeus@codewreck.org> MIME-Version: 1.0 Content-Type: text/plain; charset=utf-8 Content-Disposition: inline In-Reply-To: <1536657703-27577-1-git-send-email-asmadeus@codewreck.org> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Dominique Martinet wrote on Tue, Sep 11, 2018: > Hmm, while trying to benchmark this, I sometimes got hangs in > kcm_wait_data() for the last packet somehow? > The sender program was done (exited (zombie) so I assumed the sender > socket flushed), but the receiver was in kcm_wait_data in kcm_recvmsg > indicating it parsed a header but there was no skb to peek at? > But the sock is locked so this shouldn't be racy... > > I can get it fairly often with this patch and small messages with an > offset, but I think it's just because the pull changes some timing - I > can't hit it with just the clone, and I can hit it with a pull without > clone as well.... And I don't see how pulling a cloned skb can impact > the original socket, but I'm a bit fuzzy on this. This is weird, I cannot reproduce at all without that pull, even if I add another delay there instead of the pull, so it's not just timing... I was confusing kcm_recvmsg and kcm_rcv_strparser but I still think there is a race in kcm_wait_data between the peek and the wait. I have confirmed on a hang that the sock's sk_receive_queue is not empty so skb_peek would return something at this point, but it is waiting in kcm_wait_data's sk_wait_data.. And no event would be coming at this point since the sender is done. kcm_wait_data does have the socket lock but the enqueuing counterpart (kcm_queue_rcv_skb) is apparently called without lock most of the time (at least, sk->sk_lock.owned is not set most of the times) ; but if I add a lock_sock() here it can deadlock when a kfree_skb triggers kcm_rcv_ready if that kfree_skb was done with the lock... ugh. I thought the mux rx lock would be taken all the time but that appear not to be the case either, and even if it were I don't see how to make sk_wait_data with another lock in the first place. So meh, I'm not sure why the pull messes up with this, I'm tempted to say this is an old problem but I'm quite annoyed I do not seem to be able to reproduce it. I'm really out of time now so unless someone cares it'll wait for a few more weeks. (As an unrelated note, wrt to the patch, it might be nice to add a new kcm socket option so users could say "my bpf program handles offsets, don't bother with this") -- Dominique