Received: by 2002:a05:6a10:1d13:0:0:0:0 with SMTP id pp19csp232705pxb; Thu, 2 Sep 2021 02:59:27 -0700 (PDT) X-Google-Smtp-Source: ABdhPJy5bjr0zAEX8MIDZRYRUUCkeDdbbqE0tBe1Ob69YC+BkGxqJ9qQbs9yHDhN5l1ZxA2DKq2A X-Received: by 2002:a6b:ec0b:: with SMTP id c11mr2005247ioh.207.1630576767571; Thu, 02 Sep 2021 02:59:27 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1630576767; cv=none; d=google.com; s=arc-20160816; b=fzvOFDURAehqR7Wzytyjx9bRXZA4BP+7V0TY7vpEjc2UBX1W6fkYyrJGd/4L8ZHpH8 d9rgT30Py7jgfsNl0UPitycsNofnyCegk1t+ig8oJwY54kzFam5sdT3nCOBP0ON6biJ7 NnEM5xPpuSB2kIl1OI8hXAuCPSPqCTom1/BDl/uPrya4BuikZUzyGVcm6BoVaFZD/dEi y8GxxR3fkXKbo0l6ZWVKrFy+YUXap0pXPyGD2tQ87oqvHp8cn5IR8jth4UbD4EHDzCcT zRTth9GWpr6m01kK43n8UvohQupJ15HTCx2q7/WU0yoZj7vddNUg4pIivzxGKQ8aN3Ql hbwQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:content-transfer-encoding:content-language :in-reply-to:mime-version:user-agent:date:message-id:references:cc :to:from:subject:dkim-signature; bh=dSXhbvgJCpxM7HEdHstfIRuSSvLCB6hGd2m4qJH9i3U=; b=P2VPesDCf8zxxd/oqoo9foOb0KGhcxCrMQbnZl4Wl+MRQfrzSu4ZSW/hBaNLM7EzoS 6NZTBEGXwA67e0Qnh0VkynRSqoC6JY2Kj7CmmtLv+NG7wXFTp2khn0PyvJHRHG66LLVM 0IJclvPPnztUPIseVvOTuZpLRJYTBVf/Jiv7UaxKtnn8XVrl1hAymdq9you9ChqBnhJQ HXG5qQb+asHEtO5MzMfkyQzq7MgVq84GLGQCKx6tEp1+FO9uWGrlfJcCgAjY8P3ggSaa OYO20vxasDRkImuSsVPtSh3lG8Jl0Pz/laj9gN+vIz8dYclGFa6vhCMt76gEDzqCJuzY 5A4Q== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@virtuozzo.com header.s=relay header.b=WzrkzilR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=virtuozzo.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id m62si1016279ioa.15.2021.09.02.02.59.16; Thu, 02 Sep 2021 02:59:27 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@virtuozzo.com header.s=relay header.b=WzrkzilR; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=QUARANTINE sp=QUARANTINE dis=NONE) header.from=virtuozzo.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S243627AbhIBHeR (ORCPT + 99 others); Thu, 2 Sep 2021 03:34:17 -0400 Received: from relay.sw.ru ([185.231.240.75]:43126 "EHLO relay.sw.ru" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S233401AbhIBHeO (ORCPT ); Thu, 2 Sep 2021 03:34:14 -0400 DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=virtuozzo.com; s=relay; h=Content-Type:MIME-Version:Date:Message-ID:From: Subject; bh=dSXhbvgJCpxM7HEdHstfIRuSSvLCB6hGd2m4qJH9i3U=; b=WzrkzilRTeoP0oKLS D8+9YGrkV23/Rt892VOpynELXGb6V6Q0M8+uFoUyr3s2IydvrzLGOnWRqJlQcLksWYesmH+Vd1ZlN SK4cRYglfSjgob6fMxorKcLrhV3gYCGLISUNgNawr+VAOLi3hnjLL73ogbyVPq6gttdBGqk7vOxBc =; Received: from [10.93.0.56] by relay.sw.ru with esmtp (Exim 4.94.2) (envelope-from ) id 1mLhDh-000YMb-VB; Thu, 02 Sep 2021 10:33:09 +0300 Subject: Re: [PATCH net-next v4] skb_expand_head() adjust skb->truesize incorrectly From: Vasily Averin To: Eric Dumazet , Christoph Paasch , "David S. Miller" Cc: Hideaki YOSHIFUJI , David Ahern , Jakub Kicinski , netdev , linux-kernel@vger.kernel.org, kernel@openvz.org, Alexey Kuznetsov , Julian Wiedmann References: <67740366-7f1b-c953-dfe1-d2085297bdf3@gmail.com> <8a183782-f4b9-e12a-55d1-c4a3c4078369@virtuozzo.com> Message-ID: <2984f16b-7f20-e72d-1661-b942fdc4ff9b@virtuozzo.com> Date: Thu, 2 Sep 2021 10:33:09 +0300 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:78.0) Gecko/20100101 Thunderbird/78.11.0 MIME-Version: 1.0 In-Reply-To: <8a183782-f4b9-e12a-55d1-c4a3c4078369@virtuozzo.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 9/2/21 10:13 AM, Vasily Averin wrote: > On 9/2/21 7:48 AM, Eric Dumazet wrote: >> On 9/1/21 9:32 PM, Eric Dumazet wrote: >>> I think you missed netem case, in particular >>> skb_orphan_partial() which I already pointed out. >>> >>> You can setup a stack of virtual devices (tunnels), >>> with a qdisc on them, before ip6_xmit() is finally called... >>> >>> Socket might have been closed already. >>> >>> To test your patch, you could force a skb_orphan_partial() at the beginning >>> of skb_expand_head() (extending code coverage) >> >> To clarify : >> >> It is ok to 'downgrade' an skb->destructor having a ref on sk->sk_wmem_alloc to >> something owning a ref on sk->refcnt. >> >> But the opposite operation (ref on sk->sk_refcnt --> ref on sk->sk_wmem_alloc) is not safe. > > Could you please explain in more details, since I stil have a completely opposite point of view? > > Every sk referenced in skb have sk_wmem_alloc > 9 > It is assigned to 1 in sk_alloc and decremented right before last __sk_free(), > inside both sk_free() sock_wfree() and __sock_wfree() > > So it is safe to adjust skb->sk->sk_wmem_alloc, > because alive skb keeps reference to alive sk and last one keeps sk_wmem_alloc > 0 > > So any destructor used sk->sk_refcnt will already have sk_wmem_alloc > 0, > because last sock_put() calls sk_free(). > > However now I'm not sure in reversed direction. > skb_set_owner_w() check !sk_fullsock(sk) and call sock_hold(sk); > If sk->sk_refcnt can be 0 here (i.e. after execution of old destructor inside skb_orphan) > -- it can be trigger pointed problem: > "refcount_add() will trigger a warning (panic under KASAN)". > > Could you please explain where I'm wrong? To clarify: I'm agree it is unsafe to call on alive skb: skb_orphan(skb) adjust(skb_>sk->sk_wmem_alloc) becasue 2 reasone: 1) old destructor can decrease sk_vmem_alloc to zero and free sk 2) becasue old destructor if !sk_fullsock(sk) can call sock_out and release last sk->sk_refcnt reference. in this case sock_hold() will trigger warning. 1) can be handled, we can adjust(sk_wmem_alloc) before skb_orphan() but I badly understand how to handle 2nd case. Thank you, Vasily Averin