Received: by 2002:a4a:311b:0:0:0:0:0 with SMTP id k27-v6csp4130503ooa; Tue, 14 Aug 2018 01:18:00 -0700 (PDT) X-Google-Smtp-Source: AA+uWPzwDqcHImgwiDs4Bh2h96q0ULduNzYCDt6rKOPy89elyV1rowEwHrIOHluQbbYPT5CIYSb6 X-Received: by 2002:a62:c00c:: with SMTP id x12-v6mr22352504pff.216.1534234680738; Tue, 14 Aug 2018 01:18:00 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1534234680; cv=none; d=google.com; s=arc-20160816; b=NxrXgQY5qFLEXuiTm8S9rPiz21eGvvhC1z60oi05OyfhvhZKpXC4Ym77o5zmJ6//in lrEt/c1rpL9bFI66k4CpgaAI4dGsziTKv/CR34/893ncwkW+DzgzX22jCcuGbgbU4D0w 5quzSzItwwA2TkvXTw7HPAdcYfSMJS6mdbByrYOnXNQzo55+I2wAsktSo6v52TaGBk0l v5bN5YuJwNIeImz0zGuloPXX9xhsXzq5oo/03yc6RNW/SN7ahA/CuWA9WaT8+EDJPg84 Xq5aBV+U0/lDiXAy72WJL9t2Sp04A7GpewOPF4QhnucQfTZhkm/Jv1WUsZDsCg48EDI5 HlUw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-language :content-transfer-encoding:in-reply-to:mime-version:user-agent:date :message-id:from:references:cc:to:subject:arc-authentication-results; bh=Kn3b9q0aiwMyjLyVDvjUIZ4Z+OrKgP4Rhnu7ZBGQrBk=; b=UIlKXdC3VKbXU4cwDi9sSyiEKNp9FUuugBzEmtISYxG33SSMDuAo2C7uDs1wf5/pXL iFRXI4iod6Qia65LjBkWDqxPo3ZFWj0bBGx6eEu25JYah38F8YVl5XdnrFA13z5PT209 6o5uIosZ70B3i0NBcbdjZq1VyAb+WDEeexPlzPKGyaNLgBuWKfk/mP2nGDtXsx1izEqG cVH+yWwTUGWkWPhXBFdyQ9NnJxG9O0NDdou11SjiwXb171YENkMvn7ZFhFtWbooUzXF/ dewDM93Yjx6Tr3kYf++io4xwBNs2/VGecnOWwSCGjP0XeIBag/uFmTjwF8AQxf4/RFVI QGHA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id 1-v6si19416792pgb.107.2018.08.14.01.17.45; Tue, 14 Aug 2018 01:18:00 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=redhat.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1731025AbeHNKpL (ORCPT + 99 others); Tue, 14 Aug 2018 06:45:11 -0400 Received: from mx3-rdu2.redhat.com ([66.187.233.73]:39800 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1726053AbeHNKpL (ORCPT ); Tue, 14 Aug 2018 06:45:11 -0400 Received: from smtp.corp.redhat.com (int-mx03.intmail.prod.int.rdu2.redhat.com [10.11.54.3]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 6982088846; Tue, 14 Aug 2018 07:59:08 +0000 (UTC) Received: from [10.72.12.26] (ovpn-12-26.pek2.redhat.com [10.72.12.26]) by smtp.corp.redhat.com (Postfix) with ESMTPS id 78DE41010421; Tue, 14 Aug 2018 07:59:03 +0000 (UTC) Subject: Re: [RFC PATCH net-next V2 0/6] XDP rx handler To: Alexei Starovoitov Cc: netdev@vger.kernel.org, linux-kernel@vger.kernel.org, ast@kernel.org, daniel@iogearbox.net, jbrouer@redhat.com, mst@redhat.com References: <1534130250-5302-1-git-send-email-jasowang@redhat.com> <20180814003253.fkgl6lyklc7fclvq@ast-mbp> From: Jason Wang Message-ID: <5de3d14f-f21a-c806-51f4-b5efd7d809b7@redhat.com> Date: Tue, 14 Aug 2018 15:59:01 +0800 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.9.1 MIME-Version: 1.0 In-Reply-To: <20180814003253.fkgl6lyklc7fclvq@ast-mbp> Content-Type: text/plain; charset=utf-8; format=flowed Content-Transfer-Encoding: 8bit Content-Language: en-US X-Scanned-By: MIMEDefang 2.78 on 10.11.54.3 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Tue, 14 Aug 2018 07:59:08 +0000 (UTC) X-Greylist: inspected by milter-greylist-4.5.16 (mx1.redhat.com [10.11.55.2]); Tue, 14 Aug 2018 07:59:08 +0000 (UTC) for IP:'10.11.54.3' DOMAIN:'int-mx03.intmail.prod.int.rdu2.redhat.com' HELO:'smtp.corp.redhat.com' FROM:'jasowang@redhat.com' RCPT:'' Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 2018年08月14日 08:32, Alexei Starovoitov wrote: > On Mon, Aug 13, 2018 at 11:17:24AM +0800, Jason Wang wrote: >> Hi: >> >> This series tries to implement XDP support for rx hanlder. This would >> be useful for doing native XDP on stacked device like macvlan, bridge >> or even bond. >> >> The idea is simple, let stacked device register a XDP rx handler. And >> when driver return XDP_PASS, it will call a new helper xdp_do_pass() >> which will try to pass XDP buff to XDP rx handler directly. XDP rx >> handler may then decide how to proceed, it could consume the buff, ask >> driver to drop the packet or ask the driver to fallback to normal skb >> path. >> >> A sample XDP rx handler was implemented for macvlan. And virtio-net >> (mergeable buffer case) was converted to call xdp_do_pass() as an >> example. For ease comparision, generic XDP support for rx handler was >> also implemented. >> >> Compared to skb mode XDP on macvlan, native XDP on macvlan (XDP_DROP) >> shows about 83% improvement. > I'm missing the motiviation for this. > It seems performance of such solution is ~1M packet per second. Notice it was measured by virtio-net which is kind of slow. > What would be a real life use case for such feature ? I had another run on top of 10G mlx4 and macvlan: XDP_DROP on mlx4: 14.0Mpps XDP_DROP on macvlan: 10.05Mpps Perf shows macvlan_hash_lookup() and indirect call to macvlan_handle_xdp() are the reasons for the number drop. I think the numbers are acceptable. And we could try more optimizations on top. So here's real life use case is trying to have an fast XDP path for rx handler based device: - For containers, we can run XDP for macvlan (~70% of wire speed). This allows a container specific policy. - For VM, we can implement macvtap XDP rx handler on top. This allow us to forward packet to VM without building skb in the setup of macvtap. - The idea could be used by other rx handler based device like bridge, we may have a XDP fast forwarding path for bridge. > > Another concern is that XDP users expect to get line rate performance > and native XDP delivers it. 'generic XDP' is a fallback only > mechanism to operate on NICs that don't have native XDP yet. So I can replace generic XDP TX routine with a native one for macvlan. > Toshiaki's veth XDP work fits XDP philosophy and allows > high speed networking to be done inside containers after veth. > It's trying to get to line rate inside container. This is one of the goal of this series as well. I agree veth XDP work looks pretty fine, but it only work for a specific setup I believe since it depends on XDP_REDIRECT which is supported by few drivers (and there's no VF driver support). And in order to make it work for a end user, the XDP program still need logic like hash(map) lookup to determine the destination veth. > This XDP rx handler stuff is destined to stay at 1Mpps speeds forever > and the users will get confused with forever slow modes of XDP. > > Please explain the problem you're trying to solve. > "look, here I can to XDP on top of macvlan" is not an explanation of the problem. > Thanks