Received: by 2002:ab2:788f:0:b0:1ee:8f2e:70ae with SMTP id b15csp524151lqi; Thu, 7 Mar 2024 04:35:10 -0800 (PST) X-Forwarded-Encrypted: i=2; AJvYcCVxJuXeEplIx2KRsQvweSL8QndQloIQExj8TnE+tKmKnXFcPU/6g4e21A6uLjnM93zJLV+e+xoOd6q5Sl8lnIuSqvajwjqWHlJnuNr6ig== X-Google-Smtp-Source: AGHT+IHeoa9F25VFaiYum+epb39T1/BKVkzsNgDptylir6k1dwijqz1t3zM4stxg/gRmnrXV3/F1 X-Received: by 2002:a54:419a:0:b0:3c2:c8a:fcf with SMTP id 26-20020a54419a000000b003c20c8a0fcfmr7415659oiy.54.1709814910262; Thu, 07 Mar 2024 04:35:10 -0800 (PST) Return-Path: Received: from sy.mirrors.kernel.org (sy.mirrors.kernel.org. [147.75.48.161]) by mx.google.com with ESMTPS id i15-20020a63220f000000b005d849f5352csi13851621pgi.795.2024.03.07.04.35.09 for (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 07 Mar 2024 04:35:10 -0800 (PST) Received-SPF: pass (google.com: domain of linux-kernel+bounces-95515-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) client-ip=147.75.48.161; Authentication-Results: mx.google.com; arc=fail (body hash mismatch); spf=pass (google.com: domain of linux-kernel+bounces-95515-linux.lists.archive=gmail.com@vger.kernel.org designates 147.75.48.161 as permitted sender) smtp.mailfrom="linux-kernel+bounces-95515-linux.lists.archive=gmail.com@vger.kernel.org"; dmarc=fail (p=QUARANTINE sp=QUARANTINE dis=QUARANTINE) header.from=huawei.com Received: from smtp.subspace.kernel.org (wormhole.subspace.kernel.org [52.25.139.140]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by sy.mirrors.kernel.org (Postfix) with ESMTPS id 119C0B23BF5 for ; Thu, 7 Mar 2024 12:15:59 +0000 (UTC) Received: from localhost.localdomain (localhost.localdomain [127.0.0.1]) by smtp.subspace.kernel.org (Postfix) with ESMTP id EE99512B140; Thu, 7 Mar 2024 12:15:39 +0000 (UTC) Received: from szxga08-in.huawei.com (szxga08-in.huawei.com [45.249.212.255]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id CC444129A7D; Thu, 7 Mar 2024 12:15:36 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=45.249.212.255 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709813739; cv=none; b=KPTA8dK7cCg52Bmpo5MmX+2VioQPz3kiPv+WhJu8tFCWRgsXL+BwiwKYFzO59ypyyTGwSI3Ip/Xnk7cxB/XSdQsJ0TcQS7N28uxpcg+qU1A5ewvzFk2MbbWKYCY4eBfw4O4JE637bhSlMaDnkq50c8nfDWd+fnW4SpmZKYVL3PU= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1709813739; c=relaxed/simple; bh=33Uzy2N/rqj/7YQgQGuWOf/5Tv1toMJF0Qi0AcbhFw0=; h=Subject:To:CC:References:From:Message-ID:Date:MIME-Version: In-Reply-To:Content-Type; b=daHia+g5WZ7eODuIVAll5m8QwdGKqk00OLE0IvF8P6SZV7ETbWPIs2RfpKSz6iT7ggoEThqt1pPvRqiq3lSg0uJPztCHDmqMQnoSFsMn4ChBBZ9SznujZovIg76PZ12yj/2cv0iSKnwXhoarG6SjrIUj7nymYXGgpU8EkHdlcF8= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com; spf=pass smtp.mailfrom=huawei.com; arc=none smtp.client-ip=45.249.212.255 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=huawei.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=huawei.com Received: from mail.maildlp.com (unknown [172.19.163.174]) by szxga08-in.huawei.com (SkyGuard) with ESMTP id 4Tr7Vw2yQsz1Q9QH; Thu, 7 Mar 2024 20:13:28 +0800 (CST) Received: from dggpemm500005.china.huawei.com (unknown [7.185.36.74]) by mail.maildlp.com (Postfix) with ESMTPS id 7C1EA14059C; Thu, 7 Mar 2024 20:15:27 +0800 (CST) Received: from [10.69.30.204] (10.69.30.204) by dggpemm500005.china.huawei.com (7.185.36.74) with Microsoft SMTP Server (version=TLS1_2, cipher=TLS_ECDHE_RSA_WITH_AES_256_GCM_SHA384) id 15.1.2507.35; Thu, 7 Mar 2024 20:15:26 +0800 Subject: Re: [RFC PATCH net-next v6 05/15] netdev: support binding dma-buf to netdevice To: Mina Almasry CC: , , , , , , , , , , , , , "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Jonathan Corbet , Richard Henderson , Ivan Kokshaysky , Matt Turner , Thomas Bogendoerfer , "James E.J. Bottomley" , Helge Deller , Andreas Larsson , Jesper Dangaard Brouer , Ilias Apalodimas , Steven Rostedt , Masami Hiramatsu , Mathieu Desnoyers , Arnd Bergmann , Alexei Starovoitov , Daniel Borkmann , Andrii Nakryiko , Martin KaFai Lau , Eduard Zingerman , Song Liu , Yonghong Song , John Fastabend , KP Singh , Stanislav Fomichev , Hao Luo , Jiri Olsa , David Ahern , Willem de Bruijn , Shuah Khan , Sumit Semwal , =?UTF-8?Q?Christian_K=c3=b6nig?= , Pavel Begunkov , David Wei , Jason Gunthorpe , Shailend Chand , Harshitha Ramamurthy , Shakeel Butt , Jeroen de Borst , Praveen Kaligineedi , Willem de Bruijn , Kaiyuan Zhang References: <20240305020153.2787423-1-almasrymina@google.com> <20240305020153.2787423-6-almasrymina@google.com> From: Yunsheng Lin Message-ID: <76266a89-8ec1-6a4c-716b-da422f0b2cd5@huawei.com> Date: Thu, 7 Mar 2024 20:15:26 +0800 User-Agent: Mozilla/5.0 (Windows NT 10.0; WOW64; rv:52.0) Gecko/20100101 Thunderbird/52.2.0 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 In-Reply-To: Content-Type: text/plain; charset="utf-8" Content-Language: en-US Content-Transfer-Encoding: 7bit X-ClientProxiedBy: dggems705-chm.china.huawei.com (10.3.19.182) To dggpemm500005.china.huawei.com (7.185.36.74) On 2024/3/7 6:10, Mina Almasry wrote: .. >>>>> +static int netdev_restart_rx_queue(struct net_device *dev, int rxq_idx) >>>>> +{ >>>>> + void *new_mem; >>>>> + void *old_mem; >>>>> + int err; >>>>> + >>>>> + if (!dev || !dev->netdev_ops) >>>>> + return -EINVAL; >>>>> + >>>>> + if (!dev->netdev_ops->ndo_queue_stop || >>>>> + !dev->netdev_ops->ndo_queue_mem_free || >>>>> + !dev->netdev_ops->ndo_queue_mem_alloc || >>>>> + !dev->netdev_ops->ndo_queue_start) >>>>> + return -EOPNOTSUPP; >>>>> + >>>>> + new_mem = dev->netdev_ops->ndo_queue_mem_alloc(dev, rxq_idx); >>>>> + if (!new_mem) >>>>> + return -ENOMEM; >>>>> + >>>>> + err = dev->netdev_ops->ndo_queue_stop(dev, rxq_idx, &old_mem); >>>>> + if (err) >>>>> + goto err_free_new_mem; >>>>> + >>>>> + err = dev->netdev_ops->ndo_queue_start(dev, rxq_idx, new_mem); >>>>> + if (err) >>>>> + goto err_start_queue; >>>>> + >>>>> + dev->netdev_ops->ndo_queue_mem_free(dev, old_mem); >>>>> + >>>>> + return 0; >>>>> + >>>>> +err_start_queue: >>>>> + dev->netdev_ops->ndo_queue_start(dev, rxq_idx, old_mem); >>>> >>>> It might worth mentioning why queue start with old_mem will always >>>> success here as the return value seems to be ignored here. >>>> >>> >>> So the old queue, we stopped it, and if we fail to bring up the new >>> queue, then we want to start the old queue back up to get the queue >>> back to a workable state. >>> >>> I don't see what we can do to recover if restarting the old queue >>> fails. Seems like it should be a requirement that the driver tries as >>> much as possible to keep the old queue restartable. >> >> Is it possible that we may have the 'old_mem' leaking if the driver >> fails to restart the old queue? how does the driver handle the >> firmware cmd failure for ndo_queue_start()? it seems a little >> tricky to implement it. >> > > I'm not sure what we can do to meaningfully recover from failure to > restarting the old queue, except log it so the error is visible. In > theory because we have not modifying any queue configurations > restarting it would be straight forward, but since it's dealing with > hardware then any failures are possible. Yes, we may need to have a clear semantics of how should the driver implement the above interface, for example if the driver should free the memory when fail to start a queue or the driver should restart the queue when fail to stop a queue? Otherwise we may have different driver implementing different behavior. From the disscusion you mentioned below, does it make senses to modeling rdma subsystem by using create_queue/modify_queue/destroy_queue semantics instead? > >>> >>> I can improve this by at least logging or warning if restarting the >>> old queue fails. >> >> Also the semantics of the above function seems odd that it is not >> only restarting rx queue, but also freeing and allocating memory >> despite the name only suggests 'restart', I am a litte afraid that >> it may conflict with future usecae when user only need the >> 'restart' part, perhaps rename it to a more appropriate name. >> > > Oh, what we want here is just the 'restart' part. However, Jakub > mandates that if you restart a queue (or a driver), you do it like > this, hence the slightly more complicated implementation. > > https://patchwork.kernel.org/project/netdevbpf/patch/20231106024413.2801438-13-almasrymina@google.com/#25590262 > https://lore.kernel.org/netdev/20230815171638.4c057dcd@kernel.org/ Thanks for the link. I like david's idea of "a more generic design where H/W queues are created and destroyed - e.g., queues unique to a process which makes the cleanup so much easier." , but it seems it is a lot of work for networking to implement that for now. >