Received: by 2002:a25:8b91:0:0:0:0:0 with SMTP id j17csp4083307ybl; Mon, 27 Jan 2020 16:18:21 -0800 (PST) X-Google-Smtp-Source: APXvYqzGj6ijlOVkgz+OVKmVgLr5QYPd/PvNrcyf6wPl6BOU/oodfeae3wyJ2hG+O8ekpjzXqF1+ X-Received: by 2002:aca:1708:: with SMTP id j8mr1204937oii.166.1580170701533; Mon, 27 Jan 2020 16:18:21 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1580170701; cv=none; d=google.com; s=arc-20160816; b=hPtv4ogjiKDd0twCCe3h83MMvBku5snZV2qbNd85Ik8iOR3Iybq4hZzClBMbLK2S/h wqbt2U2TRSe+UVfTwvtHFLaYZEjHEL8noY5Nq0b2fwwtIdxMm5JqV4bqPDXc2xEoCldu 6al60YaeT44sSqBqvDalG8R+p4xaHdvqh9/OTkoIqdRkkEeMDBPRvnEsgJNoUvEAlw0A LSIBoGLaRU34dPuBH5eNXkh/RpjROsOfj5O/FoaW4ZHgIJiZKwV0p1IDzWwHa2uQAQsL Qo7uG0GUBNybN+6oOSp9WaPlaVd4lkTHK+kvnqxx7RkspgsTwYnrhXFyFBUnhf723p7Q t6lw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:mime-version :references:in-reply-to:message-id:date:subject:cc:to:from :dkim-signature; bh=N6+HRSgh5NtNpvQxQs/5HFHZ5OA3rO0AVcskLic05y0=; b=JiqyyrBYfFDa2tYD4Fg5M/fi7Cl33Vr1cb4rBQ7XULbQa3+LdG+dkEwXasXtA6YBM6 qke5ZqacSRh3wIvf9Q/r8jdnNU312mlyA6y+NFzAKzggEK+GEulsMjg3bsPBgvbCtLlS sc8Opr9HuzkjAG18PktZcXMzp+xYRLEJwJJJs29pSRamU14mIoMwBPOyT4Nl48tySRAB BkRkMIhD2EhK/NVzfAN9e2fzxbhepzC45cZpM2zm1E9Fu5h3BMPAxjGcQFs/nP/tMrBU 5cLtz4e9YWPjIMLFi/Ew093r0L3VQBe0AKAf9vMjEiq77tR+nlEtttTVGE8omNLy+9AI sJ1A== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="nO/vop58"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id q26si4160823oij.38.2020.01.27.16.18.00; Mon, 27 Jan 2020 16:18:21 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@gmail.com header.s=20161025 header.b="nO/vop58"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=QUARANTINE dis=NONE) header.from=gmail.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726612AbgA1AQh (ORCPT + 99 others); Mon, 27 Jan 2020 19:16:37 -0500 Received: from mail-wr1-f68.google.com ([209.85.221.68]:40752 "EHLO mail-wr1-f68.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726164AbgA1AQh (ORCPT ); Mon, 27 Jan 2020 19:16:37 -0500 Received: by mail-wr1-f68.google.com with SMTP id c14so13939244wrn.7; Mon, 27 Jan 2020 16:16:35 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=N6+HRSgh5NtNpvQxQs/5HFHZ5OA3rO0AVcskLic05y0=; b=nO/vop58ef4AEzY3z3aCWmEFpkVvaxoRW++lw/uTnGovXQm2a+DCZU+NgN1H6Lrx84 1bRw0WGZepOpSZ/He2ymqzEqoiAyP9xupEa+6NTfZVDRHFawjpOe0MtKr1mUGDmyTYxI s01w8iugJXmcSHJHXFeVBNUfiLFqmoIK1AQlROjq+DHImoZbQmrGbvWlAz9oresuy3f/ sWLxOq1yA1M9iS15+KdKbFPC7MTaLCfroF0KM7ecuEopY1CuisRP3+ao2mLc4AgwGt+z 0YDZVwZKd+HzTPWqY95JmbVQ6fR7oJbMHluBTzFfdWlXGmVlEB3PNLtO+FCLbRFIOpvb hFJA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=N6+HRSgh5NtNpvQxQs/5HFHZ5OA3rO0AVcskLic05y0=; b=BuauZ4PNj7yR8EScXNlqbwdx/hztsKZ9yqrqcfj/KGf8B8HkA9bgtKBkztiRB7lJrS OC28vRUxaGqg6zIPWVjlE7K3LivCDhjE0a3Xsv6BVEoQNcv7cbyNEdRFeGJvcQiVv5zL w1P2xq6JgtdWdV5Z0m1hQDEEsZKItMgnenAmkq6WNVsVAtRc2P0i0x3yeQ5kYHyXxgII 5sq6o6s5RANzajUYWcT6H0YjHhlJpr5ScEaiC6qRADV/E9mLfxWjuTE7tpI6elIsrXPf rEb+KBW/haR04mUHo7LWYzl9E71Dvg5SU2cOmDaN0U7oMALer7G4HikTOGqtcIcSj2cT VHZw== X-Gm-Message-State: APjAAAXovjV6dZJBKjoVNbd68naoFlN4ZniMLQTtk1zpCrFD9zhXzfnI coBIyywDGYvDGeafrCRrGAQ= X-Received: by 2002:adf:9c8a:: with SMTP id d10mr25202163wre.156.1580170594577; Mon, 27 Jan 2020 16:16:34 -0800 (PST) Received: from localhost.localdomain ([109.126.145.157]) by smtp.gmail.com with ESMTPSA id z21sm638426wml.5.2020.01.27.16.16.32 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 27 Jan 2020 16:16:34 -0800 (PST) From: Pavel Begunkov To: Jens Axboe , io-uring@vger.kernel.org, linux-kernel@vger.kernel.org Cc: Daurnimator Subject: [PATCH v2 2/2] io_uring: add io-wq workqueue sharing Date: Tue, 28 Jan 2020 03:15:48 +0300 Message-Id: X-Mailer: git-send-email 2.24.0 In-Reply-To: References: MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org If IORING_SETUP_ATTACH_WQ is set, it expects wq_fd in io_uring_params to be a valid io_uring fd io-wq of which will be shared with the newly created io_uring instance. If the flag is set but it can't share io-wq, it fails. This allows creation of "sibling" io_urings, where we prefer to keep the SQ/CQ private, but want to share the async backend to minimize the amount of overhead associated with having multiple rings that belong to the same backend. Reported-by: Jens Axboe Reported-by: Daurnimator Signed-off-by: Pavel Begunkov --- fs/io_uring.c | 68 +++++++++++++++++++++++++++-------- include/uapi/linux/io_uring.h | 4 ++- 2 files changed, 57 insertions(+), 15 deletions(-) diff --git a/fs/io_uring.c b/fs/io_uring.c index 5ec6428933c3..de1cb3135721 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -5692,11 +5692,56 @@ static void io_get_work(struct io_wq_work *work) refcount_inc(&req->refs); } +static int io_init_wq_offload(struct io_ring_ctx *ctx, + struct io_uring_params *p) +{ + struct io_wq_data data; + struct fd f; + struct io_ring_ctx *ctx_attach; + unsigned int concurrency; + int ret = 0; + + data.user = ctx->user; + data.get_work = io_get_work; + data.put_work = io_put_work; + + if (!(p->flags & IORING_SETUP_ATTACH_WQ)) { + /* Do QD, or 4 * CPUS, whatever is smallest */ + concurrency = min(ctx->sq_entries, 4 * num_online_cpus()); + + ctx->io_wq = io_wq_create(concurrency, &data); + if (IS_ERR(ctx->io_wq)) { + ret = PTR_ERR(ctx->io_wq); + ctx->io_wq = NULL; + } + return ret; + } + + f = fdget(p->wq_fd); + if (!f.file) + return -EBADF; + + if (f.file->f_op != &io_uring_fops) { + ret = -EOPNOTSUPP; + goto out_fput; + } + + ctx_attach = f.file->private_data; + /* @io_wq is protected by holding the fd */ + if (!io_wq_get(ctx_attach->io_wq, &data)) { + ret = -EINVAL; + goto out_fput; + } + + ctx->io_wq = ctx_attach->io_wq; +out_fput: + fdput(f); + return ret; +} + static int io_sq_offload_start(struct io_ring_ctx *ctx, struct io_uring_params *p) { - struct io_wq_data data; - unsigned concurrency; int ret; init_waitqueue_head(&ctx->sqo_wait); @@ -5740,18 +5785,9 @@ static int io_sq_offload_start(struct io_ring_ctx *ctx, goto err; } - data.user = ctx->user; - data.get_work = io_get_work; - data.put_work = io_put_work; - - /* Do QD, or 4 * CPUS, whatever is smallest */ - concurrency = min(ctx->sq_entries, 4 * num_online_cpus()); - ctx->io_wq = io_wq_create(concurrency, &data); - if (IS_ERR(ctx->io_wq)) { - ret = PTR_ERR(ctx->io_wq); - ctx->io_wq = NULL; + ret = io_init_wq_offload(ctx, p); + if (ret) goto err; - } return 0; err: @@ -6577,7 +6613,11 @@ static long io_uring_setup(u32 entries, struct io_uring_params __user *params) if (p.flags & ~(IORING_SETUP_IOPOLL | IORING_SETUP_SQPOLL | IORING_SETUP_SQ_AFF | IORING_SETUP_CQSIZE | - IORING_SETUP_CLAMP)) + IORING_SETUP_CLAMP | IORING_SETUP_ATTACH_WQ)) + return -EINVAL; + + /* wq_fd isn't valid without ATTACH_WQ being set */ + if (!(p.flags & IORING_SETUP_ATTACH_WQ) && p.wq_fd) return -EINVAL; ret = io_uring_create(entries, &p); diff --git a/include/uapi/linux/io_uring.h b/include/uapi/linux/io_uring.h index 9988e82f858b..e067b92af5ad 100644 --- a/include/uapi/linux/io_uring.h +++ b/include/uapi/linux/io_uring.h @@ -75,6 +75,7 @@ enum { #define IORING_SETUP_SQ_AFF (1U << 2) /* sq_thread_cpu is valid */ #define IORING_SETUP_CQSIZE (1U << 3) /* app defines CQ size */ #define IORING_SETUP_CLAMP (1U << 4) /* clamp SQ/CQ ring sizes */ +#define IORING_SETUP_ATTACH_WQ (1U << 5) /* attach to existing wq */ enum { IORING_OP_NOP, @@ -183,7 +184,8 @@ struct io_uring_params { __u32 sq_thread_cpu; __u32 sq_thread_idle; __u32 features; - __u32 resv[4]; + __u32 wq_fd; + __u32 resv[3]; struct io_sqring_offsets sq_off; struct io_cqring_offsets cq_off; }; -- 2.24.0