Received: by 2002:a25:d7c1:0:0:0:0:0 with SMTP id o184csp4124698ybg; Fri, 25 Oct 2019 13:43:19 -0700 (PDT) X-Google-Smtp-Source: APXvYqxpyOH6x+e48aZV0BoME0oVlpMFszi+TyE/1Vg6UqoUISIli0unZeDpxQwgl2KtnnUI1yLE X-Received: by 2002:a17:906:80c1:: with SMTP id a1mr5385956ejx.37.1572036199588; Fri, 25 Oct 2019 13:43:19 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1572036199; cv=none; d=google.com; s=arc-20160816; b=GCemtUJn1+E7YOi5jTuTMQ4I5Dr4i4yEne1d7HOEw1YvfctczL3kjoo5DLC11F1Y+7 kGqY2NemxTImauZDKj10v9Ci65DnzeZCxNAH6l60R5yk1km1jNnP6ZeWeJH3amAwYoDn SXfy9w6AGS99O3oFnXRlVxxPeghnGBbN934zlYE35AZ5EFqNN/F3PaqcvQ1LtZsZXTLr QelwhbL7xaSleG85whiVy9sZNylTQzOsJbf9tbOlizF4t6GY8m5LtMVqk9nmx/LJmY11 1/riAsUASWzhs4N2fVRmm1z/fbx43At9+8CybmkEfQthKk0WM0lmeB/r+3njI9ErXQI6 laNw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding :content-language:in-reply-to:mime-version:user-agent:date :message-id:from:references:to:subject:dkim-signature; bh=epIZn6+Wd30ec+WMFBH2/DUm6JPwiv9ffXe7aBYJZg0=; b=kQU1DTRDVqQuimnOX5Gh+xlMSVK1VL5xzVngboCqWxH3CutOeopDapp+KCdzy+k+qv kBjrhfnRY029MJJBaTjSvxcq5L04Hya2Af8uEmoX8CzA3JmZE8qPHaJt+ryThjpBmxvK /8zR2P03jM/THfamIYoqa1SSs031H8hpin1VRY5eE69GUmdIhcAAJvDoTz/itb+1pmNs wgRDqUB/nZSqVnOMo8f5qfg31SPclWGS40jGoYPvTwJNdt3FWJI/oJEHUssgp11Uotu2 Z98h2oMQVWNKnsWyS/Y5+o/zjaW21EL533yu5C4cuvVt0ySrvZifa4u1rdNydV2xL1bm qFYw== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=DgVFp0vM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id gz7si1890950ejb.271.2019.10.25.13.42.55; Fri, 25 Oct 2019 13:43:19 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel-dk.20150623.gappssmtp.com header.s=20150623 header.b=DgVFp0vM; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S2503829AbfJYQDZ (ORCPT + 99 others); Fri, 25 Oct 2019 12:03:25 -0400 Received: from mail-io1-f66.google.com ([209.85.166.66]:34542 "EHLO mail-io1-f66.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S2503806AbfJYQDZ (ORCPT ); Fri, 25 Oct 2019 12:03:25 -0400 Received: by mail-io1-f66.google.com with SMTP id q1so3031342ion.1 for ; Fri, 25 Oct 2019 09:03:25 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=kernel-dk.20150623.gappssmtp.com; s=20150623; h=subject:to:references:from:message-id:date:user-agent:mime-version :in-reply-to:content-language:content-transfer-encoding; bh=epIZn6+Wd30ec+WMFBH2/DUm6JPwiv9ffXe7aBYJZg0=; b=DgVFp0vMHDg+9Ok10nyLdQenddZHB7FXCw6Phi4otowCPa3Wt/J0x7vj92lnjgshQH GQsNXJxAxMQ2gLi6vad0xrN6GHWz9qPZUUUJn6pmjNsbnKMUz75mJ1wO6dnLGgh2D4p4 bT9kR2zBcigfYI4VlCwjwoe6JSLPEAUPcmoFnhx1NL9Sz75GCAzfKdOqLxS9TYWkzjR1 1AeLarLvedliotQLMmGygVKiKMEv8MPeVOvaN9dkU2huCAt7YyHOncKEA/79byV4eo2+ 10MUPTtCbka69ZfsSnPmZhwh+ui0x3clFEQsbMHa6kPYjNXCm3tHYVHgtej+kfVsrKWr Qq5A== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:subject:to:references:from:message-id:date :user-agent:mime-version:in-reply-to:content-language :content-transfer-encoding; bh=epIZn6+Wd30ec+WMFBH2/DUm6JPwiv9ffXe7aBYJZg0=; b=lN5MKy/bvZ8iaoytNvQ8PBnvkXkw1xLz+TnhYfFqfSPcmqeNuLQoz3d83OVUMinZBf kZ/ZUu0lefumXyvEfiX4mfDnuhPfkUV6FuxK5wcNDRYaxxhGGrydI0FVHGh/oc/jwIuP lZWXfljUBI26yIKOdPXreCyolDxmiBwXobwC+jAfFIOQ/2Rmt7OqOkSw80XpyL9wVVbO jaz4tSpbISkbxcUR4pS6d2CCBJqdY0I0H7xUg/It3T22lfoIJDl3Fi6gCECWbqD3smny zG0tooiYZw/1XgEwrgcpG7J8DxRPed8//go+vBiAVcXU2X+HruANEJD+hMZDoHw7b6/T 6JSg== X-Gm-Message-State: APjAAAW/eH20uRyJVEf4o2DoefWZaWQcrMo0XrJF7aVypRYJru8mPztj hPz3kaqG5DRerGbZWGl/6AZxt+Qg9am5gg== X-Received: by 2002:a6b:b54a:: with SMTP id e71mr4579239iof.132.1572019403834; Fri, 25 Oct 2019 09:03:23 -0700 (PDT) Received: from [192.168.1.159] ([65.144.74.34]) by smtp.gmail.com with ESMTPSA id o25sm379302ili.51.2019.10.25.09.03.22 (version=TLS1_2 cipher=ECDHE-RSA-AES128-GCM-SHA256 bits=128/128); Fri, 25 Oct 2019 09:03:22 -0700 (PDT) Subject: Re: [BUG] io_uring: defer logic based on shared data To: Pavel Begunkov , linux-block@vger.kernel.org, linux-kernel@vger.kernel.org References: <5badf1c0-9a7d-0950-2943-ff8db33e0929@gmail.com> From: Jens Axboe Message-ID: Date: Fri, 25 Oct 2019 10:03:21 -0600 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:60.0) Gecko/20100101 Thunderbird/60.9.0 MIME-Version: 1.0 In-Reply-To: <5badf1c0-9a7d-0950-2943-ff8db33e0929@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On 10/25/19 3:55 AM, Pavel Begunkov wrote: > I found 2 problems with __io_sequence_defer(). > > 1. it uses @sq_dropped, but doesn't consider @cq_overflow > 2. @sq_dropped and @cq_overflow are write-shared with userspace, so > it can be maliciously changed. > > see sent liburing test (test/defer *_hung()), which left an unkillable > process for me OK, how about the below. I'll split this in two, as it's really two separate fixes. diff --git a/fs/io_uring.c b/fs/io_uring.c index 5d10984381cf..5d9d960c1c17 100644 --- a/fs/io_uring.c +++ b/fs/io_uring.c @@ -191,6 +191,7 @@ struct io_ring_ctx { unsigned sq_entries; unsigned sq_mask; unsigned sq_thread_idle; + unsigned cached_sq_dropped; struct io_uring_sqe *sq_sqes; struct list_head defer_list; @@ -208,6 +209,7 @@ struct io_ring_ctx { struct { unsigned cached_cq_tail; + unsigned cached_cq_overflow; unsigned cq_entries; unsigned cq_mask; struct wait_queue_head cq_wait; @@ -419,7 +421,8 @@ static struct io_ring_ctx *io_ring_ctx_alloc(struct io_uring_params *p) static inline bool __io_sequence_defer(struct io_ring_ctx *ctx, struct io_kiocb *req) { - return req->sequence != ctx->cached_cq_tail + ctx->rings->sq_dropped; + return req->sequence != ctx->cached_cq_tail + ctx->cached_sq_dropped + + ctx->cached_cq_overflow; } static inline bool io_sequence_defer(struct io_ring_ctx *ctx, @@ -590,9 +593,8 @@ static void io_cqring_fill_event(struct io_ring_ctx *ctx, u64 ki_user_data, WRITE_ONCE(cqe->res, res); WRITE_ONCE(cqe->flags, 0); } else { - unsigned overflow = READ_ONCE(ctx->rings->cq_overflow); - - WRITE_ONCE(ctx->rings->cq_overflow, overflow + 1); + ctx->cached_cq_overflow++; + WRITE_ONCE(ctx->rings->cq_overflow, ctx->cached_cq_overflow); } } @@ -2601,7 +2603,8 @@ static bool io_get_sqring(struct io_ring_ctx *ctx, struct sqe_submit *s) /* drop invalid entries */ ctx->cached_sq_head++; - rings->sq_dropped++; + ctx->cached_sq_dropped++; + WRITE_ONCE(rings->sq_dropped, ctx->cached_sq_dropped); return false; } @@ -2685,6 +2688,7 @@ static int io_sq_thread(void *data) timeout = inflight = 0; while (!kthread_should_park()) { + unsigned prev_cq, cur_cq; bool mm_fault = false; unsigned int to_submit; @@ -2767,8 +2771,12 @@ static int io_sq_thread(void *data) } to_submit = min(to_submit, ctx->sq_entries); + prev_cq = ctx->cached_cq_tail + ctx->cached_cq_overflow; inflight += io_submit_sqes(ctx, to_submit, cur_mm != NULL, mm_fault); + cur_cq = ctx->cached_cq_tail + ctx->cached_cq_overflow; + if ((ctx->flags & IORING_SETUP_IOPOLL) && (prev_cq != cur_cq)) + inflight -= cur_cq - prev_cq; /* Commit SQ ring head once we've consumed all SQEs */ io_commit_sqring(ctx); -- Jens Axboe