Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1754009AbbKBQO5 (ORCPT ); Mon, 2 Nov 2015 11:14:57 -0500 Received: from mx0b-00082601.pphosted.com ([67.231.153.30]:16855 "EHLO mx0b-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751373AbbKBQO4 (ORCPT ); Mon, 2 Nov 2015 11:14:56 -0500 Subject: Re: [PATCH BUGFIX 1/3] null_blk: set a separate timer for each command To: Paolo Valente , =?UTF-8?Q?Matias_Bj=c3=b8rling?= , Arianna Avanzini References: <1446474673-2566-1-git-send-email-paolo.valente@unimore.it> <1446474673-2566-2-git-send-email-paolo.valente@unimore.it> CC: Akinobu Mita , "Luis R. Rodriguez" , Ming Lei , Mike Krinkin , From: Jens Axboe Message-ID: <56378BEE.2070907@fb.com> Date: Mon, 2 Nov 2015 09:14:38 -0700 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.3.0 MIME-Version: 1.0 In-Reply-To: <1446474673-2566-2-git-send-email-paolo.valente@unimore.it> Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:,, definitions=2015-11-02_06:,, signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2777 Lines: 56 On 11/02/2015 07:31 AM, Paolo Valente wrote: > For the Timer IRQ mode (i.e., when command completions are delayed), > there is one timer for each CPU. Each of these timers > . has a completion queue associated with it, containing all the > command completions to be executed when the timer fires; > . is set, and a new completion-to-execute is inserted into its > completion queue, every time the dispatch code for a new command > happens to be executed on the CPU related to the timer. > > This implies that, if the dispatch of a new command happens to be > executed on a CPU whose timer has already been set, but has not yet > fired, then the timer is set again, to the completion time of the > newly arrived command. When the timer eventually fires, all its queued > completions are executed. > > This way of handling delayed command completions entails the following > problem: if more than one command completion is inserted into the > queue of a timer before the timer fires, then the expiration time for > the timer is moved forward every time each of these completions is > enqueued. As a consequence, only the last completion enqueued enjoys a > correct execution time, while all previous completions are unjustly > delayed until the last completion is executed (and at that time they > are executed all together). > > Specifically, if all the above completions are enqueued almost at the > same time, then the problem is negligible. On the opposite end, if > every completion is enqueued a while after the previous completion was > enqueued (in the extreme case, it is enqueued only right before the > timer would have expired), then every enqueued completion, except for > the last one, experiences an inflated delay, proportional to the number > of completions enqueued after it. In the end, commands, and thus I/O > requests, may be completed at an arbitrarily lower rate than the > desired one. > > This commit addresses this issue by replacing per-CPU timers with > per-command timers, i.e., by associating an individual timer with each > command. Functionally the patch looks fine. My only worry is that a timer per command would be an unnecessary slowdown compared to pushing one timer forward. The problem should be fixable by still doing that, just maintaining next-expire instead. Maybe something that would still roughly be precise enough, while still getting some completion batching going? Or maybe that would be slower, and the individual timers are still better. Comments? -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/