Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1422654AbbEEVFk (ORCPT ); Tue, 5 May 2015 17:05:40 -0400 Received: from mx0a-00082601.pphosted.com ([67.231.145.42]:49130 "EHLO mx0a-00082601.pphosted.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1161132AbbEEVFe (ORCPT ); Tue, 5 May 2015 17:05:34 -0400 Message-ID: <55493097.6040007@fb.com> Date: Tue, 5 May 2015 15:05:27 -0600 From: Jens Axboe User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:31.0) Gecko/20100101 Thunderbird/31.6.0 MIME-Version: 1.0 To: Jeff Moyer CC: , , , Subject: Re: [PATCH v2] Support for write stream IDs References: <1430856181-19568-1-git-send-email-axboe@fb.com> In-Reply-To: Content-Type: text/plain; charset="windows-1252"; format=flowed Content-Transfer-Encoding: 7bit X-Originating-IP: [192.168.54.13] X-Proofpoint-Spam-Reason: safe X-FB-Internal: Safe X-Proofpoint-Virus-Version: vendor=fsecure engine=2.50.10432:5.14.151,1.0.33,0.0.0000 definitions=2015-05-05_06:2015-05-05,2015-05-05,1970-01-01 signatures=0 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2457 Lines: 57 On 05/05/2015 02:51 PM, Jeff Moyer wrote: > Jens Axboe writes: > >> Hi, >> >> Changes since the last posting: >> >> - Added a specific per-file fadvise setting. POSIX_FADV_STREAMID sets >> the inode and file stream ID, POSIX_FADV_STREAMID_FILE sets just the >> file stream ID. >> >> - Addressed review comments. >> >> I've since run some testing with write streams. Test case was a RocksDB >> overwrite benchmark, using 3 billion keys of 400B in size (numbers set >> use the full size of the device). WAL/LOG was assigned to stream 1, and >> each RocksDB compaction level used a separate stream. With streams >> enabled, user write to device writes (write amplification) was at 2.33. >> Without streams, the write amplification was 3.05. That is roughly 20% >> less written NAND, and the streams test subsequently also had 20% >> higher throughput. >> >> Unless there are any grave concerns here, I'd like to merge this for >> 4.2. > > I have a few concerns. You've added POSIX_FADV_* definitions that do > not exist in the SUS/POSIX spec. Do we care? We (poor reviewers) still > have no idea what the driver side of this will look like. Do streams > need to be opened and closed? Is that going to be handled transparently > by the kernel, or exposed to userspace? If in the kernel, where in the > kernel? You've also added a user-visible api without cc-ing linux-api. Whether this should be fadvise, fcntl, or something else, that's the primary review concern. The driver side depends on the driver! The kernel patches deal only with ensuring that the stream information gets passed down. If the device requires explicit stream open/close actions, then that needs to be handled on the side. There's no reason to include the kernel in that, the kernel doesn't care. > My preference would be to wait for the spec to finalize before pushing > in changes that depend on it. I think you are mixing up the write streams with the NVMe proposal. The two aren't necessarily connected, and the kernel parts don't really care what the NVMe proposal ends up looking like. It's just an interface to assign an ID, and a transport for passing that ID down to a driver. -- Jens Axboe -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/