Received: by 2002:a4a:301c:0:0:0:0:0 with SMTP id q28-v6csp527145oof; Tue, 25 Sep 2018 00:50:52 -0700 (PDT) X-Google-Smtp-Source: ACcGV62ABos8hyWjiZboXtOCWBoral3rRPg3x8tu1LV8kHapBf5eahy0i3X8PLH7c+XeWkgL/MwA X-Received: by 2002:a63:4d5b:: with SMTP id n27-v6mr2194507pgl.270.1537861852089; Tue, 25 Sep 2018 00:50:52 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1537861852; cv=none; d=google.com; s=arc-20160816; b=TqY7/Wpr4pHhdFwFc7vOtzr/7msHImlKcl1+fycobK9kM+XfSUNEQTOb4V7nfULKwI A3gK+e3jfXHWnvXbqLf0ON9kQN3NvE7L1pLEEG8A6qUVC28xdOA7DRPloEnqt0VkpZ4W XHV6uMx79Vw8UY/HHacKlrR1fSnfoAjQVAwHGLkL84zL7ZeCEfGF+2b/UuEMldaazfKN YKq8djTCw6+b8tXV5H9q3Ci/eXyhYl24bGikqgnhH0TpktFGkz/bkvVNahH3DxKtoupK AdajsRNMArw5kuisqxNs874B3l6xAhoffrmHOZlwsqQ/gM43wELIjhPheDMQy2U+Zf+s PQ5w== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date; bh=e/QI8AGpgtTgHWF4RUR/mMZTFtAtEudi41AsGNLs3mU=; b=GayFpC2JvFn2iw/DsRLFba8FZZaskfS8fvIFl/6orh2LATarGdlOGsjgG1A4MWL6Rc dQUNXcEuzCC1BvTmgIgl6kOogMoCV2zeW0yiqawYGq1wLReNmFFbCvfr3XyCWfWoU5rq DrFnkZTgS7Fmk35Cz8a2cpmrx6+vO5WxwQt/4ESqW364CiuXUiimsRpJFm/ld54N5qGL ey7DFBo0R4LLO3jZs77vFUNkqqVfUvnhN96zMLhzMx6YwydpIjEFB9UN0RUFZnyEW8kH 1LM8/OgjGGEoxawDPQQt6qyHC1Da4dloojgLGeyd9cmVbnIz2eR5oe/77VrdprzrtjKL 9ImA== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id n75-v6si1652500pfi.359.2018.09.25.00.50.36; Tue, 25 Sep 2018 00:50:52 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728863AbeIYNzb (ORCPT + 99 others); Tue, 25 Sep 2018 09:55:31 -0400 Received: from ipmail06.adl2.internode.on.net ([150.101.137.129]:53458 "EHLO ipmail06.adl2.internode.on.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1727588AbeIYNzb (ORCPT ); Tue, 25 Sep 2018 09:55:31 -0400 Received: from ppp59-167-129-252.static.internode.on.net (HELO dastard) ([59.167.129.252]) by ipmail06.adl2.internode.on.net with ESMTP; 25 Sep 2018 17:19:12 +0930 Received: from dave by dastard with local (Exim 4.80) (envelope-from ) id 1g4i5m-0005Tv-91; Tue, 25 Sep 2018 17:49:10 +1000 Date: Tue, 25 Sep 2018 17:49:10 +1000 From: Dave Chinner To: Jens Axboe Cc: Christopher Lameter , Christoph Hellwig , Vitaly Kuznetsov , Ming Lei , linux-block , linux-mm , Linux FS Devel , "open list:XFS FILESYSTEM" , Dave Chinner , Linux Kernel Mailing List , Ming Lei Subject: Re: block: DMA alignment of IO buffer allocated from slab Message-ID: <20180925074910.GB31060@dastard> References: <20180920063129.GB12913@lst.de> <87h8ij0zot.fsf@vitty.brq.redhat.com> <20180921130504.GA22551@lst.de> <010001660c54fb65-b9d3a770-6678-40d0-8088-4db20af32280-000000@email.amazonses.com> <1f88f59a-2cac-e899-4c2e-402e919b1034@kernel.dk> <010001660cbd51ea-56e96208-564d-4f5d-a5fb-119a938762a9-000000@email.amazonses.com> <1a5b255f-682e-783a-7f99-9d02e39c4af2@kernel.dk> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <1a5b255f-682e-783a-7f99-9d02e39c4af2@kernel.dk> User-Agent: Mutt/1.5.21 (2010-09-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Sep 24, 2018 at 12:09:37PM -0600, Jens Axboe wrote: > On 9/24/18 12:00 PM, Christopher Lameter wrote: > > On Mon, 24 Sep 2018, Jens Axboe wrote: > > > >> The situation is making me a little uncomfortable, though. If we export > >> such a setting, we really should be honoring it... That's what I said up front, but you replied to this with: | I think this is all crazy talk. We've never done this, [...] Now I'm not sure what you are saying we should do.... > > Various subsystems create custom slab arrays with their particular > > alignment requirement for these allocations. > > Oh yeah, I think the solution is basic enough for XFS, for instance. > They just have to error on the side of being cautious, by going full > sector alignment for memory... How does the filesystem find out about hardware alignment requirements? Isn't probing through the block device to find out about the request queue configurations considered a layering violation? What if sector alignment is not sufficient? And how would this work if we start supporting sector sizes larger than page size? (which the XFS buffer cache supports just fine, even if nothing else in Linux does). But even ignoring sector size > page size, implementing this requires a bunch of new slab caches, especially for 64k page machines because XFS supports sector sizes up to 32k. And every other filesystem that uses sector sized buffers (e.g. HFS) would have to do the same thing. Seems somewhat wasteful to require everyone to implement their own aligned sector slab cache... Perhaps we should take the filesystem out of this completely - maybe the block layer could provide a generic "sector heap" and have all filesystems that use sector sized buffers allocate from it. e.g. something like mem = bdev_alloc_sector_buffer(bdev, sector_size) That way we don't have to rely on filesystems knowing anything about the alignment limitations of the devices or assumptions about DMA to work correctly... Cheers, Dave. -- Dave Chinner david@fromorbit.com