Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753124AbcLFSzp (ORCPT ); Tue, 6 Dec 2016 13:55:45 -0500 Received: from mail-bn3nam01on0069.outbound.protection.outlook.com ([104.47.33.69]:7129 "EHLO NAM01-BN3-obe.outbound.protection.outlook.com" rhost-flags-OK-OK-OK-FAIL) by vger.kernel.org with ESMTP id S1751327AbcLFSzn (ORCPT ); Tue, 6 Dec 2016 13:55:43 -0500 X-Greylist: delayed 46343 seconds by postgrey-1.27 at vger.kernel.org; Tue, 06 Dec 2016 13:55:42 EST Authentication-Results: spf=none (sender IP is ) smtp.mailfrom=Yuri.Norov@caviumnetworks.com; Date: Wed, 7 Dec 2016 00:24:40 +0530 From: Yury Norov To: , , CC: Catalin Marinas , , , , , , , , , , , , , , , , , , , , , , , Subject: [Question] New mmap64 syscall? Message-ID: <20161206185440.GA4654@yury-N73SV> MIME-Version: 1.0 Content-Type: text/plain; charset="us-ascii" Content-Disposition: inline User-Agent: Mutt/1.5.24 (2015-08-30) X-Originating-IP: [1.187.11.205] X-ClientProxiedBy: HE1PR0802CA0021.eurprd08.prod.outlook.com (10.172.123.159) To BN4PR07MB2244.namprd07.prod.outlook.com (10.164.63.150) X-MS-Office365-Filtering-Correlation-Id: 32cedea7-25ef-4ec6-8515-08d41e0978d0 X-Microsoft-Antispam: UriScan:;BCL:0;PCL:0;RULEID:(22001);SRVR:BN4PR07MB2244; X-Microsoft-Exchange-Diagnostics: 1;BN4PR07MB2244;3:ERJXRgKuqCX1dikVe4xg4L516dyg7urlk/Mp0XnbdM3CZXMSg6Fn5mMA+/TRxZQPfzUIvubRVzLzipc4FldTG3pQo5GNFN+0iStbM8NRoE/R+u7ZL7/3Nm6w8FByc7VbprDzwSSE8R0CsnKbGHeNI7Zc+pQ/G9l90iIPCDrjNJ90aNl5BnF6mG6MReWmR6gJ18IVW+el73+LtUdISmf8VRd7mUfI+fcITYb2B+4RNiy/CGiZS8OmoyoohvsUk0hhTznU2Mv2CqTAMyF0Swqb2w== X-Microsoft-Exchange-Diagnostics: 1;BN4PR07MB2244;25:ekiR+8xGCCp7EttLvzf3Bfdu7dvN3jdG3YyYX2nEHiztt3tY3j82kmfM8QbHIsF1CHpFkMkZqSjcxjI4OngxNjf0fTEsjCmoL7aWFxxhCwFhM9Q5p42niOaZllvjHdS7CQiEeae2RLQ1I8oSgCySnHRXBhW3Xv93q2JKe+uuVcX2yV4TKStedZqJow2+pBIPnjlePOLrB9myc/Jvc1AI97EGH//tQIxmYzevmPw/+qVTc1jdurJzPoZouqdAYuP6n9MqCl4CrKdfFd5DPCmo2/iwvYRev5uQ4xGa3dohIP/8jjpbvZStm4s9ah9Gq95b/vO+Fm7lw740XGLRaBYPCdV5FYnV0D3j00tOofcm+5zvS6lbSNrYVPL+h2DKYtiUWhiTnW/wY1Rz0Xs2eshUGnrOG5dI+rwi1POT7CdVQGh8u4hpW9F290o95ZUTLWVSDOuLWwgnBgm90U0ZKUtwIFEOcqbo+6bJqGVS1gsKajI4KiadwdszWXAd7VzXIYj+xa1tpVyeC2+3eVcU90whxAxu89ak563zWzhU8m18ijiUKe4TQnjCgLRoPHRufELOUBaqanxaNNlNAr9blWQJeuLnPbxT6GVQH+iXQ9bqWyz0UAivD89KVS7JY2nB0hBKf2LztuB8s55+gMBlZcsMHaR5ItF5fn4Te9Peewwi4+7Bmgl6qwxJRUwEXnyRXJuQSKKcwCqMQ2axrnEsxM4skUP1OpzRxInwu4SRHiZvE+eoDtwki/xcQDpK45EhvzX/DqrWZqfuuD9EHbk7pPtsB/iUCMR9EztOruasi6RgGlwEe1leCVd4nHym9Cx1LSfW X-Microsoft-Exchange-Diagnostics: 1;BN4PR07MB2244;31:uVRQ91Y7T0I6Irzx6MEwLU56P5uPI31C4rOspksBgwgrMPqyosJ0JTxUcT1GEsHIZZs2RBZZIYx2xT+XqXJ/OgNjF9MLHgNO8LkM4siTEwKZTe2IaN1mRWIruzBkjbDwHIkFSAAXrIufw4JAWZ3sHlrRvVBK3Axmi9DR7WNtRxNfGgK0sZ+cMGf0Rs16+F7QEURKacyHvwfEujBsUBlSsxuN9w/4sE/jM7qycNALS6768o6NF06jBPHTqUthollxQ/BTHGZXqMiQCo06dqNRTQ==;20:gZuKSeMxbjTwKeWM3lnPPhtNSovLeLX81pePRqhSz8WkWS9y5QZbaBOH1hSEJs+aZEwuyO60KZL5CD8fVm8AhJgC0RMWSVrUhHFrY4Xr45US7Qv0lUTTF1Ol0wF6MNfJcD0Fgx4Hc5LDrxzTGn6ST7f+6yMPGZK7p0r5oLhMUrEcjmEOX9lqmfzg3+v4DjbZKlY6ksghCqXFi6/SQ7MYN2l6usitHnYz49V9XydLavUOEyxJ2Lro17KhP7wBx5jRzuKue6KxUyAj99FMbutmV4k5t0wACoig0GHKVsWwn/WH91bSqq3Zt4DSZkm+vkKqWGI24SyX3TiFob7sM1Aedhl19x7FcIMub5UjXCLnBjDH76uHJB8Sb4AGfBUoqbrRJcOuJ82neGneI7PN+vm5UVh4EG5IrvM8K+dmO8x7Wk8FLfuQdzWSlp3+OlXtWTZkNXCk7Vx6b/WV/EDCXioKvG6uYmQnGGs6a3ut0nn3Sc6DWvuhSzNcXGW8HBWYN/FeOKXPD9pDSsxcSrgPJQul3pZ0bG75v/W6nwOMR4fcI5FQ+VbMHIoIAWVZaZibJASumPmDCXRvsNQg3vSZUv4Ms2H32zzcjfMIbYmVEnc3hfg= X-Microsoft-Antispam-PRVS: X-Exchange-Antispam-Report-Test: UriScan:; X-Exchange-Antispam-Report-CFA-Test: BCL:0;PCL:0;RULEID:(6040375)(601004)(2401047)(8121501046)(5005006)(10201501046)(3002001)(6041248)(20161123560025)(20161123562025)(20161123564025)(20161123555025)(6072148);SRVR:BN4PR07MB2244;BCL:0;PCL:0;RULEID:;SRVR:BN4PR07MB2244; X-Microsoft-Exchange-Diagnostics: 1;BN4PR07MB2244;4:nE8PWVM0XKZCmLfvBQ0heGHTVNmt25LwIk6Mc4+Y8XOCnKk0TvJRUAF3GqQGbfZjTng3o5CCFWcWQLd+gPG+rS2i/gn9TYH1suYczzVDukXmrL0cdA8HIyOKOsLxbIh5Gf3l6ChvcK48VIfbo7EtKYy09LPqlHeES24K9z3TPftdpPoEyXKttuy+Hec8wip/XdQGM2n7fZkVzGUbdMbk4m4xkaHctj2T8tSszctFQQ3KiMvBFwME8ssCNoKBEf6ySsva232CED7qtOsW0pV0CxtPzJW1EzkKPzJYfyClbmmSKWoxisDnnJDuN2bKonTuAUoscQAZOBLJRokXet+c+tdLro7H12vLd9R1vKpdPaKVIFcR2Pe/kd7ckFgj96ZZddb9e0jgWvd4SX4BvR8xGMdCEPEMH7pAPoijOV8tCrCs0s63XS+e3HF8ha75CeUPTM7tBFfQENFSLurMqCD1TEeFgDOu17DgbKF5S6GQZNV52KwOgOYuKECy378KvbNjlgYXkNdMJZ2oHLIYr4P61CEb63yzh8AH3MxEvrwidE7Nn3XMMQuya7M10gxfgTapxxB8r3HhEq/ofJjQWlZuRg== X-Forefront-PRVS: 01480965DA X-Forefront-Antispam-Report: SFV:NSPM;SFS:(10009020)(4630300001)(6069001)(6009001)(7916002)(199003)(24454002)(53754006)(189002)(2201001)(23726003)(105586002)(6666003)(2906002)(50466002)(66066001)(33656002)(1076002)(5009440100003)(47776003)(9686002)(33716001)(42186005)(3846002)(68736007)(6116002)(39850400001)(4326007)(76506005)(7416002)(5660300001)(7846002)(8676002)(305945005)(7736002)(81156014)(101416001)(46406003)(50986999)(97756001)(54356999)(733004)(6486002)(97736004)(81166006)(83506001)(5001770100001)(39410400001)(106356001)(39450400002)(92566002)(4001350100001)(39060400001)(38730400001)(189998001)(6496003)(39840400001)(18370500001)(36456003)(2101003);DIR:OUT;SFP:1101;SCL:1;SRVR:BN4PR07MB2244;H:localhost;FPR:;SPF:None;PTR:InfoNoRecords;A:1;MX:1;LANG:en; X-Microsoft-Exchange-Diagnostics: =?us-ascii?Q?1;BN4PR07MB2244;23:vXRNEvYYiq9i0GByQ7rp0LnK27ZCe0KKXiGyD4Iqa?= =?us-ascii?Q?jul/IzYvQIeKz7uFCJF6Mqbj86djx+1roI6AiVSzMvN53meGJvDfo1w2N5MO?= =?us-ascii?Q?3bxe71IaPlKfKutCIXXjDAOhNf6RLLSo2u8J6riL5H3ZXjlDXY5i8XGGHik8?= =?us-ascii?Q?zXKTB8iT3NRR6Nc9s78XMHbAO9E3Ufl592kvjiFWGf5dNgDjqEg5bON+/pK/?= =?us-ascii?Q?0LsUWV+mfbgkoQhd2crH6GI3hTPgb021/oewHT04Nm4HzEjmxPzvJL5iH9QN?= =?us-ascii?Q?XCIZGrf+sjcrnsMPcMtpwwjsEIhIUn8AQJjwD/Et+pt9oCvAn0nh4JOW7Cxi?= =?us-ascii?Q?hRZE1Dzyu99/58Sk6yUqvlBgOHZMN8G/h0ujdPy5udTcaBGbRvdZ5cTUg7OP?= =?us-ascii?Q?pC0ILrC6pL8ybXwANcG91vH9kUDwXfAY7NZN/BV5IzJliYPzTSru8Ej662Mk?= =?us-ascii?Q?7/Kg2xE3e4Lc1ge97JxVUkop97X+M4UHRw1Vp7OczSk7LQln/lqM9q0JMlT/?= =?us-ascii?Q?N5DL3or1ooHFQu9xDeAEkG/UXZ17aMBlVbP5AzZd/VtOuT39eMZSVwHhR+4m?= =?us-ascii?Q?piowHe0b35rEA6j8TeFFCjrIBICtGUVq/8j/Ky9Huwh+9ONOfyTHfpQLUTVK?= =?us-ascii?Q?2fV2wy76sNqo4GfAaArOWY5Kw4JjcjsBsiQimzem+dwmZuYN/dsg8nbxlSBi?= =?us-ascii?Q?AgK4FTJtcJZbs0vlpT3DN6HHKJJOQkRbqEFDZzYc2AztXGdx6AEZL71Z3HIT?= =?us-ascii?Q?qlegzTxw6u1IikF0d8G66P3EPti7lk5h4ekp4gTtkp5yk0Jj7jZ9SOuojWTU?= =?us-ascii?Q?hSp1Aae/4dEomniUt9K7P1lzcEs9Xrts6mk+NiVg3numYC76TC+rm+Eh4F1X?= =?us-ascii?Q?q9XNLHbcC953Mh5upMFDO5MXci9KIE4Ppffn6Nx4rBxyPiLK+G9Sk0NXaJ0r?= =?us-ascii?Q?k63+7wr9jSIiZqKCMso08Pp9O/kOaFpDADjhQm79caNUItyDxI6fo1ELS3Yn?= =?us-ascii?Q?89cmkxI4g7WIecsIYbDEFHzPdUjreTrFo4Vgsmh8uPhmkWHwfNpNkjWg+3jH?= =?us-ascii?Q?KkWflOMtiqWCrSzsE3ITz5Mx6Wm7h3GZUTvKvoZZkMcOfyp/MTfwmd6waksV?= =?us-ascii?Q?4BB7TLrx3M15v8PbrEZFO4EWIBXx4TChCvq1mbjeYzIV4G5jV1HOCbUFmEXm?= =?us-ascii?Q?XqWP6nYGyRhBducP7aETMAGJQARgX/FlBS+APQbwOtHQfFvlSg17mg2pAJI8?= =?us-ascii?Q?blkeuP01JsHmz4CxVEQsHk/6jNDL1CHKROsJ3rnnY43NHygr4AtKUk7Nwmf4?= =?us-ascii?Q?FroSkBR2Lz/WoHv8pnizmuqd9X34mpAL1nCgWpDC69kugOX1NQXTwqORwu1p?= =?us-ascii?Q?tEOhWvrOB6oTokY6GyhTmlHDnm/5JbWktP2uMGHy1Z5UPXD+BKKyeRw7ayU3?= =?us-ascii?Q?ZoqnR1qAQ=3D=3D?= X-Microsoft-Exchange-Diagnostics: 1;BN4PR07MB2244;6:yQvXo4jvil5leIWF4nDWCRMLxU7kSQmP5JLl/h/2vyEiFe3yGXxsgWHFfgq01++naXZli9kZFRdu8ViFQsenjJhyl5LqqorqNYVTmt/dkM+wWs8k9HJareZDGfTHYXBss5MBE1E+X5oeCuKhfJM8jns537L3nFibWLLKmTznvesul3VX6n7YhoTE9soxiSQtqLz3xSbrI/3OftCenpB/uGuR7ItZ0LgEMoztbVlhTgduKbcOfnuggP8DnxnEYM6s9YXeKS+BUTbceVDxg7dsmxCbXWp/O/WmlR/DCSBGY+Zbr1XTIPIbL65LNdG3rYHOQlhzT3lDoJcZdLyImBl7Y/obzz3qXhU198Mtakv470VWMF7/nVTeUcZeIoOKjrmDaaFdgGQSfFxK3LA/tisIhloiLNu/B9HtLGGJ4zcasbxwcbsS07HZtaQZAmPMtMQywKi1+U7hfieDaIhVeUcIMQ==;5:noJFCHPZLamZT/iSxfPpP2woSDteC67w3pZ6SqI5xpnbM+bQiJEByvIjpg0DEzWJfmzTYZYX11Bum7F45Y5a35LlHmtlJonN3d2PmF6T2Sgm57xFPajbL2+ZuDhnWTdTsfslO8p+Sm7ZAKtnqf3yGe6L/9JbMGIuFNinjb26QK4=;24:WmTEGEFzDSX9D+NWQ2Tv/kgs7Jsbqh6zGPf02e12ToNG1t9ARCUTPWMB8sRyn0zX0xM7W8Vlqa/F0qsM9cGxTbzTgdXbhnxpuyxeOJ8L3mQ= SpamDiagnosticOutput: 1:99 SpamDiagnosticMetadata: NSPM X-Microsoft-Exchange-Diagnostics: 1;BN4PR07MB2244;7:DUNKk8idSzSNg5ZUNjTSg3CW400J841mob+Tni8A4RceNwJxxZWNfKy03gF9W0zEYbF1a0ohl/Ge5g3Q4cLC/flvY+1uMyztRCkm+3Ihq9IRp1QywQOuDC/tUPOWH7hM0P1EfHP4yMak8jpjv76fAxJFpTYYwsNnwjC5IzOKkmcgA6F+AvciE50vywr5kr+94xLCyKPpyq+W4P7WhuPMc0GG/Z0pjY0ekHZaKGQnri0xyoQ0SOPDDDc6BFm8mzGedzJdlV+jfwyoHHzPUNKw6FbtAUnHiPxv1mkpsV3S9dp70wgaEoCYW0yxrcSEYInU4rp0fEjEY3CaeJQoDBMzKmCLxXwRQoO35j9l5vyeibXrfQ0Z9W/7e++t/SvaHfCM0iMB0dqRg6I30+kmkEmYMm2697kPQslZ4Yco6KW/4BqJ5B6exa2OZW5Ph82SCHLFSaro/Rsrfccejb7mjjhpLQ== X-OriginatorOrg: caviumnetworks.com X-MS-Exchange-CrossTenant-OriginalArrivalTime: 06 Dec 2016 18:55:37.2523 (UTC) X-MS-Exchange-CrossTenant-FromEntityHeader: Hosted X-MS-Exchange-Transport-CrossTenantHeadersStamped: BN4PR07MB2244 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3808 Lines: 101 Hi all, (Sorry if there is similar discussion, and I missed it. I didn't find something in LKML in last half a year.) In aarch64/ilp32 discussion Catalin wondered why we don't pass offset in mmap() as 64-bit value (in 2 registers if needed). Looking at kernel code I found that there's no generic interface for it. But almost all architectures provide their own implementations, like this: SYSCALL_DEFINE6(mips_mmap, unsigned long, addr, unsigned long, len, unsigned long, prot, unsigned long, flags, unsigned long, fd, off_t, offset) { unsigned long result; result = -EINVAL; if (offset & ~PAGE_MASK) goto out; result = sys_mmap_pgoff(addr, len, prot, flags, fd, offset >> PAGE_SHIFT); out: return result; } On glibc side things are even worse. There's no mmap() implementation that allows to pass 64-bit offset in 32-bit architecture. mmap64() which is supposed to do this is simply broken: void * __mmap64 (void *addr, size_t len, int prot, int flags, int fd, off64_t offset) { [...] void *result; result = (void *) INLINE_SYSCALL (mmap2, 6, addr, len, prot, flags, fd, (off_t) (offset >> page_shift)); return result; } It explicitly declares offset as 64-bit value, but casts it to 32-bit before passing to the kernel, which is wrong for me. Even if arch has 64-bit off_t, like aarch64/ilp32, the cast will take place because offset is passed in a single register, which is 32-bit. I see 3 solutions for my problem: 1. Reuse aarch64/lp64 mmap code for ilp32 in glibc, but wrap offset with SYSCALL_LL64() macro - which converts offset to the pair for 32-bit ports. This is simple but local solution. And most probably it's enough. 2. Add new flag to mmap, like MAP_OFFSET_IN_PAIR. This will also work. The problem here is that there are too much arches that implement their custom sys_mmap2(). And, of course, this type of flags is looking ugly. 3. Introduce new mmap64() syscall like this: sys_mmap64(void *addr, size_t len, int prot, int flags, int fd, struct off_pair *off); (The pointer here because otherwise we have 7 args, if simply pass off_hi and off_lo in registers.) With new 64-bit interface we can deprecate mmap2(), and generalize all implementations in kernel. I think we can discuss it because 64-bit is the default size for off_t in all new 32-bit architectures. So generic solution may take place. The last question here is how important to support offsets bigger than 2^44 on 32-bit machines in practice? It may be a case for ARM64 servers, which are looking like main aarch64/ilp32 users. If no, we can leave things as is, and just do nothing. Yury On Mon, Dec 05, 2016 at 05:12:43PM +0000, Catalin Marinas wrote: > On Fri, Oct 21, 2016 at 11:33:10PM +0300, Yury Norov wrote: > > off_t is passed in register pair just like in aarch32. > > In this patch corresponding aarch32 handlers are shared to > > ilp32 code. > [...] > > +/* > > + * Note: off_4k (w5) is always in units of 4K. If we can't do the > > + * requested offset because it is not page-aligned, we return -EINVAL. > > + */ > > +ENTRY(compat_sys_mmap2_wrapper) > > +#if PAGE_SHIFT > 12 > > + tst w5, #~PAGE_MASK >> 12 > > + b.ne 1f > > + lsr w5, w5, #PAGE_SHIFT - 12 > > +#endif > > + b sys_mmap_pgoff > > +1: mov x0, #-EINVAL > > + ret > > +ENDPROC(compat_sys_mmap2_wrapper) > > For compat sys_mmap2, the pgoff argument is in multiples of 4K. This was > traditionally used for architectures where off_t is 32-bit to allow > mapping files to 2^44. > > Since off_t is 64-bit with AArch64/ILP32, should we just pass the off_t > as a 64-bit value in two different registers (w5 and w6)?