Received: by 2002:a25:4158:0:0:0:0:0 with SMTP id o85csp3133936yba; Mon, 22 Apr 2019 20:56:42 -0700 (PDT) X-Google-Smtp-Source: APXvYqydqoI8BIrlzKejoHiWiP/6fCiYEKSmWRDkJ/j+Pk/zWQSbzh7K4WrNRO1RnEayckUg2JrH X-Received: by 2002:a17:902:be12:: with SMTP id r18mr7339838pls.11.1555991802425; Mon, 22 Apr 2019 20:56:42 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1555991802; cv=none; d=google.com; s=arc-20160816; b=vmiXGocjUyFWTmdqEj7+1GJxRWDFvcv0nhvWjMMhjh3dxLimpb4zYIpJZheBtV/fxk kTMR/EqMvBbbE+S5zd85zCpryfD6eownxTOQz8nt/z7GrqoYTECcm2kIeg824z/Hpz/t QRLlG6p/wVLFbZvGK+nGxpTDTmFMwwHfmhdRKOKe/YT2E9rsM+u5NuJycqdjtFkxP2Ds 6h+NsMTyXHVcbuPtiVhiswu4PG0MCSNF2zByB9cvPE2C+Pco1SHkSFHUg3YfpfVPC8Ys iIUubaPTGmNlzo/FaaRyF9g521M3IHP5OalwVJDXB+4aUhQcT9I52iA35IDTsDfqz3ZC 5tEQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:cc:to:subject:message-id:date:from :in-reply-to:references:mime-version:dkim-signature; bh=SuRDpAsn1C5KIuOUdI1RaGo5G9CPkM+8JwVvvElGu9k=; b=Y7jX9iENcetnjfZVUiomLOwlua/fy3ibc3w2RxMwXq+mutRlfaKquB7vtCGDC/UFvI pZ5lOCKwDBIgTcqiLaRUjG5joFxs/6v49MSHgTDi0rVV/8uyL/RqE8Aa1klSx1rTt1xX X5PuvwB1bUoIwIMuYwxlOrIOwsTmSfccG6SlNOj5dSBE6ienySeMDGnEL7o1+XmlbBzB kb9NY3b4gxgw49++JlRogfKwtW8M20d3DCv3go3Odbp9pj/9tvGXcjQb17mYvqcQpQ9x DXaBLTZJSgHomGDFbDhLGF+dikydCzAWNg12Bh10qszNzvzrVO+Too0uh8vYMk9XCjgj q3DQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=EdxRAyEk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id r77si4341907pgr.140.2019.04.22.20.56.27; Mon, 22 Apr 2019 20:56:42 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@kernel.org header.s=default header.b=EdxRAyEk; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=NONE sp=NONE dis=NONE) header.from=kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1728370AbfDWAyv (ORCPT + 99 others); Mon, 22 Apr 2019 20:54:51 -0400 Received: from mail.kernel.org ([198.145.29.99]:56338 "EHLO mail.kernel.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726421AbfDWAyv (ORCPT ); Mon, 22 Apr 2019 20:54:51 -0400 Received: from mail-wr1-f47.google.com (mail-wr1-f47.google.com [209.85.221.47]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by mail.kernel.org (Postfix) with ESMTPSA id C85E52175B for ; Tue, 23 Apr 2019 00:54:49 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=default; t=1555980890; bh=X0n27fJoDdtD2RbG4dVrME8rlHDP9i0euoRWLm404Q8=; h=References:In-Reply-To:From:Date:Subject:To:Cc:From; b=EdxRAyEkIwjK2+nQVzBY9nvmYzLLzOt5jTUQijcOwjSWV4L09kSp7FYRwf/wesJdR cbtQND9kidtnCJQkH1vc0whhYcqZFaaHHjFVxXwnoN1ArvYjZToj5pSJzK2EsvfeE9 cU5pFd8lz3nsonQUr3WnPUAnr74GPUJE5j16Jkgk= Received: by mail-wr1-f47.google.com with SMTP id b1so6471472wru.3 for ; Mon, 22 Apr 2019 17:54:49 -0700 (PDT) X-Gm-Message-State: APjAAAW2YWt8utZ6x5lPR0Ecifsvel5oKWRHFGezbM5TJp+99p1Rlv7P UXjrX6LTVbb9upvhB7cirTHowH6OXqHeupj5z6TlBw== X-Received: by 2002:a5d:63c7:: with SMTP id c7mr548265wrw.199.1555980888373; Mon, 22 Apr 2019 17:54:48 -0700 (PDT) MIME-Version: 1.0 References: <20190421160600.GA31092@avx2> <20190421182842.GD35603@gmail.com> <8B42CD57-9343-4234-A96D-80337BFFDF0E@zytor.com> <20190421211007.GA30444@avx2> <20190422103449.GA75723@gmail.com> <20190422220948.GB26031@avx2> In-Reply-To: <20190422220948.GB26031@avx2> From: Andy Lutomirski Date: Mon, 22 Apr 2019 17:54:35 -0700 X-Gmail-Original-Message-ID: Message-ID: Subject: Re: [PATCH] x86_64: uninline TASK_SIZE To: Alexey Dobriyan Cc: Ingo Molnar , "H. Peter Anvin" , Thomas Gleixner , Ingo Molnar , Borislav Petkov , LKML , X86 ML , Andy Lutomirski , Peter Zijlstra , Linus Torvalds , Al Viro Content-Type: text/plain; charset="UTF-8" Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Mon, Apr 22, 2019 at 3:09 PM Alexey Dobriyan wrote: > > On Mon, Apr 22, 2019 at 07:30:40AM -0700, Andy Lutomirski wrote: > > > > > > > On Apr 22, 2019, at 3:34 AM, Ingo Molnar wrote: > > > > > > > > > * Alexey Dobriyan wrote: > > > > > >>>>> +++ b/arch/x86/kernel/task_size_64.c > > >>>>> @@ -0,0 +1,9 @@ > > >>>>> +#include > > >>>>> +#include > > >>>>> +#include > > >>>>> + > > >>>>> +unsigned long _task_size(void) > > >>>>> +{ > > >>>>> + return test_thread_flag(TIF_ADDR32) ? IA32_PAGE_OFFSET : > > >>>> TASK_SIZE_MAX; > > >>>>> +} > > >>>>> +EXPORT_SYMBOL(_task_size); > > >>>> > > >>>> Good idea - but instead of adding yet another compilation unit, why not > > >>>> > > >>>> stick _task_size() into arch/x86/kernel/process_64.c, which is the > > >>>> canonical place for process management related arch functions? > > >>>> > > >>>> Thanks, > > >>>> > > >>>> Ingo > > >>> > > >>> Better yet... since TIF_ADDR32 isn't something that changes randomly, > > >>> perhaps this should be a separate variable? > > >> > > >> Maybe. I only thought about putting every 32-bit related flag under > > >> CONFIG_COMPAT to further eradicate bloat (and force everyone else to > > >> keep an eye on it, ha-ha). > > > > > > Basically TIF_ADDR32 is only set for a task if set_personality_ia32() is > > > called, which function is called in the following circumstances: > > > > > > - arch/x86/ia32/ia32_aout.c:load_aout_binary() > > > > > > This is in exec(), when a new binary is loaded for the current task, > > > via search_binary_handler() and exec_binprm(). Ordering is > > > synchronous, AFAICS there can be no race between TASK_SIZE users and > > > the set_personality_ia32() call which is always for the current task. > > > > > > - in COMPAT_SET_PERSONALITY(), which through macro detours ends up being > > > in SET_PERSONALITY2(), which is used in fs/compat_binfmt_elf.c's > > > load_elf_binary(), used in a similar fashion in exec() as the AOUT > > > case above. One particular macro detour of note is that > > > fs/compat_binfmt_elf.c #includes fs/binfmt_elf.c and re-defines the > > > personality setting method to map to set_personality_ia32(). > > > > > > When set_personality_ia32() is called then TIF_ADDR32 is set > > > unconditionally, without any Kconfig variations. > > > > > > TIF_ADDR32 is cleared: > > > > > > - In set_personality_64bit(), when a 64-bit binary is loaded via > > > fs/binfmt_elf.c. > > > > > > - It also defaults to clear in the init task, which is inherited by the > > > initial kernel threads and any user-space task they might end up > > > executing. > > > > > > So the conclusion is that IMO we can safely put TASK_SIZE into a new > > > thread_info()->task_size field, and: > > > > > > - change ->task_size to the 32-bit address space in > > > set_personality_ia32() > > > > > > - change ->task_size to teh 64-bit address space in the init task and in > > > set_personality_64bit(). > > > > > > This should cover it I think, unless I missed something. > > > > > > > Are there really enough TASK_SIZE users to justify any of this? > > Saving 2KB on a defconfig is quite a lot. Saving 2kB of text by adding 8 bytes to thread_info seems rather dubious to me. You only need 256 tasks before you lose. My not-particularly-loaded laptop has 865 tasks right now. As a general principle, the mere existence of TIF_ADDR32 is a bug. The value of that flag is *wrong* under the 32-bit variant of CRIU. How about instead making some more progress toward getting rid of dubious TASK_SIZE users? I'm working on a little series to get rid of most of them. Meanwhile: it sure looks like a large fraction of the users are confused as to whether TASK_SIZE is the highest user address or the lowest non-user address.