Received: by 10.223.176.5 with SMTP id f5csp2756373wra; Thu, 1 Feb 2018 05:41:17 -0800 (PST) X-Google-Smtp-Source: AH8x225lbnZCbRhFK0mDpb6q+NolBdg+xzxqb1GVIbBN3OEIQ0DSy7Hw4JdoTf6uuEVrJXYVqxQI X-Received: by 10.101.83.133 with SMTP id x5mr28708931pgq.244.1517492476877; Thu, 01 Feb 2018 05:41:16 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1517492476; cv=none; d=google.com; s=arc-20160816; b=SK6JK3u+EyEtSxknsLgdrXIs+rf8mYlYF9qu7WPQZXLN4uC9Lbrzd6iKfs/NHX+iam sm5msYjG1x8y4ZWuh05CD+h9erx98/TzHnwzDNYPpiJnisfh4F9i+I4WhFY6caqTTCEb 2OlQvz1BBlrzYBlBvvE+1i9lcch0rSAmGdA/MSK1Rj46tANE0vSw8Nlwgp1UdwbKDeS0 vE2uYBtcX72Jtc6+M2KsQdlC3BxzRzJcpGVXR1xb+HmbZEeN1aegtQUzLqMAwe1EmocB 3ojZMKWSb5BTlRSJQ90it8oRrmBONJlpnrz8pLeLEQEycV4HW1S3MyCiL9PRoWKZiCmX tN5Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:user-agent:in-reply-to :content-disposition:mime-version:references:message-id:subject:cc :to:from:date:arc-authentication-results; bh=g1DKaJndz/idgJNh+JpTxDIGRh79q0+Z6Dm268z/mNM=; b=0tODwUJtdfatyPlmW9QZxatw/5WiRH23BZM6UE9jLgdP3oyku9oXfmQRngRQUuKe3I WYBszzRToo7+cf/Gk/PSoX6vJehWPVprugfDw/U5WYLhJG8YzsTBdhA+ij0DE5RO1N8D lpfnp/bRzJbPkt7dw4N88cpHSgbVAa+Z2+xqMIEJj+AMM4PZKwPMBupTyjfOlePmbqKe oqv32FshL7SI6zVpouQFfTjkJoijgRWPVf/sz95WYUgcFQ8szk5dENav8GuhMKGvOX05 Q13k0yN4r1E5wO3uE063kUocISCBcbDPthKNrEk0rgkKovYhL6oCoK2wc4L8l8xLd2wT tmtg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id i184si5235772pgc.333.2018.02.01.05.41.00; Thu, 01 Feb 2018 05:41:16 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1752047AbeBANkb (ORCPT + 99 others); Thu, 1 Feb 2018 08:40:31 -0500 Received: from mx2.suse.de ([195.135.220.15]:41938 "EHLO mx2.suse.de" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751582AbeBANk3 (ORCPT ); Thu, 1 Feb 2018 08:40:29 -0500 X-Virus-Scanned: by amavisd-new at test-mx.suse.de Received: from relay1.suse.de (charybdis-ext.suse.de [195.135.220.254]) by mx2.suse.de (Postfix) with ESMTP id 67FB0AD00; Thu, 1 Feb 2018 13:40:27 +0000 (UTC) Date: Thu, 1 Feb 2018 14:40:26 +0100 From: Michal Hocko To: Anshuman Khandual Cc: Michael Ellerman , "akpm@linux-foundation.org" , mm-commits@vger.kernel.org, linux-kernel@vger.kernel.org, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-next@vger.kernel.org, sfr@canb.auug.org.au, broonie@kernel.org, Kees Cook , Linus Torvalds Subject: Re: ppc elf_map breakage with MAP_FIXED_NOREPLACE Message-ID: <20180201134026.GK21609@dhcp22.suse.cz> References: <20180126140415.GD5027@dhcp22.suse.cz> <15da8c87-e6db-13aa-01c8-a913656bfdb6@linux.vnet.ibm.com> <6db9b33d-fd46-c529-b357-3397926f0733@linux.vnet.ibm.com> <20180129132235.GE21609@dhcp22.suse.cz> <87k1w081e7.fsf@concordia.ellerman.id.au> <20180130094205.GS21609@dhcp22.suse.cz> <5eccdc1b-6a10-b48a-c63f-295f69473d97@linux.vnet.ibm.com> <20180131131937.GA6740@dhcp22.suse.cz> <20180201131007.GJ21609@dhcp22.suse.cz> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20180201131007.GJ21609@dhcp22.suse.cz> User-Agent: Mutt/1.9.2 (2017-12-15) Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Thu 01-02-18 14:10:07, Michal Hocko wrote: > [CC Kees and Linus - for your background, we are talking about failures > http://lkml.kernel.org/r/20180107090229.GB24862@dhcp22.suse.cz > introduced by http://lkml.kernel.org/r/20171213092550.2774-3-mhocko@kernel.org > Debugging has shown that load_elf_binary tries to map elf segment over > an existing brk - see below.] > > On Thu 01-02-18 08:43:34, Anshuman Khandual wrote: > [...] > > [ 9.295990] vma c000001fc8137c80 start 0000000010030000 end 0000000010040000 > > next c000001fc81378c0 prev c000001fc8137680 mm c000001fc8108200 > > prot 8000000000000104 anon_vma (null) vm_ops (null) > > pgoff 1003 file (null) private_data (null) > > flags: 0x100073(read|write|mayread|maywrite|mayexec|account) > > [ 9.296351] CPU: 47 PID: 7537 Comm: sed Not tainted 4.14.0-00006-g4bd92fe-dirty #162 > > [ 9.296450] Call Trace: > > [ 9.296482] [c000001fc70db9b0] [c000000000b180e0] dump_stack+0xb0/0xf0 (unreliable) > > [ 9.296588] [c000001fc70db9f0] [c0000000002db0b8] do_brk_flags+0x2d8/0x440 > > [ 9.296674] [c000001fc70dbac0] [c0000000002db4d0] vm_brk_flags+0x80/0x130 > > [ 9.296751] [c000001fc70dbb20] [c0000000003d2998] set_brk+0x80/0xe8 > > [ 9.296824] [c000001fc70dbb60] [c0000000003d2518] load_elf_binary+0x12f8/0x1580 > > [ 9.296910] [c000001fc70dbc80] [c00000000035d9e0] search_binary_handler+0xd0/0x270 > > [ 9.296999] [c000001fc70dbd10] [c00000000035f938] do_execveat_common.isra.31+0x658/0x890 > > [ 9.297089] [c000001fc70dbdf0] [c00000000035ff80] SyS_execve+0x40/0x50 > > [ 9.297162] [c000001fc70dbe30] [c00000000000b220] system_call+0x58/0x6c > > > > But coming back to when it failed with MAP_FIXED_NOREPLACE, looking into ELF > > section details (readelf -aW /usr/bin/sed), there was a PT_LOAD segment with > > p_memsz > p_filesz which might be causing set_brk() to be called. > > > > > > Type Offset VirtAddr PhysAddr FileSiz MemSiz Flg Align > > ... > > LOAD 0x020328 0x0000000010030328 0x0000000010030328 0x000384 0x0094a0 RW 0x10000 > > > > which can be confirmed by just dumping elf_brk/elf_bss for this particular > > instance. (elf_brk > elf_bss) > > Hmm, interesting. So the above is not a regular brk. The check has been > added in 2001 by "v2.4.10.1 -> v2.4.10.2" but the changelog is not > revealing at all. > > Btw. my /bin/ls also has MemSiz>FileSiz > LOAD 0x01ade0 0x000000000061ade0 0x000000000061ade0 0x00079c 0x001520 RW 0x200000 > 113: 000000000061b57c 0 NOTYPE GLOBAL DEFAULT ABS __bss_start > > and do not see any problem. So this is more likely a problem of elf_brk > being placed at a wrong address. But I am desperately lost in this code > so I might be completely off. Thanks a lot to Michael Matz for his background. He has pointed me to the following two segments from your binary[1] LOAD 0x0000000000000000 0x0000000010000000 0x0000000010000000 0x0000000000013a8c 0x0000000000013a8c R E 10000 LOAD 0x000000000001fd40 0x000000001002fd40 0x000000001002fd40 0x00000000000002c0 0x00000000000005e8 RW 10000 LOAD 0x0000000000020328 0x0000000010030328 0x0000000010030328 0x0000000000000384 0x00000000000094a0 RW 10000 That binary has two RW LOAD segments, the first crosses a page border into the second 0x1002fd40 (LOAD2-vaddr) + 0x5e8 (LOAD2-memlen) == 0x10030328 (LOAD3-vaddr) He says : This is actually an artifact of RELRO machinism. The first RW mapping : will be remapped as RO after relocations are applied (to increase : security). : Well, to be honest, normal relro binaries also don't have more than : two LOAD segments, so whatever RHEL did to their compilation options, : it's something in addition to just relro (which can be detected by : having a GNU_RELRO program header) : But it definitely has something to do with relro, it's just not the : whole story yet. I am still trying to wrap my head around all this, but it smells rather dubious to map different segments over the same page. Is this something that might happen widely and therefore MAP_FIXED_NOREPLACE is a no-go when loading ELF segments? Or is this a special case we can detect? Or am I completely off? [1] http://lkml.kernel.org/r/96458c0a-e273-3fb9-a33b-f6f2d536f90b%40linux.vnet.ibm.com -- Michal Hocko SUSE Labs