Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1755285AbdDDW75 (ORCPT ); Tue, 4 Apr 2017 18:59:57 -0400 Received: from mail-it0-f49.google.com ([209.85.214.49]:35109 "EHLO mail-it0-f49.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1755242AbdDDW74 (ORCPT ); Tue, 4 Apr 2017 18:59:56 -0400 MIME-Version: 1.0 In-Reply-To: References: <20170330194143.cbracica3w3ijrcx@codemonkey.org.uk> <20170331171724.nm22iqiellfsvj5z@codemonkey.org.uk> From: Kees Cook Date: Tue, 4 Apr 2017 15:59:54 -0700 X-Google-Sender-Auth: cDnMmCUlEhsLgn3oepL5VLbtC9s Message-ID: Subject: Re: sudo x86info -a => kernel BUG at mm/usercopy.c:78! To: Linus Torvalds Cc: Tommi Rantala , Dave Jones , Linux-MM , LKML , Laura Abbott , Ingo Molnar , Josh Poimboeuf , Mark Rutland , Eric Biggers Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 2238 Lines: 61 On Tue, Apr 4, 2017 at 3:55 PM, Linus Torvalds wrote: > On Tue, Apr 4, 2017 at 3:37 PM, Kees Cook wrote: >> >> For one of my systems, I see something like this: >> >> 00000000-00000fff : reserved >> 00001000-0008efff : System RAM >> 0008f000-0008ffff : reserved >> 00090000-0009f7ff : System RAM >> 0009f800-0009ffff : reserved > > That's fairly normal. > >> I note that there are two "System RAM" areas below 0x100000. > > Yes. Traditionally the area from about 4k to 640kB is RAM. With a > random smattering of BIOS areas. > >> * On x86, access has to be given to the first megabyte of ram because that area >> * contains BIOS code and data regions used by X and dosemu and similar apps. > > Rigth. Traditionally, dosemu did one big mmap of the 1MB area to just > get all the BIOS data in one go. > >> This means that it allows reads into even System RAM below 0x100000, >> but I think that's a mistake. > > What you think is a "mistake" is how /dev/mem has always worked. > > /dev/mem gave access to all the memory of the system. That's LITERALLY > the whole point of it. There was no "BIOS area" or anything else. It > was access to physical memory. > > We've added limits to it, but those limits came later, and they came > with the caveat that lots of programs used /dev/mem in various ways. > > Nobody was crazy enough to read /dev/mem one byte at a time trying to > follow BIOS tables. No, the traditional way was to just map (or read) > large chunks of it, and then follow the tables in the result. The > easiest way was to just do the whole low 1MB. > > There's no "mistake" here. The only thing that is mistaken is you > thinking that we can redefine reality and change history. I'm not trying to rewrite history. :) I'm try to understand the requirements for how the 1MB area was used, which you've explained the history of now. (Thank you!) > I already explained what the likely fix is: make devmem_is_allowed() > return a ternary value, so that those things that *do* read the BIOS > area can just continue to do so, but they see zeroes for the parts > that the kernel has taken over. Sounds good to me. I'll go work on that. -Kees -- Kees Cook Pixel Security