Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932488AbcDTNZh (ORCPT ); Wed, 20 Apr 2016 09:25:37 -0400 Received: from mx1.redhat.com ([209.132.183.28]:35622 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1752899AbcDTNZg (ORCPT ); Wed, 20 Apr 2016 09:25:36 -0400 Subject: Re: [PATCH] lib: Always NUL terminate ucs2_as_utf8 To: Chris Wilson , Peter Jones References: <1461141427-16361-1-git-send-email-chris@chris-wilson.co.uk> Cc: intel-gfx@lists.freedesktop.org, Matt Fleming , Jason Andryuk , Matthew Garrett , linux-efi@vger.kernel.org, linux-kernel@vger.kernel.org From: Laszlo Ersek Message-ID: <5717834C.6070802@redhat.com> Date: Wed, 20 Apr 2016 15:25:32 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:38.0) Gecko/20100101 Thunderbird/38.7.2 MIME-Version: 1.0 In-Reply-To: <1461141427-16361-1-git-send-email-chris@chris-wilson.co.uk> Content-Type: text/plain; charset=windows-1252 Content-Transfer-Encoding: 7bit Sender: linux-kernel-owner@vger.kernel.org List-ID: X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 5848 Lines: 148 On 04/20/16 10:37, Chris Wilson wrote: > If the caller, in this case efivarfs_callback(), only provides sufficent > room for the expanded utf8 and not enough to include the terminating NUL > byte, that NUL byte is skipped. When the caller then interprets it as a > string, it may then read from past its allocated memory: > > [ 170.605647] WARNING: kmemcheck: Caught 8-bit read from uninitialized memory (ffff8804079ae786) > [ 170.605677] 436f6e4f757400004c44322d35363062663538612d316530642d346437652d39 > [ 170.606037] i i i i i i u u u u u u u u u u u u u u u u u u u u u u u u u u > [ 170.606236] ^ > [ 170.606243] RIP: 0010:[] [] efivar_variable_is_removable+0xaf/0xf0 > [ 170.606346] RSP: 0018:ffff880408e73c20 EFLAGS: 00010206 > [ 170.606352] RAX: 0000000000000000 RBX: 0000000000000006 RCX: 0000000000000006 > [ 170.606359] RDX: 0000000000000000 RSI: 0000000000000074 RDI: ffff880408e73c30 > [ 170.606365] RBP: ffff880408e73c80 R08: 0000000000000006 R09: 000000000000008c > [ 170.606371] R10: 0000000000000006 R11: 0000000000000000 R12: ffffffff8166ed20 > [ 170.606378] R13: 11d293ca8be4df61 R14: ffffffff81773834 R15: ffff8804079ae780 > [ 170.606385] FS: 0000000000000000(0000) GS:ffff88041ca00000(0000) knlGS:0000000000000000 > [ 170.606392] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 170.606399] CR2: ffff880409cbe4c0 CR3: 00000004085fd000 CR4: 00000000001406f0 > [ 170.606405] [] efivarfs_callback+0xf8/0x275 > [ 170.606418] [] efivar_init+0x248/0x2e0 > [ 170.606440] [] efivarfs_fill_super+0xb4/0xf0 > [ 170.606452] [] mount_single+0x87/0xb0 > [ 170.606463] [] efivarfs_mount+0x13/0x20 > [ 170.606475] [] mount_fs+0x10/0x90 > [ 170.606497] [] vfs_kern_mount+0x62/0x100 > [ 170.606508] [] do_mount+0x1e0/0xcd0 > [ 170.606519] [] SyS_mount+0x8f/0xd0 > [ 170.606530] [] entry_SYSCALL_64_fastpath+0x17/0x93 > [ 170.606542] [] 0xffffffffffffffff > > Cc: Matt Fleming > Cc: Jason Andryuk > Cc: Matthew Garrett > Cc: Laszlo Ersek > Cc: Peter Jones > Cc: linux-efi@vger.kernel.org > Cc: linux-kernel@vger.kernel.org > --- > drivers/firmware/efi/vars.c | 2 +- > lib/ucs2_string.c | 15 ++++++++++----- > 2 files changed, 11 insertions(+), 6 deletions(-) > > diff --git a/drivers/firmware/efi/vars.c b/drivers/firmware/efi/vars.c > index 0ac594c0a234..8dd503bac35d 100644 > --- a/drivers/firmware/efi/vars.c > +++ b/drivers/firmware/efi/vars.c > @@ -166,7 +166,7 @@ validate_ascii_string(efi_char16_t *var_name, int match, u8 *buffer, > > struct variable_validate { > efi_guid_t vendor; > - char *name; > + const char *name; > bool (*validate)(efi_char16_t *var_name, int match, u8 *data, > unsigned long len); > }; > diff --git a/lib/ucs2_string.c b/lib/ucs2_string.c > index f0b323abb4c6..fb8d03966656 100644 > --- a/lib/ucs2_string.c > +++ b/lib/ucs2_string.c > @@ -85,29 +85,34 @@ ucs2_as_utf8(u8 *dest, const ucs2_char_t *src, unsigned long maxlength) > unsigned long j = 0; > unsigned long limit = ucs2_strnlen(src, maxlength); > > - for (i = 0; maxlength && i < limit; i++) { > + if (maxlength == 0) > + return 0; > + > + for (i = 0; i < limit; i++) { > u16 c = src[i]; > > if (c >= 0x800) { > - if (maxlength < 3) > + if (maxlength <= 3) > break; > maxlength -= 3; > dest[j++] = 0xe0 | (c & 0xf000) >> 12; > dest[j++] = 0x80 | (c & 0x0fc0) >> 6; > dest[j++] = 0x80 | (c & 0x003f); > } else if (c >= 0x80) { > - if (maxlength < 2) > + if (maxlength <= 2) > break; > maxlength -= 2; > dest[j++] = 0xc0 | (c & 0x7c0) >> 6; > dest[j++] = 0x80 | (c & 0x03f); > } else { > + if (maxlength <= 1) > + break; > maxlength -= 1; > dest[j++] = c & 0x7f; > } > } > - if (maxlength) > - dest[j] = '\0'; > + dest[j] = '\0'; > + > return j; > } > EXPORT_SYMBOL(ucs2_as_utf8); > I apologize for following up in "waves". I'd like us to consider how this patch works together with the rest of the code. IIUC, ucs2_utf8size() returns the number of non-NUL bytes that are required to store the transcoded string. And, in "fs/efivarfs/super.c", function efivarfs_callback(), we have: len = ucs2_utf8size(entry->var.VariableName); /* name, plus '-', plus GUID, plus NUL*/ name = kmalloc(len + 1 + EFI_VARIABLE_GUID_LEN + 1, GFP_KERNEL); if (!name) goto fail; ucs2_as_utf8(name, entry->var.VariableName, len); So, "len" does not include the room for the terminating NUL-byte here. When "len" is passed to ucs2_as_utf8(), with the proposed patch applied, a NUL byte will be produced in "name", but it will be at the price of a genuine character from the input variable name. Because, "len" has been computed for the exact storage need, and now the equality in one of the permissive inequalities (introduced by this patch) will terminate the transcoding early. Assume that VariableName is (u16[]){ 'a', 'b', '\0' }. For this input, ucs2_strlen() returns 2, and ucs2_utf8size() also returns 2. Right? If so, then: - len = 2 - "maxlen" at entry to ucs2_as_utf8() is also 2 - "limit" in ucs2_as_utf8() will be initialized to 2 as well So we run the loop body twice. The first iteration will lower "maxlength" to 1 (last branch). In the second iteration, (maxlength<=1) will match, and instead of transferring the character 'b', we'll set dest[1] to '\0'. In other words, we get the NUL-terminated string "a" as output. I think this breaks the subsequent comparisons against known variable names. Am I wrong? Thanks Laszlo