Received: by 10.192.165.148 with SMTP id m20csp2480153imm; Sun, 22 Apr 2018 07:56:18 -0700 (PDT) X-Google-Smtp-Source: AIpwx4+G0LdycW1JRCr2V0c7tfy9uc9a3tj1wewQ2KmA1urgGwjZkO/shCSGHP2teC5jBehzi82L X-Received: by 10.101.82.133 with SMTP id y5mr14286895pgp.27.1524408978774; Sun, 22 Apr 2018 07:56:18 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1524408978; cv=none; d=google.com; s=arc-20160816; b=R9RUQ8frOjL/fuqY/DjZOpOuRHQ58Do5LqyjA/nZNVp5Uo4j7+HIUeFq/QTuXdy1P9 Wc4EZTsO4EGZerzjaWQUvCwItCa9yYjhr+fmuwTfvuB69R33msgzoEXu0YXJyfVXIn6Z NQbd28LBzM2e4OD/A53suiNkBweeAk3owt8MkueSLCQPBzI9QuSNbDK1Kgwgv2oSDdCY hu/7JNHszITvr7rHPP1SHnqeUWydBX4bjFUeiaD9qxLEGyTR8LiiHRxmv6/pr/rkAtX6 fji1eyyz/mNN4sUsCoqABHGTxt9L38ax0DnCL7yVrOIDy1nZMM6BxKr6IaeQQ1wQAce5 87sQ== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:mime-version:user-agent:references :in-reply-to:message-id:date:subject:cc:to:from :arc-authentication-results; bh=HUhHDrPmtuv41farWj3p8rpSy4CGdS/6182KY6pe4Cs=; b=g3SlTXn81zTngI8t25my1CMUD84LV0e1/Hwx4jvUKRBSkm5/uUIY1ZJ6sXw8G8HM+b LYo0nyOAaj48MMu3Z9oh2RLMeaOy8O43BpW+MpI/vvju66IVaXKH+8QADwS8BEdK6yeH g2SV3qzipFZa8l5bFn6fyYqWZuibBbiNdT9jnBKyB3k2hZZttDRHv9eEb5ceE4XphzYI pPubkbJPNQ984/11adkJSK4yGKE0jOhkomvpfBKscceEs1EhXtZX+BqI6h/5Nkz9YagA 1P2sEJFkQssYJbdc0cloS0P8GnooJhIKmOuhVHms55hypWL4lodRg3KRrRps1WVvS80J IWFQ== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z1si8530933pgs.132.2018.04.22.07.56.04; Sun, 22 Apr 2018 07:56:18 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1757138AbeDVOyE (ORCPT + 99 others); Sun, 22 Apr 2018 10:54:04 -0400 Received: from mail.linuxfoundation.org ([140.211.169.12]:56474 "EHLO mail.linuxfoundation.org" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1757021AbeDVOOg (ORCPT ); Sun, 22 Apr 2018 10:14:36 -0400 Received: from localhost (LFbn-1-12247-202.w90-92.abo.wanadoo.fr [90.92.61.202]) by mail.linuxfoundation.org (Postfix) with ESMTPSA id 927254A5; Sun, 22 Apr 2018 14:14:35 +0000 (UTC) From: Greg Kroah-Hartman To: linux-kernel@vger.kernel.org Cc: Greg Kroah-Hartman , stable@vger.kernel.org, Mingye Wang , Jan Kara Subject: [PATCH 4.9 85/95] udf: Fix leak of UTF-16 surrogates into encoded strings Date: Sun, 22 Apr 2018 15:53:54 +0200 Message-Id: <20180422135213.907799808@linuxfoundation.org> X-Mailer: git-send-email 2.17.0 In-Reply-To: <20180422135210.432103639@linuxfoundation.org> References: <20180422135210.432103639@linuxfoundation.org> User-Agent: quilt/0.65 X-stable: review MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org 4.9-stable review patch. If anyone has any objections, please let me know. ------------------ From: Jan Kara commit 44f06ba8297c7e9dfd0e49b40cbe119113cca094 upstream. OSTA UDF specification does not mention whether the CS0 charset in case of two bytes per character encoding should be treated in UTF-16 or UCS-2. The sample code in the standard does not treat UTF-16 surrogates in any special way but on systems such as Windows which work in UTF-16 internally, filenames would be treated as being in UTF-16 effectively. In Linux it is more difficult to handle characters outside of Base Multilingual plane (beyond 0xffff) as NLS framework works with 2-byte characters only. Just make sure we don't leak UTF-16 surrogates into the resulting string when loading names from the filesystem for now. CC: stable@vger.kernel.org # >= v4.6 Reported-by: Mingye Wang Signed-off-by: Jan Kara Signed-off-by: Greg Kroah-Hartman --- fs/udf/unicode.c | 6 ++++++ 1 file changed, 6 insertions(+) --- a/fs/udf/unicode.c +++ b/fs/udf/unicode.c @@ -28,6 +28,9 @@ #include "udf_sb.h" +#define SURROGATE_MASK 0xfffff800 +#define SURROGATE_PAIR 0x0000d800 + static int udf_uni2char_utf8(wchar_t uni, unsigned char *out, int boundlen) @@ -37,6 +40,9 @@ static int udf_uni2char_utf8(wchar_t uni if (boundlen <= 0) return -ENAMETOOLONG; + if ((uni & SURROGATE_MASK) == SURROGATE_PAIR) + return -EINVAL; + if (uni < 0x80) { out[u_len++] = (unsigned char)uni; } else if (uni < 0x800) {