Received: by 2002:a25:c205:0:0:0:0:0 with SMTP id s5csp2885775ybf; Mon, 2 Mar 2020 17:50:57 -0800 (PST) X-Google-Smtp-Source: ADFU+vtrcgLllT+NTn4O5x6qBqwrTVRxVYuU6hwapVSpo1zWMtDv2k3JSLd4YYz/Fy0y11D6Vq6y X-Received: by 2002:a54:4099:: with SMTP id i25mr943333oii.129.1583200256886; Mon, 02 Mar 2020 17:50:56 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1583200256; cv=none; d=google.com; s=arc-20160816; b=sCDazxECAHcw0iwR6E1ZPHxeXCUwosSkBqLeAcOaugQkSc75TPtF94KcKVmPpF8e3C /+xKFhWmTJI5ZL1Ssraw3P9LtVQWKFoYmX0hzI5bofErN/pe3iK9LXISnYY8tt0+CPlZ BXu8uqLSMFuse6D0vma/9ajUAfcaKLubtI//EdW9C4yY1RSh2FPoOslO9FLTOz4QdPDL iDnGoWdCXOq9I0+ZZbS40QweYsaWMo+692cabfK0sBOqBoJ6zTcSq2dHW6os4VT4BUVh QEvXwf+aDQ+gjxA4rj1iX1irG6OTwYrR+OSGTjjsPJgl8XQmAm7rUuVUYY6/xRmql7/4 29yg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:content-transfer-encoding:cc:to:subject :message-id:date:from:in-reply-to:references:mime-version :dkim-signature; bh=r/4TK4RpSOsZG1KYz157/rVP7bhcgbrwMq9gvsQBpSE=; b=SrNoI3hdepy3bAEFoeODHBPB4/gVoeORvFT/kTTZPLG46YLoo+giab7KuT+rMjTWbn 4qowtDiAwwj9Lk96T1mejCALk9geFDbh18Y3E0206XFSy/AuYeREAiuti9h1B+izWwW2 uAtJVHr24R0Oi9HHBJDdSSe4T05gB45hroAgQoEGmqrA30guZdXGQ2y5jLeZ8VS8X9nU 8olvllk/KMCgYPMGKk3nRC06oXiqVq1frhGxGdLwdmmz5YqfiTOSSa4yFpoOqGCyjjV/ q4/i5Jr9jj9XuRs09vke3MluKakDFX50Hpibbso5SBFVduD6eUt2Blp4aakb8I3XDB1F ZTtA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@mirlab-org.20150623.gappssmtp.com header.s=20150623 header.b=WOqssWS1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z10si7652195oth.325.2020.03.02.17.50.39; Mon, 02 Mar 2020 17:50:56 -0800 (PST) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@mirlab-org.20150623.gappssmtp.com header.s=20150623 header.b=WOqssWS1; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1726990AbgCCBtF (ORCPT + 99 others); Mon, 2 Mar 2020 20:49:05 -0500 Received: from mail-lj1-f177.google.com ([209.85.208.177]:34233 "EHLO mail-lj1-f177.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1726968AbgCCBtF (ORCPT ); Mon, 2 Mar 2020 20:49:05 -0500 Received: by mail-lj1-f177.google.com with SMTP id x7so1713138ljc.1 for ; Mon, 02 Mar 2020 17:49:03 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=mirlab-org.20150623.gappssmtp.com; s=20150623; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc:content-transfer-encoding; bh=r/4TK4RpSOsZG1KYz157/rVP7bhcgbrwMq9gvsQBpSE=; b=WOqssWS1QIrJEQysDIBeZnlnC3TVwM/clsfKrxa8W0mlF8iDgI38/oJxklFjEPhBjj yjFmIcnmIvpI6ZWY1mOXnU2uHLBOxE/dkUta7wYvUHOnjlLF9yyjEY6ZdKT07/uGftqX Z+be1jpr3yHB0zguCDZwbByH3OeRMMBqTR+z3yYszh16NwM8kC5zNkYDApeKRrjiGqXY BrVQ9xfOyXVFjP0SCr/6wL/Wrpi1EH2iNtqFg6RoEwZg6+40O4GI96r6rvbIP0kb7td4 GgnK2BqbEtwzUuXLfFQww+/0M/bC8NEFqDhKDrRu9ZRoA/5U/0D9C32dVAthX2kKMzEq 07vQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc:content-transfer-encoding; bh=r/4TK4RpSOsZG1KYz157/rVP7bhcgbrwMq9gvsQBpSE=; b=bvPygEghcOR58vL6CUXDwYvKDSFM5HNdNsTuqOGiAkvVCIfqmLEAiCCWWnMM/qxSfy AP3NViDCgIeat3mHdCX+KB96MO6FNSUnmjFuNEg0UaLIyNTW1faN3R1rh3xuDx6YCGX7 hZA0qFwefQcMufZu9OokXGUsglF3aj3ca/dtv+ujYpqBwV6U0iJYQJcp79DgS0ai5tvj tL0sSystl7AYOW77vbkJl4/dZqoDFycUBN6UX81BE2IpHSSNI7muX3f8sDf4WIMTJR42 /+W2E0z15NS5u7zggkdT5jOgHwdu61MSmsVPK9orNfen4a/vG1hIO3SskQ2xEm95PQ2j 45Cw== X-Gm-Message-State: ANhLgQ0aPvVlr53rZ6TiYp/bplCAgKONIp6lZKfyywiPfIEmar1ka3bB R5wJYGzkoASYpVcm0cihGInle6dsbA5O/BzV8Q8NMh1l0jc= X-Received: by 2002:a2e:99cc:: with SMTP id l12mr660926ljj.271.1583200142999; Mon, 02 Mar 2020 17:49:02 -0800 (PST) MIME-Version: 1.0 References: <20200302103754.nsvtne2vvduug77e@yavin> <20200302104741.b5lypijqlbpq5lgz@yavin> In-Reply-To: <20200302104741.b5lypijqlbpq5lgz@yavin> From: lampahome Date: Tue, 3 Mar 2020 09:48:50 +0800 Message-ID: Subject: Re: why do we need utf8 normalization when compare name? To: Aleksa Sarai Cc: linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Content-Type: text/plain; charset="UTF-8" Content-Transfer-Encoding: quoted-printable Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org > Sorry, a better example would've been "=C3=B1" (U+00F1). You can also > represent it as "n" (U+006E) followed by "=E2=97=8C=CC=83" (U+0303 -- "co= mbining > tilde"). Both forms are defined by Unicode to be canonically equivalent > so it would be incorrect to treat the two Unicode strings differently > (that isn't quite the case for "=C3=85"). So utf8-normalize will convert "=C3=B1" (U+00F1) and "n" (U+006E) followed by "=E2=97=8C=CC=83" to a utf8 code, and both are the same, right? Then compare it byte by byte.