Received: by 2002:ac0:a581:0:0:0:0:0 with SMTP id m1-v6csp5915229imm; Tue, 26 Jun 2018 22:01:33 -0700 (PDT) X-Google-Smtp-Source: ADUXVKLPdTEhBxs0DAIRh8FDAT59mFGzu57m8rfx6YZQarP5hDHTVuLCkPxL6F/AyzDU7e8moqRF X-Received: by 2002:a63:714c:: with SMTP id b12-v6mr3785969pgn.420.1530075693215; Tue, 26 Jun 2018 22:01:33 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1530075693; cv=none; d=google.com; s=arc-20160816; b=SQQRK8pkE184p3agCUpyjMdrK4uznEB/ZM407yRG110T2/U9ef0aBSAfI8+kD/al4o bv7dqxGQmjvYNyMlXn+wSOVxsMk7Ey+wIDJxJFMJ1BYyBbusH9YEV5BZsyU1UKwbRjDK SJzHbLesLtIP/iUbqsyzZUZKm3SUIEtprkgAOM2GZxun6EdkQJcLI+qlyvBK/6M5sG8F VICJQEez75XUpygqyC5MWRhJVyi3rG00Re4rmRxwjVEq48Z/gPst6y9T6AJ4yFPzWemm Bqt3NhBIauxg0hTeYUPhTfaa1UCexK0/aJ4ipWmQ6aAqBhqoPOP4O5Jc1AaNlNrt/1yV 1drg== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:sender:message-id:date:subject:cc:to:from :dkim-signature:arc-authentication-results; bh=Katllt2ss8aLoYGAmigt+jW/DgXrwB6f+MakPdF4kpA=; b=jhgOu5n0j0C/jXT2kGML8OIwvbk5aHuLpavOlrur7tqPDRjl7wQdA9WGvOgveqxioX O2rAtXZRItHr6IFN+jarZPPH+y2QZ1bA3kCVPCH8eFVMdlPig49Sx/6roAuGqL8pJGgF h3NvtAvnpg1dKAxOrRBv2d88mwKc0hS5DBrpybfVCakla0DWvwBwcZVqo8jk+hFNRu7/ vo1rcdttcasDnNh+o1GRivc46jn84LNAxbU8UKMJK5BNXIjYuG1z9F0HJV+RKldt/rVR 2WkkY7rRsI18q/rUsKH+pis6fOqQnNb2pkmZGt0nlqHmVdSeO0IsY9PX5jcXQTfBYpli t0kQ== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@pobox.com header.s=sasl header.b="Axn6A/Y2"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Return-Path: Received: from vger.kernel.org (vger.kernel.org. [209.132.180.67]) by mx.google.com with ESMTP id z73-v6si2811833pgz.284.2018.06.26.22.01.18; Tue, 26 Jun 2018 22:01:33 -0700 (PDT) Received-SPF: pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) client-ip=209.132.180.67; Authentication-Results: mx.google.com; dkim=pass header.i=@pobox.com header.s=sasl header.b="Axn6A/Y2"; spf=pass (google.com: best guess record for domain of linux-kernel-owner@vger.kernel.org designates 209.132.180.67 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=fail (p=NONE sp=NONE dis=NONE) header.from=linaro.org Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1753556AbeF0D4w (ORCPT + 99 others); Tue, 26 Jun 2018 23:56:52 -0400 Received: from pb-smtp2.pobox.com ([64.147.108.71]:53214 "EHLO pb-smtp2.pobox.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1751073AbeF0D4s (ORCPT ); Tue, 26 Jun 2018 23:56:48 -0400 Received: from pb-smtp2.pobox.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id B2F71E7787; Tue, 26 Jun 2018 23:56:46 -0400 (EDT) DKIM-Signature: v=1; a=rsa-sha1; c=relaxed; d=pobox.com; h=from:to:cc :subject:date:message-id; s=sasl; bh=d3LgZaE8SnHbrUb77BIW8JQoLLs =; b=Axn6A/Y2OZPk1WL/SvMLfhWHYNMAxuzi/b/CiuHf14J/5xe2q9ADq9BohBm AdLUOSey8esSeGpeSfSrR1iilLffzkoKJItgf3F3eDPYX3itPcJ3TfncEn6VP+xp mfv3XhTGWgqk73Np4OJXjXlA6QDCXKcBPxqxiCxZhxE4HKXI= Received: from pb-smtp2.nyi.icgroup.com (unknown [127.0.0.1]) by pb-smtp2.pobox.com (Postfix) with ESMTP id 9AB57E7786; Tue, 26 Jun 2018 23:56:46 -0400 (EDT) Received: from yoda.home (unknown [70.82.104.228]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by pb-smtp2.pobox.com (Postfix) with ESMTPSA id 18EB4E777F; Tue, 26 Jun 2018 23:56:46 -0400 (EDT) Received: from xanadu.home (xanadu.home [192.168.2.2]) by yoda.home (Postfix) with ESMTP id 483112DA00D8; Tue, 26 Jun 2018 23:56:44 -0400 (EDT) From: Nicolas Pitre To: Greg Kroah-Hartman Cc: Dave Mielke , Samuel Thibault , Adam Borowski , Alan Cox , linux-kernel@vger.kernel.org, linux-console@vger.kernel.org Subject: [PATCH v3 0/3] have the vt console preserve unicode characters Date: Tue, 26 Jun 2018 23:56:39 -0400 Message-Id: <20180627035642.8561-1-nicolas.pitre@linaro.org> X-Mailer: git-send-email 2.17.1 X-Pobox-Relay-ID: 1C5A012A-79BE-11E8-9FF5-40570C78B957-78420484!pb-smtp2.pobox.com Sender: linux-kernel-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org The vt code translates UTF-8 strings into glyph index values and stores those glyph values in the screen buffer. Because there can only be at most 512 glyphs at the moment, it is impossible to represent most unicode characters, in which case a default glyph (often '?') is displayed instead. The original unicode value is then lost. The 512-glyph limitation is inherent to text-mode VGA displays after which the core console code was modelled. This also means that the /dev/vcs* devices only provide user space with glyph index values, and then user applications must get hold of the unicode-to-glyph table the kernel is using in order to back-translate those into actual characters. It is not possible to get back the original unicode value when multiple unicode characters map to the same glyph, especially for the vast majority that maps to the default replacement glyph. Users of /dev/vcs* shouldn't have to be restricted to a narrow unicode space from lossy screen content because of that. This is especially true for accessibility applications such as BRLTTY that rely on /dev/vcs to render screen content onto braille terminals. It was also argued that the VGA-centric glyph buffer should eventually go entirely. The current design made sense when hardware was slow and managing the screen directly into the VGA memory made a difference (i.e. 25 years ago). Modern console display drivers no longer have to be limited to 512 glyphs. Quoting Alan Cox: |The only driver that it suits is the VGA text mode driver, which at |2GHz+ is going to be fast enough whatever format you convert from. We |have the memory, the processor power and the fact almost all our |displays are bitmapped (or more complex still) all in favour of |throwing away that limit. This patch series introduces unicode screen support to the core console code with /dev/vcs* as a first user. Memory is allocated, and possible CPU overhead introduced, only if /dev/vcsu is read at least once. For now both the glyph and unicode buffers are maintained in parallel to allow for a smooth transition. I'm a prime user of this new /dev/vcsu interface, as well as the BRLTTY maintainer Dave Mielke who implemented support for this in BRLTTY. There is therefore a vested interest in maintaining this feature as necessary. And this received extensive testing as well at this point. This is also available on top of v4.18-rc2 here: git://git.linaro.org/people/nicolas.pitre/linux vt-unicode Changes from v2: - Dropped patch #4 as it was useful only for initial debugging and it attracted all the review comments so far -- actually more than the patch is worth. - Added Adam Borowski's ACK. Changes from v1: - Rebased to v4.18-rc1. - Dropped first patch (now in mainline as commit 4b4ecd9cb8). - Removed a printk instance from an error path easily triggerable from user space. - Minor cleanup. Diffstat: drivers/tty/vt/vc_screen.c | 90 ++++++++-- drivers/tty/vt/vt.c | 308 +++++++++++++++++++++++++++++++-- include/linux/console_struct.h | 2 + include/linux/selection.h | 5 + 4 files changed, 380 insertions(+), 25 deletions(-)