Return-Path: Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S261779AbUCDKFX (ORCPT ); Thu, 4 Mar 2004 05:05:23 -0500 Received: (majordomo@vger.kernel.org) by vger.kernel.org id S261787AbUCDKFX (ORCPT ); Thu, 4 Mar 2004 05:05:23 -0500 Received: from havoc.gtf.org ([216.162.42.101]:18066 "EHLO gtf.org") by vger.kernel.org with ESMTP id S261779AbUCDKFF (ORCPT ); Thu, 4 Mar 2004 05:05:05 -0500 Date: Thu, 4 Mar 2004 05:05:03 -0500 From: David Eger To: linux-kernel@vger.kernel.org Subject: [PATCH] UTF-8ifying the kernel source Message-ID: <20040304100503.GA13970@havoc.gtf.org> Mime-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline User-Agent: Mutt/1.4.1i Sender: linux-kernel-owner@vger.kernel.org X-Mailing-List: linux-kernel@vger.kernel.org Content-Length: 3557 Lines: 84 http://www.yak.net/random/linux-2.6.3-utf8-cleanup-auto.diff.bz2 Here you find the first of several patches to convert the kernel source from ISO Latin-1 to UTF-8. I'm working on the files that didn't auto-convert easily; comments welcome ;-) First, some statistics! In Linux 2.6.3, there are: 15860 clean 7-bit ASCII files 274 text files are not 7-bit clean 38 of these 274 files are not auto-convertible -- either they are not ISO Latin-1 or the high octets appear within the actual code (not comments). This first patch applies to help files, documentation, and comments which are trivially correct ISO Latin-1 => UTF-8 conversions. The work I have left to do is summarized below. --dte Un-needed/wrong non-ASCII characters (these fixes will form patch 2) ==================================================================== drivers/video/amifb.c - +- sign? Documentation/i2c/i2c-protocol - NBSP, but why? arch/i386/kernel/cpu/cyrix.c - NBSP, but why? arch/v850/kernel/as85ep1.ld - WTF? comments in some random charset... drivers/char/ftape/lowlevel/fdc-isr.c - WTF? shit in the comments include/asm-m68k/atarihw.h - 0x94 - "cancel character"? include/asm-m68k/atariints.h - 0x94 - "cancel character"? include/linux/802_11.h - why the non-standard dash? scripts/docproc.c - why the bizarre spelling for specific? fs/ext2/xattr.c - bad ASCII art fs/ext3/xattr.c - bad ASCII art fs/afs/vlclient.h - a degrees sign, but why? Box-drawing ASCII art (these fixes will form patch 3) ===================================================== Documentation/networking/tms380tr.txt - DOS-style ASCII art arch/arm/nwfpe/fpopcode.h - line-drawing characters C strings - (what to do?) ========================= arch/ppc/platforms/proc_rtas.c - a C string containing "degrees" arch/ppc64/kernel/rtas-proc.c - a C string containing "degrees" drivers/macintosh/therm_adt7467.c - degrees, MODULE_PARAM_DESC(), and a C string drivers/mtd/chips/cfi_probe.c - C strings drivers/net/wireless/netwave_cs.c - C strings drivers/scsi/dc395x.c - C strings Other - (i'd convert it, but...) ================================ drivers/pci/pci.ids - I don't know what program processes this... drivers/ieee1394/oui.db - I don't know what program processes this... Machine / charset specific shite - (does anything need to be done?) =================================================================== arch/m68k/hp300/hp300map.map - maps to "char"s.. grr drivers/char/defkeymap.map - a map file... maps to "char"s.. grr drivers/char/qtronixmap.c_shipped - maps to "char"s.. grr drivers/char/qtronixmap.map - maps to "char"s.. grr drivers/tc/lk201-map.c_shipped - maps to "char"s.. grr drivers/tc/lk201-map.map - maps to "char"s.. grr drivers/acorn/char/defkeymap-l7200.c - maps to "char"s.. grr arch/s390/kernel/ebcdic.c - comments on a keymap table drivers/video/console/font_8x16.c - comments on a keymap table drivers/video/console/font_8x8.c - comments on a keymap table drivers/video/console/font_pearl_8x8.c - comments on a keymap table drivers/s390/ebcdic.c - comments on a keymap table Noise from userland (this I won't be touching) ============================================== Documentation/networking/ethertap.txt - random crap cat'd from /dev/tap0 Documentation/s390/Debugging390.txt - weird gdb output - To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/