Received: by 2002:a05:6a10:d5a5:0:0:0:0 with SMTP id gn37csp2739477pxb; Fri, 8 Oct 2021 14:19:23 -0700 (PDT) X-Google-Smtp-Source: ABdhPJxB68T4TmwQIjlDnCsIuSX0aKARTZliw/gKpe0EwVaktrL194Ec+YoJsEWAKSF++utGhfUY X-Received: by 2002:a17:906:f6cd:: with SMTP id jo13mr7078744ejb.563.1633727963101; Fri, 08 Oct 2021 14:19:23 -0700 (PDT) ARC-Seal: i=1; a=rsa-sha256; t=1633727963; cv=none; d=google.com; s=arc-20160816; b=vtUKWQ4RZd15Z58GR9l9KPcnDQ9pjkywsEjERXgl+kDhLNFsrWJKrUSvTzBFqVaGFQ J8mJu/WLDtLGTAGvxtggDc8lKD88nmmS/qBSddPaP8c6KS38nKYO6urGkDNNTPKBSjMf Qayr6hxPBHBWGgO2GBA+KdXnZ3+gjecGUD7+a6W8el2r5U0bFGVLmrwJdFJvZG2H+Tyq zD6GEgSldRwLmxbNTn6/VQwaEzrKDvGw3CGDwZLf19iZMi1ydGYk7nJsvnWLmYXlO5QL uUqHWT5I+jYJ3a+eexLuIOYfTVmMi1OmOfM99tuDOZJTFvUBK5mPlnxq1u54J7F8k3wa 6CRw== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=list-id:precedence:cc:to:subject:message-id:date:from:in-reply-to :references:mime-version:dkim-signature; bh=HJ2hSEQ+j+jRG03QYHkbXI7vei0EPm7JoU1IqV+QS5I=; b=o028yKfI3wWFM6Khqse4212QsqyKSAfyBxi4ieP4ZZ7qc+rZFEkGgl95lEkZIkUrnh 3BN0sItgwY4HfE/JAwQhiJ9x+ZQSjtsCvgEivviDhxgqIfsmmwHEoneeinxLRYtg+H2E Qu5VyxPuwHRXvtj3dZOT5jkzYbk/mi0dDZo9rty8UnXiDRCIW1Y8+hfRFWNm6VFjx0wx eC1y4Dx+ffli/NrtQnoPxDF+cw69HrieRDTvVeEnsl8w9VV788xunUjJYkIYlv7O2pHb K/vS0PCdwfLkSO47rNv8P96avjggHfdUhZgFsNEphSYYY5aotRjHvWx0tYiIEMUrxqBm FUcA== ARC-Authentication-Results: i=1; mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=BGZYWRGb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Return-Path: Received: from vger.kernel.org (vger.kernel.org. [23.128.96.18]) by mx.google.com with ESMTP id s1si531510edt.171.2021.10.08.14.18.58; Fri, 08 Oct 2021 14:19:23 -0700 (PDT) Received-SPF: pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) client-ip=23.128.96.18; Authentication-Results: mx.google.com; dkim=pass header.i=@google.com header.s=20210112 header.b=BGZYWRGb; spf=pass (google.com: domain of linux-kernel-owner@vger.kernel.org designates 23.128.96.18 as permitted sender) smtp.mailfrom=linux-kernel-owner@vger.kernel.org; dmarc=pass (p=REJECT sp=REJECT dis=NONE) header.from=google.com Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S231745AbhJHVRe (ORCPT + 99 others); Fri, 8 Oct 2021 17:17:34 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:56836 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S231696AbhJHVRd (ORCPT ); Fri, 8 Oct 2021 17:17:33 -0400 Received: from mail-pg1-x52c.google.com (mail-pg1-x52c.google.com [IPv6:2607:f8b0:4864:20::52c]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id CA94FC061755 for ; Fri, 8 Oct 2021 14:15:37 -0700 (PDT) Received: by mail-pg1-x52c.google.com with SMTP id q12so3553395pgq.12 for ; Fri, 08 Oct 2021 14:15:37 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=20210112; h=mime-version:references:in-reply-to:from:date:message-id:subject:to :cc; bh=HJ2hSEQ+j+jRG03QYHkbXI7vei0EPm7JoU1IqV+QS5I=; b=BGZYWRGb5LbKNUD6vjOnXGkDRKoHoLWaJ9B0EwOvYcgr5rEgBFHVE9yzeE8YP1YGH/ 99iWMQw2OhOpehvcXBi8+w5pp1s6x49QMgpq9q6pubtnT8ltJYRA7NvDETpVQwyZ6vvo MFJ2jVO6xiPZKjPv/za3/I7ej392xWgJzSDkAtWnsAlh0GQCXRChvtb4m/URdzINMkJU qoxM3nkC9Eaefjj4+GM9vE3DCDzOYStuzdb+KfH1dCRq31KUxwD0z2gjZ8kH0F1hE4uC EwrGH/E9grcVt+LBUpMDCEcfbrhzncZFgx+MSSyIBDlds1mYl7qObx8rwK8bpWtaueMF vt7g== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:mime-version:references:in-reply-to:from:date :message-id:subject:to:cc; bh=HJ2hSEQ+j+jRG03QYHkbXI7vei0EPm7JoU1IqV+QS5I=; b=y3VSpER0BpjeFLOdRdUIGg0TcdV8Gkgo89ibRt3MzrV8nbbGWN2NWFO47+lesue9w4 DTfF50m9+Apywrtf92C/UHAITs5CFiw6K0/lUZWA2gGIymv9HMLN3ErI+xsnnqHX1CII 8w2UWCnwblFu8u3rJ24u7XrPehTvE10XzE1ipurK8OQCeL3csZpqXTsT/h7qD+fLRAXW vUedIz82URoxm75N7nSXuwxfKl7JyOTIhqlPvbfvIdJHrdbuCbBv4mIYErKO5MD7cJh+ X2AWDgVeRSMXg6i6PcsV/7hPNZchFzmE4FWObOBPm9ZPOHw8rfTdXsKHZBoACmo2CJV2 PVbw== X-Gm-Message-State: AOAM530mkx0kJyr490zdi6GcYZ3KnlaiZCcjG6mHdcr+QQod3eafnmg8 KMgUxtZezZykPoGfiPg0Hkp13pklFlk7p/oglExVpQ== X-Received: by 2002:a63:790b:: with SMTP id u11mr6432381pgc.71.1633727736965; Fri, 08 Oct 2021 14:15:36 -0700 (PDT) MIME-Version: 1.0 References: <20211008210752.1109785-1-dlatypov@google.com> In-Reply-To: <20211008210752.1109785-1-dlatypov@google.com> From: Brendan Higgins Date: Fri, 8 Oct 2021 14:15:25 -0700 Message-ID: Subject: Re: [PATCH] kunit: tool: continue past invalid utf-8 output To: Daniel Latypov Cc: davidgow@google.com, linux-kernel@vger.kernel.org, kunit-dev@googlegroups.com, linux-kselftest@vger.kernel.org, skhan@linuxfoundation.org Content-Type: text/plain; charset="UTF-8" Precedence: bulk List-ID: X-Mailing-List: linux-kernel@vger.kernel.org On Fri, Oct 8, 2021 at 2:08 PM Daniel Latypov wrote: > > kunit.py currently crashes and fails to parse kernel output if it's not > fully valid utf-8. > > This can come from memory corruption or or just inadvertently printing > out binary data as strings. > > E.g. adding this line into a kunit test > pr_info("\x80") > will cause this exception > UnicodeDecodeError: 'utf-8' codec can't decode byte 0x80 in position 1961: invalid start byte > > We can tell Python how to handle errors, see > https://docs.python.org/3/library/codecs.html#error-handlers > > Unfortunately, it doesn't seem like there's a way to specify this in > just one location, so we need to repeat ourselves quite a bit. > > Specify `errors='backslashreplace'` so we instead: > * print out the offending byte as '\x80' > * try and continue parsing the output. > * as long as the TAP lines themselves are valid, we're fine. > > Signed-off-by: Daniel Latypov Thanks for fixing this! Reviewed-by: Brendan Higgins