2017-07-27 16:10:43

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 00/16] MIPS: Miscellaneous fixes related to Android Mips emulator

From: Aleksandar Markovic <[email protected]>

v3->v4:

- patches on MADDF/MSUBF tagged as "stable #4.7+"
- patches on MAX/MIN/MAXA/MINA tagged as "stable #4.3+"
- logic in the patch on NaN handling in MADDF/MSUBF simplified
- improved code formatting in several patches
- fixed spelling mistakes and improved otherwise code comments
- fixed multiple typos and spelling mistakes in commit messages
- ammended commit messages in several instances
- patch 16 moved 2 places ahead in patch order
- rebased to the latest kernel code

v2->v3:

- added a patch that fixes clobber lists in vdso fallback cases
- added 6 patches related to MIN/MINA/MAX/MAXA issues
- added 6 patches related to MADDF/MSUBDF issues
- enhanced logic and comments in patch on multitouch
- fixed a number of minor spelling and format mistakes in code
comments and commit messages
- several patches removed since they got integrated into the tree
- order of patches changed to better reflect similarity
- rebased to the latest kernel code

v1->v2:

- the patch on PREF usage in memcpy dropped as not needed
- updated recipient lists using get_maintainer.pl
- rebased to the latest kernel code

This series contains an assortment of changes necessary for proper
operation of Android emulator for Mips. However, we think that wider
kernel community may benefit from them too.

Aleksandar Markovic (10):
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix cases of both inputs
zero
MIPS: math-emu: <MAX|MIN>.<D|S>: Fix cases of both inputs negative
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of input values with
opposite signs
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of both infinite inputs
MIPS: math-emu: MINA.<D|S>: Fix some cases of infinity and zero inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix NaN propagation
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of infinite inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of zero inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Clean up "maddf_flags"
enumeration

Douglas Leung (2):
MIPS: math-emu: <MADDF|MSUBF>.S: Fix accuracy (32-bit case)
MIPS: math-emu: <MADDF|MSUBF>.D: Fix accuracy (64-bit case)

Goran Ferenc (1):
MIPS: VDSO: Fix clobber lists in fallback code paths

Lingfeng Yang (1):
input: goldfish: Fix multitouch event handling

Miodrag Dinic (2):
tty: goldfish: Use streaming DMA for r/w operations on Ranchu
platforms
tty: goldfish: Implement support for kernel 'earlycon' parameter

arch/mips/math-emu/dp_fmax.c | 84 ++++++++---
arch/mips/math-emu/dp_fmin.c | 86 ++++++++---
arch/mips/math-emu/dp_maddf.c | 246 +++++++++++++++++++------------
arch/mips/math-emu/ieee754int.h | 4 +
arch/mips/math-emu/ieee754sp.h | 4 +
arch/mips/math-emu/sp_fmax.c | 84 ++++++++---
arch/mips/math-emu/sp_fmin.c | 86 ++++++++---
arch/mips/math-emu/sp_maddf.c | 229 ++++++++++++++--------------
arch/mips/vdso/gettimeofday.c | 6 +-
drivers/input/keyboard/goldfish_events.c | 35 ++++-
drivers/tty/Kconfig | 3 +
drivers/tty/goldfish.c | 145 ++++++++++++++++--
12 files changed, 698 insertions(+), 314 deletions(-)

--
2.7.4


2017-07-27 16:10:52

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 01/16] input: goldfish: Fix multitouch event handling

From: Lingfeng Yang <[email protected]>

Register Goldfish Events device properly as a multitouch device,
and send SYN_REPORT events in appropriate cases only.

If SYN_REPORT event is sent on every single multitouch event, it
breaks the multitouch support. The multitouch interaction becomes
janky and user has to click 2-3 times to do stuff (also, notification
bars are randomly activating when not clicking). If these SYN_REPORT
events are suppressed, multitouch works fine, plus the events have a
protocol that looks nice.

In addition, Goldfish Events device needs to be registered as a
multitouch device by issuing input_mt_init_slots. Otherwise,
input_handle_abs_event in drivers/input/input.c will silently drop
all ABS_MT_SLOT events, causing touches with more than one finger
not to work properly.

Signed-off-by: Lingfeng Yang <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
drivers/input/keyboard/goldfish_events.c | 35 +++++++++++++++++++++++++++++++-
1 file changed, 34 insertions(+), 1 deletion(-)

diff --git a/drivers/input/keyboard/goldfish_events.c b/drivers/input/keyboard/goldfish_events.c
index f6e643b..bc3e8b3 100644
--- a/drivers/input/keyboard/goldfish_events.c
+++ b/drivers/input/keyboard/goldfish_events.c
@@ -17,6 +17,7 @@
#include <linux/interrupt.h>
#include <linux/types.h>
#include <linux/input.h>
+#include <linux/input/mt.h>
#include <linux/kernel.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
@@ -24,6 +25,8 @@
#include <linux/io.h>
#include <linux/acpi.h>

+#define GOLDFISH_MAX_FINGERS 5
+
enum {
REG_READ = 0x00,
REG_SET_PAGE = 0x00,
@@ -52,7 +55,22 @@ static irqreturn_t events_interrupt(int irq, void *dev_id)
value = __raw_readl(edev->addr + REG_READ);

input_event(edev->input, type, code, value);
- input_sync(edev->input);
+
+ /*
+ * Send an extra (EV_SYN, SYN_REPORT, 0x0) event only if a key
+ * was pressed. Some keyboard device drivers may only send the
+ * EV_KEY event and not the EV_SYN event.
+ *
+ * Note that sending an extra SYN_REPORT is not necessary nor
+ * correct protocol with other devices such as touchscreens,
+ * which will send their own SYN_REPORTs when sufficient event
+ * event information has been collected (for example, in case
+ * touchscreens, when pressure and X/Y coordinates have been
+ * received). Hence, we will only send this extra SYN_REPORT
+ * if type == EV_KEY.
+ */
+ if (type == EV_KEY)
+ input_sync(edev->input);
return IRQ_HANDLED;
}

@@ -155,6 +173,21 @@ static int events_probe(struct platform_device *pdev)
input_dev->name = edev->name;
input_dev->id.bustype = BUS_HOST;

+ /*
+ * Set the Goldfish Device to be multitouch.
+ *
+ * In the Ranchu kernel, there is multitouch-specific code for
+ * handling ABS_MT_SLOT events (see file drivers/input/input.c,
+ * function input_handle_abs_event). If we do not issue
+ * input_mt_init_slots, the kernel will filter out needed
+ * ABS_MT_SLOT events when we touch the screen in more than one
+ * place, preventing multitouch with more than one finger from
+ * working.
+ */
+ error = input_mt_init_slots(input_dev, GOLDFISH_MAX_FINGERS, 0);
+ if (error)
+ return error;
+
events_import_bits(edev, input_dev->evbit, EV_SYN, EV_MAX);
events_import_bits(edev, input_dev->keybit, EV_KEY, KEY_MAX);
events_import_bits(edev, input_dev->relbit, EV_REL, REL_MAX);
--
2.7.4

2017-07-27 16:11:11

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 02/16] tty: goldfish: Use streaming DMA for r/w operations on Ranchu platforms

From: Miodrag Dinic <[email protected]>

Implement tty r/w operations using streaming DMA.

Goldfish tty for Ranchu platforms has been modified to use
streaming DMA mappings for read/write operations. This change
eliminates the need for snooping through the TLB in QEMU using
cpu_get_phys_page_debug() which does not guarantee that it will
return the valid va -> pa mapping.

The streaming DMA mapping is implemented using dma_map_single() per
transfer, while dma_unmap_single() is used for unmapping right after
the DMA transfer.

Using DMA API is the proper way for handling r/w transfers and
makes this driver more portable, thus effectively eliminating
the need for virt_to_page() and page_to_phys() conversions.

This change does not affect the old style Goldfish tty behaviour
which is still used by the Goldfish emulator. Version register has
been added and probed to see which platform is running this driver.
Reading from the new GOLDFISH_TTY_VERSION register using the Goldfish
emulator will return 0 and driver will work with virtual addresses.
Whereas if run on Ranchu it returns 1, and thus DMA is used.

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
drivers/tty/goldfish.c | 119 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 108 insertions(+), 11 deletions(-)

diff --git a/drivers/tty/goldfish.c b/drivers/tty/goldfish.c
index 996bd47..acd50fa 100644
--- a/drivers/tty/goldfish.c
+++ b/drivers/tty/goldfish.c
@@ -22,6 +22,8 @@
#include <linux/io.h>
#include <linux/module.h>
#include <linux/goldfish.h>
+#include <linux/mm.h>
+#include <linux/dma-mapping.h>

enum {
GOLDFISH_TTY_PUT_CHAR = 0x00,
@@ -32,6 +34,8 @@ enum {
GOLDFISH_TTY_DATA_LEN = 0x14,
GOLDFISH_TTY_DATA_PTR_HIGH = 0x18,

+ GOLDFISH_TTY_VERSION = 0x20,
+
GOLDFISH_TTY_CMD_INT_DISABLE = 0,
GOLDFISH_TTY_CMD_INT_ENABLE = 1,
GOLDFISH_TTY_CMD_WRITE_BUFFER = 2,
@@ -45,6 +49,8 @@ struct goldfish_tty {
u32 irq;
int opencount;
struct console console;
+ u32 version;
+ struct device *dev;
};

static DEFINE_MUTEX(goldfish_tty_lock);
@@ -53,24 +59,94 @@ static u32 goldfish_tty_line_count = 8;
static u32 goldfish_tty_current_line_count;
static struct goldfish_tty *goldfish_ttys;

-static void goldfish_tty_do_write(int line, const char *buf, unsigned count)
+static inline void do_rw_io(struct goldfish_tty *qtty,
+ unsigned long address,
+ unsigned int count,
+ int is_write)
{
unsigned long irq_flags;
- struct goldfish_tty *qtty = &goldfish_ttys[line];
void __iomem *base = qtty->base;
+
spin_lock_irqsave(&qtty->lock, irq_flags);
- gf_write_ptr(buf, base + GOLDFISH_TTY_DATA_PTR,
+ gf_write_ptr((void *)address, base + GOLDFISH_TTY_DATA_PTR,
base + GOLDFISH_TTY_DATA_PTR_HIGH);
writel(count, base + GOLDFISH_TTY_DATA_LEN);
- writel(GOLDFISH_TTY_CMD_WRITE_BUFFER, base + GOLDFISH_TTY_CMD);
+
+ if (is_write)
+ writel(GOLDFISH_TTY_CMD_WRITE_BUFFER, base + GOLDFISH_TTY_CMD);
+ else
+ writel(GOLDFISH_TTY_CMD_READ_BUFFER, base + GOLDFISH_TTY_CMD);
+
spin_unlock_irqrestore(&qtty->lock, irq_flags);
}

+static inline void goldfish_tty_rw(struct goldfish_tty *qtty,
+ unsigned long addr,
+ unsigned int count,
+ int is_write)
+{
+ dma_addr_t dma_handle;
+ enum dma_data_direction dma_dir;
+
+ dma_dir = (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE);
+ if (qtty->version) {
+ /*
+ * Goldfish TTY for Ranchu platform uses
+ * physical addresses and DMA for read/write operations
+ */
+ unsigned long addr_end = addr + count;
+
+ while (addr < addr_end) {
+ unsigned long pg_end = (addr & PAGE_MASK) + PAGE_SIZE;
+ unsigned long next =
+ pg_end < addr_end ? pg_end : addr_end;
+ unsigned long avail = next - addr;
+
+ /*
+ * Map the buffer's virtual address to the DMA address
+ * so the buffer can be accessed by the device.
+ */
+ dma_handle = dma_map_single(qtty->dev,
+ (void *)addr, avail, dma_dir);
+
+ if (dma_mapping_error(qtty->dev, dma_handle)) {
+ dev_err(qtty->dev, "tty: DMA mapping error.\n");
+ return;
+ }
+ do_rw_io(qtty, dma_handle, avail, is_write);
+
+ /*
+ * Unmap the previously mapped region after
+ * the completion of the read/write operation.
+ */
+ dma_unmap_single(qtty->dev, dma_handle, avail, dma_dir);
+
+ addr += avail;
+ }
+ } else {
+ /*
+ * Old style Goldfish TTY used on the Goldfish platform
+ * uses virtual addresses.
+ */
+ do_rw_io(qtty, addr, count, is_write);
+ }
+
+}
+
+static void goldfish_tty_do_write(int line, const char *buf,
+ unsigned int count)
+{
+ struct goldfish_tty *qtty = &goldfish_ttys[line];
+ unsigned long address = (unsigned long)(void *)buf;
+
+ goldfish_tty_rw(qtty, address, count, 1);
+}
+
static irqreturn_t goldfish_tty_interrupt(int irq, void *dev_id)
{
struct goldfish_tty *qtty = dev_id;
void __iomem *base = qtty->base;
- unsigned long irq_flags;
+ unsigned long address;
unsigned char *buf;
u32 count;

@@ -79,12 +155,10 @@ static irqreturn_t goldfish_tty_interrupt(int irq, void *dev_id)
return IRQ_NONE;

count = tty_prepare_flip_string(&qtty->port, &buf, count);
- spin_lock_irqsave(&qtty->lock, irq_flags);
- gf_write_ptr(buf, base + GOLDFISH_TTY_DATA_PTR,
- base + GOLDFISH_TTY_DATA_PTR_HIGH);
- writel(count, base + GOLDFISH_TTY_DATA_LEN);
- writel(GOLDFISH_TTY_CMD_READ_BUFFER, base + GOLDFISH_TTY_CMD);
- spin_unlock_irqrestore(&qtty->lock, irq_flags);
+
+ address = (unsigned long)(void *)buf;
+ goldfish_tty_rw(qtty, address, count, 0);
+
tty_schedule_flip(&qtty->port);
return IRQ_HANDLED;
}
@@ -271,6 +345,29 @@ static int goldfish_tty_probe(struct platform_device *pdev)
qtty->port.ops = &goldfish_port_ops;
qtty->base = base;
qtty->irq = irq;
+ qtty->dev = &pdev->dev;
+
+ /* Goldfish TTY device used by the Goldfish emulator
+ * should identify itself with 0, forcing the driver
+ * to use virtual addresses. Goldfish TTY device
+ * on Ranchu emulator (qemu2) returns 1 here and
+ * driver will use physical addresses.
+ */
+ qtty->version = readl(base + GOLDFISH_TTY_VERSION);
+
+ /* Goldfish TTY device on Ranchu emulator (qemu2)
+ * will use DMA for read/write IO operations.
+ */
+ if (qtty->version > 0) {
+ /* Initialize dma_mask to 32-bits.
+ */
+ if (!pdev->dev.dma_mask)
+ pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask;
+ if (dma_set_mask(&pdev->dev, DMA_BIT_MASK(32))) {
+ dev_err(&pdev->dev, "No suitable DMA available.\n");
+ goto err_create_driver_failed;
+ }
+ }

writel(GOLDFISH_TTY_CMD_INT_DISABLE, base + GOLDFISH_TTY_CMD);

--
2.7.4

2017-07-27 16:11:19

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 03/16] tty: goldfish: Implement support for kernel 'earlycon' parameter

From: Miodrag Dinic <[email protected]>

Add early console functionality to the Goldfish tty driver.

When 'earlycon' kernel command line parameter is used with no options,
the early console is determined by the 'stdout-path' property in device
tree's 'chosen' node. This is illustrated in the following device tree
source example:

Device tree example:

chosen {
stdout-path = "/goldfish_tty@1f004000";
};

goldfish_tty@1f004000 {
interrupts = <0xc>;
reg = <0x1f004000 0x0 0x1000>;
compatible = "google,goldfish-tty", "generic,goldfish-tty";
};

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
drivers/tty/Kconfig | 3 +++
drivers/tty/goldfish.c | 26 ++++++++++++++++++++++++++
2 files changed, 29 insertions(+)

diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig
index 9510305..873e0ba 100644
--- a/drivers/tty/Kconfig
+++ b/drivers/tty/Kconfig
@@ -392,6 +392,9 @@ config PPC_EARLY_DEBUG_EHV_BC_HANDLE
config GOLDFISH_TTY
tristate "Goldfish TTY Driver"
depends on GOLDFISH
+ select SERIAL_CORE
+ select SERIAL_CORE_CONSOLE
+ select SERIAL_EARLYCON
help
Console and system TTY driver for the Goldfish virtual platform.

diff --git a/drivers/tty/goldfish.c b/drivers/tty/goldfish.c
index acd50fa..22b7ad5 100644
--- a/drivers/tty/goldfish.c
+++ b/drivers/tty/goldfish.c
@@ -1,6 +1,7 @@
/*
* Copyright (C) 2007 Google, Inc.
* Copyright (C) 2012 Intel, Inc.
+ * Copyright (C) 2017 Imagination Technologies Ltd.
*
* This software is licensed under the terms of the GNU General Public
* License version 2, as published by the Free Software Foundation, and
@@ -24,6 +25,7 @@
#include <linux/goldfish.h>
#include <linux/mm.h>
#include <linux/dma-mapping.h>
+#include <linux/serial_core.h>

enum {
GOLDFISH_TTY_PUT_CHAR = 0x00,
@@ -427,6 +429,30 @@ static int goldfish_tty_remove(struct platform_device *pdev)
return 0;
}

+static void gf_early_console_putchar(struct uart_port *port, int ch)
+{
+ __raw_writel(ch, port->membase);
+}
+
+static void gf_early_write(struct console *con, const char *s, unsigned int n)
+{
+ struct earlycon_device *dev = con->data;
+
+ uart_console_write(&dev->port, s, n, gf_early_console_putchar);
+}
+
+static int __init gf_earlycon_setup(struct earlycon_device *device,
+ const char *opt)
+{
+ if (!device->port.membase)
+ return -ENODEV;
+
+ device->con->write = gf_early_write;
+ return 0;
+}
+
+OF_EARLYCON_DECLARE(early_gf_tty, "google,goldfish-tty", gf_earlycon_setup);
+
static const struct of_device_id goldfish_tty_of_match[] = {
{ .compatible = "google,goldfish-tty", },
{},
--
2.7.4

2017-07-27 16:11:28

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 04/16] MIPS: VDSO: Fix clobber lists in fallback code paths

From: Goran Ferenc <[email protected]>

Extend clobber lists to include all GP registers.

Fixes: 0b523a85e134 ("MIPS: VDSO: Add implementation of gettimeofday() fallback")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/vdso/gettimeofday.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/arch/mips/vdso/gettimeofday.c b/arch/mips/vdso/gettimeofday.c
index 974276e..e2690d7 100644
--- a/arch/mips/vdso/gettimeofday.c
+++ b/arch/mips/vdso/gettimeofday.c
@@ -35,7 +35,8 @@ static __always_inline long gettimeofday_fallback(struct timeval *_tv,
" syscall\n"
: "=r" (ret), "=r" (error)
: "r" (tv), "r" (tz), "r" (nr)
- : "memory");
+ : "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13",
+ "$14", "$15", "$24", "$25", "hi", "lo", "memory");

return error ? -ret : ret;
}
@@ -55,7 +56,8 @@ static __always_inline long clock_gettime_fallback(clockid_t _clkid,
" syscall\n"
: "=r" (ret), "=r" (error)
: "r" (clkid), "r" (ts), "r" (nr)
- : "memory");
+ : "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13",
+ "$14", "$15", "$24", "$25", "hi", "lo", "memory");

return error ? -ret : ret;
}
--
2.7.4

2017-07-27 16:11:47

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 05/16] MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation

From: Aleksandar Markovic <[email protected]>

Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S> fd,fs,ft, if both
inputs are quiet NaNs. The <MAX|MAXA|MIN|MINA>.<D|S> specifications
state that the returned value in such cases should be the quiet NaN
contained in register fs.

A relevant example:

MAX.S fd,fs,ft:
If fs contains qNaN1, and ft contains qNaN2, fd is going to contain
qNaN1 (without this patch, it used to contain qNaN2).

Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.3+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 32 ++++++++++++++++++++++++++++----
arch/mips/math-emu/dp_fmin.c | 32 ++++++++++++++++++++++++++++----
arch/mips/math-emu/sp_fmax.c | 32 ++++++++++++++++++++++++++++----
arch/mips/math-emu/sp_fmin.c | 32 ++++++++++++++++++++++++++++----
4 files changed, 112 insertions(+), 16 deletions(-)

diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index fd71b8d..41bd6ed 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -47,14 +47,26 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,14 +159,26 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index c1072b0..53fb8c9 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -47,14 +47,26 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,14 +159,26 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index 4d00084..d0d73c32 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -47,14 +47,26 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,14 +159,26 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 4eb1bb9..011692e 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -47,14 +47,26 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,14 +159,26 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);

- /* numbers are preferred to NaNs */
+ /*
+ * Quiet NaN handling
+ */
+
+ /*
+ * The case of both inputs quiet NaNs
+ */
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
+ /*
+ * The cases of exactly one input quiet NaN (numbers
+ * are here preferred as returned values to NaNs)
+ */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
--
2.7.4

2017-07-27 16:11:56

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 06/16] MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix cases of both inputs zero

From: Aleksandar Markovic <[email protected]>

Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S>, if both inputs
are zeros. The right behavior in such cases is stated in instruction
reference manual and is as follows:

fs ft MAX MIN MAXA MINA
---------------------------------------------
0 0 0 0 0 0
0 -0 0 -0 0 -0
-0 0 0 -0 0 -0
-0 -0 -0 -0 -0 -0

Prior to this patch, some of the above cases were yielding correct
results. However, for the sake of code consistency, all such cases
are rewritten in this patch.

A relevant example:

MAX.S fd,fs,ft:
If fs contains +0.0, and ft contains -0.0, fd is going to contain
+0.0 (without this patch, it used to contain -0.0).

Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.3+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 8 ++------
arch/mips/math-emu/dp_fmin.c | 8 ++------
arch/mips/math-emu/sp_fmax.c | 8 ++------
arch/mips/math-emu/sp_fmin.c | 8 ++------
4 files changed, 8 insertions(+), 24 deletions(-)

diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index 41bd6ed..31f091a 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -92,9 +92,7 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
return ys ? x : y;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs & ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
@@ -204,9 +202,7 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
return y;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs & ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 53fb8c9..e607d55 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -92,9 +92,7 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
return ys ? y : x;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs | ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
@@ -204,9 +202,7 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
return y;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs | ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index d0d73c32..3ca5b20 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -92,9 +92,7 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
return ys ? x : y;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs & ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
@@ -204,9 +202,7 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
return y;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs & ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 011692e..c982647 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -92,9 +92,7 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
return ys ? y : x;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs | ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
@@ -204,9 +202,7 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
return y;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs | ys);

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
--
2.7.4

2017-07-27 16:12:03

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 07/16] MIPS: math-emu: <MAX|MIN>.<D|S>: Fix cases of both inputs negative

From: Aleksandar Markovic <[email protected]>

Fix the value returned by <MAX|MIN>.<D|S>, if both inputs are negative
normal fp numbers. The previous logic did not take into account that
if both inputs have the same sign, there should be separate treatment
of the cases when both inputs are negative and when both inputs are
positive.

A relevant example:

MAX.S fd,fs,ft:
If fs contains -5.0, and ft contains -7.0, fd is going to contain
-5.0 (without this patch, it used to contain -7.0).

Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.3+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 32 ++++++++++++++++++++++++--------
arch/mips/math-emu/dp_fmin.c | 32 ++++++++++++++++++++++++--------
arch/mips/math-emu/sp_fmax.c | 32 ++++++++++++++++++++++++--------
arch/mips/math-emu/sp_fmin.c | 32 ++++++++++++++++++++++++--------
4 files changed, 96 insertions(+), 32 deletions(-)

diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index 31f091a..0b53c78 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -116,16 +116,32 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
else if (xs < ys)
return x;

- /* Compare exponent */
- if (xe > ye)
- return x;
- else if (xe < ye)
- return y;
+ /* Signs of inputs are equal, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ }

- /* Compare mantissa */
+ /* Signs and exponents of inputs are equal, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal signs and exponents */
+ if (xm <= ym)
+ return y;
+ return x;
+ }
+ /* Inputs are both negative, with equal signs and exponents */
if (xm <= ym)
- return y;
- return x;
+ return x;
+ return y;
}

union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index e607d55..099e6bd 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -116,16 +116,32 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
else if (xs < ys)
return y;

- /* Compare exponent */
- if (xe > ye)
- return y;
- else if (xe < ye)
- return x;
+ /* Signs of inputs are the same, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ }

- /* Compare mantissa */
+ /* Signs and exponents of inputs are equal, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal signs and exponents */
+ if (xm <= ym)
+ return x;
+ return y;
+ }
+ /* Inputs are both negative, with equal signs and exponents */
if (xm <= ym)
- return x;
- return y;
+ return y;
+ return x;
}

union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index 3ca5b20..7efa772 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -116,16 +116,32 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
else if (xs < ys)
return x;

- /* Compare exponent */
- if (xe > ye)
- return x;
- else if (xe < ye)
- return y;
+ /* Signs of inputs are equal, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ }

- /* Compare mantissa */
+ /* Signs and exponents of inputs are equal, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal signs and exponents */
+ if (xm <= ym)
+ return y;
+ return x;
+ }
+ /* Inputs are both negative, with equal signs and exponents */
if (xm <= ym)
- return y;
- return x;
+ return x;
+ return y;
}

union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index c982647..e2c5543 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -116,16 +116,32 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
else if (xs < ys)
return y;

- /* Compare exponent */
- if (xe > ye)
- return y;
- else if (xe < ye)
- return x;
+ /* Signs of inputs are the same, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ }

- /* Compare mantissa */
+ /* Signs and exponents of inputs are equal, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal signs and exponents */
+ if (xm <= ym)
+ return x;
+ return y;
+ }
+ /* Inputs are both negative, with equal signs and exponents */
if (xm <= ym)
- return x;
- return y;
+ return y;
+ return x;
}

union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
--
2.7.4

2017-07-27 16:12:10

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 08/16] MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of input values with opposite signs

From: Aleksandar Markovic <[email protected]>

Fix the value returned by <MAXA|MINA>.<D|S>, if the inputs are normal
fp numbers of the same absolute value, but opposite signs.

A relevant example:

MAXA.S fd,fs,ft:
If fs contains -3.0, and ft contains +3.0, fd is going to contain
+3.0 (without this patch, it used to contain -3.0).

Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.3+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 8 ++++++--
arch/mips/math-emu/dp_fmin.c | 6 +++++-
arch/mips/math-emu/sp_fmax.c | 8 ++++++--
arch/mips/math-emu/sp_fmin.c | 6 +++++-
4 files changed, 22 insertions(+), 6 deletions(-)

diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index 0b53c78..81d12bf 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -243,7 +243,11 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
return y;

/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
return y;
- return x;
+ else if (xm > ym)
+ return x;
+ else if (xs == 0)
+ return x;
+ return y;
}
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 099e6bd..4574f04 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -243,7 +243,11 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
return x;

/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
+ return x;
+ else if (xm > ym)
+ return y;
+ else if (xs == 1)
return x;
return y;
}
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index 7efa772..fb41497 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -243,7 +243,11 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
return y;

/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
return y;
- return x;
+ else if (xm > ym)
+ return x;
+ else if (xs == 0)
+ return x;
+ return y;
}
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index e2c5543..7915b94 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -243,7 +243,11 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
return x;

/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
+ return x;
+ else if (xm > ym)
+ return y;
+ else if (xs == 1)
return x;
return y;
}
--
2.7.4

2017-07-27 16:12:19

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 09/16] MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of both infinite inputs

From: Aleksandar Markovic <[email protected]>

Fix the value returned by <MAXA|MINA>.<D|S> fd,fs,ft, if both inputs
are infinite. The previous implementation returned always the value
contained in ft in such cases. The correct behavior is specified
in Mips instruction set manual and is as follows:

fs ft MAXA MINA
---------------------------------
inf inf inf inf
inf -inf inf -inf
-inf inf inf -inf
-inf -inf -inf -inf

A relevant example:

MAXA.S fd,fs,ft:
If fs contains +inf, and ft contains -inf, fd is going to contain
+inf (without this patch, it used to contain -inf).

Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.3+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 4 +++-
arch/mips/math-emu/dp_fmin.c | 4 +++-
arch/mips/math-emu/sp_fmax.c | 4 +++-
arch/mips/math-emu/sp_fmin.c | 4 +++-
4 files changed, 12 insertions(+), 4 deletions(-)

diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index 81d12bf..5bec64f 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -202,6 +202,9 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754dp_inf(xs & ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -209,7 +212,6 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;

- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 4574f04..2495bd7 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -202,6 +202,9 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754dp_inf(xs | ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -209,7 +212,6 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;

- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index fb41497..74a5a00 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -202,6 +202,9 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754sp_inf(xs & ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -209,7 +212,6 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;

- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 7915b94..42ec431 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -202,6 +202,9 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754sp_inf(xs | ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -209,7 +212,6 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;

- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
--
2.7.4

2017-07-27 16:12:24

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 10/16] MIPS: math-emu: MINA.<D|S>: Fix some cases of infinity and zero inputs

From: Aleksandar Markovic <[email protected]>

Fix following special cases for MINA>.<D|S>:

- if one of the inputs is zero, and the other is subnormal, normal,
or infinity, the value of the former should be returned (that is,
a zero).
- if one of the inputs is infinity, and the other input is normal,
or subnormal, the value of the latter should be returned.

The previous implementation's logic for such cases was incorrect - it
appears as if it implements MAXA, and not MINA instruction.

A relevant example:

MINA.S fd,fs,ft:
If fs contains 100.0, and ft contains 0.0, fd is going to contain
0.0 (without this patch, it used to contain 100.0).

Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.3+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_fmin.c | 4 ++--
arch/mips/math-emu/sp_fmin.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 2495bd7..a287b23 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -210,14 +210,14 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
- return x;
+ return y;

case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
- return y;
+ return x;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
return ieee754dp_zero(xs | ys);
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 42ec431..c51385f 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -210,14 +210,14 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
- return x;
+ return y;

case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
- return y;
+ return x;

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
return ieee754sp_zero(xs | ys);
--
2.7.4

2017-07-27 16:12:31

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 11/16] MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix NaN propagation

From: Aleksandar Markovic <[email protected]>

Fix the cases of <MADDF|MSUBF>.<D|S> when any of three inputs is any
NaN. Correct behavior of <MADDF|MSUBF>.<D|S> fd, fs, ft is following:

- if any of inputs is sNaN, return a sNaN using following rules: if
only one input is sNaN, return that one; if more than one input is
sNaN, order of precedence for return value is fd, fs, ft
- if no input is sNaN, but at least one of inputs is qNaN, return a
qNaN using following rules: if only one input is qNaN, return that
one; if more than one input is qNaN, order of precedence for
return value is fd, fs, ft

The previous code contained correct handling of some above cases, but
not all. Also, such handling was scattered into various cases of
"switch (CLPAIR(xc, yc))" statement, and elsewhere. With this patch,
this logic is placed in one place, and "switch (CLPAIR(xc, yc))" is
significantly simplified.

A relevant example:

MADDF.S fd,fs,ft:
If fs contains qNaN1, ft contains qNaN2, and fd contains qNaN3, fd
is going to contain qNaN3 (without this patch, it used to contain
qNaN1).

Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
Fixes: 83d43305a1df ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.7+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 66 +++++++++++++------------------------------
arch/mips/math-emu/sp_maddf.c | 66 ++++++++++++++-----------------------------
2 files changed, 41 insertions(+), 91 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index caa62f2..8b1bd42 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -48,52 +48,34 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,

ieee754_clearcx();

- switch (zc) {
- case IEEE754_CLASS_SNAN:
- ieee754_setcx(IEEE754_INVALID_OPERATION);
+ /*
+ * Handle the cases when at least one of x, y or z is a NaN.
+ * Order of precedence is sNaN, qNaN and z, x, y.
+ */
+ if (zc == IEEE754_CLASS_SNAN)
return ieee754dp_nanxcpt(z);
- case IEEE754_CLASS_DNORM:
- DPDNORMZ;
- /* QNAN and ZERO cases are handled separately below */
- }
-
- switch (CLPAIR(xc, yc)) {
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
- return ieee754dp_nanxcpt(y);
-
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
+ if (xc == IEEE754_CLASS_SNAN)
return ieee754dp_nanxcpt(x);
-
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
+ if (yc == IEEE754_CLASS_SNAN)
+ return ieee754dp_nanxcpt(y);
+ if (zc == IEEE754_CLASS_QNAN)
+ return z;
+ if (xc == IEEE754_CLASS_QNAN)
+ return x;
+ if (yc == IEEE754_CLASS_QNAN)
return y;

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
- return x;
+ if (zc == IEEE754_CLASS_DNORM)
+ DPDNORMZ;
+ /* ZERO z cases are handled separately below */

+ switch (CLPAIR(xc, yc)) {

/*
* Infinity handling
*/
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
ieee754_setcx(IEEE754_INVALID_OPERATION);
return ieee754dp_indef();

@@ -102,8 +84,6 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
return ieee754dp_inf(xs ^ ys);

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
@@ -120,25 +100,19 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
DPDNORMX;

case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
DPDNORMY;
break;

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
DPDNORMX;
break;

case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
/* fall through to real computations */
}
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index c91d5e5..6cdaa2a 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -48,51 +48,35 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,

ieee754_clearcx();

- switch (zc) {
- case IEEE754_CLASS_SNAN:
- ieee754_setcx(IEEE754_INVALID_OPERATION);
+ /*
+ * Handle the cases when at least one of x, y or z is a NaN.
+ * Order of precedence is sNaN, qNaN and z, x, y.
+ */
+ if (zc == IEEE754_CLASS_SNAN)
return ieee754sp_nanxcpt(z);
- case IEEE754_CLASS_DNORM:
- SPDNORMZ;
- /* QNAN and ZERO cases are handled separately below */
- }
-
- switch (CLPAIR(xc, yc)) {
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
+ if (xc == IEEE754_CLASS_SNAN)
+ return ieee754sp_nanxcpt(x);
+ if (yc == IEEE754_CLASS_SNAN)
return ieee754sp_nanxcpt(y);
+ if (zc == IEEE754_CLASS_QNAN)
+ return z;
+ if (xc == IEEE754_CLASS_QNAN)
+ return x;
+ if (yc == IEEE754_CLASS_QNAN)
+ return y;

- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
- return ieee754sp_nanxcpt(x);
+ if (zc == IEEE754_CLASS_DNORM)
+ SPDNORMZ;
+ /* ZERO z cases are handled separately below */

- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
- return y;
+ switch (CLPAIR(xc, yc)) {

- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
- return x;

/*
* Infinity handling
*/
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
ieee754_setcx(IEEE754_INVALID_OPERATION);
return ieee754sp_indef();

@@ -101,8 +85,6 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
return ieee754sp_inf(xs ^ ys);

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
@@ -119,25 +101,19 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
SPDNORMX;

case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
SPDNORMY;
break;

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
SPDNORMX;
break;

case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
/* fall through to real computations */
}
--
2.7.4

2017-07-27 16:12:37

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 12/16] MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of infinite inputs

From: Aleksandar Markovic <[email protected]>

Fix the cases of <MADDF|MSUBF>.<D|S> when any of two multiplicands is
infinity. The correct behavior in such cases is affected by the nature
of third input. Cases of addition of infinities with opposite signs
and subtraction of infinities with same signs may arise and must be
handles separately. Also, the value od flags argument (that determines
whether the instruction is MADDF or MSUBF) affects the outcome.

Relevant examples:

MADDF.S fd,fs,ft:
If fs contains +inf, ft contains +inf, and fd contains -inf, fd is
going to contain indef (without this patch, it used to contain
-inf).

MSUBF.S fd,fs,ft:
If fs contains +inf, ft contains 1.0, and fd contains +0.0, fd is
going to contain -inf (without this patch, it used to contain +inf).

Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
Fixes: 83d43305a1df ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")

Signed-off-by: Douglas Leung <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.7+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 22 +++++++++++++++++++++-
arch/mips/math-emu/sp_maddf.c | 22 +++++++++++++++++++++-
2 files changed, 42 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 8b1bd42..557a0a1 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -84,7 +84,27 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- return ieee754dp_inf(xs ^ ys);
+ if ((zc == IEEE754_CLASS_INF) &&
+ ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
+ ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ /*
+ * Cases of addition of infinities with opposite signs
+ * or subtraction of infinities with same signs.
+ */
+ ieee754_setcx(IEEE754_INVALID_OPERATION);
+ return ieee754dp_indef();
+ }
+ /*
+ * z is here either not an infinity, or an infinity having the
+ * same sign as product (x*y) (in case of MADDF.D instruction)
+ * or product -(x*y) (in MSUBF.D case). The result must be an
+ * infinity, and its sign is determined only by the value of
+ * (flags & maddf_negate_product) and the signs of x and y.
+ */
+ if (flags & maddf_negate_product)
+ return ieee754dp_inf(1 ^ (xs ^ ys));
+ else
+ return ieee754dp_inf(xs ^ ys);

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 6cdaa2a..0d8d25f 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -85,7 +85,27 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- return ieee754sp_inf(xs ^ ys);
+ if ((zc == IEEE754_CLASS_INF) &&
+ ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
+ ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ /*
+ * Cases of addition of infinities with opposite signs
+ * or subtraction of infinities with same signs.
+ */
+ ieee754_setcx(IEEE754_INVALID_OPERATION);
+ return ieee754sp_indef();
+ }
+ /*
+ * z is here either not an infinity, or an infinity having the
+ * same sign as product (x*y) (in case of MADDF.D instruction)
+ * or product -(x*y) (in MSUBF.D case). The result must be an
+ * infinity, and its sign is determined only by the value of
+ * (flags & maddf_negate_product) and the signs of x and y.
+ */
+ if (flags & maddf_negate_product)
+ return ieee754sp_inf(1 ^ (xs ^ ys));
+ else
+ return ieee754sp_inf(xs ^ ys);

case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
--
2.7.4

2017-07-27 16:12:44

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 14/16] MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Clean up "maddf_flags" enumeration

From: Aleksandar Markovic <[email protected]>

Fix definition and usage of "maddf_flags" enumeration. Avoid duplicate
definition and apply more common capitalization.

This patch does not change any scenario. It just makes MADDF and
MSUBF emulation code more readable and easier to maintain, and
hopefully prevents future bugs as well.

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.7+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 19 ++++++++-----------
arch/mips/math-emu/ieee754int.h | 4 ++++
arch/mips/math-emu/sp_maddf.c | 19 ++++++++-----------
3 files changed, 20 insertions(+), 22 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index c38fe1b..e799fc8 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -14,9 +14,6 @@

#include "ieee754dp.h"

-enum maddf_flags {
- maddf_negate_product = 1 << 0,
-};

static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
union ieee754dp y, enum maddf_flags flags)
@@ -85,8 +82,8 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
if ((zc == IEEE754_CLASS_INF) &&
- ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
- ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ ((!(flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))) ||
+ ((flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))))) {
/*
* Cases of addition of infinities with opposite signs
* or subtraction of infinities with same signs.
@@ -99,9 +96,9 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
* same sign as product (x*y) (in case of MADDF.D instruction)
* or product -(x*y) (in MSUBF.D case). The result must be an
* infinity, and its sign is determined only by the value of
- * (flags & maddf_negate_product) and the signs of x and y.
+ * (flags & MADDF_NEGATE_PRODUCT) and the signs of x and y.
*/
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
return ieee754dp_inf(1 ^ (xs ^ ys));
else
return ieee754dp_inf(xs ^ ys);
@@ -115,9 +112,9 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
return ieee754dp_inf(zs);
if (zc == IEEE754_CLASS_ZERO) {
/* Handle cases +0 + (-0) and similar ones. */
- if ((!(flags & maddf_negate_product)
+ if ((!(flags & MADDF_NEGATE_PRODUCT)
&& (zs == (xs ^ ys))) ||
- ((flags & maddf_negate_product)
+ ((flags & MADDF_NEGATE_PRODUCT)
&& (zs != (xs ^ ys))))
/*
* Cases of addition of zeros of equal signs
@@ -167,7 +164,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,

re = xe + ye;
rs = xs ^ ys;
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
rs ^= 1;

/* shunt to top of word */
@@ -291,5 +288,5 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
union ieee754dp y)
{
- return _dp_maddf(z, x, y, maddf_negate_product);
+ return _dp_maddf(z, x, y, MADDF_NEGATE_PRODUCT);
}
diff --git a/arch/mips/math-emu/ieee754int.h b/arch/mips/math-emu/ieee754int.h
index 8bc2f69..dd2071f 100644
--- a/arch/mips/math-emu/ieee754int.h
+++ b/arch/mips/math-emu/ieee754int.h
@@ -26,6 +26,10 @@

#define CLPAIR(x, y) ((x)*6+(y))

+enum maddf_flags {
+ MADDF_NEGATE_PRODUCT = 1 << 0,
+};
+
static inline void ieee754_clearcx(void)
{
ieee754_csr.cx = 0;
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 4241ec1..07f5a9b 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -14,9 +14,6 @@

#include "ieee754sp.h"

-enum maddf_flags {
- maddf_negate_product = 1 << 0,
-};

static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
union ieee754sp y, enum maddf_flags flags)
@@ -86,8 +83,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
if ((zc == IEEE754_CLASS_INF) &&
- ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
- ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ ((!(flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))) ||
+ ((flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))))) {
/*
* Cases of addition of infinities with opposite signs
* or subtraction of infinities with same signs.
@@ -100,9 +97,9 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
* same sign as product (x*y) (in case of MADDF.D instruction)
* or product -(x*y) (in MSUBF.D case). The result must be an
* infinity, and its sign is determined only by the value of
- * (flags & maddf_negate_product) and the signs of x and y.
+ * (flags & MADDF_NEGATE_PRODUCT) and the signs of x and y.
*/
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
return ieee754sp_inf(1 ^ (xs ^ ys));
else
return ieee754sp_inf(xs ^ ys);
@@ -116,9 +113,9 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
return ieee754sp_inf(zs);
if (zc == IEEE754_CLASS_ZERO) {
/* Handle cases +0 + (-0) and similar ones. */
- if ((!(flags & maddf_negate_product)
+ if ((!(flags & MADDF_NEGATE_PRODUCT)
&& (zs == (xs ^ ys))) ||
- ((flags & maddf_negate_product)
+ ((flags & MADDF_NEGATE_PRODUCT)
&& (zs != (xs ^ ys))))
/*
* Cases of addition of zeros of equal signs
@@ -170,7 +167,7 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,

re = xe + ye;
rs = xs ^ ys;
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
rs ^= 1;

/* shunt to top of word */
@@ -287,5 +284,5 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
union ieee754sp y)
{
- return _sp_maddf(z, x, y, maddf_negate_product);
+ return _sp_maddf(z, x, y, MADDF_NEGATE_PRODUCT);
}
--
2.7.4

2017-07-27 16:12:47

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 15/16] MIPS: math-emu: <MADDF|MSUBF>.S: Fix accuracy (32-bit case)

From: Douglas Leung <[email protected]>

Implement fused multiply-add with correct accuracy.

Fused multiply-add operation has better accuracy than respective
sequential execution of multiply and add operations applied on the
same inputs. This is because accuracy errors accumulate in latter
case.

This patch implements fused multiply-add with the same accuracy
as it is implemented in hardware, using 64-bit intermediate
calculations.

One test case example (raw bits) that this patch fixes:

MADDF.S fd,fs,ft:
fd = 0x22575225
fs = ft = 0x3727c5ac

Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
Fixes: 83d43305a1df ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")

Signed-off-by: Douglas Leung <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.7+
---
arch/mips/math-emu/ieee754sp.h | 4 ++
arch/mips/math-emu/sp_maddf.c | 116 ++++++++++++++++-------------------------
2 files changed, 50 insertions(+), 70 deletions(-)

diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index 8476067..0f63e42 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -45,6 +45,10 @@ static inline int ieee754sp_finite(union ieee754sp x)
return SPBEXP(x) != SP_EMAX + 1 + SP_EBIAS;
}

+/* 64 bit right shift with rounding */
+#define XSPSRS64(v, rs) \
+ (((rs) >= 64) ? ((v) != 0) : ((v) >> (rs)) | ((v) << (64-(rs)) != 0))
+
/* 3bit extended single precision sticky right shift */
#define XSPSRS(v, rs) \
((rs > (SP_FBITS+3))?1:((v) >> (rs)) | ((v) << (32-(rs)) != 0))
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 07f5a9b..7195fe7 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -21,14 +21,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
int re;
int rs;
unsigned rm;
- unsigned short lxm;
- unsigned short hxm;
- unsigned short lym;
- unsigned short hym;
- unsigned lrm;
- unsigned hrm;
- unsigned t;
- unsigned at;
+ uint64_t rm64;
+ uint64_t zm64;
int s;

COMPXSP;
@@ -170,108 +164,90 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
if (flags & MADDF_NEGATE_PRODUCT)
rs ^= 1;

- /* shunt to top of word */
- xm <<= 32 - (SP_FBITS + 1);
- ym <<= 32 - (SP_FBITS + 1);
+ /* Multiple 24 bit xm and ym to give 48 bit results */
+ rm64 = (uint64_t)xm * ym;

- /*
- * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
- */
- lxm = xm & 0xffff;
- hxm = xm >> 16;
- lym = ym & 0xffff;
- hym = ym >> 16;
-
- lrm = lxm * lym; /* 16 * 16 => 32 */
- hrm = hxm * hym; /* 16 * 16 => 32 */
-
- t = lxm * hym; /* 16 * 16 => 32 */
- at = lrm + (t << 16);
- hrm += at < lrm;
- lrm = at;
- hrm = hrm + (t >> 16);
-
- t = hxm * lym; /* 16 * 16 => 32 */
- at = lrm + (t << 16);
- hrm += at < lrm;
- lrm = at;
- hrm = hrm + (t >> 16);
-
- rm = hrm | (lrm != 0);
+ /* Shunt to top of word */
+ rm64 = rm64 << 16;

- /*
- * Sticky shift down to normal rounding precision.
- */
- if ((int) rm < 0) {
- rm = (rm >> (32 - (SP_FBITS + 1 + 3))) |
- ((rm << (SP_FBITS + 1 + 3)) != 0);
+ /* Put explicit bit at bit 62 if necessary */
+ if ((int64_t) rm64 < 0) {
+ rm64 = rm64 >> 1;
re++;
- } else {
- rm = (rm >> (32 - (SP_FBITS + 1 + 3 + 1))) |
- ((rm << (SP_FBITS + 1 + 3 + 1)) != 0);
}
- assert(rm & (SP_HIDDEN_BIT << 3));

- if (zc == IEEE754_CLASS_ZERO)
- return ieee754sp_format(rs, re, rm);
-
- /* And now the addition */
+ assert(rm64 & (1 << 62));

- assert(zm & SP_HIDDEN_BIT);
+ if (zc == IEEE754_CLASS_ZERO) {
+ /*
+ * Move explicit bit from bit 62 to bit 26 since the
+ * ieee754sp_format code expects the mantissa to be
+ * 27 bits wide (24 + 3 rounding bits).
+ */
+ rm = XSPSRS64(rm64, (62 - 26));
+ return ieee754sp_format(rs, re, rm);
+ }

- /*
- * Provide guard,round and stick bit space.
- */
- zm <<= 3;
+ /* Move explicit bit from bit 23 to bit 62 */
+ zm64 = (uint64_t)zm << (62 - 23);
+ assert(zm64 & (1 << 62));

+ /* Make the exponents the same */
if (ze > re) {
/*
* Have to shift r fraction right to align.
*/
s = ze - re;
- rm = XSPSRS(rm, s);
+ rm64 = XSPSRS64(rm64, s);
re += s;
} else if (re > ze) {
/*
* Have to shift z fraction right to align.
*/
s = re - ze;
- zm = XSPSRS(zm, s);
+ zm64 = XSPSRS64(zm64, s);
ze += s;
}
assert(ze == re);
assert(ze <= SP_EMAX);

+ /* Do the addition */
if (zs == rs) {
/*
- * Generate 28 bit result of adding two 27 bit numbers
- * leaving result in zm, zs and ze.
+ * Generate 64 bit result by adding two 63 bit numbers
+ * leaving result in zm64, zs and ze.
*/
- zm = zm + rm;
-
- if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
- zm = XSPSRS1(zm);
+ zm64 = zm64 + rm64;
+ if ((int64_t)zm64 < 0) { /* carry out */
+ zm64 = XSPSRS1(zm64);
ze++;
}
} else {
- if (zm >= rm) {
- zm = zm - rm;
+ if (zm64 >= rm64) {
+ zm64 = zm64 - rm64;
} else {
- zm = rm - zm;
+ zm64 = rm64 - zm64;
zs = rs;
}
- if (zm == 0)
+ if (zm64 == 0)
return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);

/*
- * Normalize in extended single precision
+ * Put explicit bit at bit 62 if necessary.
*/
- while ((zm >> (SP_MBITS + 3)) == 0) {
- zm <<= 1;
+ while ((zm64 >> 62) == 0) {
+ zm64 <<= 1;
ze--;
}
-
}
+
+ /*
+ * Move explicit bit from bit 62 to bit 26 since the
+ * ieee754sp_format code expects the mantissa to be
+ * 27 bits wide (24 + 3 rounding bits).
+ */
+ zm = XSPSRS64(zm64, (62 - 26));
+
return ieee754sp_format(zs, ze, zm);
}

--
2.7.4

2017-07-27 16:12:52

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 16/16] MIPS: math-emu: <MADDF|MSUBF>.D: Fix accuracy (64-bit case)

From: Douglas Leung <[email protected]>

Implement fused multiply-add with correct accuracy.

Fused multiply-add operation has better accuracy than respective
sequential execution of multiply and add operations applied on the
same inputs. This is because accuracy errors accumulate in latter
case.

This patch implements fused multiply-add with the same accuracy
as it is implemented in hardware, using 128-bit intermediate
calculations.

One test case example (raw bits) that this patch fixes:

MADDF.D fd,fs,ft:
fd = 0x00000ca000000000
fs = ft = 0x3f40624dd2f1a9fc

Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
Fixes: 83d43305a1df ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")

Signed-off-by: Douglas Leung <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.7+
---
arch/mips/math-emu/dp_maddf.c | 133 +++++++++++++++++++++++++++++-------------
1 file changed, 94 insertions(+), 39 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index e799fc8..e0d9be5 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -15,18 +15,44 @@
#include "ieee754dp.h"


+/* 128 bits shift right logical with rounding. */
+void srl128(u64 *hptr, u64 *lptr, int count)
+{
+ u64 low;
+
+ if (count >= 128) {
+ *lptr = *hptr != 0 || *lptr != 0;
+ *hptr = 0;
+ } else if (count >= 64) {
+ if (count == 64) {
+ *lptr = *hptr | (*lptr != 0);
+ } else {
+ low = *lptr;
+ *lptr = *hptr >> (count - 64);
+ *lptr |= (*hptr << (128 - count)) != 0 || low != 0;
+ }
+ *hptr = 0;
+ } else {
+ low = *lptr;
+ *lptr = low >> count | *hptr << (64 - count);
+ *lptr |= (low << (64 - count)) != 0;
+ *hptr = *hptr >> count;
+ }
+}
+
static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
union ieee754dp y, enum maddf_flags flags)
{
int re;
int rs;
- u64 rm;
unsigned lxm;
unsigned hxm;
unsigned lym;
unsigned hym;
u64 lrm;
u64 hrm;
+ u64 lzm;
+ u64 hzm;
u64 t;
u64 at;
int s;
@@ -172,7 +198,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
ym <<= 64 - (DP_FBITS + 1);

/*
- * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
+ * Multiply 64 bits xm and ym to give 128 bits result in hrm:lrm.
*/

/* 32 * 32 => 64 */
@@ -202,81 +228,110 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,

hrm = hrm + (t >> 32);

- rm = hrm | (lrm != 0);
-
- /*
- * Sticky shift down to normal rounding precision.
- */
- if ((s64) rm < 0) {
- rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
- ((rm << (DP_FBITS + 1 + 3)) != 0);
+ /* Put explicit bit at bit 126 if necessary */
+ if ((int64_t)hrm < 0) {
+ lrm = (hrm << 63) | (lrm >> 1);
+ hrm = hrm >> 1;
re++;
- } else {
- rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
- ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
}
- assert(rm & (DP_HIDDEN_BIT << 3));

- if (zc == IEEE754_CLASS_ZERO)
- return ieee754dp_format(rs, re, rm);
+ assert(hrm & (1 << 62));

- /* And now the addition */
- assert(zm & DP_HIDDEN_BIT);
+ if (zc == IEEE754_CLASS_ZERO) {
+ /*
+ * Move explicit bit from bit 126 to bit 55 since the
+ * ieee754dp_format code expects the mantissa to be
+ * 56 bits wide (53 + 3 rounding bits).
+ */
+ srl128(&hrm, &lrm, (126 - 55));
+ return ieee754dp_format(rs, re, lrm);
+ }

- /*
- * Provide guard,round and stick bit space.
- */
- zm <<= 3;
+ /* Move explicit bit from bit 52 to bit 126 */
+ lzm = 0;
+ hzm = zm << 10;
+ assert(hzm & (1 << 62));

+ /* Make the exponents the same */
if (ze > re) {
/*
* Have to shift y fraction right to align.
*/
s = ze - re;
- rm = XDPSRS(rm, s);
+ srl128(&hrm, &lrm, s);
re += s;
} else if (re > ze) {
/*
* Have to shift x fraction right to align.
*/
s = re - ze;
- zm = XDPSRS(zm, s);
+ srl128(&hzm, &lzm, s);
ze += s;
}
assert(ze == re);
assert(ze <= DP_EMAX);

+ /* Do the addition */
if (zs == rs) {
/*
- * Generate 28 bit result of adding two 27 bit numbers
- * leaving result in xm, xs and xe.
+ * Generate 128 bit result by adding two 127 bit numbers
+ * leaving result in hzm:lzm, zs and ze.
*/
- zm = zm + rm;
-
- if (zm >> (DP_FBITS + 1 + 3)) { /* carry out */
- zm = XDPSRS1(zm);
+ hzm = hzm + hrm + (lzm > (lzm + lrm));
+ lzm = lzm + lrm;
+ if ((int64_t)hzm < 0) { /* carry out */
+ srl128(&hzm, &lzm, 1);
ze++;
}
} else {
- if (zm >= rm) {
- zm = zm - rm;
+ if (hzm > hrm || (hzm == hrm && lzm >= lrm)) {
+ hzm = hzm - hrm - (lzm < lrm);
+ lzm = lzm - lrm;
} else {
- zm = rm - zm;
+ hzm = hrm - hzm - (lrm < lzm);
+ lzm = lrm - lzm;
zs = rs;
}
- if (zm == 0)
+ if (lzm == 0 && hzm == 0)
return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);

/*
- * Normalize to rounding precision.
+ * Put explicit bit at bit 126 if necessary.
*/
- while ((zm >> (DP_FBITS + 3)) == 0) {
- zm <<= 1;
- ze--;
+ if (hzm == 0) {
+ /* left shift by 63 or 64 bits */
+ if ((int64_t)lzm < 0) {
+ /* MSB of lzm is the explicit bit */
+ hzm = lzm >> 1;
+ lzm = lzm << 63;
+ ze -= 63;
+ } else {
+ hzm = lzm;
+ lzm = 0;
+ ze -= 64;
+ }
+ }
+
+ t = 0;
+ while ((hzm >> (62 - t)) == 0)
+ t++;
+
+ assert(t <= 62);
+ if (t) {
+ hzm = hzm << t | lzm >> (64 - t);
+ lzm = lzm << t;
+ ze -= t;
}
}

- return ieee754dp_format(zs, ze, zm);
+ /*
+ * Move explicit bit from bit 126 to bit 55 since the
+ * ieee754dp_format code expects the mantissa to be
+ * 56 bits wide (53 + 3 rounding bits).
+ */
+ srl128(&hzm, &lzm, (126 - 55));
+
+ return ieee754dp_format(zs, ze, lzm);
}

union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
--
2.7.4

2017-07-27 16:13:32

by Aleksandar Markovic

[permalink] [raw]
Subject: [PATCH v4 13/16] MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of zero inputs

From: Aleksandar Markovic <[email protected]>

Fix the cases of <MADDF|MSUBF>.<D|S> when any of two multiplicands is
+0 or -0, and the third input is also +0 or -0. Depending on the signs
of inputs, certain special cases must be handled.

A relevant example:

MADDF.S fd,fs,ft:
If fs contains +0.0, ft contains -0.0, and fd contains 0.0, fd is
going to contain +0.0 (without this patch, it used to contain -0.0).

Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
Fixes: 83d43305a1df ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")

Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
Cc: <[email protected]> # 4.7+
Reviewed-by: James Hogan <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 18 +++++++++++++++++-
arch/mips/math-emu/sp_maddf.c | 18 +++++++++++++++++-
2 files changed, 34 insertions(+), 2 deletions(-)

diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 557a0a1..c38fe1b 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -113,7 +113,23 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
- /* Multiplication is 0 so just return z */
+ if (zc == IEEE754_CLASS_ZERO) {
+ /* Handle cases +0 + (-0) and similar ones. */
+ if ((!(flags & maddf_negate_product)
+ && (zs == (xs ^ ys))) ||
+ ((flags & maddf_negate_product)
+ && (zs != (xs ^ ys))))
+ /*
+ * Cases of addition of zeros of equal signs
+ * or subtraction of zeroes of opposite signs.
+ * The sign of the resulting zero is in any
+ * such case determined only by the sign of z.
+ */
+ return z;
+
+ return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
+ }
+ /* x*y is here 0, and z is not 0, so just return z */
return z;

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 0d8d25f..4241ec1 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -114,7 +114,23 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
- /* Multiplication is 0 so just return z */
+ if (zc == IEEE754_CLASS_ZERO) {
+ /* Handle cases +0 + (-0) and similar ones. */
+ if ((!(flags & maddf_negate_product)
+ && (zs == (xs ^ ys))) ||
+ ((flags & maddf_negate_product)
+ && (zs != (xs ^ ys))))
+ /*
+ * Cases of addition of zeros of equal signs
+ * or subtraction of zeroes of opposite signs.
+ * The sign of the resulting zero is in any
+ * such case determined only by the sign of z.
+ */
+ return z;
+
+ return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
+ }
+ /* x*y is here 0, and z is not 0, so just return z */
return z;

case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
--
2.7.4

2017-08-07 10:09:38

by Ralf Baechle

[permalink] [raw]
Subject: Re: [PATCH v4 05/16] MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation

On Thu, Jul 27, 2017 at 06:08:48PM +0200, Aleksandar Markovic wrote:

This is a mindbogglingly big series of FP fixes. There's always a risk
associated with fixes so I'd prefer if they could wait for 4.14. For now
I'm going to apply them to my 4.14 branch but let me know if you think they
are more urgent than that?

Ralf