From: Aleksandar Markovic <[email protected]>
v2->v3:
- added a patch that fixes clobber lists in vdso fallback cases
- added 6 patches related to MIN/MINA/MAX/MAXA issues
- added 6 patches related to MADDF/MSUBDF issues
- enhanced logic and comments in patch on multitouch
- fixed a number of minor spelling and format mistakes in code
comments and commit messages
- several patches removed since they got integrated into the tree
- order of patches changed to better reflect similarity
- rebased to the latest kernel code
v1->v2:
- the patch on PREF usage in memcpy dropped as not needed
- updated recipient lists using get_maintainer.pl
- rebased to the latest kernel code
This series contains an assortment of changes necessary for proper
operation of Android emulator for Mips. However, we think that wider
kernel community may benefit from them too.
Aleksandar Markovic (10):
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation
MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix cases of both inputs
zero
MIPS: math-emu: <MAX|MIN>.<D|S>: Fix cases of both inputs negative
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of input values with
opposite signs
MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of both infinite inputs
MIPS: math-emu: MINA.<D|S>: Fix some cases of infinity and zero inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix NaN propagation
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of infinite inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of zero inputs
MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Clean up maddf_flags enumeration
Douglas Leung (2):
MIPS: math-emu: <MADDF|MSUBF>.S: Fix accuracy (32-bit case)
MIPS: math-emu: <MADDF|MSUBF>.D: Fix accuracy (64-bit case)
Goran Ferenc (1):
MIPS: VDSO: Fix clobber lists in fallback code paths
Lingfeng Yang (1):
input: goldfish: Fix multitouch event handling
Miodrag Dinic (2):
tty: goldfish: Use streaming DMA for r/w operations on Ranchu
platforms
tty: goldfish: Implement support for kernel 'earlycon' parameter
arch/mips/math-emu/dp_fmax.c | 61 +++++---
arch/mips/math-emu/dp_fmin.c | 63 +++++---
arch/mips/math-emu/dp_maddf.c | 237 +++++++++++++++++++------------
arch/mips/math-emu/ieee754int.h | 4 +
arch/mips/math-emu/ieee754sp.h | 4 +
arch/mips/math-emu/sp_fmax.c | 61 +++++---
arch/mips/math-emu/sp_fmin.c | 62 +++++---
arch/mips/math-emu/sp_maddf.c | 221 +++++++++++++---------------
arch/mips/vdso/gettimeofday.c | 6 +-
drivers/input/keyboard/goldfish_events.c | 35 ++++-
drivers/tty/Kconfig | 3 +
drivers/tty/goldfish.c | 145 +++++++++++++++++--
12 files changed, 596 insertions(+), 306 deletions(-)
--
2.7.4
From: Lingfeng Yang <[email protected]>
Register Goldfish Events device properly as a multitouch device,
and send SYN_REPORT event in appropriate cases only.
If SYN_REPORT is sent on every single multitouch event, it breaks
the multitouch. The multitouch becomes janky and having to click
2-3 times to do stuff (plus randomly activating notification bars
when not clicking). If these SYN_REPORT events are supreessed,
multitouch will work fine, plus the events will have a protocol
that looks nice.
In addition, Goldfish Events device needs to be registerd as a
multitouch device by issuing input_mt_init_slots. Otherwise,
input_handle_abs_event in drivers/input/input.c will silently drop
all ABS_MT_SLOT events, causing touches with more than one finger
not to work properly.
Signed-off-by: Lingfeng Yang <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
drivers/input/keyboard/goldfish_events.c | 35 +++++++++++++++++++++++++++++++-
1 file changed, 34 insertions(+), 1 deletion(-)
diff --git a/drivers/input/keyboard/goldfish_events.c b/drivers/input/keyboard/goldfish_events.c
index f6e643b..bc3e8b3 100644
--- a/drivers/input/keyboard/goldfish_events.c
+++ b/drivers/input/keyboard/goldfish_events.c
@@ -17,6 +17,7 @@
#include <linux/interrupt.h>
#include <linux/types.h>
#include <linux/input.h>
+#include <linux/input/mt.h>
#include <linux/kernel.h>
#include <linux/platform_device.h>
#include <linux/slab.h>
@@ -24,6 +25,8 @@
#include <linux/io.h>
#include <linux/acpi.h>
+#define GOLDFISH_MAX_FINGERS 5
+
enum {
REG_READ = 0x00,
REG_SET_PAGE = 0x00,
@@ -52,7 +55,22 @@ static irqreturn_t events_interrupt(int irq, void *dev_id)
value = __raw_readl(edev->addr + REG_READ);
input_event(edev->input, type, code, value);
- input_sync(edev->input);
+
+ /*
+ * Send an extra (EV_SYN, SYN_REPORT, 0x0) event only if a key
+ * was pressed. Some keyboard device drivers may only send the
+ * EV_KEY event and not the EV_SYN event.
+ *
+ * Note that sending an extra SYN_REPORT is not necessary nor
+ * correct protocol with other devices such as touchscreens,
+ * which will send their own SYN_REPORTs when sufficient event
+ * event information has been collected (for example, in case
+ * touchscreens, when pressure and X/Y coordinates have been
+ * received). Hence, we will only send this extra SYN_REPORT
+ * if type == EV_KEY.
+ */
+ if (type == EV_KEY)
+ input_sync(edev->input);
return IRQ_HANDLED;
}
@@ -155,6 +173,21 @@ static int events_probe(struct platform_device *pdev)
input_dev->name = edev->name;
input_dev->id.bustype = BUS_HOST;
+ /*
+ * Set the Goldfish Device to be multitouch.
+ *
+ * In the Ranchu kernel, there is multitouch-specific code for
+ * handling ABS_MT_SLOT events (see file drivers/input/input.c,
+ * function input_handle_abs_event). If we do not issue
+ * input_mt_init_slots, the kernel will filter out needed
+ * ABS_MT_SLOT events when we touch the screen in more than one
+ * place, preventing multitouch with more than one finger from
+ * working.
+ */
+ error = input_mt_init_slots(input_dev, GOLDFISH_MAX_FINGERS, 0);
+ if (error)
+ return error;
+
events_import_bits(edev, input_dev->evbit, EV_SYN, EV_MAX);
events_import_bits(edev, input_dev->keybit, EV_KEY, KEY_MAX);
events_import_bits(edev, input_dev->relbit, EV_REL, REL_MAX);
--
2.7.4
From: Miodrag Dinic <[email protected]>
Implement tty r/w operations using streaming DMA.
Goldfish tty for Ranchu platforms has been modified to use
streaming DMA mappings for read/write operations. This change
eliminates the need for snooping through the TLB in QEMU using
cpu_get_phys_page_debug() which does not guarantee that it will
return the valid va -> pa mapping.
The streaming DMA mapping is implemented using dma_map_single() per
transfer, while dma_unmap_single() is used for unmapping right after
the DMA transfer.
Using DMA API is the proper way for handling r/w transfers and
makes this driver more portable, thus effectively eliminating
the need for virt_to_page() and page_to_phys() conversions.
This change does not affect the old style Goldfish tty behaviour
which is still used by the Goldfish emulator. Version register has
been added and probed to see which platform is running this driver.
Reading from the new GOLDFISH_TTY_VERSION register using the Goldfish
emulator will return 0 and driver will work with virtual addresses.
Whereas if run on Ranchu it returns 1, and thus DMA is used.
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
drivers/tty/goldfish.c | 119 ++++++++++++++++++++++++++++++++++++++++++++-----
1 file changed, 108 insertions(+), 11 deletions(-)
diff --git a/drivers/tty/goldfish.c b/drivers/tty/goldfish.c
index 996bd47..acd50fa 100644
--- a/drivers/tty/goldfish.c
+++ b/drivers/tty/goldfish.c
@@ -22,6 +22,8 @@
#include <linux/io.h>
#include <linux/module.h>
#include <linux/goldfish.h>
+#include <linux/mm.h>
+#include <linux/dma-mapping.h>
enum {
GOLDFISH_TTY_PUT_CHAR = 0x00,
@@ -32,6 +34,8 @@ enum {
GOLDFISH_TTY_DATA_LEN = 0x14,
GOLDFISH_TTY_DATA_PTR_HIGH = 0x18,
+ GOLDFISH_TTY_VERSION = 0x20,
+
GOLDFISH_TTY_CMD_INT_DISABLE = 0,
GOLDFISH_TTY_CMD_INT_ENABLE = 1,
GOLDFISH_TTY_CMD_WRITE_BUFFER = 2,
@@ -45,6 +49,8 @@ struct goldfish_tty {
u32 irq;
int opencount;
struct console console;
+ u32 version;
+ struct device *dev;
};
static DEFINE_MUTEX(goldfish_tty_lock);
@@ -53,24 +59,94 @@ static u32 goldfish_tty_line_count = 8;
static u32 goldfish_tty_current_line_count;
static struct goldfish_tty *goldfish_ttys;
-static void goldfish_tty_do_write(int line, const char *buf, unsigned count)
+static inline void do_rw_io(struct goldfish_tty *qtty,
+ unsigned long address,
+ unsigned int count,
+ int is_write)
{
unsigned long irq_flags;
- struct goldfish_tty *qtty = &goldfish_ttys[line];
void __iomem *base = qtty->base;
+
spin_lock_irqsave(&qtty->lock, irq_flags);
- gf_write_ptr(buf, base + GOLDFISH_TTY_DATA_PTR,
+ gf_write_ptr((void *)address, base + GOLDFISH_TTY_DATA_PTR,
base + GOLDFISH_TTY_DATA_PTR_HIGH);
writel(count, base + GOLDFISH_TTY_DATA_LEN);
- writel(GOLDFISH_TTY_CMD_WRITE_BUFFER, base + GOLDFISH_TTY_CMD);
+
+ if (is_write)
+ writel(GOLDFISH_TTY_CMD_WRITE_BUFFER, base + GOLDFISH_TTY_CMD);
+ else
+ writel(GOLDFISH_TTY_CMD_READ_BUFFER, base + GOLDFISH_TTY_CMD);
+
spin_unlock_irqrestore(&qtty->lock, irq_flags);
}
+static inline void goldfish_tty_rw(struct goldfish_tty *qtty,
+ unsigned long addr,
+ unsigned int count,
+ int is_write)
+{
+ dma_addr_t dma_handle;
+ enum dma_data_direction dma_dir;
+
+ dma_dir = (is_write ? DMA_TO_DEVICE : DMA_FROM_DEVICE);
+ if (qtty->version) {
+ /*
+ * Goldfish TTY for Ranchu platform uses
+ * physical addresses and DMA for read/write operations
+ */
+ unsigned long addr_end = addr + count;
+
+ while (addr < addr_end) {
+ unsigned long pg_end = (addr & PAGE_MASK) + PAGE_SIZE;
+ unsigned long next =
+ pg_end < addr_end ? pg_end : addr_end;
+ unsigned long avail = next - addr;
+
+ /*
+ * Map the buffer's virtual address to the DMA address
+ * so the buffer can be accessed by the device.
+ */
+ dma_handle = dma_map_single(qtty->dev,
+ (void *)addr, avail, dma_dir);
+
+ if (dma_mapping_error(qtty->dev, dma_handle)) {
+ dev_err(qtty->dev, "tty: DMA mapping error.\n");
+ return;
+ }
+ do_rw_io(qtty, dma_handle, avail, is_write);
+
+ /*
+ * Unmap the previously mapped region after
+ * the completion of the read/write operation.
+ */
+ dma_unmap_single(qtty->dev, dma_handle, avail, dma_dir);
+
+ addr += avail;
+ }
+ } else {
+ /*
+ * Old style Goldfish TTY used on the Goldfish platform
+ * uses virtual addresses.
+ */
+ do_rw_io(qtty, addr, count, is_write);
+ }
+
+}
+
+static void goldfish_tty_do_write(int line, const char *buf,
+ unsigned int count)
+{
+ struct goldfish_tty *qtty = &goldfish_ttys[line];
+ unsigned long address = (unsigned long)(void *)buf;
+
+ goldfish_tty_rw(qtty, address, count, 1);
+}
+
static irqreturn_t goldfish_tty_interrupt(int irq, void *dev_id)
{
struct goldfish_tty *qtty = dev_id;
void __iomem *base = qtty->base;
- unsigned long irq_flags;
+ unsigned long address;
unsigned char *buf;
u32 count;
@@ -79,12 +155,10 @@ static irqreturn_t goldfish_tty_interrupt(int irq, void *dev_id)
return IRQ_NONE;
count = tty_prepare_flip_string(&qtty->port, &buf, count);
- spin_lock_irqsave(&qtty->lock, irq_flags);
- gf_write_ptr(buf, base + GOLDFISH_TTY_DATA_PTR,
- base + GOLDFISH_TTY_DATA_PTR_HIGH);
- writel(count, base + GOLDFISH_TTY_DATA_LEN);
- writel(GOLDFISH_TTY_CMD_READ_BUFFER, base + GOLDFISH_TTY_CMD);
- spin_unlock_irqrestore(&qtty->lock, irq_flags);
+
+ address = (unsigned long)(void *)buf;
+ goldfish_tty_rw(qtty, address, count, 0);
+
tty_schedule_flip(&qtty->port);
return IRQ_HANDLED;
}
@@ -271,6 +345,29 @@ static int goldfish_tty_probe(struct platform_device *pdev)
qtty->port.ops = &goldfish_port_ops;
qtty->base = base;
qtty->irq = irq;
+ qtty->dev = &pdev->dev;
+
+ /* Goldfish TTY device used by the Goldfish emulator
+ * should identify itself with 0, forcing the driver
+ * to use virtual addresses. Goldfish TTY device
+ * on Ranchu emulator (qemu2) returns 1 here and
+ * driver will use physical addresses.
+ */
+ qtty->version = readl(base + GOLDFISH_TTY_VERSION);
+
+ /* Goldfish TTY device on Ranchu emulator (qemu2)
+ * will use DMA for read/write IO operations.
+ */
+ if (qtty->version > 0) {
+ /* Initialize dma_mask to 32-bits.
+ */
+ if (!pdev->dev.dma_mask)
+ pdev->dev.dma_mask = &pdev->dev.coherent_dma_mask;
+ if (dma_set_mask(&pdev->dev, DMA_BIT_MASK(32))) {
+ dev_err(&pdev->dev, "No suitable DMA available.\n");
+ goto err_create_driver_failed;
+ }
+ }
writel(GOLDFISH_TTY_CMD_INT_DISABLE, base + GOLDFISH_TTY_CMD);
--
2.7.4
From: Miodrag Dinic <[email protected]>
Add early console functionality to the Goldfish tty driver.
When 'earlycon' kernel command line parameter is used with no options,
the early console is determined by the 'stdout-path' property in device
tree's 'chosen' node. This is illustrated in the following device tree
source example:
Device tree example:
chosen {
stdout-path = "/goldfish_tty@1f004000";
};
goldfish_tty@1f004000 {
interrupts = <0xc>;
reg = <0x1f004000 0x0 0x1000>;
compatible = "google,goldfish-tty", "generic,goldfish-tty";
};
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
drivers/tty/Kconfig | 3 +++
drivers/tty/goldfish.c | 26 ++++++++++++++++++++++++++
2 files changed, 29 insertions(+)
diff --git a/drivers/tty/Kconfig b/drivers/tty/Kconfig
index 9510305..873e0ba 100644
--- a/drivers/tty/Kconfig
+++ b/drivers/tty/Kconfig
@@ -392,6 +392,9 @@ config PPC_EARLY_DEBUG_EHV_BC_HANDLE
config GOLDFISH_TTY
tristate "Goldfish TTY Driver"
depends on GOLDFISH
+ select SERIAL_CORE
+ select SERIAL_CORE_CONSOLE
+ select SERIAL_EARLYCON
help
Console and system TTY driver for the Goldfish virtual platform.
diff --git a/drivers/tty/goldfish.c b/drivers/tty/goldfish.c
index acd50fa..22b7ad5 100644
--- a/drivers/tty/goldfish.c
+++ b/drivers/tty/goldfish.c
@@ -1,6 +1,7 @@
/*
* Copyright (C) 2007 Google, Inc.
* Copyright (C) 2012 Intel, Inc.
+ * Copyright (C) 2017 Imagination Technologies Ltd.
*
* This software is licensed under the terms of the GNU General Public
* License version 2, as published by the Free Software Foundation, and
@@ -24,6 +25,7 @@
#include <linux/goldfish.h>
#include <linux/mm.h>
#include <linux/dma-mapping.h>
+#include <linux/serial_core.h>
enum {
GOLDFISH_TTY_PUT_CHAR = 0x00,
@@ -427,6 +429,30 @@ static int goldfish_tty_remove(struct platform_device *pdev)
return 0;
}
+static void gf_early_console_putchar(struct uart_port *port, int ch)
+{
+ __raw_writel(ch, port->membase);
+}
+
+static void gf_early_write(struct console *con, const char *s, unsigned int n)
+{
+ struct earlycon_device *dev = con->data;
+
+ uart_console_write(&dev->port, s, n, gf_early_console_putchar);
+}
+
+static int __init gf_earlycon_setup(struct earlycon_device *device,
+ const char *opt)
+{
+ if (!device->port.membase)
+ return -ENODEV;
+
+ device->con->write = gf_early_write;
+ return 0;
+}
+
+OF_EARLYCON_DECLARE(early_gf_tty, "google,goldfish-tty", gf_earlycon_setup);
+
static const struct of_device_id goldfish_tty_of_match[] = {
{ .compatible = "google,goldfish-tty", },
{},
--
2.7.4
From: Goran Ferenc <[email protected]>
Extend clobber lists to include all GP registers.
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/vdso/gettimeofday.c | 6 ++++--
1 file changed, 4 insertions(+), 2 deletions(-)
diff --git a/arch/mips/vdso/gettimeofday.c b/arch/mips/vdso/gettimeofday.c
index 974276e..e2690d7 100644
--- a/arch/mips/vdso/gettimeofday.c
+++ b/arch/mips/vdso/gettimeofday.c
@@ -35,7 +35,8 @@ static __always_inline long gettimeofday_fallback(struct timeval *_tv,
" syscall\n"
: "=r" (ret), "=r" (error)
: "r" (tv), "r" (tz), "r" (nr)
- : "memory");
+ : "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13",
+ "$14", "$15", "$24", "$25", "hi", "lo", "memory");
return error ? -ret : ret;
}
@@ -55,7 +56,8 @@ static __always_inline long clock_gettime_fallback(clockid_t _clkid,
" syscall\n"
: "=r" (ret), "=r" (error)
: "r" (clkid), "r" (ts), "r" (nr)
- : "memory");
+ : "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13",
+ "$14", "$15", "$24", "$25", "hi", "lo", "memory");
return error ? -ret : ret;
}
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S>, if both inputs
are quiet NaNs. The specifications of <MAX|MAXA|MIN|MINA>.<D|S> state
that the returned value in such cases should be the quiet NaN
contained in register fs.
The relevant example:
MAX.S fd,fs,ft:
If fs contains qNaN1, and ft contains qNaN2, fd is going to contain
qNaN1 (without this patch, it used to contain qNaN2).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 8 ++++++--
arch/mips/math-emu/dp_fmin.c | 8 ++++++--
arch/mips/math-emu/sp_fmax.c | 8 ++++++--
arch/mips/math-emu/sp_fmin.c | 8 ++++++--
4 files changed, 24 insertions(+), 8 deletions(-)
diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index fd71b8d..567fc33 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -47,6 +47,9 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -54,7 +57,6 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,6 +149,9 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -154,7 +159,6 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index c1072b0..77f7ca9 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -47,6 +47,9 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -54,7 +57,6 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,6 +149,9 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754dp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -154,7 +159,6 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index 4d00084..d46e8e4 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -47,6 +47,9 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -54,7 +57,6 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,6 +149,9 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -154,7 +159,6 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 4eb1bb9..b528c4b 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -47,6 +47,9 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -54,7 +57,6 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
@@ -147,6 +149,9 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
return ieee754sp_nanxcpt(x);
+ case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
+ return x;
+
/* numbers are preferred to NaNs */
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
@@ -154,7 +159,6 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
return x;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S>, if both inputs
are zeros. The right behavior in such cases is stated in instruction
reference manual and is as follows:
fs ft MAX MIN MAXA MINA
---------------------------------------------
0 0 0 0 0 0
0 -0 0 -0 0 -0
-0 0 0 -0 0 -0
-0 -0 -0 -0 -0 -0
The relevant example:
MAX.S fd,fs,ft:
If fs contains +0, and ft contains -0, fd is going to contain 0
(without this patch, it used to contain -0).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 8 ++------
arch/mips/math-emu/dp_fmin.c | 8 ++------
arch/mips/math-emu/sp_fmax.c | 8 ++------
arch/mips/math-emu/sp_fmin.c | 8 ++------
4 files changed, 8 insertions(+), 24 deletions(-)
diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index 567fc33..9517572 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -82,9 +82,7 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
return ys ? x : y;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs & ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
@@ -184,9 +182,7 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
return y;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs & ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 77f7ca9..7069320 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -82,9 +82,7 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
return ys ? y : x;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs | ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
@@ -184,9 +182,7 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
return y;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754dp_zero(1);
+ return ieee754dp_zero(xs | ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
DPDNORMX;
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index d46e8e4..d72111a 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -82,9 +82,7 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
return ys ? x : y;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs & ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
@@ -184,9 +182,7 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
return y;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs & ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index b528c4b..61ff9c6 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -82,9 +82,7 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
return ys ? y : x;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs | ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
@@ -184,9 +182,7 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
return y;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
- if (xs == ys)
- return x;
- return ieee754sp_zero(1);
+ return ieee754sp_zero(xs | ys);
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
SPDNORMX;
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the value returned by <MAX|MIN>.<D|S>, if both inputs are negative
normal fp numbers. The previous logic did not take into account that
if both inputs have the same sign, there should be separate treatment
of the cases when both inputs are negative and when both inputs are
positive.
The relevant example:
MAX.S fd,fs,ft:
If fs contains -5, and ft contains -7, fd is going to contain -5
(without this patch, it used to contain -7).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 33 +++++++++++++++++++++++++--------
arch/mips/math-emu/dp_fmin.c | 33 +++++++++++++++++++++++++--------
arch/mips/math-emu/sp_fmax.c | 33 +++++++++++++++++++++++++--------
arch/mips/math-emu/sp_fmin.c | 32 +++++++++++++++++++++++++-------
4 files changed, 100 insertions(+), 31 deletions(-)
diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index 9517572..a0175cc 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -106,16 +106,33 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
else if (xs < ys)
return x;
- /* Compare exponent */
- if (xe > ye)
- return x;
- else if (xe < ye)
- return y;
+ /* Signs of inputs are the same, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ }
- /* Compare mantissa */
- if (xm <= ym)
+ /* Signs and exponents of inputs are the same, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal exponents */
+ if (xm <= ym)
+ return y;
+ return x;
+ } else {
+ /* Inputs are both negative, with equal exponents */
+ if (xm <= ym)
+ return x;
return y;
- return x;
+ }
}
union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 7069320..074a858 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -106,16 +106,33 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
else if (xs < ys)
return y;
- /* Compare exponent */
- if (xe > ye)
- return y;
- else if (xe < ye)
- return x;
+ /* Signs of inputs are the same, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ }
- /* Compare mantissa */
- if (xm <= ym)
+ /* Signs and exponents of inputs are the same, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal exponents */
+ if (xm <= ym)
+ return x;
+ return y;
+ } else {
+ /* Inputs are both negative, with equal exponents */
+ if (xm <= ym)
+ return y;
return x;
- return y;
+ }
}
union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index d72111a..15825db 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -106,16 +106,33 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
else if (xs < ys)
return x;
- /* Compare exponent */
- if (xe > ye)
- return x;
- else if (xe < ye)
- return y;
+ /* Signs of inputs are the same, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ }
- /* Compare mantissa */
- if (xm <= ym)
+ /* Signs and exponents of inputs are the same, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal exponents */
+ if (xm <= ym)
+ return y;
+ return x;
+ } else {
+ /* Inputs are both negative, with equal exponents */
+ if (xm <= ym)
+ return x;
return y;
- return x;
+ }
}
union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 61ff9c6..f1418f7 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -106,16 +106,34 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
else if (xs < ys)
return y;
- /* Compare exponent */
- if (xe > ye)
+ /* Signs of inputs are the same, let's compare exponents */
+ if (xs == 0) {
+ /* Inputs are both positive */
+ if (xe > ye)
+ return y;
+ else if (xe < ye)
+ return x;
+ } else {
+ /* Inputs are both negative */
+ if (xe > ye)
+ return x;
+ else if (xe < ye)
+ return y;
+ }
+
+ /* Signs and exponents of inputs are the same, let's compare mantissas */
+ if (xs == 0) {
+ /* Inputs are both positive, with equal exponents */
+ if (xm <= ym)
+ return x;
return y;
- else if (xe < ye)
+ } else {
+ /* Inputs are both negative, with equal exponents */
+ if (xm <= ym)
+ return y;
return x;
+ }
- /* Compare mantissa */
- if (xm <= ym)
- return x;
- return y;
}
union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the value returned by <MAXA|MINA>.<D|S>, if inputs are normal fp
numbers of the same absolute value, but opposite signs.
The relevant example:
MAXA.S fd,fs,ft:
If fs contains -3, and ft contains +3, fd is going to contain +3
(without this patch, it used to contain -3).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 8 ++++++--
arch/mips/math-emu/dp_fmin.c | 6 +++++-
arch/mips/math-emu/sp_fmax.c | 8 ++++++--
arch/mips/math-emu/sp_fmin.c | 6 +++++-
4 files changed, 22 insertions(+), 6 deletions(-)
diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index a0175cc..860b43f9 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -224,7 +224,11 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
return y;
/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
return y;
- return x;
+ else if (xm > ym)
+ return x;
+ else if (xs == 0)
+ return x;
+ return y;
}
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 074a858..73d85e4 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -224,7 +224,11 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
return x;
/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
+ return x;
+ else if (xm > ym)
+ return y;
+ else if (xs == 1)
return x;
return y;
}
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index 15825db..fec7f64 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -224,7 +224,11 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
return y;
/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
return y;
- return x;
+ else if (xm > ym)
+ return x;
+ else if (xs == 0)
+ return x;
+ return y;
}
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index f1418f7..74780bc 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -225,7 +225,11 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
return x;
/* Compare mantissa */
- if (xm <= ym)
+ if (xm < ym)
+ return x;
+ else if (xm > ym)
+ return y;
+ else if (xs == 1)
return x;
return y;
}
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the value returned by <MAXA|MINA>.<D|S> fd,fs,ft, if both inputs
are infinite. The previous implementation returned always the value
contained in ft in such cases. The correct behavior is specified
in Mips instruction set manual and is as follows:
fs ft MAXA MINA
---------------------------------
inf inf inf inf
inf -inf inf -inf
-inf inf inf -inf
-inf -inf -inf -inf
The relevant example:
MAXA.S fd,fs,ft:
If fs contains +inf, and ft contains -inf, fd is going to contain
+inf (without this patch, it used to contain -inf).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_fmax.c | 4 +++-
arch/mips/math-emu/dp_fmin.c | 4 +++-
arch/mips/math-emu/sp_fmax.c | 4 +++-
arch/mips/math-emu/sp_fmin.c | 4 +++-
4 files changed, 12 insertions(+), 4 deletions(-)
diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
index 860b43f9..5459643 100644
--- a/arch/mips/math-emu/dp_fmax.c
+++ b/arch/mips/math-emu/dp_fmax.c
@@ -183,6 +183,9 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754dp_inf(xs & ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -190,7 +193,6 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index 73d85e4..d4cd243 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -183,6 +183,9 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754dp_inf(xs | ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -190,7 +193,6 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
index fec7f64..528a90b 100644
--- a/arch/mips/math-emu/sp_fmax.c
+++ b/arch/mips/math-emu/sp_fmax.c
@@ -183,6 +183,9 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754sp_inf(xs & ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -190,7 +193,6 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 74780bc..5f1d650 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -184,6 +184,9 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
/*
* Infinity and zero handling
*/
+ case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
+ return ieee754sp_inf(xs | ys);
+
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
@@ -191,7 +194,6 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
return x;
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix following special cases for MINA>.<D|S>:
- if one of the inputs is zero, and the other is subnormal, normal,
or infinity, the value of the former should be returned (that is,
a zero).
- if one of the inputs is infinity, and the other input is normal,
or subnormal, the value of the latter should be returned.
The previous implementation's logic for such cases was incorrect - it
appears as if it implements MAXA, and not MINA instruction.
The relevant example:
MINA.S fd,fs,ft:
If fs contains 100.0, and ft contains 0.0, fd is going to contain
0.0 (without this patch, it used to contain 100.0).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_fmin.c | 4 ++--
arch/mips/math-emu/sp_fmin.c | 4 ++--
2 files changed, 4 insertions(+), 4 deletions(-)
diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
index d4cd243..1e9ee3d 100644
--- a/arch/mips/math-emu/dp_fmin.c
+++ b/arch/mips/math-emu/dp_fmin.c
@@ -191,14 +191,14 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
- return x;
+ return y;
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
- return y;
+ return x;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
return ieee754dp_zero(xs | ys);
diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
index 5f1d650..685ce75 100644
--- a/arch/mips/math-emu/sp_fmin.c
+++ b/arch/mips/math-emu/sp_fmin.c
@@ -192,14 +192,14 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
- return x;
+ return y;
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
- return y;
+ return x;
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
return ieee754sp_zero(xs | ys);
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the cases of <MADDF|MSUBF>.<D|S> when any of three inputs is any
NaN. Correct behavior of <MADDF|MSUBF>.<D|S> fd, fs, ft is following:
- if any of inputs is sNaN, return a sNaN using following rules: if
only one input is sNaN, return that one; if more than one input is
sNaN, order of precedence for return value is fd, fs, ft
- if no input is sNaN, but at least one of inputs is qNaN, return a
qNaN using following rules: if only one input is qNaN, return that
one; if more than one input is qNaN, order of precedence for
return value is fd, fs, ft
The previous code contained handling of some above cases, but not all.
Also, such handling was scattered into various cases of
"switch (CLPAIR(xc, yc))" statement and elsewhere. With this patch,
this logic is placed in one place, and "switch (CLPAIR(xc, yc))" is
significantly simplified.
The relevant example:
MADDF.S fd,fs,ft:
If fs contains qNaN1, ft contains qNaN2, and fd contains qNaN3, fd
is going to contain qNaN3 (without this patch, it used to contain
qNaN1).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 71 ++++++++++++++-----------------------------
arch/mips/math-emu/sp_maddf.c | 69 ++++++++++++++---------------------------
2 files changed, 46 insertions(+), 94 deletions(-)
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index caa62f2..4f2e783 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -48,52 +48,35 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
ieee754_clearcx();
- switch (zc) {
- case IEEE754_CLASS_SNAN:
- ieee754_setcx(IEEE754_INVALID_OPERATION);
- return ieee754dp_nanxcpt(z);
- case IEEE754_CLASS_DNORM:
- DPDNORMZ;
- /* QNAN and ZERO cases are handled separately below */
- }
-
- switch (CLPAIR(xc, yc)) {
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
- return ieee754dp_nanxcpt(y);
-
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
- return ieee754dp_nanxcpt(x);
-
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
+ /* handle the cases when at least one of x, y or z is a NaN */
+ if (((xc == IEEE754_CLASS_SNAN) || (xc == IEEE754_CLASS_QNAN)) ||
+ ((yc == IEEE754_CLASS_SNAN) || (yc == IEEE754_CLASS_QNAN)) ||
+ ((zc == IEEE754_CLASS_SNAN) || (zc == IEEE754_CLASS_QNAN))) {
+ /* order of precedence is z, x, y */
+ if (zc == IEEE754_CLASS_SNAN)
+ return ieee754dp_nanxcpt(z);
+ if (xc == IEEE754_CLASS_SNAN)
+ return ieee754dp_nanxcpt(x);
+ if (yc == IEEE754_CLASS_SNAN)
+ return ieee754dp_nanxcpt(y);
+ if (zc == IEEE754_CLASS_QNAN)
+ return z;
+ if (xc == IEEE754_CLASS_QNAN)
+ return x;
return y;
+ }
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
- return x;
+ if (zc == IEEE754_CLASS_DNORM)
+ DPDNORMZ;
+ /* ZERO z cases are handled separately below */
+ switch (CLPAIR(xc, yc)) {
/*
* Infinity handling
*/
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
ieee754_setcx(IEEE754_INVALID_OPERATION);
return ieee754dp_indef();
@@ -102,8 +85,6 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
return ieee754dp_inf(xs ^ ys);
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
@@ -120,25 +101,19 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
DPDNORMX;
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
DPDNORMY;
break;
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
DPDNORMX;
break;
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
/* fall through to real computations */
}
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index c91d5e5..9fd2035 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -48,51 +48,36 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
ieee754_clearcx();
- switch (zc) {
- case IEEE754_CLASS_SNAN:
- ieee754_setcx(IEEE754_INVALID_OPERATION);
- return ieee754sp_nanxcpt(z);
- case IEEE754_CLASS_DNORM:
- SPDNORMZ;
- /* QNAN and ZERO cases are handled separately below */
+ /* handle the cases when at least one of x, y or z is a NaN */
+ if (((xc == IEEE754_CLASS_SNAN) || (xc == IEEE754_CLASS_QNAN)) ||
+ ((yc == IEEE754_CLASS_SNAN) || (yc == IEEE754_CLASS_QNAN)) ||
+ ((zc == IEEE754_CLASS_SNAN) || (zc == IEEE754_CLASS_QNAN))) {
+ /* order of precedence is z, x, y */
+ if (zc == IEEE754_CLASS_SNAN)
+ return ieee754sp_nanxcpt(z);
+ if (xc == IEEE754_CLASS_SNAN)
+ return ieee754sp_nanxcpt(x);
+ if (yc == IEEE754_CLASS_SNAN)
+ return ieee754sp_nanxcpt(y);
+ if (zc == IEEE754_CLASS_QNAN)
+ return z;
+ if (xc == IEEE754_CLASS_QNAN)
+ return x;
+ return y;
}
+ if (zc == IEEE754_CLASS_DNORM)
+ SPDNORMZ;
+ /* ZERO z cases are handled separately below */
+
switch (CLPAIR(xc, yc)) {
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
- return ieee754sp_nanxcpt(y);
-
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
- return ieee754sp_nanxcpt(x);
-
- case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
- return y;
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
- case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
- return x;
/*
* Infinity handling
*/
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
ieee754_setcx(IEEE754_INVALID_OPERATION);
return ieee754sp_indef();
@@ -101,8 +86,6 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
return ieee754sp_inf(xs ^ ys);
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
@@ -119,25 +102,19 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
SPDNORMX;
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
SPDNORMY;
break;
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
SPDNORMX;
break;
case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
- if (zc == IEEE754_CLASS_QNAN)
- return z;
- else if (zc == IEEE754_CLASS_INF)
+ if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
/* fall through to real computations */
}
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the cases of <MADDF|MSUBF>.<D|S> when any of two multiplicands is
infinity. The correct behavior in such cases is affected by the nature
of third input. Cases of addition of infinities with opposite signs
and subtraction of infinities with same signs may arise and must be
handles separately. Also, the value od flags argument (that determines
whether the instruction is MADDF or MSUBF) affects the outcome.
The relevant examples:
MADDF.S fd,fs,ft:
If fs contains +inf, ft contains +inf, and fd contains -inf, fd is
going to contain indef (without this patch, it used to contain
-inf).
MSUBF.S fd,fs,ft:
If fs contains +inf, ft contains 1.0, and fd contains +0.0, fd is
going to contain -inf (without this patch, it used to contain +inf).
Signed-off-by: Douglas Leung <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 21 ++++++++++++++++++++-
arch/mips/math-emu/sp_maddf.c | 21 ++++++++++++++++++++-
2 files changed, 40 insertions(+), 2 deletions(-)
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 4f2e783..45f815d 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -85,7 +85,26 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- return ieee754dp_inf(xs ^ ys);
+ if ((zc == IEEE754_CLASS_INF) &&
+ ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
+ ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ /*
+ * Cases of addition of infinities with opposite signs
+ * or subtraction of infinities with same signs.
+ */
+ ieee754_setcx(IEEE754_INVALID_OPERATION);
+ return ieee754dp_indef();
+ }
+ /*
+ * z is here either not infinity, or infinity of the same sign
+ * as maddf_negate_product * x * y. So, the result must be
+ * infinity, and its sign is determined only by the value of
+ * (flags & maddf_negate_product) and the signs of x and y.
+ */
+ if (flags & maddf_negate_product)
+ return ieee754dp_inf(1 ^ (xs ^ ys));
+ else
+ return ieee754dp_inf(xs ^ ys);
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 9fd2035..76856d7 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -86,7 +86,26 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
- return ieee754sp_inf(xs ^ ys);
+ if ((zc == IEEE754_CLASS_INF) &&
+ ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
+ ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ /*
+ * Cases of addition of infinities with opposite signs
+ * or subtraction of infinities with same signs.
+ */
+ ieee754_setcx(IEEE754_INVALID_OPERATION);
+ return ieee754sp_indef();
+ }
+ /*
+ * z is here either not infinity, or infinity of the same sign
+ * as maddf_negate_product * x * y. So, the result must be
+ * infinity, and its sign is determined only by the value of
+ * (flags & maddf_negate_product) and the signs of x and y.
+ */
+ if (flags & maddf_negate_product)
+ return ieee754sp_inf(1 ^ (xs ^ ys));
+ else
+ return ieee754sp_inf(xs ^ ys);
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix the cases of <MADDF|MSUBF>.<D|S> when any of two multiplicands is
+0 or -0, and the third input is also +0 or -0. Depending on the signs
of inputs, certain special cases must be handled.
The relevant example:
MADDF.S fd,fs,ft:
If fs contains +0.0, ft contains -0.0, and fd contains 0.0, fd is
going to contain +0.0 (without this patch, it used to contain -0.0).
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 8 ++++++++
arch/mips/math-emu/sp_maddf.c | 8 ++++++++
2 files changed, 16 insertions(+)
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 45f815d..b8b2c17 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -113,6 +113,14 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
if (zc == IEEE754_CLASS_INF)
return ieee754dp_inf(zs);
+ /* Handle cases +0 + (-0) and similar ones. */
+ if (zc == IEEE754_CLASS_ZERO) {
+ if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
+ ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
+ return z;
+ else
+ return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
+ }
/* Multiplication is 0 so just return z */
return z;
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index 76856d7..cb8597b 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -114,6 +114,14 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
if (zc == IEEE754_CLASS_INF)
return ieee754sp_inf(zs);
+ /* Handle cases +0 + (-0) and similar ones. */
+ if (zc == IEEE754_CLASS_ZERO) {
+ if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
+ ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
+ return z;
+ else
+ return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
+ }
/* Multiplication is 0 so just return z */
return z;
--
2.7.4
From: Douglas Leung <[email protected]>
Implement fused multiply-add with correct accuracy.
Fused multiply-add operation has better accuracy than respective
sequential execution of multiply and add operations applied on the
same inputs. This is because accuracy errors accumulate in latter
case.
This patch implements fused multiply-add with the same accuracy
as it is implemented in hardware, using 64-bit intermediate
calculations.
One test case example (raw bits) that this patch fixes:
MADDF.S fd,fs,ft:
fd = 0x22575225
fs = ft = 0x3727c5ac
Signed-off-by: Douglas Leung <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/ieee754sp.h | 4 ++
arch/mips/math-emu/sp_maddf.c | 116 ++++++++++++++++-------------------------
2 files changed, 50 insertions(+), 70 deletions(-)
diff --git a/arch/mips/math-emu/ieee754sp.h b/arch/mips/math-emu/ieee754sp.h
index 8476067..0f63e42 100644
--- a/arch/mips/math-emu/ieee754sp.h
+++ b/arch/mips/math-emu/ieee754sp.h
@@ -45,6 +45,10 @@ static inline int ieee754sp_finite(union ieee754sp x)
return SPBEXP(x) != SP_EMAX + 1 + SP_EBIAS;
}
+/* 64 bit right shift with rounding */
+#define XSPSRS64(v, rs) \
+ (((rs) >= 64) ? ((v) != 0) : ((v) >> (rs)) | ((v) << (64-(rs)) != 0))
+
/* 3bit extended single precision sticky right shift */
#define XSPSRS(v, rs) \
((rs > (SP_FBITS+3))?1:((v) >> (rs)) | ((v) << (32-(rs)) != 0))
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index cb8597b..b380189 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -24,14 +24,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
int re;
int rs;
unsigned rm;
- unsigned short lxm;
- unsigned short hxm;
- unsigned short lym;
- unsigned short hym;
- unsigned lrm;
- unsigned hrm;
- unsigned t;
- unsigned at;
+ uint64_t rm64;
+ uint64_t zm64;
int s;
COMPXSP;
@@ -165,108 +159,90 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
if (flags & maddf_negate_product)
rs ^= 1;
- /* shunt to top of word */
- xm <<= 32 - (SP_FBITS + 1);
- ym <<= 32 - (SP_FBITS + 1);
+ /* Multiple 24 bit xm and ym to give 48 bit results */
+ rm64 = (uint64_t)xm * ym;
- /*
- * Multiply 32 bits xm, ym to give high 32 bits rm with stickness.
- */
- lxm = xm & 0xffff;
- hxm = xm >> 16;
- lym = ym & 0xffff;
- hym = ym >> 16;
-
- lrm = lxm * lym; /* 16 * 16 => 32 */
- hrm = hxm * hym; /* 16 * 16 => 32 */
-
- t = lxm * hym; /* 16 * 16 => 32 */
- at = lrm + (t << 16);
- hrm += at < lrm;
- lrm = at;
- hrm = hrm + (t >> 16);
-
- t = hxm * lym; /* 16 * 16 => 32 */
- at = lrm + (t << 16);
- hrm += at < lrm;
- lrm = at;
- hrm = hrm + (t >> 16);
-
- rm = hrm | (lrm != 0);
+ /* Shunt to top of word */
+ rm64 = rm64 << 16;
- /*
- * Sticky shift down to normal rounding precision.
- */
- if ((int) rm < 0) {
- rm = (rm >> (32 - (SP_FBITS + 1 + 3))) |
- ((rm << (SP_FBITS + 1 + 3)) != 0);
+ /* Put explicit bit at bit 62 if necessary */
+ if ((int64_t) rm64 < 0) {
+ rm64 = rm64 >> 1;
re++;
- } else {
- rm = (rm >> (32 - (SP_FBITS + 1 + 3 + 1))) |
- ((rm << (SP_FBITS + 1 + 3 + 1)) != 0);
}
- assert(rm & (SP_HIDDEN_BIT << 3));
- if (zc == IEEE754_CLASS_ZERO)
- return ieee754sp_format(rs, re, rm);
-
- /* And now the addition */
+ assert(rm64 & (1 << 62));
- assert(zm & SP_HIDDEN_BIT);
+ if (zc == IEEE754_CLASS_ZERO) {
+ /*
+ * Move explicit bit from bit 62 to bit 26 since the
+ * ieee754sp_format code expects the mantissa to be
+ * 27 bits wide (24 + 3 rounding bits).
+ */
+ rm = XSPSRS64(rm64, (62 - 26));
+ return ieee754sp_format(rs, re, rm);
+ }
- /*
- * Provide guard,round and stick bit space.
- */
- zm <<= 3;
+ /* Move explicit bit from bit 23 to bit 62 */
+ zm64 = (uint64_t)zm << (62 - 23);
+ assert(zm64 & (1 << 62));
+ /* Make the exponents the same */
if (ze > re) {
/*
* Have to shift r fraction right to align.
*/
s = ze - re;
- rm = XSPSRS(rm, s);
+ rm64 = XSPSRS64(rm64, s);
re += s;
} else if (re > ze) {
/*
* Have to shift z fraction right to align.
*/
s = re - ze;
- zm = XSPSRS(zm, s);
+ zm64 = XSPSRS64(zm64, s);
ze += s;
}
assert(ze == re);
assert(ze <= SP_EMAX);
+ /* Do the addition */
if (zs == rs) {
/*
- * Generate 28 bit result of adding two 27 bit numbers
- * leaving result in zm, zs and ze.
+ * Generate 64 bit result by adding two 63 bit numbers
+ * leaving result in zm64, zs and ze.
*/
- zm = zm + rm;
-
- if (zm >> (SP_FBITS + 1 + 3)) { /* carry out */
- zm = XSPSRS1(zm);
+ zm64 = zm64 + rm64;
+ if ((int64_t)zm64 < 0) { /* carry out */
+ zm64 = XSPSRS1(zm64);
ze++;
}
} else {
- if (zm >= rm) {
- zm = zm - rm;
+ if (zm64 >= rm64) {
+ zm64 = zm64 - rm64;
} else {
- zm = rm - zm;
+ zm64 = rm64 - zm64;
zs = rs;
}
- if (zm == 0)
+ if (zm64 == 0)
return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
/*
- * Normalize in extended single precision
+ * Put explicit bit at bit 62 if necessary.
*/
- while ((zm >> (SP_MBITS + 3)) == 0) {
- zm <<= 1;
+ while ((zm64 >> 62) == 0) {
+ zm64 <<= 1;
ze--;
}
-
}
+
+ /*
+ * Move explicit bit from bit 62 to bit 26 since the
+ * ieee754sp_format code expects the mantissa to be
+ * 27 bits wide (24 + 3 rounding bits).
+ */
+ zm = XSPSRS64(zm64, (62 - 26));
+
return ieee754sp_format(zs, ze, zm);
}
--
2.7.4
From: Douglas Leung <[email protected]>
Implement fused multiply-add with correct accuracy.
Fused multiply-add operation has better accuracy than respective
sequential execution of multiply and add operations applied on the
same inputs. This is because accuracy errors accumulate in latter
case.
This patch implements fused multiply-add with the same accuracy
as it is implemented in hardware, using 128-bit intermediate
calculations.
One test case example (raw bits) that this patch fixes:
MADDF.D fd,fs,ft:
fd = 0x00000ca000000000
fs = ft = 0x3f40624dd2f1a9fc
Signed-off-by: Douglas Leung <[email protected]>
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 130 +++++++++++++++++++++++++++++-------------
1 file changed, 91 insertions(+), 39 deletions(-)
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index b8b2c17..68b55c8 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -18,18 +18,43 @@ enum maddf_flags {
maddf_negate_product = 1 << 0,
};
+/* 128 bits shift right logical with rounding. */
+void srl128(u64 *hptr, u64 *lptr, int count)
+{
+ u64 low;
+ if (count >= 128) {
+ *lptr = *hptr != 0 || *lptr != 0;
+ *hptr = 0;
+ } else if (count >= 64) {
+ if (count == 64) {
+ *lptr = *hptr | (*lptr != 0);
+ } else {
+ low = *lptr;
+ *lptr = *hptr >> (count - 64);
+ *lptr |= (*hptr << (128 - count)) != 0 || low != 0;
+ }
+ *hptr = 0;
+ } else {
+ low = *lptr;
+ *lptr = low >> count | *hptr << (64 - count);
+ *lptr |= (low << (64 - count)) != 0;
+ *hptr = *hptr >> count;
+ }
+}
+
static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
union ieee754dp y, enum maddf_flags flags)
{
int re;
int rs;
- u64 rm;
unsigned lxm;
unsigned hxm;
unsigned lym;
unsigned hym;
u64 lrm;
u64 hrm;
+ u64 lzm;
+ u64 hzm;
u64 t;
u64 at;
int s;
@@ -167,7 +192,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
ym <<= 64 - (DP_FBITS + 1);
/*
- * Multiply 64 bits xm, ym to give high 64 bits rm with stickness.
+ * Multiply 64 bits xm and ym to give 128 bits result in hrm:lrm.
*/
/* 32 * 32 => 64 */
@@ -197,81 +222,108 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
hrm = hrm + (t >> 32);
- rm = hrm | (lrm != 0);
-
- /*
- * Sticky shift down to normal rounding precision.
- */
- if ((s64) rm < 0) {
- rm = (rm >> (64 - (DP_FBITS + 1 + 3))) |
- ((rm << (DP_FBITS + 1 + 3)) != 0);
+ /* Put explicit bit at bit 126 if necessary */
+ if ((int64_t)hrm < 0) {
+ lrm = (hrm << 63) | (lrm >> 1);
+ hrm = hrm >> 1;
re++;
- } else {
- rm = (rm >> (64 - (DP_FBITS + 1 + 3 + 1))) |
- ((rm << (DP_FBITS + 1 + 3 + 1)) != 0);
}
- assert(rm & (DP_HIDDEN_BIT << 3));
- if (zc == IEEE754_CLASS_ZERO)
- return ieee754dp_format(rs, re, rm);
+ assert(hrm & (1 << 62));
- /* And now the addition */
- assert(zm & DP_HIDDEN_BIT);
+ if (zc == IEEE754_CLASS_ZERO) {
+ /*
+ * Move explicit bit from bit 126 to bit 55 since the
+ * ieee754dp_format code expects the mantissa to be
+ * 56 bits wide (53 + 3 rounding bits).
+ */
+ srl128(&hrm, &lrm, (126 - 55));
+ return ieee754dp_format(rs, re, lrm);
+ }
- /*
- * Provide guard,round and stick bit space.
- */
- zm <<= 3;
+ /* Move explicit bit from bit 52 to bit 126 */
+ lzm = 0;
+ hzm = zm << 10;
+ assert(hzm & (1 << 62));
+ /* Make the exponents the same */
if (ze > re) {
/*
* Have to shift y fraction right to align.
*/
s = ze - re;
- rm = XDPSRS(rm, s);
+ srl128(&hrm, &lrm, s);
re += s;
} else if (re > ze) {
/*
* Have to shift x fraction right to align.
*/
s = re - ze;
- zm = XDPSRS(zm, s);
+ srl128(&hzm, &lzm, s);
ze += s;
}
assert(ze == re);
assert(ze <= DP_EMAX);
+ /* Do the addition */
if (zs == rs) {
/*
- * Generate 28 bit result of adding two 27 bit numbers
- * leaving result in xm, xs and xe.
+ * Generate 128 bit result by adding two 127 bit numbers
+ * leaving result in hzm:lzm, zs and ze.
*/
- zm = zm + rm;
-
- if (zm >> (DP_FBITS + 1 + 3)) { /* carry out */
- zm = XDPSRS1(zm);
+ hzm = hzm + hrm + (lzm > (lzm + lrm));
+ lzm = lzm + lrm;
+ if ((int64_t)hzm < 0) { /* carry out */
+ srl128(&hzm, &lzm, 1);
ze++;
}
} else {
- if (zm >= rm) {
- zm = zm - rm;
+ if (hzm > hrm || (hzm == hrm && lzm >= lrm)) {
+ hzm = hzm - hrm - (lzm < lrm);
+ lzm = lzm - lrm;
} else {
- zm = rm - zm;
+ hzm = hrm - hzm - (lrm < lzm);
+ lzm = lrm - lzm;
zs = rs;
}
- if (zm == 0)
+ if (lzm == 0 && hzm == 0)
return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
/*
- * Normalize to rounding precision.
+ * Put explicit bit at bit 126 if necessary.
*/
- while ((zm >> (DP_FBITS + 3)) == 0) {
- zm <<= 1;
- ze--;
+ if (hzm == 0) {
+ /* left shift by 63 or 64 bits */
+ if ((int64_t)lzm < 0) {
+ /* MSB of lzm is the explicit bit */
+ hzm = lzm >> 1;
+ lzm = lzm << 63;
+ ze -= 63;
+ } else {
+ hzm = lzm;
+ lzm = 0;
+ ze -= 64;
+ }
+ }
+ t = 0;
+ while ((hzm >> (62 - t)) == 0) t++;
+
+ assert(t <= 62);
+ if (t) {
+ hzm = hzm << t | lzm >> (64 - t);
+ lzm = lzm << t;
+ ze -= t;
}
}
- return ieee754dp_format(zs, ze, zm);
+ /*
+ * Move explicit bit from bit 126 to bit 55 since the
+ * ieee754dp_format code expects the mantissa to be
+ * 56 bits wide (53 + 3 rounding bits).
+ */
+ srl128(&hzm, &lzm, (126 - 55));
+
+ return ieee754dp_format(zs, ze, lzm);
}
union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
--
2.7.4
From: Aleksandar Markovic <[email protected]>
Fix definition and usage of maddf_flags enumeration. Avoid duplicate
definition and apply more common capitalization.
This patch does not change any scenario. It just make MADDF and MSUBF
emulation code more readable and easier to maintain, and hopefully
also prevents future bugs.
Signed-off-by: Miodrag Dinic <[email protected]>
Signed-off-by: Goran Ferenc <[email protected]>
Signed-off-by: Aleksandar Markovic <[email protected]>
---
arch/mips/math-emu/dp_maddf.c | 19 ++++++++-----------
arch/mips/math-emu/ieee754int.h | 4 ++++
arch/mips/math-emu/sp_maddf.c | 19 ++++++++-----------
3 files changed, 20 insertions(+), 22 deletions(-)
diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
index 68b55c8..d36f01a 100644
--- a/arch/mips/math-emu/dp_maddf.c
+++ b/arch/mips/math-emu/dp_maddf.c
@@ -14,9 +14,6 @@
#include "ieee754dp.h"
-enum maddf_flags {
- maddf_negate_product = 1 << 0,
-};
/* 128 bits shift right logical with rounding. */
void srl128(u64 *hptr, u64 *lptr, int count)
@@ -111,8 +108,8 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
if ((zc == IEEE754_CLASS_INF) &&
- ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
- ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ ((!(flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))) ||
+ ((flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))))) {
/*
* Cases of addition of infinities with opposite signs
* or subtraction of infinities with same signs.
@@ -124,9 +121,9 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
* z is here either not infinity, or infinity of the same sign
* as maddf_negate_product * x * y. So, the result must be
* infinity, and its sign is determined only by the value of
- * (flags & maddf_negate_product) and the signs of x and y.
+ * (flags & MADDF_NEGATE_PRODUCT) and the signs of x and y.
*/
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
return ieee754dp_inf(1 ^ (xs ^ ys));
else
return ieee754dp_inf(xs ^ ys);
@@ -140,8 +137,8 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
return ieee754dp_inf(zs);
/* Handle cases +0 + (-0) and similar ones. */
if (zc == IEEE754_CLASS_ZERO) {
- if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
- ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
+ if ((!(flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))) ||
+ ((flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))))
return z;
else
return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
@@ -184,7 +181,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
re = xe + ye;
rs = xs ^ ys;
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
rs ^= 1;
/* shunt to top of word */
@@ -335,5 +332,5 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
union ieee754dp y)
{
- return _dp_maddf(z, x, y, maddf_negate_product);
+ return _dp_maddf(z, x, y, MADDF_NEGATE_PRODUCT);
}
diff --git a/arch/mips/math-emu/ieee754int.h b/arch/mips/math-emu/ieee754int.h
index 8bc2f69..dd2071f 100644
--- a/arch/mips/math-emu/ieee754int.h
+++ b/arch/mips/math-emu/ieee754int.h
@@ -26,6 +26,10 @@
#define CLPAIR(x, y) ((x)*6+(y))
+enum maddf_flags {
+ MADDF_NEGATE_PRODUCT = 1 << 0,
+};
+
static inline void ieee754_clearcx(void)
{
ieee754_csr.cx = 0;
diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
index b380189..715cc47 100644
--- a/arch/mips/math-emu/sp_maddf.c
+++ b/arch/mips/math-emu/sp_maddf.c
@@ -14,9 +14,6 @@
#include "ieee754sp.h"
-enum maddf_flags {
- maddf_negate_product = 1 << 0,
-};
static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
union ieee754sp y, enum maddf_flags flags)
@@ -81,8 +78,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
if ((zc == IEEE754_CLASS_INF) &&
- ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
- ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
+ ((!(flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))) ||
+ ((flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))))) {
/*
* Cases of addition of infinities with opposite signs
* or subtraction of infinities with same signs.
@@ -94,9 +91,9 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
* z is here either not infinity, or infinity of the same sign
* as maddf_negate_product * x * y. So, the result must be
* infinity, and its sign is determined only by the value of
- * (flags & maddf_negate_product) and the signs of x and y.
+ * (flags & MADDF_NEGATE_PRODUCT) and the signs of x and y.
*/
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
return ieee754sp_inf(1 ^ (xs ^ ys));
else
return ieee754sp_inf(xs ^ ys);
@@ -110,8 +107,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
return ieee754sp_inf(zs);
/* Handle cases +0 + (-0) and similar ones. */
if (zc == IEEE754_CLASS_ZERO) {
- if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
- ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
+ if ((!(flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))) ||
+ ((flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))))
return z;
else
return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
@@ -156,7 +153,7 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
re = xe + ye;
rs = xs ^ ys;
- if (flags & maddf_negate_product)
+ if (flags & MADDF_NEGATE_PRODUCT)
rs ^= 1;
/* Multiple 24 bit xm and ym to give 48 bit results */
@@ -255,5 +252,5 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
union ieee754sp y)
{
- return _sp_maddf(z, x, y, maddf_negate_product);
+ return _sp_maddf(z, x, y, MADDF_NEGATE_PRODUCT);
}
--
2.7.4
On Fri, Jul 21, 2017 at 04:09:02PM +0200, Aleksandar Markovic wrote:
> From: Goran Ferenc <[email protected]>
>
> Extend clobber lists to include all GP registers.
>
Consider adding:
Fixes: 0b523a85e134 ("MIPS: VDSO: Add implementation of gettimeofday() fallback")
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/vdso/gettimeofday.c | 6 ++++--
> 1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/arch/mips/vdso/gettimeofday.c b/arch/mips/vdso/gettimeofday.c
> index 974276e..e2690d7 100644
> --- a/arch/mips/vdso/gettimeofday.c
> +++ b/arch/mips/vdso/gettimeofday.c
> @@ -35,7 +35,8 @@ static __always_inline long gettimeofday_fallback(struct timeval *_tv,
> " syscall\n"
> : "=r" (ret), "=r" (error)
> : "r" (tv), "r" (tz), "r" (nr)
> - : "memory");
> + : "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13",
> + "$14", "$15", "$24", "$25", "hi", "lo", "memory");
>
> return error ? -ret : ret;
> }
> @@ -55,7 +56,8 @@ static __always_inline long clock_gettime_fallback(clockid_t _clkid,
> " syscall\n"
> : "=r" (ret), "=r" (error)
> : "r" (clkid), "r" (ts), "r" (nr)
> - : "memory");
> + : "$1", "$3", "$8", "$9", "$10", "$11", "$12", "$13",
> + "$14", "$15", "$24", "$25", "hi", "lo", "memory");
>
> return error ? -ret : ret;
> }
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:03PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S>, if both inputs
> are quiet NaNs. The specifications of <MAX|MAXA|MIN|MINA>.<D|S> state
> that the returned value in such cases should be the quiet NaN
> contained in register fs.
>
> The relevant example:
>
> MAX.S fd,fs,ft:
> If fs contains qNaN1, and ft contains qNaN2, fd is going to contain
> qNaN1 (without this patch, it used to contain qNaN2).
>
Consider adding:
Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Consider adding:
Cc: <[email protected]> # 4.3+
> ---
> arch/mips/math-emu/dp_fmax.c | 8 ++++++--
> arch/mips/math-emu/dp_fmin.c | 8 ++++++--
> arch/mips/math-emu/sp_fmax.c | 8 ++++++--
> arch/mips/math-emu/sp_fmin.c | 8 ++++++--
> 4 files changed, 24 insertions(+), 8 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
> index fd71b8d..567fc33 100644
> --- a/arch/mips/math-emu/dp_fmax.c
> +++ b/arch/mips/math-emu/dp_fmax.c
> @@ -47,6 +47,9 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754dp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
couldn't the above...
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -54,7 +57,6 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
... go somewhere around here and fall through to the existing return x
case?
and same below of course.
Otherwise:
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> @@ -147,6 +149,9 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754dp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -154,7 +159,6 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
> index c1072b0..77f7ca9 100644
> --- a/arch/mips/math-emu/dp_fmin.c
> +++ b/arch/mips/math-emu/dp_fmin.c
> @@ -47,6 +47,9 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754dp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -54,7 +57,6 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> @@ -147,6 +149,9 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754dp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -154,7 +159,6 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
> index 4d00084..d46e8e4 100644
> --- a/arch/mips/math-emu/sp_fmax.c
> +++ b/arch/mips/math-emu/sp_fmax.c
> @@ -47,6 +47,9 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754sp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -54,7 +57,6 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> @@ -147,6 +149,9 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754sp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -154,7 +159,6 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
> index 4eb1bb9..b528c4b 100644
> --- a/arch/mips/math-emu/sp_fmin.c
> +++ b/arch/mips/math-emu/sp_fmin.c
> @@ -47,6 +47,9 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754sp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -54,7 +57,6 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> @@ -147,6 +149,9 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> return ieee754sp_nanxcpt(x);
>
> + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> + return x;
> +
> /* numbers are preferred to NaNs */
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> @@ -154,7 +159,6 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:04PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S>, if both inputs
> are zeros. The right behavior in such cases is stated in instruction
> reference manual and is as follows:
>
> fs ft MAX MIN MAXA MINA
> ---------------------------------------------
> 0 0 0 0 0 0
> 0 -0 0 -0 0 -0
> -0 0 0 -0 0 -0
> -0 -0 -0 -0 -0 -0
To be fair I think the min behaviour was already technically correct.
When the values matched it returned that value, and when they didn't
match it returned -0, so max could have just been fixed to return
ieee754*p_zero(0), but its fine IMO to rewrite both like you have.
>
> The relevant example:
>
> MAX.S fd,fs,ft:
> If fs contains +0, and ft contains -0, fd is going to contain 0
> (without this patch, it used to contain -0).
>
Consider Fixes and Cc stable as with other patch
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Otherwise
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/math-emu/dp_fmax.c | 8 ++------
> arch/mips/math-emu/dp_fmin.c | 8 ++------
> arch/mips/math-emu/sp_fmax.c | 8 ++------
> arch/mips/math-emu/sp_fmin.c | 8 ++------
> 4 files changed, 8 insertions(+), 24 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
> index 567fc33..9517572 100644
> --- a/arch/mips/math-emu/dp_fmax.c
> +++ b/arch/mips/math-emu/dp_fmax.c
> @@ -82,9 +82,7 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
> return ys ? x : y;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754dp_zero(1);
> + return ieee754dp_zero(xs & ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> DPDNORMX;
> @@ -184,9 +182,7 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
> return y;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754dp_zero(1);
> + return ieee754dp_zero(xs & ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> DPDNORMX;
> diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
> index 77f7ca9..7069320 100644
> --- a/arch/mips/math-emu/dp_fmin.c
> +++ b/arch/mips/math-emu/dp_fmin.c
> @@ -82,9 +82,7 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
> return ys ? y : x;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754dp_zero(1);
> + return ieee754dp_zero(xs | ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> DPDNORMX;
> @@ -184,9 +182,7 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> return y;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754dp_zero(1);
> + return ieee754dp_zero(xs | ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> DPDNORMX;
> diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
> index d46e8e4..d72111a 100644
> --- a/arch/mips/math-emu/sp_fmax.c
> +++ b/arch/mips/math-emu/sp_fmax.c
> @@ -82,9 +82,7 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
> return ys ? x : y;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754sp_zero(1);
> + return ieee754sp_zero(xs & ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> SPDNORMX;
> @@ -184,9 +182,7 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
> return y;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754sp_zero(1);
> + return ieee754sp_zero(xs & ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> SPDNORMX;
> diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
> index b528c4b..61ff9c6 100644
> --- a/arch/mips/math-emu/sp_fmin.c
> +++ b/arch/mips/math-emu/sp_fmin.c
> @@ -82,9 +82,7 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
> return ys ? y : x;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754sp_zero(1);
> + return ieee754sp_zero(xs | ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> SPDNORMX;
> @@ -184,9 +182,7 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> return y;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> - if (xs == ys)
> - return x;
> - return ieee754sp_zero(1);
> + return ieee754sp_zero(xs | ys);
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_DNORM):
> SPDNORMX;
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:05PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the value returned by <MAX|MIN>.<D|S>, if both inputs are negative
> normal fp numbers. The previous logic did not take into account that
> if both inputs have the same sign, there should be separate treatment
> of the cases when both inputs are negative and when both inputs are
> positive.
>
> The relevant example:
>
> MAX.S fd,fs,ft:
> If fs contains -5, and ft contains -7, fd is going to contain -5
> (without this patch, it used to contain -7).
ouch!
>
same fixes/stable comment as for previous min/max patches
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/math-emu/dp_fmax.c | 33 +++++++++++++++++++++++++--------
> arch/mips/math-emu/dp_fmin.c | 33 +++++++++++++++++++++++++--------
> arch/mips/math-emu/sp_fmax.c | 33 +++++++++++++++++++++++++--------
> arch/mips/math-emu/sp_fmin.c | 32 +++++++++++++++++++++++++-------
> 4 files changed, 100 insertions(+), 31 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
> index 9517572..a0175cc 100644
> --- a/arch/mips/math-emu/dp_fmax.c
> +++ b/arch/mips/math-emu/dp_fmax.c
> @@ -106,16 +106,33 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
> else if (xs < ys)
> return x;
>
> - /* Compare exponent */
> - if (xe > ye)
> - return x;
> - else if (xe < ye)
> - return y;
> + /* Signs of inputs are the same, let's compare exponents */
> + if (xs == 0) {
> + /* Inputs are both positive */
> + if (xe > ye)
> + return x;
> + else if (xe < ye)
> + return y;
> + } else {
> + /* Inputs are both negative */
> + if (xe > ye)
> + return y;
> + else if (xe < ye)
> + return x;
> + }
>
> - /* Compare mantissa */
> - if (xm <= ym)
> + /* Signs and exponents of inputs are the same, let's compare mantissas */
> + if (xs == 0) {
> + /* Inputs are both positive, with equal exponents */
> + if (xm <= ym)
> + return y;
> + return x;
> + } else {
> + /* Inputs are both negative, with equal exponents */
> + if (xm <= ym)
> + return x;
> return y;
> - return x;
> + }
> }
>
> union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
> diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
> index 7069320..074a858 100644
> --- a/arch/mips/math-emu/dp_fmin.c
> +++ b/arch/mips/math-emu/dp_fmin.c
> @@ -106,16 +106,33 @@ union ieee754dp ieee754dp_fmin(union ieee754dp x, union ieee754dp y)
> else if (xs < ys)
> return y;
>
> - /* Compare exponent */
> - if (xe > ye)
> - return y;
> - else if (xe < ye)
> - return x;
> + /* Signs of inputs are the same, let's compare exponents */
> + if (xs == 0) {
> + /* Inputs are both positive */
> + if (xe > ye)
> + return y;
> + else if (xe < ye)
> + return x;
> + } else {
> + /* Inputs are both negative */
> + if (xe > ye)
> + return x;
> + else if (xe < ye)
> + return y;
> + }
>
> - /* Compare mantissa */
> - if (xm <= ym)
> + /* Signs and exponents of inputs are the same, let's compare mantissas */
> + if (xs == 0) {
> + /* Inputs are both positive, with equal exponents */
> + if (xm <= ym)
> + return x;
> + return y;
> + } else {
> + /* Inputs are both negative, with equal exponents */
> + if (xm <= ym)
> + return y;
> return x;
> - return y;
> + }
> }
>
> union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
> index d72111a..15825db 100644
> --- a/arch/mips/math-emu/sp_fmax.c
> +++ b/arch/mips/math-emu/sp_fmax.c
> @@ -106,16 +106,33 @@ union ieee754sp ieee754sp_fmax(union ieee754sp x, union ieee754sp y)
> else if (xs < ys)
> return x;
>
> - /* Compare exponent */
> - if (xe > ye)
> - return x;
> - else if (xe < ye)
> - return y;
> + /* Signs of inputs are the same, let's compare exponents */
> + if (xs == 0) {
> + /* Inputs are both positive */
> + if (xe > ye)
> + return x;
> + else if (xe < ye)
> + return y;
> + } else {
> + /* Inputs are both negative */
> + if (xe > ye)
> + return y;
> + else if (xe < ye)
> + return x;
> + }
>
> - /* Compare mantissa */
> - if (xm <= ym)
> + /* Signs and exponents of inputs are the same, let's compare mantissas */
> + if (xs == 0) {
> + /* Inputs are both positive, with equal exponents */
> + if (xm <= ym)
> + return y;
> + return x;
> + } else {
> + /* Inputs are both negative, with equal exponents */
> + if (xm <= ym)
> + return x;
> return y;
> - return x;
> + }
> }
>
> union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
> diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
> index 61ff9c6..f1418f7 100644
> --- a/arch/mips/math-emu/sp_fmin.c
> +++ b/arch/mips/math-emu/sp_fmin.c
> @@ -106,16 +106,34 @@ union ieee754sp ieee754sp_fmin(union ieee754sp x, union ieee754sp y)
> else if (xs < ys)
> return y;
>
> - /* Compare exponent */
> - if (xe > ye)
> + /* Signs of inputs are the same, let's compare exponents */
> + if (xs == 0) {
> + /* Inputs are both positive */
> + if (xe > ye)
> + return y;
> + else if (xe < ye)
> + return x;
> + } else {
> + /* Inputs are both negative */
> + if (xe > ye)
> + return x;
> + else if (xe < ye)
> + return y;
> + }
> +
> + /* Signs and exponents of inputs are the same, let's compare mantissas */
> + if (xs == 0) {
> + /* Inputs are both positive, with equal exponents */
> + if (xm <= ym)
> + return x;
> return y;
> - else if (xe < ye)
> + } else {
> + /* Inputs are both negative, with equal exponents */
> + if (xm <= ym)
> + return y;
> return x;
> + }
>
> - /* Compare mantissa */
> - if (xm <= ym)
> - return x;
> - return y;
> }
>
> union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:06PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the value returned by <MAXA|MINA>.<D|S>, if inputs are normal fp
> numbers of the same absolute value, but opposite signs.
>
> The relevant example:
>
> MAXA.S fd,fs,ft:
> If fs contains -3, and ft contains +3, fd is going to contain +3
> (without this patch, it used to contain -3).
I think its worth mentioning also that for MINA.*, it returns the
negative one when the absolute values are equal (The phrase "For equal
absolute values, returns the smallest positive argument" in the manual
is a bit ambiguous IMO, so I ended up checking what I6500 did).
>
Usual fixes/stable thing.
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Thanks
James
> ---
> arch/mips/math-emu/dp_fmax.c | 8 ++++++--
> arch/mips/math-emu/dp_fmin.c | 6 +++++-
> arch/mips/math-emu/sp_fmax.c | 8 ++++++--
> arch/mips/math-emu/sp_fmin.c | 6 +++++-
> 4 files changed, 22 insertions(+), 6 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
> index a0175cc..860b43f9 100644
> --- a/arch/mips/math-emu/dp_fmax.c
> +++ b/arch/mips/math-emu/dp_fmax.c
> @@ -224,7 +224,11 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
> return y;
>
> /* Compare mantissa */
> - if (xm <= ym)
> + if (xm < ym)
> return y;
> - return x;
> + else if (xm > ym)
> + return x;
> + else if (xs == 0)
> + return x;
> + return y;
> }
> diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
> index 074a858..73d85e4 100644
> --- a/arch/mips/math-emu/dp_fmin.c
> +++ b/arch/mips/math-emu/dp_fmin.c
> @@ -224,7 +224,11 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> return x;
>
> /* Compare mantissa */
> - if (xm <= ym)
> + if (xm < ym)
> + return x;
> + else if (xm > ym)
> + return y;
> + else if (xs == 1)
> return x;
> return y;
> }
> diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
> index 15825db..fec7f64 100644
> --- a/arch/mips/math-emu/sp_fmax.c
> +++ b/arch/mips/math-emu/sp_fmax.c
> @@ -224,7 +224,11 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
> return y;
>
> /* Compare mantissa */
> - if (xm <= ym)
> + if (xm < ym)
> return y;
> - return x;
> + else if (xm > ym)
> + return x;
> + else if (xs == 0)
> + return x;
> + return y;
> }
> diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
> index f1418f7..74780bc 100644
> --- a/arch/mips/math-emu/sp_fmin.c
> +++ b/arch/mips/math-emu/sp_fmin.c
> @@ -225,7 +225,11 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> return x;
>
> /* Compare mantissa */
> - if (xm <= ym)
> + if (xm < ym)
> + return x;
> + else if (xm > ym)
> + return y;
> + else if (xs == 1)
> return x;
> return y;
> }
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:07PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the value returned by <MAXA|MINA>.<D|S> fd,fs,ft, if both inputs
> are infinite. The previous implementation returned always the value
> contained in ft in such cases. The correct behavior is specified
> in Mips instruction set manual and is as follows:
>
> fs ft MAXA MINA
> ---------------------------------
> inf inf inf inf
> inf -inf inf -inf
> -inf inf inf -inf
> -inf -inf -inf -inf
>
> The relevant example:
>
> MAXA.S fd,fs,ft:
> If fs contains +inf, and ft contains -inf, fd is going to contain
> +inf (without this patch, it used to contain -inf).
>
Same Fixes/stable thing
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/math-emu/dp_fmax.c | 4 +++-
> arch/mips/math-emu/dp_fmin.c | 4 +++-
> arch/mips/math-emu/sp_fmax.c | 4 +++-
> arch/mips/math-emu/sp_fmin.c | 4 +++-
> 4 files changed, 12 insertions(+), 4 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
> index 860b43f9..5459643 100644
> --- a/arch/mips/math-emu/dp_fmax.c
> +++ b/arch/mips/math-emu/dp_fmax.c
> @@ -183,6 +183,9 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
> /*
> * Infinity and zero handling
> */
> + case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> + return ieee754dp_inf(xs & ys);
> +
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> @@ -190,7 +193,6 @@ union ieee754dp ieee754dp_fmaxa(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
> index 73d85e4..d4cd243 100644
> --- a/arch/mips/math-emu/dp_fmin.c
> +++ b/arch/mips/math-emu/dp_fmin.c
> @@ -183,6 +183,9 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> /*
> * Infinity and zero handling
> */
> + case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> + return ieee754dp_inf(xs | ys);
> +
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> @@ -190,7 +193,6 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> diff --git a/arch/mips/math-emu/sp_fmax.c b/arch/mips/math-emu/sp_fmax.c
> index fec7f64..528a90b 100644
> --- a/arch/mips/math-emu/sp_fmax.c
> +++ b/arch/mips/math-emu/sp_fmax.c
> @@ -183,6 +183,9 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
> /*
> * Infinity and zero handling
> */
> + case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> + return ieee754sp_inf(xs & ys);
> +
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> @@ -190,7 +193,6 @@ union ieee754sp ieee754sp_fmaxa(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
> index 74780bc..5f1d650 100644
> --- a/arch/mips/math-emu/sp_fmin.c
> +++ b/arch/mips/math-emu/sp_fmin.c
> @@ -184,6 +184,9 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> /*
> * Infinity and zero handling
> */
> + case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> + return ieee754sp_inf(xs | ys);
> +
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> @@ -191,7 +194,6 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> return x;
>
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:08PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix following special cases for MINA>.<D|S>:
>
> - if one of the inputs is zero, and the other is subnormal, normal,
> or infinity, the value of the former should be returned (that is,
> a zero).
> - if one of the inputs is infinity, and the other input is normal,
> or subnormal, the value of the latter should be returned.
>
> The previous implementation's logic for such cases was incorrect - it
> appears as if it implements MAXA, and not MINA instruction.
>
> The relevant example:
>
> MINA.S fd,fs,ft:
> If fs contains 100.0, and ft contains 0.0, fd is going to contain
> 0.0 (without this patch, it used to contain 100.0).
Another ouch!
>
Fixes/stable
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/math-emu/dp_fmin.c | 4 ++--
> arch/mips/math-emu/sp_fmin.c | 4 ++--
> 2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_fmin.c b/arch/mips/math-emu/dp_fmin.c
> index d4cd243..1e9ee3d 100644
> --- a/arch/mips/math-emu/dp_fmin.c
> +++ b/arch/mips/math-emu/dp_fmin.c
> @@ -191,14 +191,14 @@ union ieee754dp ieee754dp_fmina(union ieee754dp x, union ieee754dp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> - return x;
> + return y;
>
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
> - return y;
> + return x;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> return ieee754dp_zero(xs | ys);
> diff --git a/arch/mips/math-emu/sp_fmin.c b/arch/mips/math-emu/sp_fmin.c
> index 5f1d650..685ce75 100644
> --- a/arch/mips/math-emu/sp_fmin.c
> +++ b/arch/mips/math-emu/sp_fmin.c
> @@ -192,14 +192,14 @@ union ieee754sp ieee754sp_fmina(union ieee754sp x, union ieee754sp y)
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> - return x;
> + return y;
>
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_DNORM):
> - return y;
> + return x;
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> return ieee754sp_zero(xs | ys);
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:09PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the cases of <MADDF|MSUBF>.<D|S> when any of three inputs is any
> NaN. Correct behavior of <MADDF|MSUBF>.<D|S> fd, fs, ft is following:
>
> - if any of inputs is sNaN, return a sNaN using following rules: if
> only one input is sNaN, return that one; if more than one input is
> sNaN, order of precedence for return value is fd, fs, ft
> - if no input is sNaN, but at least one of inputs is qNaN, return a
> qNaN using following rules: if only one input is qNaN, return that
> one; if more than one input is qNaN, order of precedence for
> return value is fd, fs, ft
>
> The previous code contained handling of some above cases, but not all.
> Also, such handling was scattered into various cases of
> "switch (CLPAIR(xc, yc))" statement and elsewhere. With this patch,
> this logic is placed in one place, and "switch (CLPAIR(xc, yc))" is
> significantly simplified.
>
> The relevant example:
>
> MADDF.S fd,fs,ft:
> If fs contains qNaN1, ft contains qNaN2, and fd contains qNaN3, fd
> is going to contain qNaN3 (without this patch, it used to contain
> qNaN1).
>
Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
Fixes: 83d43305a1df ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
If backported, I suspect commits:
6162051e87f6 ("MIPS: math-emu: Unify ieee754sp_m{add,sub}f")
and
d728f6709bcc ("MIPS: math-emu: Unify ieee754dp_m{add,sub}f")
in 4.7 will require manual backporting between 4.3 and 4.7 (due to
separation of maddf/msubf before that point), so I suppose tagging
stable 4.7+ and backporting is best (assuming you consider this fix
worth backporting).
> ---
> arch/mips/math-emu/dp_maddf.c | 71 ++++++++++++++-----------------------------
> arch/mips/math-emu/sp_maddf.c | 69 ++++++++++++++---------------------------
> 2 files changed, 46 insertions(+), 94 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
> index caa62f2..4f2e783 100644
> --- a/arch/mips/math-emu/dp_maddf.c
> +++ b/arch/mips/math-emu/dp_maddf.c
> @@ -48,52 +48,35 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
>
> ieee754_clearcx();
>
> - switch (zc) {
> - case IEEE754_CLASS_SNAN:
> - ieee754_setcx(IEEE754_INVALID_OPERATION);
> - return ieee754dp_nanxcpt(z);
> - case IEEE754_CLASS_DNORM:
> - DPDNORMZ;
> - /* QNAN and ZERO cases are handled separately below */
> - }
> -
> - switch (CLPAIR(xc, yc)) {
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
> - return ieee754dp_nanxcpt(y);
> -
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> - return ieee754dp_nanxcpt(x);
> -
> - case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> + /* handle the cases when at least one of x, y or z is a NaN */
> + if (((xc == IEEE754_CLASS_SNAN) || (xc == IEEE754_CLASS_QNAN)) ||
> + ((yc == IEEE754_CLASS_SNAN) || (yc == IEEE754_CLASS_QNAN)) ||
> + ((zc == IEEE754_CLASS_SNAN) || (zc == IEEE754_CLASS_QNAN))) {
This condition basically covers all of the cases below. Any particular
reason not to skip it ...
> + /* order of precedence is z, x, y */
> + if (zc == IEEE754_CLASS_SNAN)
> + return ieee754dp_nanxcpt(z);
> + if (xc == IEEE754_CLASS_SNAN)
> + return ieee754dp_nanxcpt(x);
> + if (yc == IEEE754_CLASS_SNAN)
> + return ieee754dp_nanxcpt(y);
> + if (zc == IEEE754_CLASS_QNAN)
> + return z;
> + if (xc == IEEE754_CLASS_QNAN)
> + return x;
> return y;
... and make this return conditional on (yc == IEEE754_CLASS_QNAN)?
Same for sp_maddf.c too.
Otherwise:
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> + }
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
> - return x;
> + if (zc == IEEE754_CLASS_DNORM)
> + DPDNORMZ;
> + /* ZERO z cases are handled separately below */
>
> + switch (CLPAIR(xc, yc)) {
>
> /*
> * Infinity handling
> */
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> ieee754_setcx(IEEE754_INVALID_OPERATION);
> return ieee754dp_indef();
>
> @@ -102,8 +85,6 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> return ieee754dp_inf(xs ^ ys);
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> @@ -120,25 +101,19 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
> DPDNORMX;
>
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> - else if (zc == IEEE754_CLASS_INF)
> + if (zc == IEEE754_CLASS_INF)
> return ieee754dp_inf(zs);
> DPDNORMY;
> break;
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> - else if (zc == IEEE754_CLASS_INF)
> + if (zc == IEEE754_CLASS_INF)
> return ieee754dp_inf(zs);
> DPDNORMX;
> break;
>
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> - else if (zc == IEEE754_CLASS_INF)
> + if (zc == IEEE754_CLASS_INF)
> return ieee754dp_inf(zs);
> /* fall through to real computations */
> }
> diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
> index c91d5e5..9fd2035 100644
> --- a/arch/mips/math-emu/sp_maddf.c
> +++ b/arch/mips/math-emu/sp_maddf.c
> @@ -48,51 +48,36 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
>
> ieee754_clearcx();
>
> - switch (zc) {
> - case IEEE754_CLASS_SNAN:
> - ieee754_setcx(IEEE754_INVALID_OPERATION);
> - return ieee754sp_nanxcpt(z);
> - case IEEE754_CLASS_DNORM:
> - SPDNORMZ;
> - /* QNAN and ZERO cases are handled separately below */
> + /* handle the cases when at least one of x, y or z is a NaN */
> + if (((xc == IEEE754_CLASS_SNAN) || (xc == IEEE754_CLASS_QNAN)) ||
> + ((yc == IEEE754_CLASS_SNAN) || (yc == IEEE754_CLASS_QNAN)) ||
> + ((zc == IEEE754_CLASS_SNAN) || (zc == IEEE754_CLASS_QNAN))) {
> + /* order of precedence is z, x, y */
> + if (zc == IEEE754_CLASS_SNAN)
> + return ieee754sp_nanxcpt(z);
> + if (xc == IEEE754_CLASS_SNAN)
> + return ieee754sp_nanxcpt(x);
> + if (yc == IEEE754_CLASS_SNAN)
> + return ieee754sp_nanxcpt(y);
> + if (zc == IEEE754_CLASS_QNAN)
> + return z;
> + if (xc == IEEE754_CLASS_QNAN)
> + return x;
> + return y;
> }
>
> + if (zc == IEEE754_CLASS_DNORM)
> + SPDNORMZ;
> + /* ZERO z cases are handled separately below */
> +
> switch (CLPAIR(xc, yc)) {
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_SNAN):
> - return ieee754sp_nanxcpt(y);
> -
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_SNAN):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_ZERO):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_NORM):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_DNORM):
> - case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> - return ieee754sp_nanxcpt(x);
> -
> - case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_QNAN):
> - return y;
>
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_ZERO):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_NORM):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_DNORM):
> - case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_INF):
> - return x;
>
> /*
> * Infinity handling
> */
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_INF):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> ieee754_setcx(IEEE754_INVALID_OPERATION);
> return ieee754sp_indef();
>
> @@ -101,8 +86,6 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> return ieee754sp_inf(xs ^ ys);
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> @@ -119,25 +102,19 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> SPDNORMX;
>
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_DNORM):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> - else if (zc == IEEE754_CLASS_INF)
> + if (zc == IEEE754_CLASS_INF)
> return ieee754sp_inf(zs);
> SPDNORMY;
> break;
>
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_NORM):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> - else if (zc == IEEE754_CLASS_INF)
> + if (zc == IEEE754_CLASS_INF)
> return ieee754sp_inf(zs);
> SPDNORMX;
> break;
>
> case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_NORM):
> - if (zc == IEEE754_CLASS_QNAN)
> - return z;
> - else if (zc == IEEE754_CLASS_INF)
> + if (zc == IEEE754_CLASS_INF)
> return ieee754sp_inf(zs);
> /* fall through to real computations */
> }
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:10PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the cases of <MADDF|MSUBF>.<D|S> when any of two multiplicands is
> infinity. The correct behavior in such cases is affected by the nature
> of third input. Cases of addition of infinities with opposite signs
> and subtraction of infinities with same signs may arise and must be
> handles separately. Also, the value od flags argument (that determines
s/handles/handled/?
s/od/of/
> whether the instruction is MADDF or MSUBF) affects the outcome.
>
> The relevant examples:
>
> MADDF.S fd,fs,ft:
> If fs contains +inf, ft contains +inf, and fd contains -inf, fd is
> going to contain indef (without this patch, it used to contain
> -inf).
>
> MSUBF.S fd,fs,ft:
> If fs contains +inf, ft contains 1.0, and fd contains +0.0, fd is
> going to contain -inf (without this patch, it used to contain +inf).
>
Same fixes/stable notes as previous patch.
> Signed-off-by: Douglas Leung <[email protected]>
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/math-emu/dp_maddf.c | 21 ++++++++++++++++++++-
> arch/mips/math-emu/sp_maddf.c | 21 ++++++++++++++++++++-
> 2 files changed, 40 insertions(+), 2 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
> index 4f2e783..45f815d 100644
> --- a/arch/mips/math-emu/dp_maddf.c
> +++ b/arch/mips/math-emu/dp_maddf.c
> @@ -85,7 +85,26 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> - return ieee754dp_inf(xs ^ ys);
> + if ((zc == IEEE754_CLASS_INF) &&
> + ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
> + ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
> + /*
> + * Cases of addition of infinities with opposite signs
> + * or subtraction of infinities with same signs.
> + */
> + ieee754_setcx(IEEE754_INVALID_OPERATION);
> + return ieee754dp_indef();
> + }
> + /*
> + * z is here either not infinity, or infinity of the same sign
> + * as maddf_negate_product * x * y. So, the result must be
> + * infinity, and its sign is determined only by the value of
> + * (flags & maddf_negate_product) and the signs of x and y.
> + */
> + if (flags & maddf_negate_product)
> + return ieee754dp_inf(1 ^ (xs ^ ys));
> + else
> + return ieee754dp_inf(xs ^ ys);
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
> diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
> index 9fd2035..76856d7 100644
> --- a/arch/mips/math-emu/sp_maddf.c
> +++ b/arch/mips/math-emu/sp_maddf.c
> @@ -86,7 +86,26 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_NORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> - return ieee754sp_inf(xs ^ ys);
> + if ((zc == IEEE754_CLASS_INF) &&
> + ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
> + ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
> + /*
> + * Cases of addition of infinities with opposite signs
> + * or subtraction of infinities with same signs.
> + */
> + ieee754_setcx(IEEE754_INVALID_OPERATION);
> + return ieee754sp_indef();
> + }
> + /*
> + * z is here either not infinity, or infinity of the same sign
> + * as maddf_negate_product * x * y. So, the result must be
> + * infinity, and its sign is determined only by the value of
> + * (flags & maddf_negate_product) and the signs of x and y.
> + */
> + if (flags & maddf_negate_product)
> + return ieee754sp_inf(1 ^ (xs ^ ys));
> + else
> + return ieee754sp_inf(xs ^ ys);
>
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_ZERO):
> case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_NORM):
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:11PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix the cases of <MADDF|MSUBF>.<D|S> when any of two multiplicands is
> +0 or -0, and the third input is also +0 or -0. Depending on the signs
> of inputs, certain special cases must be handled.
>
> The relevant example:
>
> MADDF.S fd,fs,ft:
> If fs contains +0.0, ft contains -0.0, and fd contains 0.0, fd is
> going to contain +0.0 (without this patch, it used to contain -0.0).
>
Usual fixes/stable comments.
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Patch looks correct to me.
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/math-emu/dp_maddf.c | 8 ++++++++
> arch/mips/math-emu/sp_maddf.c | 8 ++++++++
> 2 files changed, 16 insertions(+)
>
> diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
> index 45f815d..b8b2c17 100644
> --- a/arch/mips/math-emu/dp_maddf.c
> +++ b/arch/mips/math-emu/dp_maddf.c
> @@ -113,6 +113,14 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> if (zc == IEEE754_CLASS_INF)
> return ieee754dp_inf(zs);
> + /* Handle cases +0 + (-0) and similar ones. */
> + if (zc == IEEE754_CLASS_ZERO) {
> + if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
> + ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
> + return z;
> + else
> + return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
> + }
> /* Multiplication is 0 so just return z */
> return z;
>
> diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
> index 76856d7..cb8597b 100644
> --- a/arch/mips/math-emu/sp_maddf.c
> +++ b/arch/mips/math-emu/sp_maddf.c
> @@ -114,6 +114,14 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> case CLPAIR(IEEE754_CLASS_DNORM, IEEE754_CLASS_ZERO):
> if (zc == IEEE754_CLASS_INF)
> return ieee754sp_inf(zs);
> + /* Handle cases +0 + (-0) and similar ones. */
> + if (zc == IEEE754_CLASS_ZERO) {
> + if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
> + ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
> + return z;
> + else
> + return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
> + }
> /* Multiplication is 0 so just return z */
> return z;
>
> --
> 2.7.4
>
On Fri, Jul 21, 2017 at 04:09:14PM +0200, Aleksandar Markovic wrote:
> From: Aleksandar Markovic <[email protected]>
>
> Fix definition and usage of maddf_flags enumeration. Avoid duplicate
> definition and apply more common capitalization.
>
> This patch does not change any scenario. It just make MADDF and MSUBF
> emulation code more readable and easier to maintain, and hopefully
> also prevents future bugs.
>
> Signed-off-by: Miodrag Dinic <[email protected]>
> Signed-off-by: Goran Ferenc <[email protected]>
> Signed-off-by: Aleksandar Markovic <[email protected]>
Reviewed-by: James Hogan <[email protected]>
Cheers
James
> ---
> arch/mips/math-emu/dp_maddf.c | 19 ++++++++-----------
> arch/mips/math-emu/ieee754int.h | 4 ++++
> arch/mips/math-emu/sp_maddf.c | 19 ++++++++-----------
> 3 files changed, 20 insertions(+), 22 deletions(-)
>
> diff --git a/arch/mips/math-emu/dp_maddf.c b/arch/mips/math-emu/dp_maddf.c
> index 68b55c8..d36f01a 100644
> --- a/arch/mips/math-emu/dp_maddf.c
> +++ b/arch/mips/math-emu/dp_maddf.c
> @@ -14,9 +14,6 @@
>
> #include "ieee754dp.h"
>
> -enum maddf_flags {
> - maddf_negate_product = 1 << 0,
> -};
>
> /* 128 bits shift right logical with rounding. */
> void srl128(u64 *hptr, u64 *lptr, int count)
> @@ -111,8 +108,8 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> if ((zc == IEEE754_CLASS_INF) &&
> - ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
> - ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
> + ((!(flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))) ||
> + ((flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))))) {
> /*
> * Cases of addition of infinities with opposite signs
> * or subtraction of infinities with same signs.
> @@ -124,9 +121,9 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
> * z is here either not infinity, or infinity of the same sign
> * as maddf_negate_product * x * y. So, the result must be
> * infinity, and its sign is determined only by the value of
> - * (flags & maddf_negate_product) and the signs of x and y.
> + * (flags & MADDF_NEGATE_PRODUCT) and the signs of x and y.
> */
> - if (flags & maddf_negate_product)
> + if (flags & MADDF_NEGATE_PRODUCT)
> return ieee754dp_inf(1 ^ (xs ^ ys));
> else
> return ieee754dp_inf(xs ^ ys);
> @@ -140,8 +137,8 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
> return ieee754dp_inf(zs);
> /* Handle cases +0 + (-0) and similar ones. */
> if (zc == IEEE754_CLASS_ZERO) {
> - if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
> - ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
> + if ((!(flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))) ||
> + ((flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))))
> return z;
> else
> return ieee754dp_zero(ieee754_csr.rm == FPU_CSR_RD);
> @@ -184,7 +181,7 @@ static union ieee754dp _dp_maddf(union ieee754dp z, union ieee754dp x,
>
> re = xe + ye;
> rs = xs ^ ys;
> - if (flags & maddf_negate_product)
> + if (flags & MADDF_NEGATE_PRODUCT)
> rs ^= 1;
>
> /* shunt to top of word */
> @@ -335,5 +332,5 @@ union ieee754dp ieee754dp_maddf(union ieee754dp z, union ieee754dp x,
> union ieee754dp ieee754dp_msubf(union ieee754dp z, union ieee754dp x,
> union ieee754dp y)
> {
> - return _dp_maddf(z, x, y, maddf_negate_product);
> + return _dp_maddf(z, x, y, MADDF_NEGATE_PRODUCT);
> }
> diff --git a/arch/mips/math-emu/ieee754int.h b/arch/mips/math-emu/ieee754int.h
> index 8bc2f69..dd2071f 100644
> --- a/arch/mips/math-emu/ieee754int.h
> +++ b/arch/mips/math-emu/ieee754int.h
> @@ -26,6 +26,10 @@
>
> #define CLPAIR(x, y) ((x)*6+(y))
>
> +enum maddf_flags {
> + MADDF_NEGATE_PRODUCT = 1 << 0,
> +};
> +
> static inline void ieee754_clearcx(void)
> {
> ieee754_csr.cx = 0;
> diff --git a/arch/mips/math-emu/sp_maddf.c b/arch/mips/math-emu/sp_maddf.c
> index b380189..715cc47 100644
> --- a/arch/mips/math-emu/sp_maddf.c
> +++ b/arch/mips/math-emu/sp_maddf.c
> @@ -14,9 +14,6 @@
>
> #include "ieee754sp.h"
>
> -enum maddf_flags {
> - maddf_negate_product = 1 << 0,
> -};
>
> static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> union ieee754sp y, enum maddf_flags flags)
> @@ -81,8 +78,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_DNORM):
> case CLPAIR(IEEE754_CLASS_INF, IEEE754_CLASS_INF):
> if ((zc == IEEE754_CLASS_INF) &&
> - ((!(flags & maddf_negate_product) && (zs != (xs ^ ys))) ||
> - ((flags & maddf_negate_product) && (zs == (xs ^ ys))))) {
> + ((!(flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))) ||
> + ((flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))))) {
> /*
> * Cases of addition of infinities with opposite signs
> * or subtraction of infinities with same signs.
> @@ -94,9 +91,9 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> * z is here either not infinity, or infinity of the same sign
> * as maddf_negate_product * x * y. So, the result must be
> * infinity, and its sign is determined only by the value of
> - * (flags & maddf_negate_product) and the signs of x and y.
> + * (flags & MADDF_NEGATE_PRODUCT) and the signs of x and y.
> */
> - if (flags & maddf_negate_product)
> + if (flags & MADDF_NEGATE_PRODUCT)
> return ieee754sp_inf(1 ^ (xs ^ ys));
> else
> return ieee754sp_inf(xs ^ ys);
> @@ -110,8 +107,8 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
> return ieee754sp_inf(zs);
> /* Handle cases +0 + (-0) and similar ones. */
> if (zc == IEEE754_CLASS_ZERO) {
> - if ((!(flags & maddf_negate_product) && (zs == (xs ^ ys))) ||
> - ((flags & maddf_negate_product) && (zs != (xs ^ ys))))
> + if ((!(flags & MADDF_NEGATE_PRODUCT) && (zs == (xs ^ ys))) ||
> + ((flags & MADDF_NEGATE_PRODUCT) && (zs != (xs ^ ys))))
> return z;
> else
> return ieee754sp_zero(ieee754_csr.rm == FPU_CSR_RD);
> @@ -156,7 +153,7 @@ static union ieee754sp _sp_maddf(union ieee754sp z, union ieee754sp x,
>
> re = xe + ye;
> rs = xs ^ ys;
> - if (flags & maddf_negate_product)
> + if (flags & MADDF_NEGATE_PRODUCT)
> rs ^= 1;
>
> /* Multiple 24 bit xm and ym to give 48 bit results */
> @@ -255,5 +252,5 @@ union ieee754sp ieee754sp_maddf(union ieee754sp z, union ieee754sp x,
> union ieee754sp ieee754sp_msubf(union ieee754sp z, union ieee754sp x,
> union ieee754sp y)
> {
> - return _sp_maddf(z, x, y, maddf_negate_product);
> + return _sp_maddf(z, x, y, MADDF_NEGATE_PRODUCT);
> }
> --
> 2.7.4
>
> _______________________________________
> From: James Hogan
> Sent: Monday, July 24, 2017 3:24 AM
> To: Aleksandar Markovic
> Cc: [email protected]; Aleksandar Markovic; Miodrag Dinic; Goran Ferenc; Douglas Leung; linux-> [email protected]; Paul Burton; Petar Jovanovic; Raghu Gandham; Ralf Baechle
> Subject: Re: [PATCH v3 11/16] MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix NaN propagation
>
> On Fri, Jul 21, 2017 at 04:09:09PM +0200, Aleksandar Markovic wrote:
> > From: Aleksandar Markovic <[email protected]>
> >
> > Fix the cases of <MADDF|MSUBF>.<D|S> when any of three inputs is any
> > NaN. Correct behavior of <MADDF|MSUBF>.<D|S> fd, fs, ft is following:
> >
> > - if any of inputs is sNaN, return a sNaN using following rules: if
> > only one input is sNaN, return that one; if more than one input is
> > sNaN, order of precedence for return value is fd, fs, ft
> > - if no input is sNaN, but at least one of inputs is qNaN, return a
> > qNaN using following rules: if only one input is qNaN, return that
> > one; if more than one input is qNaN, order of precedence for
> > return value is fd, fs, ft
> >
> > The previous code contained handling of some above cases, but not all.
> > Also, such handling was scattered into various cases of
> > "switch (CLPAIR(xc, yc))" statement and elsewhere. With this patch,
> > this logic is placed in one place, and "switch (CLPAIR(xc, yc))" is
> > significantly simplified.
> >
> > The relevant example:
> >
> > MADDF.S fd,fs,ft:
> > If fs contains qNaN1, ft contains qNaN2, and fd contains qNaN3, fd
> > is going to contain qNaN3 (without this patch, it used to contain
> > qNaN1).
> >
>
> Fixes: e24c3bec3e8e ("MIPS: math-emu: Add support for the MIPS R6 MADDF FPU instruction")
> Fixes: 83d43305a1df ("MIPS: math-emu: Add support for the MIPS R6 MSUBF FPU instruction")
>
In v4, I will add these lines to commit messages of all MADDF/MSUBF patches from this series.
> > Signed-off-by: Miodrag Dinic <[email protected]>
> > Signed-off-by: Goran Ferenc <[email protected]>
> > Signed-off-by: Aleksandar Markovic <[email protected]>
>
> > If backported, I suspect commits:
> > 6162051e87f6 ("MIPS: math-emu: Unify ieee754sp_m{add,sub}f")
> > and
> > d728f6709bcc ("MIPS: math-emu: Unify ieee754dp_m{add,sub}f")
> > in 4.7 will require manual backporting between 4.3 and 4.7 (due to
> > separation of maddf/msubf before that point), so I suppose tagging
> > stable 4.7+ and backporting is best (assuming you consider this fix
> > worth backporting).
I am going to tag all MADDF/MSUBF patches "stable 4.7+" and all MIN/MAX/MINA/MAXA patches "stable 4.3+" in v4.
> > ---
> > arch/mips/math-emu/dp_maddf.c | 71 ++++++++++++++-----------------------------
> > arch/mips/math-emu/sp_maddf.c | 69 ++++++++++++++---------------------------
> > 2 files changed, 46 insertions(+), 94 deletions(-)
> ...
> > + /* handle the cases when at least one of x, y or z is a NaN */
> > + if (((xc == IEEE754_CLASS_SNAN) || (xc == IEEE754_CLASS_QNAN)) ||
> > + ((yc == IEEE754_CLASS_SNAN) || (yc == IEEE754_CLASS_QNAN)) ||
> > + ((zc == IEEE754_CLASS_SNAN) || (zc == IEEE754_CLASS_QNAN))) {
>
> This condition basically covers all of the cases below. Any particular
> reason not to skip it ...
> > + /* order of precedence is z, x, y */
> > + if (zc == IEEE754_CLASS_SNAN)
> > + return ieee754dp_nanxcpt(z);
> > + if (xc == IEEE754_CLASS_SNAN)
> > + return ieee754dp_nanxcpt(x);
> > + if (yc == IEEE754_CLASS_SNAN)
> > + return ieee754dp_nanxcpt(y);
> > + if (zc == IEEE754_CLASS_QNAN)
> > + return z;
> > + if (xc == IEEE754_CLASS_QNAN)
> > + return x;
> > return y;
>
> ... and make this return conditional on (yc == IEEE754_CLASS_QNAN)?
You are right. I am going to reorganize the code as you suggested in v4.
Regards,
Aleksandar
>
> ________________________________________
> From: James Hogan
> Sent: Monday, July 24, 2017 3:39 AM
> To: Aleksandar Markovic
> Cc: [email protected]; Aleksandar Markovic; Douglas Leung; Miodrag Dinic; Goran Ferenc; [email protected]; Paul Burton; Petar Jovanovic; Raghu Gandham; Ralf > Baechle
> Subject: Re: [PATCH v3 12/16] MIPS: math-emu: <MADDF|MSUBF>.<D|S>: Fix some cases of infinite inputs
>
> On Fri, Jul 21, 2017 at 04:09:10PM +0200, Aleksandar Markovic wrote:
> > From: Aleksandar Markovic <[email protected]>
> >
> > Fix the cases of <MADDF|MSUBF>.<D|S> when any of two multiplicands is
> > infinity. The correct behavior in such cases is affected by the nature
> > of third input. Cases of addition of infinities with opposite signs
> > and subtraction of infinities with same signs may arise and must be
> > handles separately. Also, the value od flags argument (that determines
>
> s/handles/handled/?
> s/od/of/
In v4, I am going to fix these typos and also other typos and spelling mistakes
in commit messages of all patches in this series.
Thanks,
Aleksandar
>
> ________________________________________
> From: James Hogan
> Sent: Friday, July 21, 2017 8:42 AM
> To: Aleksandar Markovic
> Cc: [email protected]; Aleksandar Markovic; Miodrag Dinic; Goran Ferenc; Douglas Leung; [email protected]; Paul Burton; Petar Jovanovic; Raghu Gandham; Ralf Baechle
> Subject: Re: [PATCH v3 08/16] MIPS: math-emu: <MAXA|MINA>.<D|S>: Fix cases of input values with opposite signs
>
> On Fri, Jul 21, 2017 at 04:09:06PM +0200, Aleksandar Markovic wrote:
> > From: Aleksandar Markovic <[email protected]>
> >
> > Fix the value returned by <MAXA|MINA>.<D|S>, if inputs are normal fp
> > numbers of the same absolute value, but opposite signs.
> >
> > The relevant example:
> >
> > MAXA.S fd,fs,ft:
> > If fs contains -3, and ft contains +3, fd is going to contain +3
> > (without this patch, it used to contain -3).
>
> I think its worth mentioning also that for MINA.*, it returns the
> negative one when the absolute values are equal (The phrase "For equal
> absolute values, returns the smallest positive argument" in the manual
> is a bit ambiguous IMO, so I ended up checking what I6500 did).
I am going to slightly rephrase the commit messege to address this.
Regards,
Aleksandar
Hi, James,
I appreciate your thorough and expeditious review.
>
> ________________________________________
> From: James Hogan
> Sent: Friday, July 21, 2017 7:45 AM
> To: Aleksandar Markovic
> Cc: [email protected]; Aleksandar Markovic; Miodrag Dinic; Goran Ferenc; Douglas Leung; [email protected]; Paul Burton; Petar Jovanovic; Raghu Gandham; Ralf Baechle
> Subject: Re: [PATCH v3 05/16] MIPS: math-emu: <MAX|MAXA|MIN|MINA>.<D|S>: Fix quiet NaN propagation
>
> On Fri, Jul 21, 2017 at 04:09:03PM +0200, Aleksandar Markovic wrote:
> > From: Aleksandar Markovic <[email protected]>
> >
> > Fix the value returned by <MAX|MAXA|MIN|MINA>.<D|S>, if both inputs
> > are quiet NaNs. The specifications of <MAX|MAXA|MIN|MINA>.<D|S> state
> > that the returned value in such cases should be the quiet NaN
> > contained in register fs.
> >
> > The relevant example:
> >
> > MAX.S fd,fs,ft:
> > If fs contains qNaN1, and ft contains qNaN2, fd is going to contain
> > qNaN1 (without this patch, it used to contain qNaN2).
> >
>
> Consider adding:
>
> Fixes: a79f5f9ba508 ("MIPS: math-emu: Add support for the MIPS R6 MAX{, A} FPU instruction")
> Fixes: 4e9561b20e2f ("MIPS: math-emu: Add support for the MIPS R6 MIN{, A} FPU instruction")
>
Will add in v4 (for all MIN/MAX/MINA/MAXa patches).
> > Signed-off-by: Miodrag Dinic <[email protected]>
> > Signed-off-by: Goran Ferenc <[email protected]>
> > Signed-off-by: Aleksandar Markovic <[email protected]>
>
> Consider adding:
>
> Cc: <[email protected]> # 4.3+
Will add on v4 (for all MIN/MAX/MINA/MAXA patches).
> > ---
> > arch/mips/math-emu/dp_fmax.c | 8 ++++++--
> > arch/mips/math-emu/dp_fmin.c | 8 ++++++--
> > arch/mips/math-emu/sp_fmax.c | 8 ++++++--
> > arch/mips/math-emu/sp_fmin.c | 8 ++++++--
> > 4 files changed, 24 insertions(+), 8 deletions(-)
> >
> > diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
> > index fd71b8d..567fc33 100644
> > --- a/arch/mips/math-emu/dp_fmax.c
> > +++ b/arch/mips/math-emu/dp_fmax.c
> > @@ -47,6 +47,9 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
> > case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> > return ieee754dp_nanxcpt(x);
> >
> > + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> > + return x;
>
> couldn't the above...
>
> > +
> > /* numbers are preferred to NaNs */
> > case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> > case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> > @@ -54,7 +57,6 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
>
> ... go somewhere around here and fall through to the existing return x
> case?
>
It could, but at the expense of code clarity and/or logical grouping of special cases,
which after this patch looks like:
. . .
|
case of both inputs qNaN
|
case of only x input qNaN
|
case of only y input qNaN
|
. . .
If you agree, I suggest keeping the code the same as currently proposed in
this patch, except that the following comments should be added in appropriate
places:
/*
* Quiet NaN handling
*/
/* The case of both inputs quiet NaNs */
. . .
/* The cases of exactly one input quiet NaN */
Unfortunately, the code segment for handling of sNaN and infinity inputs do
not follow the organization that I proposed. However, I think that my proposal
for case organization is the superior one - therefore I intend to keep it in v4,
unless you tell me not to do so.
Regards,
Aleksandar
Hi Aleksandar,
On Mon, Jul 24, 2017 at 02:36:05PM +0100, Aleksandar Markovic wrote:
> > > diff --git a/arch/mips/math-emu/dp_fmax.c b/arch/mips/math-emu/dp_fmax.c
> > > index fd71b8d..567fc33 100644
> > > --- a/arch/mips/math-emu/dp_fmax.c
> > > +++ b/arch/mips/math-emu/dp_fmax.c
> > > @@ -47,6 +47,9 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
> > > case CLPAIR(IEEE754_CLASS_SNAN, IEEE754_CLASS_INF):
> > > return ieee754dp_nanxcpt(x);
> > >
> > > + case CLPAIR(IEEE754_CLASS_QNAN, IEEE754_CLASS_QNAN):
> > > + return x;
> >
> > couldn't the above...
> >
> > > +
> > > /* numbers are preferred to NaNs */
> > > case CLPAIR(IEEE754_CLASS_ZERO, IEEE754_CLASS_QNAN):
> > > case CLPAIR(IEEE754_CLASS_NORM, IEEE754_CLASS_QNAN):
> > > @@ -54,7 +57,6 @@ union ieee754dp ieee754dp_fmax(union ieee754dp x, union ieee754dp y)
> >
> > ... go somewhere around here and fall through to the existing return x
> > case?
> >
>
> It could, but at the expense of code clarity and/or logical grouping of special cases,
> which after this patch looks like:
>
> . . .
> |
> case of both inputs qNaN
> |
> case of only x input qNaN
> |
> case of only y input qNaN
> |
> . . .
>
> If you agree, I suggest keeping the code the same as currently proposed in
> this patch, except that the following comments should be added in appropriate
> places:
>
> /*
> * Quiet NaN handling
> */
> /* The case of both inputs quiet NaNs */
> . . .
> /* The cases of exactly one input quiet NaN */
>
> Unfortunately, the code segment for handling of sNaN and infinity inputs do
> not follow the organization that I proposed. However, I think that my proposal
> for case organization is the superior one - therefore I intend to keep it in v4,
> unless you tell me not to do so.
Okay, I don't object.
Thanks
James