2004-03-21 22:02:43

by Pavel Machek

[permalink] [raw]
Subject: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> > > But as the next guy up in the tree, you do get to say what you'll take
> > > and what you won't :>
> >
> > Yeah, but I have to see it first!
>
> :> Ok. I'll start with a framework and flesh it out.

Now I have _proof_ that eye-candy is harmfull. What is see on screen is:

S U S P E N D T O D I S K

N
umber of free pages a[ ]h! (285723 != 285754)
(Press SPACE to continue)

The message I saw on screen is not in dmesg?! Don't tell me its /proc
configurable; it should not be there in the first place.

@@ -27,6 +27,9 @@
#include <linux/slab.h>
#include <linux/proc_fs.h>
#include <linux/efi.h>
+#ifdef CONFIG_SOFTWARE_SUSPEND2
+#include <linux/suspend-common.h>
+#endif

#include <asm/processor.h>
#include <asm/system.h>

You should modify header files so that they can be always included.

{
if (page_is_ram(pfn) && !(bad_ppro && page_kills_ppro(pfn))) {
ClearPageReserved(page);
+#ifdef CONFIG_SOFTWARE_SUSPEND2
+ ClearPageNosave(page);
+#endif
set_bit(PG_highmem, &page->flags);
set_page_count(page, 1);
__free_page(page);
totalhigh_pages++;
- } else
+ } else {
SetPageReserved(page);
+#ifdef CONFIG_SOFTWARE_SUSPEND2
+ SetPageNosave(page);
+#endif
+ }

Make it so that {Set,Clear}PageNosave macros are NOP w/o software
suspend, then kill the ifdef.

I picked up signal.c changes into my tree -- they look more correct
than what I have here.

--- clean.2.5/arch/ppc/kernel/signal.c 2004-02-05 01:53:57.000000000 +0100
+++ linux-mm/arch/ppc/kernel/signal.c 2004-03-21 20:21:41.000000000 +0100
...
unsigned long frame, newsp;
int signr, ret;

+ if (current->flags & PF_FREEZE) {
+ refrigerator(PF_FREEZE);
+ return 0;
+ }
+
if (!oldset)
oldset = &current->blocked;


But you probably need to check if signal_pending() here...

+++ linux-mm/drivers/acpi/sleep/proc.c 2004-03-21 20:21:41.000000000 +0100
@@ -1,7 +1,9 @@
#include <linux/proc_fs.h>
#include <linux/seq_file.h>
#include <linux/suspend.h>
+#include <linux/sched.h>
#include <linux/bcd.h>
+#include <linux/init.h>
#include <asm/uaccess.h>

#include <acpi/acpi_bus.h>
...
@@ -61,6 +66,7 @@
if (copy_from_user(str, buffer, count))
return -EFAULT;

+
/* Check for S4 bios request */
if (!strcmp(str,"4b")) {
error = acpi_suspend(4);

Try not to include blank lines. Also do not add includes. If your
ACTIVITY_XXX macros depend on sched.h / init.h, include them from
suspend.h.

Was it really neccessary to rename IOTHREAD to NOFREEZE? This way you
are changing all the users. OTOH you could submit patch to rename all
*now* and swsusp1 will continue working.

- sync_inodes_sb(sb, 0);
- DQUOT_SYNC(sb);
- lock_super(sb);
- if (sb->s_dirt && sb->s_op->write_super)
- sb->s_op->write_super(sb);
- unlock_super(sb);
- if (sb->s_op->sync_fs)
- sb->s_op->sync_fs(sb, 1);
- sync_blockdev(sb->s_bdev);
- sync_inodes_sb(sb, 1);
+#ifdef CONFIG_SOFTWARE_SUSPEND2
+ /* A safety net. During suspend, we might overwrite
+ * memory containing filesystem info. We don't then
+ * want to sync it to disk. */
+ if (likely(!(swsusp_state & FREEZE_UNREFRIGERATED)))
+#endif
+ {
+ sync_inodes_sb(sb, 0);

Can't you at least reverse the condition and put return there so that
you don't need to reindent the block? Also perhaps this should be
BUG_ON()? Noone should ever fall into safety net. Same in do_sync.

+++ linux-mm/include/asm-i386/mtrr.h 2004-03-21 20:21:42.000000000
+0100
@@ -106,4 +106,8 @@

#endif

+/* Save and restore functions for Software Suspend */
+extern int *mtrr_suspend(void);
+extern void mtrr_resume(int *ptr);
+
#endif /* _LINUX_MTRR_H */

This should be done through driver model.

+++ linux-mm/kernel/module.c 2004-03-21 20:21:42.000000000 +0100
@@ -1870,6 +1870,33 @@
return NULL;
}

+#define MODLIST_SIZE 4096
+
+void print_module_list(void)
+{
+ static char modlist[MODLIST_SIZE];
+ struct module *mod;
+ int pos = 0;
+
+ list_for_each_entry(mod, &modules, list)
+ if (mod->name)
+ pos += snprintf(modlist+pos,
MODLIST_SIZE-pos-1,
+ "%s ", mod->name);
+ printk("%s\n",modlist);
+}
+
+int print_module_list_to_buffer(char * buffer, int size)
+{
+ struct module *mod;
+ int pos = 0;
+
+ list_for_each_entry(mod, &modules, list)
+ if (mod->name)
+ pos += snprintf(buffer+pos, size-pos-1,
+ "%s ", mod->name);
+ return pos;
+}
+

This is only for debugging, right? If we have to have this one, could
you at least make one use the other?

+#if 0
+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,5,0)
+extern void ide_suspend(void);
+extern void ide_unsuspend(void);
+#endif
+#endif

Kill #if 0-ed code from there, also no LINUX_VERSION_CODE tests in
version for merging, please.

+static int processesfrozen = 0;

Missing "_" there...

+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,5,0)
+ if (unlikely((swsusp_state & FREEZE_NEW_ACTIVITY) &&
+ strcmp(current->comm, "dosexec") &&
+ ((!(current->flags & PF_SYNCTHREAD)) ||
+ (swsusp_state & FREEZE_UNREFRIGERATED))))
+#else

Uff, testing for name of userland program? That's quite an ugly hack,
even if you only need it for 2.4...

+ struct task_struct FOR_EACH_THREAD_TASK_STRUCTS;
...
+ suspend_relinquish_console();
+
+ suspend_task = 0;
+ swsusp_state = 0;
+ STORAGE_UNSUSPEND
+
+ /*
+ * Pause the other processors so we can safely
+ * change threads' flags
+ */
+

I'd call this macro abuse. Is it because 2.4 compatibility?

if (likely(!(current->state & (TASK_DEAD | TASK_ZOMBIE)))) {
if (unlikely(in_atomic())) {
- printk(KERN_ERR "bad: scheduling while
atomic!\n");
- dump_stack();
+ if (likely(!(software_suspend_state &
SOFTWARE_SUSPEND_RUNNING))) {
+ printk(KERN_ERR "bad: scheduling while
atomic!\n");
+ dump_stack();
+ }
}
}

Were you lazy or is there some reason why scheduling while atomic is
not bad for swsusp2?

@@ -484,6 +490,10 @@
* Really, prep_compound_page() should be called from
__rmqueue_bulk(). But
* we cheat by calling it from here, in the order > 0 path. Saves a
branch
* or two.
+ *
+ * While suspending, we don't use the pcp structure. It mucks up our
+ * accounting for all the pages and necessitates calling
drain_local_pages
+ * multiple times.
*/

static struct page *buffered_rmqueue(struct zone *zone, int order,
int cold)

I believe I worked around this one by drain_local_pages(), which is
much less intrusive.

+/*
+ * Remove a single page from a list, if possible.
+ * This is split out so that Software Suspend can use it
+ * to shoot down cache pages added while suspending/resuming.
+ *
+ * Returns:
+ * 0: No page to work on (not an error).
+ * 1: Exited via free_it.
+ * 2: Exited via activate_locked.
+ * 3: Exited via keep.
*/
-static int
-shrink_list(struct list_head *page_list, unsigned int gfp_mask,
- int *max_scan, int *nr_mapped)
+
+int __free_single_cache_page(struct page * page, unsigned int
gfp_mask)
{
+ int may_enter_fs, referenced, ret=3;

Why do you need this one? It looks intrusive. If its really
neccessary (or it cleans up code), this one should be sent
separately.

[I'll have to look at new files, too...]
Pavel
--
When do you have a heart between your knees? [Johanka's followup:
and *two* hearts?]


2004-03-22 00:31:30

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

(Andrew killed from Cc, he's probably not too interested in
swsusp-only issues. Benjamin added, to remind he what he wanted in
kernel ;-)

> [I'll have to look at new files, too...]

+struct rangechain {
+ struct range * first;
+ struct range * last;
+ int size; /* size of the range ie sum (max-min+1) */
+ int allocs;
+ int frees;
+ int debug;
+ int timesusedoptimisation;
+ char * name;
+ struct range * lastaccessed, *prevtolastaccessed, *prevtoprev;
+};
+

You need some more '_'s in there (like last_accessed), and do nut put
space between * and symbol.

+struct suspend_header {
+ __u32 version_code;

It can be u32 (unless this header is included from userland).

+#define MDELAY(a) if (TEST_ACTION_STATE(SUSPEND_SLOW)) { mdelay(a); } else { do { } while (0); }

Ouch, this is not the way to do it. You need do...while around if. It
is useless inside {}s.

+#define beepOK \
+ if (TEST_ACTION_STATE(SUSPEND_BEEP)) { \
+ currentbeep += 200; \
+ kd_mksound(currentbeep,HZ/8); \
+ mdelay(150); \
+ } else { \
+ do { } while (0); \
+ }
+#define beepERR \
+ if (TEST_ACTION_STATE(SUSPEND_BEEP)) { \
+ kd_mksound(300,HZ/4); \
+ mdelay(300); \
+ } else { \
+ do { } while (0); \
+ }

Is this really usefull to someone? Again useless while(0)s.

+struct swsusp_proc_data {
+ char * filename;
+ int permissions;
+ int type;
+ union {
+ struct {
+ unsigned long * bit_vector;
+ int bit;
+ } bit;
+ struct {
+ int * variable;
+ int minimum;
+ int maximum;
+ } integer;
+ struct {
+ unsigned long * variable;
+ unsigned long minimum;
+ unsigned long maximum;
+ } ul;
+ struct {
+ char * variable;
+ int max_length;
+ int (* write_proc) (void); /* Routine
triggered by write after new value stored */
+ } string;
+ struct {
+ void * read_proc;
+ void * write_proc;
+ void * data;
+ } special;
+ } data;
+ struct list_head proc_data_list;
+};
+
+#define SWSUSP_PROC_DATA_CUSTOM 0
+#define SWSUSP_PROC_DATA_BIT 1
+#define SWSUSP_PROC_DATA_INTEGER 2
+#define SWSUSP_PROC_DATA_UL 3
+#define SWSUSP_PROC_DATA_STRING 4

Your own abstraction on top of /proc? I do not think we need any
runtime configurability.

+#ifdef CONFIG_SOFTWARE_SUSPEND_RELAXED_PROC
+#define PROC_WRITEONLY 0222
+#define PROC_READONLY 0444
+#define PROC_RW 0666
+#else
+#define PROC_WRITEONLY 0200
+#define PROC_READONLY 0400
+#define PROC_RW 0600
+#endif

..and we do certainly not want CONFIG_SECURITY_HOLE.

#define MDELAY and friends seems to be in _two_ different header
files.

+struct swsusp_plugin_ops gzip_compression_ops = {
+ .type = FILTER_PLUGIN,
+ .name = "Zlib Page Compressor",
+ .memory_needed = gzip_memory_needed,
+ .print_debug_info = gzip_print_debug_stats,
+ .save_config_info = gzip_save_config_info,
+ .load_config_info = gzip_load_config_info,
+ .write_init = gzip_write_init,
+ .write_chunk = gzip_write_chunk,
+ .write_cleanup = gzip_write_cleanup,
+ .read_init = gzip_read_init,
+ .read_chunk = gzip_read_chunk,
+ .read_cleanup = gzip_read_cleanup,
+ .ops = {
+ .filter = {
+ .expected_compression =
gzip_get_expected_compression,
+ }
+ }

+/* gzip_save_config_info
+ *
+ * Description: Save informaton needed when reloading the image at resume time.
~~~~~~~~~~
typo
+ * Arguments: Buffer: Pointer to a buffer of size PAGE_SIZE.
+ * Returns: Number of bytes used for saving our data.
+ */

+static void gzip_load_config_info(char * buffer, int size)
+{
+ if (size != 2 * sizeof(unsigned long)) {
+ printk("Huh? Size of plugin data is not right. (%d
instead of %d).\n",
+ size, 2 * sizeof(unsigned long));
+ return;
+ }

BUG_ON? Otherwise, what about returing failure back to original code?

+++ software-suspend-core-2.0/kernel/power/Internals 2004-01-30
17:22:58.000000000 +1300
@@ -0,0 +1,364 @@
+ Software Suspend 2.0 Internal Documentation.
+ Version 1
+

Internal or not, documentation should go under Documentation, I
believe.

+ o We need to get as close as we can to an atomic copy of the
data.

We need to get "as close"? I hope it really _is_ atomic.

+ In 2.4, the dosexec thread (Win4Lin) is treated specially. It
does not
+ handle us even pretending to send it a signal. This is
worked-around by
+ us adjusting the can_schedule() macro in schedule.c to stop the
task from
+ being scheduled during suspend. Ugly, but it works. The 2.6
version of
+ Win4Lin has been made compatible.

Aha, so dosexec is some binary module, not userland program. Sorry for
my previous comment.

+ Writing the image requires memory, of course, and at this point we have
+ also not yet suspended the drivers. To avoid the possibility of remaining
+ activity corrupting the image, we allocate a special memory pool. Calls
+ to __alloc_pages and __free_pages_ok are then diverted to use our memory
+ pool. Pages in the memory pool are saved as part of pageset1 regardless of
+ whether or not they are used.

This is pretty ugly.

+ Having saved pageset2 pages, we can safely overwrite their contents with
+ the atomic copy of pageset1. This is how we manage to overcome the half of
+ memory limitation. Pageset2 is normally far larger than pageset1, and
+ pageset1 is normally much smaller than half of the memory, with the result
+ that pageset2 pages can be safely overwritten with the atomic copy of
+ pageset1. This is where we need to be careful about syncing, however.
+ Pageset2 will probably contain filesystem meta data. If this is overwritten
+ with pageset1 and then a sync occurs, the filesystem will be corrupted -
+ at least until resume time and another sync of the restored data. Since
+ there is a possibility that the user might not resume or (may it never be!)
+ that suspend might oops, we do our utmost to avoid syncing filesystems after
+ copying pageset1.

Ugly... We have not only to not-sync, but avoid any writeback, or any
reading altogether, right? And writes might be done from memory
managment paths..

+ struct list_head writer_list;
+ };
+
+ STORAGE_AVAILABLE is
diff -ruN post-version-specific/kernel/power/io.c software-suspend-core-2.0/kernel/power/io.c
--- post-version-specific/kernel/power/io.c 1970-01-01 12:00:00.000000000 +1200

Eh? Documentation seems to end in the middle of the line.

+#ifdef CONFIG_SOFTWARE_SUSPEND_DEBUG
+ sh->param2 = swsusp_debug_state;
+#endif

Try to avoid #ifdefs... This one really is not neccessary.

+/* TODO: Handle page protection when saving and loading pages. */

Can you elaborate?

+/* __read_primary_suspend_image
+ *
+ * Description: Test for the existence of an image and attempt to load it.
+ * Returns: Int. Zero if image found and pageset1 successfully loaded.
+ * Error if no image found or loaded.
+ */

If you are doing so nice comments already, perhaps you should use linuxdoc?

+ case -EINVAL: /* non fatal error */
+ software_suspend_state &= ~3;
+ MDELAY(1000);
+ return error;
+ break;

While break after error?

+/*
+ * Copyright (c) 2000-2002 Marc Alexander Lehmann <[email protected]>
+ *
+ * Redistribution and use in source and binary forms, with or without modifica-
+ * tion, are permitted provided that the following conditions are met:
+ *
+ * 1. Redistributions of source code must retain the above copyright notice,
+ * this list of conditions and the following disclaimer.
+ *
+ * 2. Redistributions in binary form must reproduce the above copyright
+ * notice, this list of conditions and the following disclaimer in the
+ * documentation and/or other materials provided with the distribution.
+ *
+ * 3. The name of the author may not be used to endorse or promote products
+ * derived from this software without specific prior written permission.

lzf compression should go under /lib, not under kernel/power, and
probably should go in separately.

This looks like BSD with advertising clause. I do not think you are
allowed to link this with kernel. It does not follow kernel coding style.

+static unsigned long lzf_memory_needed(void)
+{
+ if (DISABLED)
+ return 0;
+
+ return PAGE_SIZE * 2 + (1<<HLOG)*sizeof(char *);
+}

Perhaps disabling of plugins should be done on other level, so that if
(DISABLED) is not needed at begining of each function?

+static __init int lzf_load(void)
+{
+ int result;
+
+ printk("Software Suspend LZF Compression Driver v1.0\n");
+ if (!(result = swsusp_register_plugin(&lzf_compression_ops))) {
+ swsusp_register_procfile(&expected_compression_proc_data);
+ swsusp_register_procfile(&disable_compression_proc_data);
+ }
+ return result;
+}
+
+__initcall(lzf_load);

Hmm, so yes, you add your own abstraction on the top of /proc. Ouch.
And what about module unload? [Anyway __initcall() should probably be
module_init()].

swsusp2 not only has its own /proc abstraction and plugin system, it
also has its own memory alocator :-((((.

+/*
+ * workspace_size
+ *
+ * Description:
+ * Returns the number of bytes of RAM needed for this
+ * code to do its work. (Used when calculating whether
+ * we have enough memory to be able to suspend & resume).
+ *
+ */
+static unsigned long nullwriter_memory_needed(void)

Is it workspace_size or memory_needed? (null example driver).

collision_cache... is that attempt to speed up suspending, that is
making it more complicated?

+char* origrangesname = "original addresses";
+char* destrangesname = "destination addresses";
+char* allocdrangesname = "allocated addresses";
+
+/* set_chain_names
+ *
+ * Description: Set the chain names for a pagedir. (For debugging).
+ * Arguments: struct pagedir: The pagedir on which we want to set the names.
+ */
+
+void set_chain_names(struct pagedir * p)
+{
+ p->origranges.name = origrangesname;
+ p->destranges.name = destrangesname;
+ p->allocdranges.name = allocdrangesname;
+}

Why not simply p->origranges.name = "original addresses"?

+ int alternative_order = 999;

Why not 666? :-).

+ /* Get a single grabbed page for swsusp's use */
+ /* We only look at pages of the desired order */
+ /* Code could be added to split a higher order) */

Missing (.

+#define MAX_ATTEMPTS 3
+extern int freeze_processes(int no_progress);
+
+static int attempt_to_freeze(void)

Why are you doing 3 attempts? I thought this should work...

+ * Versions:
+ * 1: /proc/sys/kernel/swsusp the only tuning interface
+ * 2: Initial version of this file
+ * 3: Removed KDB parameter.
+ * Added checkpage parameter (for checking checksum of a page over time).

No changelogs at beginings of files, if possible... ... aha, that's
exported to userland? You are seriously overdesigning this.

+ * Another complicating factor is that freeing memory requires the processes
+ * to not be frozen, but at the end of freeing memory, they need to be frozen
+ * so that we can be sure we actually have eaten enough memory. This is why
+ * freezing and freeing are in the one file. The freezer is not called from
+ * the main logic, but indirectly, via the code for eating memory. The eat
+ * memory logic is iterative, first freezing processes and checking the stats,
+ * then (if necessary) unfreezing them and eating more memory until it looks
+ * like the criteria are met (at which point processes are frozen & stats
+ * checked again).

What if processes eat all the memory you freed just after unfreezing?
This could result in livelock...

+ do { } while (sync_buffers(NODEV, 1));
+ do { } while (fsync_dev(NODEV));

You can kill 'do { }'...

+/**
+ * register_resume_notifier - Register function to be called at resume time
+ * @nb: Info about notifier function to be called
+ *
+ * Registers a function with the list of functions
+ * to be called at resume time.
+ *
+ * Currently always returns zero, as notifier_chain_register
+ * always returns zero.
+ */

So you do know how to use linuxdoc, good. What is resume notifier good
for when we have driver model?

+++ software-suspend-core-2.0/kernel/power/range.c 2004-01-30 17:22:58.000000000 +1300
@@ -0,0 +1,845 @@
+/* pm_disk routines for manipulating ranges.
+ *
+ * (C) 2003, Nigel Cunningham.
+ *

pm_disk? And this needs GPL.

Ranges... nice trick to make it use slightly less memory, but it seems
pretty complex. You are using linklist of extents, anyway; why not
simplify it with linklist of pages?

+ if (!range) {
+ printk("Error! put_range called with NULL range.\n");
+ return;

BUG_ON? No, just kill the test, it will oops cleanly, anyway.

+ do { } while (!IS_ERR(ioinfo_cleanup_one()));

kill do { }.

+/*
+ *
+ */
+
+int parse_signature(char * header, int restore)

?

Why are you trying to parse signatures for pmdisk/swsusp1? 13
signatures is certainly impressive..

+ if (thisdevice == resume_device) {
+ printnolog(SUSPEND_IO, SUSPEND_VERBOSE, 0, "Resume root device %x", thisdevice);
+ RESUME_BDEV(i) = resume_block_device;
+ continue;
+ }
+
+ if (thisdevice == header_device) {
+ printnolog(SUSPEND_IO, SUSPEND_VERBOSE, 0, "Resume header device %x", thisdevice);
+ RESUME_BDEV(i) = header_block_device;
+ continue;
+ }

It uses multiple devices for suspend? Hm, I never managed to do
that...

+ while ((*thischar != ':') && ((thischar - commandline) < 250) && (*thischar))
+ thischar++;
+
+ if (*thischar == ':') {
+ colon = thischar;
+ *colon = 0;
+ thischar++;
+ }
+
+ while ((*thischar != '@') && ((thischar - commandline) < 250)
&& (*thischar))
+ thischar++;


strchr()?

+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,5,0)
+/* Make disk drivers accept operations, again */
+void storage_unsuspend(void)
+{
+#ifdef CONFIG_SCSI
+ struct pm_dev *dev = NULL;
+
+ while ((dev = pm_find(PM_SCSI_DEV, dev)))
+ pm_send(dev, PM_RESUME, (void *) 0);
+
+#endif
+#ifdef CONFIG_BLK_DEV_IDE
+ ide_disk_unsuspend(0);
+#endif
+}
+
+void storage_suspend(void)
+{
+#ifdef CONFIG_SCSI
+ struct pm_dev *dev = NULL;
+
+ while ((dev = pm_find(PM_SCSI_DEV, dev)))
+ pm_send(dev, PM_SUSPEND, (void *) 3);
+
+#endif
+#ifdef CONFIG_BLK_DEV_IDE
+ do_suspend_sync();
+
+ ide_disk_suspend();
+#endif
+}
+#endif

This should not be needed in 2.6...

+#define RESUME_PHASE1 1 /* Called from interrupts disabled */
+#define RESUME_PHASE2 2 /* Called with interrupts enabled */
+#define RESUME_ALL_PHASES (RESUME_PHASE1 | RESUME_PHASE2)
+
+void drivers_resume(int flags)
+{
+#if LINUX_VERSION_CODE < KERNEL_VERSION(2,5,0)
+ if(flags & RESUME_PHASE2) {
+#ifdef CONFIG_BLK_DEV_HD
+ do_reset_hd(); /* Kill all controller
state */
+#endif
+ }
+ if (flags & RESUME_PHASE1) {
+#ifdef CONFIG_BLK_DEV_IDE
+ ide_disk_unsuspend(1);
+#endif
+#ifdef CONFIG_SCSI
+ {
+ struct pm_dev *dev = NULL;
+
+ while ((dev = pm_find(PM_SCSI_DEV, dev)))
+ pm_send(dev, PM_RESUME, (void *) 0);
+ }
+#endif
+#ifdef CONFIG_BLK_DEV_MD
+ md_autostart_arrays();
+#endif
+ }

There were some problems with phases etc... I had to modify it, and it
became cleaner, too.

+int suspend_snprintf(char * buffer, int buffer_size, const char *fmt,
...)
+{

Ouch, own snprintf version?!

+ SNPRINTF("Please include the following information in bug reports:\n");
+ SNPRINTF("- SWSUSP core : %s\n", swsusp_version);
+ SNPRINTF("- Kernel Version : %s\n", UTS_RELEASE);
+ SNPRINTF("- Version spec. : %s\n", SWSUSP_VERSION_SPECIFIC_REVISION_STRING);
+ SNPRINTF("- Compiler vers. : %d.%d\n", __GNUC__,
__GNUC_MINOR__);

Could this be compressed into less lines/characters? Copying this by
hand is pretty tiring excersize.

+/* kstat_save
+ * Save the contents of the kstat array so that
+ * our labours are hidden from vmstat
+ */
+static int kstat_save(void)

What hack is this? You don't want load average to go up after resume?

+void start_kswsuspd(void * data)
+{
+ down(&kswsuspd_access);
+ init_completion(&work_complete);
+ work_data = data;
+ wake_up_interruptible(&swsusp_wait);
+ wait_for_completion(&work_complete);
+ up(&kswsuspd_access);
+}

swsusp1 gets away without new thread... As long as you don't use
sysrq-G, you are always called from process context. And that means
cleaner code....

+__setup("resume2=", resume_setup);
+__setup("swsusp_act=", swsusp_act_setup);
+#ifdef CONFIG_SOFTWARE_SUSPEND_DEBUG
+__setup("swsusp_dbg=", swsusp_dbg_setup);
+__setup("swsusp_lvl=", swsusp_lvl_setup);
+#endif
+__setup("noresume2", noresume_setup);

It should take over swsusp1 parameters. We do not want 3 different
mechanisms *in one running kernel*.

--- post-version-specific/kernel/power/ui.c 1970-01-01
12:00:00.000000000 +1200
+++ software-suspend-core-2.0/kernel/power/ui.c 2004-01-30
17:22:58.000000000 +1300
@@ -0,0 +1,1038 @@

...1000 lines of eye candies, with stuff like

+static void move_cursor_to(unsigned char * xy)
+{
+ char buf[10];
+
+ int len = suspend_snprintf(buf, 10, "\233%d;%dH", xy[1],
xy[0]);
+
+ cond_console_print(buf, len);
+}
+

Can we get rid of this?


+ if (TEST_RESULT_STATE(SUSPEND_ABORTED)) {
+ prepare_status(1, 1, "--- ESCAPE PRESSED AGAIN :
TRYING HARDER TO ABORT ---");
+ show_state();
+ thaw_processes();

Will not this corrupt data?

All in all, its ~12K lines of code (swsusp1 is ~2K). I guess its needs
to be way smaller to be suitable for merging.

Pavel

-- When do you have a heart between your knees? [Johanka's followup:
and *two* hearts?]

2004-03-22 21:56:10

by Nigel Cunningham

[permalink] [raw]
Subject: Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Mon, 2004-03-22 at 10:00, Pavel Machek wrote:
> Now I have _proof_ that eye-candy is harmfull. What is see on screen is:

No, that's not proof; just a bug in the code. It's not using the right
code to display the error message. I'll fix it asap.

> N
> umber of free pages a[ ]h! (285723 != 285754)
> (Press SPACE to continue)

By thw way, to get this message, you probably removed the memory pool
hooks. I'm getting the picture more and more clearly. I need to write a
series of emails explaining why each part of the changes exists. I'll
try to do that shortly.

> Was it really neccessary to rename IOTHREAD to NOFREEZE? This way you

Not really, but I felt that IOTHREAD wasn't a good description of the
way the flag is used. The name implies that it is intended for threads
used for doing I/O, but it is also used for some threads that aren't I/O
related but cannot/should not be refrigerated.

> +/* Save and restore functions for Software Suspend */
> +extern int *mtrr_suspend(void);
> +extern void mtrr_resume(int *ptr);
> +
> #endif /* _LINUX_MTRR_H */
>
> This should be done through driver model.

It is; the latest patch removes these declarations, which I missed
before.

> + suspend_task = 0;
> + swsusp_state = 0;
> + STORAGE_UNSUSPEND
> +
> + /*
> + * Pause the other processors so we can safely
> + * change threads' flags
> + */
> +
>
> I'd call this macro abuse. Is it because 2.4 compatibility?

Yes, wholy and solely.

> if (likely(!(current->state & (TASK_DEAD | TASK_ZOMBIE)))) {
> if (unlikely(in_atomic())) {
> - printk(KERN_ERR "bad: scheduling while
> atomic!\n");
> - dump_stack();
> + if (likely(!(software_suspend_state &
> SOFTWARE_SUSPEND_RUNNING))) {
> + printk(KERN_ERR "bad: scheduling while
> atomic!\n");
> + dump_stack();
> + }
> }
> }
>
> Were you lazy or is there some reason why scheduling while atomic is
> not bad for swsusp2?

I like the way you're forcing me to remember why I've done things the
way I have :>. I'll need to get look at this again and get back to you.
There is a good reason and I did try to avoid doing this. I just don't
remember the logic right now.

> I believe I worked around this one by drain_local_pages(), which is
> much less intrusive.

Again, I'll look at it again and explain why.

> Why do you need this one? It looks intrusive. If its really
> neccessary (or it cleans up code), this one should be sent
> separately.

Before not-long-ago, I had a separate routine that was essentially a
copy of this aimed at freeing a single page. I thought it was better to
properly merge it in. The point is this: Remember that we save the image
in two parts and that the first part saved consists of LRU pages. While
saving that part, we use the generic I/O routines, which will add the
page we've saved into the LRU. We, however, are trying to keep the image
consistent. We therefore don't want the page added to the LRU and thus
seek the remove it immediately. As I mentioned before, a better solution
might well be to stop it being added in the first place. I need to talk
with the mm guys about that.

Thanks for the comments. I really appreciate them.

Nigel

--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-22 23:18:15

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> > Now I have _proof_ that eye-candy is harmfull. What is see on screen is:
>
> No, that's not proof; just a bug in the code. It's not using the right
> code to display the error message. I'll fix it asap.

:-)

I'd really like eye-candy code to be gone. Its long, and its not worth
the trouble.

> > N
> > umber of free pages a[ ]h! (285723 != 285754)
> > (Press SPACE to continue)
>
> By thw way, to get this message, you probably removed the memory pool
> hooks. I'm getting the picture more and more clearly. I need to write a
> series of emails explaining why each part of the changes exists. I'll
> try to do that shortly.

No, this was actually with unmodified swsusp2. [I'm not sure if
highmem was enabled at that point. I do not think it was.]

> > Was it really neccessary to rename IOTHREAD to NOFREEZE? This way you
>
> Not really, but I felt that IOTHREAD wasn't a good description of the
> way the flag is used. The name implies that it is intended for threads
> used for doing I/O, but it is also used for some threads that aren't I/O
> related but cannot/should not be refrigerated.

I agree that NOFREEZE is better name. Try submitting separate patch to
rename it; if it gets rejected, modify swsusp2 to use IOTHREAD name...

> > if (likely(!(current->state & (TASK_DEAD | TASK_ZOMBIE)))) {
> > if (unlikely(in_atomic())) {
> > - printk(KERN_ERR "bad: scheduling while
> > atomic!\n");
> > - dump_stack();
> > + if (likely(!(software_suspend_state &
> > SOFTWARE_SUSPEND_RUNNING))) {
> > + printk(KERN_ERR "bad: scheduling while
> > atomic!\n");
> > + dump_stack();
> > + }
> > }
> > }
> >
> > Were you lazy or is there some reason why scheduling while atomic is
> > not bad for swsusp2?
>
> I like the way you're forcing me to remember why I've done things the
> way I have :>. I'll need to get look at this again and get back to you.
> There is a good reason and I did try to avoid doing this. I just don't
> remember the logic right now.

Not enough comments, then :-). [I wish I had followed swsusp2
development more closely, but I guess its too late for that by now.]

> Thanks for the comments. I really appreciate them.

I'm looking forward to better swsusp2.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-23 09:53:33

by Jonathan Sambrook

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

At 00:17 on Tue 23/03/04, [email protected] masquerading as 'Pavel Machek' wrote:
> Hi!
>
> > > Now I have _proof_ that eye-candy is harmfull. What is see on screen is:
> >
> > No, that's not proof; just a bug in the code. It's not using the right
> > code to display the error message. I'll fix it asap.
>
> :-)
>
> I'd really like eye-candy code to be gone. Its long, and its not worth
> the trouble.

Pejorative comments aside Pavel, it is valued feedback for end-user
re-assurance. It has also helped swsusp2 get streets ahead of the other
implementations by aiding end-user feedback on a wide range of hardware,
Which would appear to be well worth it? If it's not broke (and is
valuble) why rip it out?

Regards,
Jonathan


--


.__
|..|__ _____ _____ ___ ____ _ ____
|;;|: \ /;:;:;\ /;:;:;\ /;:;\ /;;;;\/;\/\/;;;;\
| Y \ Y Y \ Y Y \ Y \ | ___ \ |\_| ___ \
|___| /__|_| /__|_| /__| / o \_____/ | \___ /
\/ \/ \/ \/ \/ _/ /
\__/

hmmnsoft's FreeShade : window shading, for Windows, for free

http://www.hmmn.org/FreeShade

2004-03-23 15:25:58

by Micha

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Tue, Mar 23, 2004 at 09:53:18AM +0000, Jonathan Sambrook wrote:
> At 00:17 on Tue 23/03/04, [email protected] masquerading as 'Pavel Machek' wrote:
> > Hi!
> >
> > > > Now I have _proof_ that eye-candy is harmfull. What is see on screen is:
> > >
> > > No, that's not proof; just a bug in the code. It's not using the right
> > > code to display the error message. I'll fix it asap.
> >
> > :-)
> >
> > I'd really like eye-candy code to be gone. Its long, and its not worth
> > the trouble.
>
> Pejorative comments aside Pavel, it is valued feedback for end-user
> re-assurance. It has also helped swsusp2 get streets ahead of the other
> implementations by aiding end-user feedback on a wide range of hardware,
> Which would appear to be well worth it? If it's not broke (and is
> valuble) why rip it out?
>

Some kind of progress report is quite useful.

Hibernating when you are in a hurry, its quite nice to see if you have
the time to let hibernation finish before sticking the laptop in the
pack and be sure that it actually turned itself off.

> Regards,
> Jonathan
>
>
> --
>
>
> .__
> |..|__ _____ _____ ___ ____ _ ____
> |;;|: \ /;:;:;\ /;:;:;\ /;:;\ /;;;;\/;\/\/;;;;\
> | Y \ Y Y \ Y Y \ Y \ | ___ \ |\_| ___ \
> |___| /__|_| /__|_| /__| / o \_____/ | \___ /
> \/ \/ \/ \/ \/ _/ /
> \__/
>
> hmmnsoft's FreeShade : window shading, for Windows, for free
>
> http://www.hmmn.org/FreeShade
>
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> swsusp-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/swsusp-devel
>
> +++++++++++++++++++++++++++++++++++++++++++
> This Mail Was Scanned By Mail-seCure System
> at the Tel-Aviv University CC.
>

2004-03-23 21:50:21

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> > > > Now I have _proof_ that eye-candy is harmfull. What is see on screen is:
> > >
> > > No, that's not proof; just a bug in the code. It's not using the right
> > > code to display the error message. I'll fix it asap.
> >
> > :-)
> >
> > I'd really like eye-candy code to be gone. Its long, and its not worth
> > the trouble.
>
> Pejorative comments aside Pavel, it is valued feedback for end-user
> re-assurance. It has also helped swsusp2 get streets ahead of the other
> implementations by aiding end-user feedback on a wide range of hardware,
> Which would appear to be well worth it? If it's not broke (and is
> valuble) why rip it out?

Its 1000 lines. If it is not broken now, it will be broken in 2.8, and
because it is in mainline, it will be up to linus to fix it.

Oh and it is enough confusing that it confuses me. Some messages end
in dmesg, some do not. User feedback can be done with much less code,
and also slightly less confusing for the user, see swsusp1. [We have
to switch to another console, anyway; and printing dots is easy.]

Okay, we should probably make suspend more quiet, I can see users
badly confused by those hdX: spinning down (etc) messages.

Also, in your model, where do messages printk()-ed from drivers during
suspend/resume end up? Corrupting screen? Lost from sight and only
accessible from dmesg? I believe driver messages *are* important, and
do not see how they could coexist with eye-candy.

Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-23 22:08:56

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Wed, 2004-03-24 at 09:47, Pavel Machek wrote:
> Its 1000 lines. If it is not broken now, it will be broken in 2.8, and
> because it is in mainline, it will be up to linus to fix it.

Of the 1161 lines in ui.c, there are probably 200 lines of comments (the
header is 77 lines). Of what remains, I agree that there could be could
and should be some pruning before merging. Much of the debugging code
can be removed from the merged version and kept as a separate patch for
if/when its needed. I'm also of a mind to not include the original
text-mode 'nice display' and just use the Bootsplash support.

Of course there's also the point that you're assuming I'm going to
disappear into the wild blue yonder after it's merged. That assumption
has no basis from my perspective.

> Oh and it is enough confusing that it confuses me. Some messages end
> in dmesg, some do not. User feedback can be done with much less code,
> and also slightly less confusing for the user, see swsusp1. [We have
> to switch to another console, anyway; and printing dots is easy.]

As I said above, much of the code was from debugging and can be removed.
Nevertheless, the interface is not that confusing:

prepare_status: Set the title above the progress bar and optionally
reset the progress bar to 0. If the nice display is off, just print the
message.

update_status: Update the progress bar percentage.

print_nolog: Display a message without normally sticking it in the logs.
(Exception: if log everything is on). Used for the more verbose messages
(eg 1567/5624) so that detail can be seen without it cluttering the
logs.

print_log: Display a message and log it.

For print_nolog and print_log, we also take account of user definable
settings specifying what sections we want to see output from and how
much output we want, in deciding whether to print the message.

> Okay, we should probably make suspend more quiet, I can see users
> badly confused by those hdX: spinning down (etc) messages.

That's fine, but those messages aren't related to my code.

> Also, in your model, where do messages printk()-ed from drivers during
> suspend/resume end up? Corrupting screen? Lost from sight and only
> accessible from dmesg? I believe driver messages *are* important, and
> do not see how they could coexist with eye-candy.

They do go in the logs. An important exception though (which applies to
all implementations): messages displayed after the atomic copy is made
(while suspending) or before the kernel is copied back (resuming) as
lost because the printk buffer is overwritten,

Nigel.
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-23 22:18:26

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> if/when its needed. I'm also of a mind to not include the original
> text-mode 'nice display' and just use the Bootsplash support.

Yes, that would be great. [Bootsplash is in mainline but is in vendor
kernels => great, vendor kernels get splashscreen with progress bar
and vanilla kernel just plain old dots].

> Of course there's also the point that you're assuming I'm going to
> disappear into the wild blue yonder after it's merged. That assumption
> has no basis from my perspective.

Ok, sorry about that one.

> > Oh and it is enough confusing that it confuses me. Some messages end
> > in dmesg, some do not. User feedback can be done with much less code,
> > and also slightly less confusing for the user, see swsusp1. [We have
> > to switch to another console, anyway; and printing dots is easy.]
>
> As I said above, much of the code was from debugging and can be removed.
> Nevertheless, the interface is not that confusing:

If it slims down a lot, great. If debugging code is gone, and progress
bar depends on bootsplash, we are left with something that looks like
output from swsusp1, right? And there should be little need to
configure it. Good.

> > Okay, we should probably make suspend more quiet, I can see users
> > badly confused by those hdX: spinning down (etc) messages.
>
> That's fine, but those messages aren't related to my code.

Yes, thats _my_ todo.

> > Also, in your model, where do messages printk()-ed from drivers during
> > suspend/resume end up? Corrupting screen? Lost from sight and only
> > accessible from dmesg? I believe driver messages *are* important, and
> > do not see how they could coexist with eye-candy.
>
> They do go in the logs. An important exception though (which applies to
> all implementations): messages displayed after the atomic copy is made
> (while suspending) or before the kernel is copied back (resuming) as
> lost because the printk buffer is overwritten,

Yes, but with eye candy its a bit more of a problem because progress
bar updating could obscure them.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-23 23:11:17

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 24 Mar 2004 09:08:52 +1200, Nigel Cunningham <[email protected]> wrote:
> I'm also of a mind to not include the original
> text-mode 'nice display' and just use the Bootsplash support.

Which I would not agree with as this is what I use ;)

However could this be moved into a module or userland thus keeping everyone happy.

Michael

2004-03-23 23:17:49

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> <[email protected]> wrote:
> > I'm also of a mind to not include the original
> >text-mode 'nice display' and just use the Bootsplash support.
>
> Which I would not agree with as this is what I use ;)
>
> However could this be moved into a module or userland thus keeping everyone
> happy.

Unfortunately, it can not be sanely moved to userland.

We probably could live with hooks for bootsplash (as long as they are
very few lines, code-wise), and you probably could hook your favourite
text-mode eye-candy there.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-23 23:28:19

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Tuesday 23 March 2004 04:47 pm, Pavel Machek wrote:
> Hi!
>
<skip>

> Also, in your model, where do messages printk()-ed from drivers during
> suspend/resume end up? Corrupting screen? Lost from sight and only
> accessible from dmesg? I believe driver messages *are* important, and
> do not see how they could coexist with eye-candy.
>
Well, unless these are error messages that prevent machine from suspending/
resuming they are really just another form of eye-candy, nothing more,
nothing less.

--
Dmitry

2004-03-23 23:32:52

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> > Also, in your model, where do messages printk()-ed from drivers during
> > suspend/resume end up? Corrupting screen? Lost from sight and only
> > accessible from dmesg? I believe driver messages *are* important, and
> > do not see how they could coexist with eye-candy.
> >
> Well, unless these are error messages that prevent machine from suspending/
> resuming they are really just another form of eye-candy, nothing more,
> nothing less.

Well, I'd hate

Nov 10 00:37:51 amd kernel: Buffer I/O error on device sr0, logical block 842340
Nov 10 00:37:53 amd kernel: end_request: I/O error, dev sr0, sector 6738472

to be obscured by progress bar.

Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-23 23:36:50

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Wed, 2004-03-24 at 11:17, Pavel Machek wrote:
> > > I'm also of a mind to not include the original
> > >text-mode 'nice display' and just use the Bootsplash support.
> >
> > Which I would not agree with as this is what I use ;)

I could always make the ui a plugin too - along the lines of what
Michael is suggesting. That would be really simple: separate out the
bootsplash code and the text mode code into new files, perhaps in a ui
directory and adjust the Makefile & config.in accordingly.

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-23 23:38:58

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Wed, 2004-03-24 at 10:17, Pavel Machek wrote:
> Yes, but with eye candy its a bit more of a problem because progress
> bar updating could obscure them.

Remember though that I said that's a mistake I need to fix.

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-23 23:41:37

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi yet again.

On Wed, 2004-03-24 at 11:32, Pavel Machek wrote:
> Well, I'd hate
>
> Nov 10 00:37:51 amd kernel: Buffer I/O error on device sr0, logical block 842340
> Nov 10 00:37:53 amd kernel: end_request: I/O error, dev sr0, sector 6738472
>
> to be obscured by progress bar.

So why aren't you arguing against bootsplash too? That definitely
obscures such an error :> Of course we could argue that such an error
shouldn't happen and/or will be obvious via other means (assuming it
indicates hardware failure).

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-23 23:46:13

by Dumitru Ciobarcianu

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 2004-03-24 at 10:36 +1200, Nigel Cunningham wrote:

> I could always make the ui a plugin too - along the lines of what
> Michael is suggesting. That would be really simple: separate out the
> bootsplash code and the text mode code into new files, perhaps in a ui
> directory and adjust the Makefile & config.in accordingly.


Speaking from an swsusp user (not developer) this is the best thing
imho. Some people just love eye candy :)



--
Dumitru "Eyecandy Lover" C.


Attachments:
signature.asc (189.00 B)
This is a digitally signed message part

2004-03-23 23:45:22

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> > Well, I'd hate
> >
> > Nov 10 00:37:51 amd kernel: Buffer I/O error on device sr0, logical block 842340
> > Nov 10 00:37:53 amd kernel: end_request: I/O error, dev sr0, sector 6738472
> >
> > to be obscured by progress bar.
>
> So why aren't you arguing against bootsplash too? That definitely
> obscures such an error :> Of course we could argue that such an error
> shouldn't happen and/or will be obvious via other means (assuming it
> indicates hardware failure).

Of course I *am* against bootsplash. Unfortunately I've probably lost
that war already. But at least it is not in -linus tree (and that's
what I use anyway) => I gave up with bootsplash-equivalents, as long
as they don't come to linus.

[And I believe Linus would shoot down bootsplash-like code, anyway.]

Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-23 23:52:42

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Wow. That cc list is growing. I'm going to trim it since those guys are
subscribed to LKML or swsusp-devel anyway.

On Wed, 2004-03-24 at 11:45, Dumitru Ciobarcianu wrote:
> Speaking from an swsusp user (not developer) this is the best thing
> imho. Some people just love eye candy :)

I think it's really important that we have some eye candy. This is going
to be one of the most visible parts of the kernel, and for some people,
it's going to be the thing they stare at most of the time when they
start up/shut down. If we want to help the cause of Linux on the
desktop, we want it to look good. As long as its within reasonable
bounds, a small increase in kernel size is worth it, in my opinion.

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-24 00:05:58

by Joel Jaeggli

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 24 Mar 2004, Nigel Cunningham wrote:

> Wow. That cc list is growing. I'm going to trim it since those guys are
> subscribed to LKML or swsusp-devel anyway.
>
> On Wed, 2004-03-24 at 11:45, Dumitru Ciobarcianu wrote:
> > Speaking from an swsusp user (not developer) this is the best thing
> > imho. Some people just love eye candy :)
>
> I think it's really important that we have some eye candy. This is going
> to be one of the most visible parts of the kernel, and for some people,
> it's going to be the thing they stare at most of the time when they
> start up/shut down. If we want to help the cause of Linux on the
> desktop, we want it to look good. As long as its within reasonable
> bounds, a small increase in kernel size is worth it, in my opinion.

I'd settle for just having something that worked, in a released kernel.
the patch load I'm carrying around to make most of the hardware in my vaio
work is more of a shambling mound than a source of stability and
goodness...

> Nigel
>

--
--------------------------------------------------------------------------
Joel Jaeggli Unix Consulting [email protected]
GPG Key Fingerprint: 5C6E 0104 BAF0 40B0 5BD3 C38B F000 35AB B67F 56B2


2004-03-24 00:10:24

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Wed, 2004-03-24 at 12:05, Joel Jaeggli wrote:
> I'd settle for just having something that worked, in a released kernel.
> the patch load I'm carrying around to make most of the hardware in my vaio
> work is more of a shambling mound than a source of stability and
> goodness...

Yes, but most people want it to look good while it works. You're
exceptional!

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-24 03:16:39

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 24 Mar 2004 00:44:49 +0100, Pavel Machek <[email protected]> wrote:

> Hi!
>
>> > Well, I'd hate
>> >
>> > Nov 10 00:37:51 amd kernel: Buffer I/O error on device sr0, logical block 842340
>> > Nov 10 00:37:53 amd kernel: end_request: I/O error, dev sr0, sector 6738472
>> >
>> > to be obscured by progress bar.
>>
>> So why aren't you arguing against bootsplash too? That definitely
>> obscures such an error :> Of course we could argue that such an error
>> shouldn't happen and/or will be obvious via other means (assuming it
>> indicates hardware failure).
>
> Of course I *am* against bootsplash. Unfortunately I've probably lost
> that war already. But at least it is not in -linus tree (and that's
> what I use anyway) => I gave up with bootsplash-equivalents, as long
> as they don't come to linus.
>
> [And I believe Linus would shoot down bootsplash-like code, anyway.]
>
> Pavel

Solution: Auto switch to non-swsusp VT on error showing the error message.

Michael

2004-03-24 04:53:14

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Tuesday 23 March 2004 06:32 pm, Pavel Machek wrote:
> Hi!
>
> > > Also, in your model, where do messages printk()-ed from drivers during
> > > suspend/resume end up? Corrupting screen? Lost from sight and only
> > > accessible from dmesg? I believe driver messages *are* important, and
> > > do not see how they could coexist with eye-candy.
> > >
> > Well, unless these are error messages that prevent machine from suspending/
> > resuming they are really just another form of eye-candy, nothing more,
> > nothing less.
>
> Well, I'd hate
>
> Nov 10 00:37:51 amd kernel: Buffer I/O error on device sr0, logical block 842340
> Nov 10 00:37:53 amd kernel: end_request: I/O error, dev sr0, sector 6738472
>
> to be obscured by progress bar.
>
> Pavel

OK, so you have an error condition on your CD. Does it prevent suspend from
completing? If not then I probably would not care about it till later when
I see it in syslog... I remember that the one thing that Pat complained
most often is your love for "panic"ing instead of trying to recover. In that
case you understandably want as many preceding messages as possible left
intact on the screen to diagnose the problem. If error recovery is ever done
then some eye-candy probably won't hurt.

Also, if we leave swsusp and suspending alone for a second, I could have
'top' running on console overwriting the very same messages. Should we ban
top?

If error reporting is so important for a box it probably:
- has many more means to notify operator;
- has all eye-candy turned off;
- is not getting suspended anyway.
IOW it's most likely a server running 24/7.

--
Dmitry

2004-03-24 06:04:51

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Wed, 2004-03-24 at 16:52, Dmitry Torokhov wrote:
> OK, so you have an error condition on your CD. Does it prevent suspend from
> completing? If not then I probably would not care about it till later when
> I see it in syslog... I remember that the one thing that Pat complained
> most often is your love for "panic"ing instead of trying to recover. In that
> case you understandably want as many preceding messages as possible left
> intact on the screen to diagnose the problem. If error recovery is ever done
> then some eye-candy probably won't hurt.

Suspend2 is capable of aborting (there are four panics for dire
situations; for the record swsusp.c has 14). I haven't tried to simulate
a media error, so I'm not perfectly sure it would abort okay, but it
wouldn't take too much to fix any issues.

Regards,

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-24 06:26:21

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 24 Mar 2004 17:04:58 +1200, Nigel Cunningham <[email protected]> wrote:

> Hi.
>
> On Wed, 2004-03-24 at 16:52, Dmitry Torokhov wrote:
>> OK, so you have an error condition on your CD. Does it prevent suspend from
>> completing? If not then I probably would not care about it till later when
>> I see it in syslog... I remember that the one thing that Pat complained
>> most often is your love for "panic"ing instead of trying to recover. In that

As easy, as clumsy.

>> case you understandably want as many preceding messages as possible left
>> intact on the screen to diagnose the problem. If error recovery is ever done
>> then some eye-candy probably won't hurt.

Error messages should be handled on a seperate VT eliminating the issue.

>
> Suspend2 is capable of aborting (there are four panics for dire
> situations; for the record swsusp.c has 14). I haven't tried to simulate
> a media error, so I'm not perfectly sure it would abort okay,

I'll do more testing there in due course.

How are bad blocks on a swap partition handled by the vm?

> but it
> wouldn't take too much to fix any issues.

Which reminds me of the "failed to read a chunk" message, the guys who reported
it got all quiet after telling them to do more badblocks testing without diskcaching or
using dd to write random data and read them back, so likely was caused by
media problems.

Here we need more detailed error messages including the driver output and the
screen should be switched to a text VT so messages are visible. Also as the
error will cause resume to fail the system should be halted in this case.

IMO seperate message VT will eliminate all interference issues and further modularization
by keeping the eye candy seperate.

Regards
Michael

2004-03-24 07:11:18

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Wed, 2004-03-24 at 18:22, Michael Frank wrote:
> Error messages should be handled on a seperate VT eliminating the issue.

While I definitely like the idea, I'm not sure that's feasible; as Pavel
pointed out, Suspend doesn't generate all the error messages that might
possibly appear. Maybe I'm just ignorant.. I'll take a look when I get
the change.

> How are bad blocks on a swap partition handled by the vm?

Good question :>

> Which reminds me of the "failed to read a chunk" message, the guys who reported
> it got all quiet after telling them to do more badblocks testing without diskcaching or
> using dd to write random data and read them back, so likely was caused by
> media problems.

I'll reserve judgement there...

> Here we need more detailed error messages including the driver output and the
> screen should be switched to a text VT so messages are visible. Also as the
> error will cause resume to fail the system should be halted in this case.
>
> IMO seperate message VT will eliminate all interference issues and further modularization
> by keeping the eye candy seperate.
>
> Regards
> Michael
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-24 07:36:29

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 24 Mar 2004 17:46:28 +1200, Nigel Cunningham <[email protected]> wrote:

> Hi.
>
> On Wed, 2004-03-24 at 18:22, Michael Frank wrote:
>> Error messages should be handled on a seperate VT eliminating the issue.
>
> While I definitely like the idea, I'm not sure that's feasible; as Pavel
> pointed out, Suspend doesn't generate all the error messages that might
> possibly appear. Maybe I'm just ignorant.. I'll take a look when I get
> the change.

printk is central and could do the switch when swsusp is active

2004-03-24 09:32:33

by Karol Kozimor

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Thus wrote Michael Frank:
> Which reminds me of the "failed to read a chunk" message, the guys who
> reported
> it got all quiet after telling them to do more badblocks testing without
> diskcaching or
> using dd to write random data and read them back, so likely was caused by
> media problems.

I'm not so sure, at least in my case. Sure, badblocks /dev/hda1 reports an
access beyond end, but neither badblocks /dev/hda1 $SIZEOF_HDA1 nor SMART
do. Anyway, the alleged bad blocks are at the end of a 400 MB partition, so
unless swsusp allocated swap randomly, there's hardly any chance I could
hit them with 256 MB RAM and LZF on. But then, this failure was a single
event in my case, while others reported some regularity.
Best regards,

--
Karol 'sziwan' Kozimor
[email protected]

2004-03-24 10:19:14

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> >>So why aren't you arguing against bootsplash too? That definitely
> >>obscures such an error :> Of course we could argue that such an error
> >>shouldn't happen and/or will be obvious via other means (assuming it
> >>indicates hardware failure).
> >
> >Of course I *am* against bootsplash. Unfortunately I've probably lost
> >that war already. But at least it is not in -linus tree (and that's
> >what I use anyway) => I gave up with bootsplash-equivalents, as long
> >as they don't come to linus.
> >
> >[And I believe Linus would shoot down bootsplash-like code, anyway.]
>
> Solution: Auto switch to non-swsusp VT on error showing the error message.

Hmm, at that point you loose context, like now you know what error
happened, but do not know at which phase of suspend. That's pretty bad
too.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-24 10:25:26

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On ?t 23-03-04 23:52:58, Dmitry Torokhov wrote:
> On Tuesday 23 March 2004 06:32 pm, Pavel Machek wrote:
> > Well, I'd hate
> >
> > Nov 10 00:37:51 amd kernel: Buffer I/O error on device sr0, logical block 842340
> > Nov 10 00:37:53 amd kernel: end_request: I/O error, dev sr0, sector 6738472
> >
> > to be obscured by progress bar.
>
> OK, so you have an error condition on your CD. Does it prevent suspend from
> completing? If not then I probably would not care about it till later when
> I see it in syslog... I remember that the one thing that Pat
> complained

Except when it is not in syslog, because it was after atomic copy or
before atomic copy back. swsusp is pretty unique in this respect.

> most often is your love for "panic"ing instead of trying to recover. In that
> case you understandably want as many preceding messages as possible left
> intact on the screen to diagnose the problem. If error recovery is ever done
> then some eye-candy probably won't hurt.

Except when error recovery does not work.

> Also, if we leave swsusp and suspending alone for a second, I could have
> 'top' running on console overwriting the very same messages. Should we ban
> top?

Its bad idea to run top when kernel messages are not redirected
somewhere. Unfortunately eye-candy makes that choice for you, and does
the wrong thing automatically.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-24 10:27:09

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> >On Wed, 2004-03-24 at 18:22, Michael Frank wrote:
> >>Error messages should be handled on a seperate VT eliminating the issue.
> >
> >While I definitely like the idea, I'm not sure that's feasible; as Pavel
> >pointed out, Suspend doesn't generate all the error messages that might
> >possibly appear. Maybe I'm just ignorant.. I'll take a look when I get
> >the change.
>
> printk is central and could do the switch when swsusp is active

You *could* do it, but it is bad idea. You don't want to patch
printk.c, that driver printk could be done from interrupt (and you
can't switch consoles at that point), you loose context, etc. What
about doing the simple thing, maybe hack with CRs and be done with
that?

If someone wants more eye candy, they have to patch their kernel with
bootsplash.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-24 12:48:43

by Dmitry Torokhov

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wednesday 24 March 2004 05:22 am, Pavel Machek wrote:
> On ?t 23-03-04 23:52:58, Dmitry Torokhov wrote:
> > On Tuesday 23 March 2004 06:32 pm, Pavel Machek wrote:
> > > Well, I'd hate
> > >
> > > Nov 10 00:37:51 amd kernel: Buffer I/O error on device sr0, logical block 842340
> > > Nov 10 00:37:53 amd kernel: end_request: I/O error, dev sr0, sector 6738472
> > >
> > > to be obscured by progress bar.
> >
> > OK, so you have an error condition on your CD. Does it prevent suspend from
> > completing? If not then I probably would not care about it till later when
> > I see it in syslog... I remember that the one thing that Pat
> > complained
>
> Except when it is not in syslog, because it was after atomic copy or
> before atomic copy back. swsusp is pretty unique in this respect.
>

And I would consider an error that happens after atomic copy critical. If
this happens all attempts to draw progress bar, etc. should be stopped and
suspend should probably abort as well.

What happens if swsusp1 gets such an error during atomic phase? Will it
continue or abort? If it continues I would imagine user not noticing the
message it it flashes the split second before the box powers off.

--
Dmitry

2004-03-24 14:19:22

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 24 Mar 2004 10:32:31 +0100, Karol Kozimor <[email protected]> wrote:

> Thus wrote Michael Frank:
>> Which reminds me of the "failed to read a chunk" message, the guys who
>> reported
>> it got all quiet after telling them to do more badblocks testing without
>> diskcaching or
>> using dd to write random data and read them back, so likely was caused by
>> media problems.
>
> I'm not so sure, at least in my case. Sure, badblocks /dev/hda1 reports an
> access beyond end, but neither badblocks /dev/hda1 $SIZEOF_HDA1 nor SMART
> do. Anyway, the alleged bad blocks are at the end of a 400 MB partition, so
> unless swsusp allocated swap randomly, there's hardly any chance I could
> hit them with 256 MB RAM and LZF on. But then, this failure was a single
> event in my case, while others reported some regularity.
> Best regards,
>

Badblocks error reading beyond the end of the partition is irrelevant,
it is a primitive bug somewhere unrelated to media condition.

Also Badblocks without disabling drive cache is _utterly_useless_.

It will not be a bare swsusp bug, I would have hit that in 20K+ cycles
since using LZF and a thousand or so of other 2.4 users would have
hit it too.

Please help indentify the actual problem by running some decent tests.

Regards
Michael

2004-03-24 15:18:34

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> > Except when it is not in syslog, because it was after atomic copy or
> > before atomic copy back. swsusp is pretty unique in this respect.
> >
>
> And I would consider an error that happens after atomic copy critical. If
> this happens all attempts to draw progress bar, etc. should be stopped and
> suspend should probably abort as well.

Well, not all printks() are errors this hard. And at some points, it
is no longer possible to abort (after pagedir is on disk, its okay to
panic (machine will resume normally after that), but its not okay to
simply return. You could fix signature then return, but it would be hard).

> What happens if swsusp1 gets such an error during atomic phase? Will it
> continue or abort? If it continues I would imagine user not noticing the
> message it it flashes the split second before the box powers off.

It continues. Fortunately powerdown takes enough time on most machines
that messages can be readed ... if you pay attetion.
Pavel
--
Horseback riding is like software...
...vgf orggre jura vgf serr.

2004-03-24 20:23:16

by Jonathan Sambrook

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

At 16:18 on Wed 24/03/04, [email protected] masquerading as 'Pavel Machek' wrote:
> Hi!
>
> > > Except when it is not in syslog, because it was after atomic copy or
> > > before atomic copy back. swsusp is pretty unique in this respect.
> > >
> >
> > And I would consider an error that happens after atomic copy critical. If
> > this happens all attempts to draw progress bar, etc. should be stopped and
> > suspend should probably abort as well.
>
> Well, not all printks() are errors this hard. And at some points, it
> is no longer possible to abort (after pagedir is on disk, its okay to
> panic (machine will resume normally after that), but its not okay to
> simply return. You could fix signature then return, but it would be hard).
>
> > What happens if swsusp1 gets such an error during atomic phase? Will it
> > continue or abort? If it continues I would imagine user not noticing the
> > message it it flashes the split second before the box powers off.
>
> It continues. Fortunately powerdown takes enough time on most machines
> that messages can be readed ...

> if you pay attetion.

Which is not a normal usage scenario.

Common scenario:

suspend machine then go home/to bed


BTW Pavel I'm not arguing that the code has to stay in without
modification
(e.g. massively stripped down or whatever). But this is one place where
kernel code is a lot closer to Joe Enduser's awareness than is usually
the case, so IMUO, the priorities shift.

That said, this discussion seems to be necessary to refine what is
necessary from what has been creatively evolved.

Jonathan

> Pavel
> --
> Horseback riding is like software...
> ...vgf orggre jura vgf serr.
> -
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to [email protected]
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/

--

Jonathan Sambrook
Software Developer
Designer Servers


Attachments:
(No filename) (1.98 kB)
(No filename) (189.00 B)
Download all attachments

2004-03-24 21:10:40

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Wed, 24 Mar 2004 20:22:59 +0000, Jonathan Sambrook <[email protected]> wrote:

> At 16:18 on Wed 24/03/04, [email protected] masquerading as 'Pavel Machek' wrote:
>> Hi!
>>
>> > > Except when it is not in syslog, because it was after atomic copy or
>> > > before atomic copy back. swsusp is pretty unique in this respect.
>> > >
>> >
>> > And I would consider an error that happens after atomic copy critical. If
>> > this happens all attempts to draw progress bar, etc. should be stopped and
>> > suspend should probably abort as well.

swsusp2 aborts and reads back the kernel and process data, it recovers even
when hitting ESC just before powerdown.

>>
>> Well, not all printks() are errors this hard. And at some points, it
>> is no longer possible to abort (after pagedir is on disk, its okay to
>> panic (machine will resume normally after that), but its not okay to
>> simply return. You could fix signature then return, but it would be hard).

It is _utterly_ unacceptable to fail during suspend. You might as well take
a sledge hammer to kill the box!

Suspend is a mechanism to suspend the system transparently and
_NOT_EVER_ impairing the system. There can be NO_COMPROMISE and
NO_EXCUSE. I walk out of my office suspending the machine and resuming it
in front of my client it can't ever fail, or am I an idiot to advocate linux?

If I would be willing to accept failure I would not spend my time here and
utilize M$'s incarnation of an architectural idiocy.

>>
>> > What happens if swsusp1 gets such an error during atomic phase? Will it
>> > continue or abort? If it continues I would imagine user not noticing the
>> > message it it flashes the split second before the box powers off.
>>
>> It continues. Fortunately powerdown takes enough time on most machines
>> that messages can be readed ...
>
>> if you pay attetion.
>
> Which is not a normal usage scenario.

Right, because nobody pays attention. If you step on the brake pedal driving
your car these brakes better work or you may be well worse of than dead
... just in case you ran over a kid...

>
> Common scenario:
>
> suspend machine then go home/to bed

Worse, I suspend 3 networked machines from a notebook via ssh,
tear out the network cable from the notebook and run out or go to
bed. NFS crossmounts still make me trouble at times...

>
>
> BTW Pavel I'm not arguing that the code has to stay in without
> modification
> (e.g. massively stripped down or whatever). But this is one place where
> kernel code is a lot closer to Joe Enduser's awareness than is usually
> the case, so IMUO, the priorities shift.
>

First we have to agree on what the objectives are, the code will follow.

And eventually one has to combine the superiority of linux with the
simplicity/primitivity of M$, which I am btw still recovering from.

Michael

2004-03-24 22:05:36

by Markus Gaugusch

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Mar 24, Pavel Machek <[email protected]> wrote:

> On ?t 23-03-04 23:52:58, Dmitry Torokhov wrote:
> > most often is your love for "panic"ing instead of trying to recover. In that
> > case you understandably want as many preceding messages as possible left
> > intact on the screen to diagnose the problem. If error recovery is ever done
> > then some eye-candy probably won't hurt.
>
> Except when error recovery does not work.
The most valuable tool of all swsusp developers and testers was and is the
serial console. If there are errors that haven't been found yet by anyone
(and a lot already have been found and fixed!), it's time to dig out the
console cable, record it and report it. That's what happend all the time
in the last year or so.
In all cases, I never could do much with the text on the screen. Apart
from being hard to paste into emails, it also scrolled away in most cases
(or worse: everything was just hanging and I had to increase debug level
which produced even more output that wouldn't fit on screen). By the way -
higher debug levels disable nice output automatically (AFAIR).

> > 'top' running on console overwriting the very same messages. Should we ban
> > top?
>
> Its bad idea to run top when kernel messages are not redirected
> somewhere. Unfortunately eye-candy makes that choice for you, and does
> the wrong thing automatically.
> Pavel
Did I say redirect to serial console or what? ;-)

Markus

--
__________________ /"\
Markus Gaugusch \ / ASCII Ribbon Campaign
markus(at)gaugusch.at X Against HTML Mail
/ \

2004-03-24 22:26:03

by Nigel Cunningham

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi.

On Thu, 2004-03-25 at 10:05, Markus Gaugusch wrote:
> By the way -
> higher debug levels disable nice output automatically (AFAIR).

That's right.

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-24 22:50:29

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

May I request that you leave the authors headers intact when quoting. Thank you.

On Wed, 24 Mar 2004 11:17:04 +0100, Pavel Machek <[email protected]> wrote:

>> >>So why aren't you arguing against bootsplash too? That definitely
>> >>obscures such an error :> Of course we could argue that such an error
>> >>shouldn't happen and/or will be obvious via other means (assuming it
>> >>indicates hardware failure).
>> >
>> >Of course I *am* against bootsplash. Unfortunately I've probably lost
>> >that war already. But at least it is not in -linus tree (and that's
>> >what I use anyway) => I gave up with bootsplash-equivalents, as long
>> >as they don't come to linus.
>> >
>> >[And I believe Linus would shoot down bootsplash-like code, anyway.]

Why? Joe consumer wants it.

As to the ever growing size of the kernel, there could be a official addons/tools
tree with non-core functions maintained by a seperate maintainer. Things like
debuggers, profiling or (swsusp) debug support could go there as well...

>>
>> Solution: Auto switch to non-swsusp VT on error showing the error message.
>
> Hmm, at that point you loose context, like now you know what error
> happened, but do not know at which phase of suspend. That's pretty bad
> too.

Right, Good idea! Just print always "ugly" swsusp context on a text VT - plus any
error messages - and switch over to this VT in printk when not in interrupt
context. 10 lines of code or so in printk ;)

Michael





2004-03-24 23:24:16

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On ÄŒt 25-03-04 06:46:12, Michael Frank wrote:
> May I request that you leave the authors headers intact when quoting. Thank
> you.

As you wish.

> On Wed, 24 Mar 2004 11:17:04 +0100, Pavel Machek <[email protected]> wrote:
>
> >>>>So why aren't you arguing against bootsplash too? That definitely
> >>>>obscures such an error :> Of course we could argue that such an error
> >>>>shouldn't happen and/or will be obvious via other means (assuming it
> >>>>indicates hardware failure).
> >>>
> >>>Of course I *am* against bootsplash. Unfortunately I've probably lost
> >>>that war already. But at least it is not in -linus tree (and that's
> >>>what I use anyway) => I gave up with bootsplash-equivalents, as long
> >>>as they don't come to linus.
> >>>
> >>>[And I believe Linus would shoot down bootsplash-like code, anyway.]
>
> Why? Joe consumer wants it.
> As to the ever growing size of the kernel, there could be a official
> addons/tools
> tree with non-core functions maintained by a seperate maintainer. Things
> like
> debuggers, profiling or (swsusp) debug support could go there as
> well...

Yes, having -nice patch with bootsplashes, translated kernel messages,
and swsusp eye-candy would work for me. Feel free to maintain it.

> >>Solution: Auto switch to non-swsusp VT on error showing the error message.
> >
> >Hmm, at that point you loose context, like now you know what error
> >happened, but do not know at which phase of suspend. That's pretty bad
> >too.
>
> Right, Good idea! Just print always "ugly" swsusp context on a text VT -
> plus any
> error messages - and switch over to this VT in printk when not in interrupt
> context. 10 lines of code or so in printk ;)

You see, 10 lines in printk is probably good enough reason not to
include that patch in kernel, because its "too ugly". Plus it does not
work if printk _was_ from interrupt context.

swsusp really should not have patch any code outside kernel/power.

Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-25 00:11:28

by Jonathan Sambrook

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

At 11:26 on Wed 24/03/04, [email protected] masquerading as 'Pavel Machek' wrote:
> Hi!
>
> > >On Wed, 2004-03-24 at 18:22, Michael Frank wrote:
> > >>Error messages should be handled on a seperate VT eliminating the issue.
> > >
> > >While I definitely like the idea, I'm not sure that's feasible; as Pavel
> > >pointed out, Suspend doesn't generate all the error messages that might
> > >possibly appear. Maybe I'm just ignorant.. I'll take a look when I get
> > >the change.
> >
> > printk is central and could do the switch when swsusp is active
>
> You *could* do it, but it is bad idea. You don't want to patch
> printk.c, that driver printk could be done from interrupt (and you
> can't switch consoles at that point), you loose context, etc. What
> about doing the simple thing, maybe hack with CRs and be done with
> that?

What about getting rid of the progress bar and just print out at
swsusp2 log-level 9? We're used to scrolling boot messages after all.
(I do currently use bootsplash, but I have it underneath my boot
messages.)

Jonathan

>
> If someone wants more eye candy, they have to patch their kernel with
> bootsplash.
> Pavel
> --
> When do you have a heart between your knees?
> [Johanka's followup: and *two* hearts?]
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> swsusp-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/swsusp-devel

--


.__
|..|__ _____ _____ ___ ____ _ ____
|;;|: \ /;:;:;\ /;:;:;\ /;:;\ /;;;;\/;\/\/;;;;\
| Y \ Y Y \ Y Y \ Y \ | ___ \ |\_| ___ \
|___| /__|_| /__|_| /__| / o \_____/ | \___ /
\/ \/ \/ \/ \/ _/ /
\__/

hmmnsoft's FreeShade : window shading, for Windows, for free

http://www.hmmn.org/FreeShade

2004-03-25 00:18:12

by Michael Frank

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

On Thu, 25 Mar 2004 00:23:38 +0100, Pavel Machek <[email protected]> wrote:

> On ÄŒt 25-03-04 06:46:12, Michael Frank wrote:
>> May I request that you leave the authors headers intact when quoting. Thank
>> you.
>
> As you wish.
>
>> On Wed, 24 Mar 2004 11:17:04 +0100, Pavel Machek <[email protected]> wrote:
>>
>> >>>>So why aren't you arguing against bootsplash too? That definitely
>> >>>>obscures such an error :> Of course we could argue that such an error
>> >>>>shouldn't happen and/or will be obvious via other means (assuming it
>> >>>>indicates hardware failure).
>> >>>
>> >>>Of course I *am* against bootsplash. Unfortunately I've probably lost
>> >>>that war already. But at least it is not in -linus tree (and that's
>> >>>what I use anyway) => I gave up with bootsplash-equivalents, as long
>> >>>as they don't come to linus.
>> >>>
>> >>>[And I believe Linus would shoot down bootsplash-like code, anyway.]
>>
>> Why? Joe consumer wants it.
>> As to the ever growing size of the kernel, there could be a official
>> addons/tools
>> tree with non-core functions maintained by a seperate maintainer. Things
>> like
>> debuggers, profiling or (swsusp) debug support could go there as
>> well...
>
> Yes, having -nice patch with bootsplashes, translated kernel messages,
> and swsusp eye-candy would work for me.

If a -nice _tree_ is useful, your guys just have to launch it. Gosh this could reduce
arguments about what goes into the kernel and save Linus and Andrew lots of work.

> Feel free to maintain it.

Busy enough with testing, actually far too busy for being on a volunteer basis ;)

I am sure that better qualified and properly supported/sponsored individuals
will queue up as long as it is an _official_ -nice tree with the good purpose
of centralizing useful non-core functions :)

>
>> >>Solution: Auto switch to non-swsusp VT on error showing the error message.
>> >
>> >Hmm, at that point you loose context, like now you know what error
>> >happened, but do not know at which phase of suspend. That's pretty bad
>> >too.
>>
>> Right, Good idea! Just print always "ugly" swsusp context on a text VT -
>> plus any
>> error messages - and switch over to this VT in printk when not in interrupt
>> context. 10 lines of code or so in printk ;)
>
> You see, 10 lines in printk is probably good enough reason not to
> include that patch in kernel, because its "too ugly".

Pretty does not count above, Ugly does not count here, Functionality always does.
Besides that patch might be in the -nice tree.

> Plus it does not work if printk _was_ from interrupt context.

Kernel knows when in interrupt context and can defer switching.

>
> swsusp really should not have patch any code outside kernel/power.

Which is extremely ideal, but one thing at the time...

Michael

2004-03-25 00:28:45

by Pavel Machek

[permalink] [raw]
Subject: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On ÄŒt 25-03-04 07:56:14, Michael Frank wrote:
> On Thu, 25 Mar 2004 00:23:38 +0100, Pavel Machek <[email protected]> wrote:
> >On ÄŒt 25-03-04 06:46:12, Michael Frank wrote:
> >>On Wed, 24 Mar 2004 11:17:04 +0100, Pavel Machek <[email protected]> wrote:

> >Yes, having -nice patch with bootsplashes, translated kernel messages,
> >and swsusp eye-candy would work for me.
>
> If a -nice _tree_ is useful, your guys just have to launch it. Gosh this
> could reduce
> arguments about what goes into the kernel and save Linus and Andrew lots of
> work.

Yes.

> >Feel free to maintain it.
>
> Busy enough with testing, actually far too busy for being on a volunteer
> basis ;)
>
> I am sure that better qualified and properly supported/sponsored individuals
> will queue up as long as it is an _official_ -nice tree with the good
> purpose
> of centralizing useful non-core functions :)

I'd say that having official -anything tree is an oxymoron (is -ac
tree official? is -mm tree official?), but yes, I hope someone picks
this up

> >You see, 10 lines in printk is probably good enough reason not to
> >include that patch in kernel, because its "too ugly".
>
> Pretty does not count above, Ugly does not count here, Functionality always
> does.
> Besides that patch might be in the -nice tree.

Prettyness *does* count in -linus tree. -nice tree is likely to have
different criteria.

> >swsusp really should not have patch any code outside kernel/power.
>
> Which is extremely ideal, but one thing at the time...

Okay, lets not please add more of outside changes (for -linus merge).

Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-25 00:38:23

by Karol Kozimor

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Thus wrote Michael Frank:
> Badblocks error reading beyond the end of the partition is irrelevant,
> it is a primitive bug somewhere unrelated to media condition.
>
> Also Badblocks without disabling drive cache is _utterly_useless_.
>
> It will not be a bare swsusp bug, I would have hit that in 20K+ cycles
> since using LZF and a thousand or so of other 2.4 users would have
> hit it too.
>
> Please help indentify the actual problem by running some decent tests.

I did my testing after hdparm -W0.
Best regards,

--
Karol 'sziwan' Kozimor
[email protected]

2004-03-25 00:54:45

by Michael Frank

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Thu, 25 Mar 2004 01:23:02 +0100, Pavel Machek <[email protected]> wrote:

> On ÄŒt 25-03-04 07:56:14, Michael Frank wrote:
>> On Thu, 25 Mar 2004 00:23:38 +0100, Pavel Machek <[email protected]> wrote:
>> >On ÄŒt 25-03-04 06:46:12, Michael Frank wrote:
>> >>On Wed, 24 Mar 2004 11:17:04 +0100, Pavel Machek <[email protected]> wrote:
>
>> >Yes, having -nice patch with bootsplashes, translated kernel messages,
>> >and swsusp eye-candy would work for me.
>>
>> If a -nice _tree_ is useful, your guys just have to launch it. Gosh this
>> could reduce
>> arguments about what goes into the kernel and save Linus and Andrew lots of
>> work.
>
> Yes.
>
>> >Feel free to maintain it.
>>
>> Busy enough with testing, actually far too busy for being on a volunteer
>> basis ;)
>>
>> I am sure that better qualified and properly supported/sponsored individuals
>> will queue up as long as it is an _official_ -nice tree with the good
>> purpose
>> of centralizing useful non-core functions :)
>
> I'd say that having official -anything tree is an oxymoron (is -ac
> tree official? is -mm tree official?), but yes, I hope someone picks
> this up

-mm or -ac are "private trees", albeit very credible and at least -mm is
experimental.

Linux is now 10 times as big as it was a few years ago, OK tools are better,
but there is so much important work like PM waiting.

If Linus or Andrew and peers said, OK bootsplash, or kgdb or whatever
is nice "enough", but I do not have time to deal with it, put it into the -nice tree,
it would be "official" enough.

And if for that matter all of swsusp would end up in _the_ -nice tree
(not crippled of course), that would be (_speaking_for_myself)
(sorry Nigel, should you disagree) also OK.

Then after all, -nice can then be much nicer than -Linus without many fights
and accomplishing more than ever ;).

>
>> >You see, 10 lines in printk is probably good enough reason not to
>> >include that patch in kernel, because its "too ugly".
>>
>> Pretty does not count above, Ugly does not count here, Functionality always
>> does.
>> Besides that patch might be in the -nice tree.
>
> Prettyness *does* count in -linus tree. -nice tree is likely to have
> different criteria.

As this may be another religious war, I'll give up.

>
>> >swsusp really should not have patch any code outside kernel/power.
>>
>> Which is extremely ideal, but one thing at the time...
>
> Okay, lets not please add more of outside changes (for -linus merge).

Fine by me as long as it works.

Guess Nigel will come up with a spec soon and then it has to be decided
what functions you want in -Linus.

I'll be traveling a few days and may be slower to respond.

Michael

2004-03-25 01:41:30

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On ÄŒt 25-03-04 08:50:27, Michael Frank wrote:
> On Thu, 25 Mar 2004 01:23:02 +0100, Pavel Machek <[email protected]> wrote:
>
> >On ÄŒt 25-03-04 07:56:14, Michael Frank wrote:
> >>On Thu, 25 Mar 2004 00:23:38 +0100, Pavel Machek <[email protected]> wrote:

> >>I am sure that better qualified and properly supported/sponsored
> >>individuals
> >>will queue up as long as it is an _official_ -nice tree with the good
> >>purpose
> >>of centralizing useful non-core functions :)
> >
> >I'd say that having official -anything tree is an oxymoron (is -ac
> >tree official? is -mm tree official?), but yes, I hope someone picks
> >this up
>
> -mm or -ac are "private trees", albeit very credible and at least -mm is
> experimental.

Having credible "private" -nice tree would be enough, I guess.

> >>Which is extremely ideal, but one thing at the time...
> >
> >Okay, lets not please add more of outside changes (for -linus merge).
>
> Fine by me as long as it works.
>
> Guess Nigel will come up with a spec soon and then it has to be decided
> what functions you want in -Linus.

My priorities are

* highmem support (there are notebooks with 2GB ram; I have one too
close to me)
[I have hacky patch for this for swsusp1; at least its short]

* smp support (HT notebooks are going to be more common, I'm afraid)

Important but not at price of modifying too many files outside
kernel/power

* refrigerator should work (but if you have NFS server mounted, and
its down, you are on your own)

* even if all memory is used, it should be possible to suspend

"If it is very non-intrusive it might go in"

* esc interrupts

* I'd say that one compression method should be enough for everyone

Features I'd prefer not to see in -linus kernel

* splashscreen

* /proc configuration
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-25 21:38:57

by Nigel Cunningham

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

By the way, here's an example where having the /proc interface is a good
thing: which do you use? zip compression, lzf compression or no
compression? Until recently I always used lzf compression. I just
upgraded my laptop's hard drive, and found I wasn't getting the
performance improvements in suspending I expected. It turns out that the
CPU is now the limiting factor. Because I had the /proc interface, I
could easily adjust the debug settings to show me throughput and then
try a couple of suspend cycles with compression enabled and with it
disabled. Without the /proc interface, I would have had to have
recompiled the kernel to switch settings. (I didn't try gzip because I
knew it wasn't going to be a contender for me).

Regards,

Nigel

On Thu, 2004-03-25 at 19:57, Matthias Wieser wrote:
> On Thursday 25 March 2004 02:41, Pavel Machek wrote:
> > * I'd say that one compression method should be enough for everyone
> No:
>
> Fast Compression algorithm: lzw
> Good Compression algorithm: gzip
>
> That is the reason, 2 algorithm may exist and have their reason.
>
> Ciao, Matthias
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-25 22:08:32

by Pavel Machek

[permalink] [raw]
Subject: Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]

Hi!

> BTW Pavel I'm not arguing that the code has to stay in without
> modification
> (e.g. massively stripped down or whatever). But this is one place where
> kernel code is a lot closer to Joe Enduser's awareness than is usually
> the case, so IMUO, the priorities shift.

If joe user can survive seeing kernel messages while booting, he
should survive swsusp1-style messages, too....
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-25 22:18:00

by Pavel Machek

[permalink] [raw]
Subject: swsusp is not reliable. Face it. [was Re: [Swsusp-devel] Re: swsusp problems]

Hi!

> >>Well, not all printks() are errors this hard. And at some points, it
> >>is no longer possible to abort (after pagedir is on disk, its okay to
> >>panic (machine will resume normally after that), but its not okay to
> >>simply return. You could fix signature then return, but it would be hard).
>
> It is _utterly_ unacceptable to fail during suspend. You might as well take
> a sledge hammer to kill the box!

Okay, so bring me the sledgehammer.

> Suspend is a mechanism to suspend the system transparently and
> _NOT_EVER_ impairing the system. There can be NO_COMPROMISE and
> NO_EXCUSE. I walk out of my office suspending the machine and resuming it
> in front of my client it can't ever fail, or am I an idiot to advocate
> linux?
>
> If I would be willing to accept failure I would not spend my time here and
> utilize M$'s incarnation of an architectural idiocy.

You are wrong.

swsusp1 fails your test, swsusp2 fails your test, and pmdisk fails it,
too. If half of memory is used by kmalloc(), there's no sane way to
make suspend-to-disk working. And swsusp[12] does not. Granted, half
of memory kmalloc-ed is unusual situation, but it can theoreticaly
happen. Try mem=8M or something.

So stop spreading nonsense about "NO COMPROMISES" and stop comparing
me to Mickey$oft. Thanks.

> Right, because nobody pays attention. If you step on the brake pedal driving
> your car these brakes better work or you may be well worse of than dead
> ... just in case you ran over a kid...

So don't put swsusp into car-braking system.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-25 22:29:06

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi!

> By the way, here's an example where having the /proc interface is a good
> thing: which do you use? zip compression, lzf compression or no
> compression? Until recently I always used lzf compression. I just

We should select one, and drop the others.

gzip is useless for almost everyone -> gets little testing -> is
probably broken.

> upgraded my laptop's hard drive, and found I wasn't getting the
> performance improvements in suspending I expected. It turns out that the
> CPU is now the limiting factor. Because I had the /proc interface, I
> could easily adjust the debug settings to show me throughput and then
> try a couple of suspend cycles with compression enabled and with it
> disabled. Without the /proc interface, I would have had to have
> recompiled the kernel to switch settings. (I didn't try gzip because I
> knew it wasn't going to be a contender for me).

Kernel could automagically select the right one.. But I'd prefer for
only "non compressed" part to reach mainline for 2.6. Feature freeze
was few months ago, and "adding possibility to compress swsusp data"
does not sound like a bugfix to me...
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-25 22:36:13

by Nigel Cunningham

[permalink] [raw]
Subject: Re: swsusp is not reliable. Face it. [was Re: [Swsusp-devel] Re: swsusp problems]

Howdy.

On Fri, 2004-03-26 at 10:13, Pavel Machek wrote:
> swsusp1 fails your test, swsusp2 fails your test, and pmdisk fails it,
> too. If half of memory is used by kmalloc(), there's no sane way to
> make suspend-to-disk working. And swsusp[12] does not. Granted, half
> of memory kmalloc-ed is unusual situation, but it can theoreticaly
> happen. Try mem=8M or something.

Of course if you do have 8M memory, you're not going to care about
suspending to disk anyway :>. Note too that suspend2 will eat memory
until it can suspend. It doesn't livelock because it grabs the memory it
frees immediately and if it can't free enough, it gives up and exits
cleanly. You'll know almost instantly if your suspend is going to
succeed or fail: once you start seeing the image written, the only thing
that will stop it is media/hardware failure or user intervention.

> > Right, because nobody pays attention. If you step on the brake pedal driving
> > your car these brakes better work or you may be well worse of than dead
> > ... just in case you ran over a kid...
>
> So don't put swsusp into car-braking system.

LOL! Or M$!

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-25 22:42:45

by Nigel Cunningham

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi.

On Fri, 2004-03-26 at 10:27, Pavel Machek wrote:
> We should select one, and drop the others.
>
> gzip is useless for almost everyone -> gets little testing -> is
> probably broken.

Nope. Not broken. I used it yesterday in testing some changes you
suggested. I asked about removing it a while back and people said 'No!
We're using it!"

> > upgraded my laptop's hard drive, and found I wasn't getting the
> > performance improvements in suspending I expected. It turns out that the
> > CPU is now the limiting factor. Because I had the /proc interface, I
> > could easily adjust the debug settings to show me throughput and then
> > try a couple of suspend cycles with compression enabled and with it
> > disabled. Without the /proc interface, I would have had to have
> > recompiled the kernel to switch settings. (I didn't try gzip because I
> > knew it wasn't going to be a contender for me).
>
> Kernel could automagically select the right one.. But I'd prefer for

How? It would need to know the drive throughput, the compression
throughput for each driver... I'm sure you're not suggesting some sort
of utility run during kernel configuration/compilation. (And if you
were, that would assume the computer it was being compiled on was the
computer the kernel would be run on).

> only "non compressed" part to reach mainline for 2.6. Feature freeze
> was few months ago, and "adding possibility to compress swsusp data"
> does not sound like a bugfix to me...

Feature freeze is always half unfrozen anyway. 2.4 just gained XFS! I'm
sure 2.6 would be fine gaining a permutation of suspend with Highmem
support, swap file support, support for multiple swap devices, support
for bootsplash, support for compression and/or encryption and support
for SMP/HT. (Did I forget anything? :>).

Regards,

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-25 22:57:02

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp is not reliable. Face it. [was Re: [Swsusp-devel] Re: swsusp problems]

Hi!

> > swsusp1 fails your test, swsusp2 fails your test, and pmdisk fails it,
> > too. If half of memory is used by kmalloc(), there's no sane way to
> > make suspend-to-disk working. And swsusp[12] does not. Granted, half
> > of memory kmalloc-ed is unusual situation, but it can theoreticaly
> > happen. Try mem=8M or something.
>
> Of course if you do have 8M memory, you're not going to care about
> suspending to disk anyway :>. Note too that suspend2 will eat memory
> until it can suspend. It doesn't livelock because it grabs the memory it
> frees immediately and if it can't free enough, it gives up and exits
> cleanly. You'll know almost instantly if your suspend is going to
> succeed or fail: once you start seeing the image written, the only thing
> that will stop it is media/hardware failure or user intervention.

Yep, swsusp2 will

a) either fail and exit cleanly

b) or suspend to disk and powerdown

. And that's correct behaviour. Michael apparently wants suspend that
always suspends, and never refuses, but not even swsusp2 can do
that.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-25 23:07:43

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi!

> > only "non compressed" part to reach mainline for 2.6. Feature freeze
> > was few months ago, and "adding possibility to compress swsusp data"
> > does not sound like a bugfix to me...
>
> Feature freeze is always half unfrozen anyway. 2.4 just gained XFS!
> I'm

XFS probably has 10+ people working on it, full time, and it still
required quite a lot of pushing.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-26 06:04:18

by Michael Frank

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Thu, 25 Mar 2004 23:27:45 +0100, Pavel Machek <[email protected]> wrote:

> Hi!
>
>> By the way, here's an example where having the /proc interface is a good
>> thing: which do you use? zip compression, lzf compression or no
>> compression? Until recently I always used lzf compression. I just
>
> We should select one, and drop the others.
>
> gzip is useless for almost everyone -> gets little testing -> is
> probably broken.
>
>> upgraded my laptop's hard drive, and found I wasn't getting the
>> performance improvements in suspending I expected. It turns out that the
>> CPU is now the limiting factor. Because I had the /proc interface, I
>> could easily adjust the debug settings to show me throughput and then
>> try a couple of suspend cycles with compression enabled and with it
>> disabled. Without the /proc interface, I would have had to have
>> recompiled the kernel to switch settings. (I didn't try gzip because I
>> knew it wasn't going to be a contender for me).
>

/proc is needed a lot

- enable escape
- select reboot mode
which is essential for multibooting. We use it all the time to
boot various installations of Linux
- select compression none, lzw or gzip
none is used when disk faster that cpu-limited lzf
lzf is used when cpu is fast enough to compress to disk
Fast CPU can do 100MB/s+ to 50MB/s drives
gzip is used by some who care about image size eg flash users
- keep image mode (when compiled in)
which is used for embedded kiosks for example
Boeing requested and uses it
- default console level
Controls console messages or nice display
- access debug info header
This is needed to analyze swsusp2 performance
- access resume parameters
Saves a reboot when changing parameters
- activate
swsusp2 activation independent of apci, apm
- last result
info on why swsusp2 did not suspend such as out of
memory or swap or freezing failure
- access version info
could be dropped in mainline
- access interface version info
could be dropped in mainline

Michael

2004-03-26 06:06:15

by Michael Frank

[permalink] [raw]
Subject: Re: swsusp is not reliable. Face it. [was Re: [Swsusp-devel] Re: swsusp problems]

You forgot to leave the header ;)

On Thu, 25 Mar 2004 23:13:48 +0100, Pavel Machek <[email protected]> wrote:

>
>> Suspend is a mechanism to suspend the system transparently and
>> _NOT_EVER_ impairing the system. There can be NO_COMPROMISE and
>> NO_EXCUSE. I walk out of my office suspending the machine and resuming it
>> in front of my client it can't ever fail, or am I an idiot to advocate
>> linux?
>>
>> If I would be willing to accept failure I would not spend my time here and
>> utilize M$'s incarnation of an architectural idiocy.
>
> You are wrong.
>
> swsusp1 fails your test, swsusp2 fails your test, and pmdisk fails it,
> too. If half of memory is used by kmalloc(), there's no sane way to
> make suspend-to-disk working. And swsusp[12] does not. Granted, half
> of memory kmalloc-ed is unusual situation, but it can theoreticaly
> happen. Try mem=8M or something.

No, I am not!

mem=8M won't boot into a usable system. mem=~11M will not suspend and
swsusp2 will exit gracefully and this is tested.

So swsusp2 does _not_ fail. You still have a usable system instead of a
paniced system you seem to like to accept.

Regards
Michael

2004-03-26 06:04:10

by Michael Frank

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Thu, 25 Mar 2004 23:54:53 +0100, Pavel Machek <[email protected]> wrote:

> Hi!
>
>> > only "non compressed" part to reach mainline for 2.6. Feature freeze
>> > was few months ago, and "adding possibility to compress swsusp data"
>> > does not sound like a bugfix to me...
>>
>> Feature freeze is always half unfrozen anyway. 2.4 just gained XFS!
>> I'm
>
> XFS probably has 10+ people working on it, full time, and it still
> required quite a lot of pushing.

We could push harder. Should we ask all swsusp2 users using
compression to announce themselves on LKML? ;)

Michael

2004-03-26 09:59:43

by Pavel Machek

[permalink] [raw]
Subject: Re: swsusp is not reliable. Face it. [was Re: [Swsusp-devel] Re: swsusp problems]

Hi!

On P? 26-03-04 13:59:55, Michael Frank wrote:
> On Thu, 25 Mar 2004 23:13:48 +0100, Pavel Machek <[email protected]> wrote:
> >>Suspend is a mechanism to suspend the system transparently and
> >>_NOT_EVER_ impairing the system. There can be NO_COMPROMISE and
> >>NO_EXCUSE. I walk out of my office suspending the machine and resuming it
> >>in front of my client it can't ever fail, or am I an idiot to advocate
> >>linux?
> >>
> >>If I would be willing to accept failure I would not spend my time here and
> >>utilize M$'s incarnation of an architectural idiocy.
> >
> >You are wrong.
> >
> >swsusp1 fails your test, swsusp2 fails your test, and pmdisk fails it,
> >too. If half of memory is used by kmalloc(), there's no sane way to
> >make suspend-to-disk working. And swsusp[12] does not. Granted, half
> >of memory kmalloc-ed is unusual situation, but it can theoreticaly
> >happen. Try mem=8M or something.
>
> No, I am not!
>
> mem=8M won't boot into a usable system. mem=~11M will not suspend and
> swsusp2 will exit gracefully and this is tested.
>
> So swsusp2 does _not_ fail. You still have a usable system instead of a
> paniced system you seem to like to accept.

If swsusp1 panics system, that's a bug. I'm not accepting that one.

Refusing to suspend (I'd call it "fail to suspend") is bad but is not
a bug. Do we understand each other now?
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-26 10:22:20

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi!

> >>upgraded my laptop's hard drive, and found I wasn't getting the
> >>performance improvements in suspending I expected. It turns out that the
> >>CPU is now the limiting factor. Because I had the /proc interface, I
> >>could easily adjust the debug settings to show me throughput and then
> >>try a couple of suspend cycles with compression enabled and with it
> >>disabled. Without the /proc interface, I would have had to have
> >>recompiled the kernel to switch settings. (I didn't try gzip because I
> >>knew it wasn't going to be a contender for me).
> >
>
> /proc is needed a lot
>
> - enable escape
> - select reboot mode
> which is essential for multibooting. We use it all the time to
> boot various installations of Linux

Perhaps reboot() can have parameter for that.

> - select compression none, lzw or gzip
> none is used when disk faster that cpu-limited lzf
> lzf is used when cpu is fast enough to compress to disk
> Fast CPU can do 100MB/s+ to 50MB/s drives
> gzip is used by some who care about image size eg flash users

If you are doing "resume=swap:<something>", why not "resume=lzw-swap:something"?

> - keep image mode (when compiled in)
> which is used for embedded kiosks for example
> Boeing requested and uses it

Boeing can keep external patch, this seems like pretty dangerous
feature for joe user. And it should not be selected at /proc, but as
command line parameter.

> - default console level
> Controls console messages or nice display
> - access debug info header
> This is needed to analyze swsusp2 performance
> - access resume parameters
> Saves a reboot when changing parameters
> - activate
> swsusp2 activation independent of apci, apm

Should not be needed. There's reboot() syscall to do that.

> - last result
> info on why swsusp2 did not suspend such as out of
> memory or swap or freezing failure

That should be return value of reboot() syscall.

Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-26 10:25:44

by Michael Frank

[permalink] [raw]
Subject: Re: swsusp is not reliable. Face it. [was Re: [Swsusp-devel] Re: swsusp problems]

On Fri, 26 Mar 2004 10:59:29 +0100, Pavel Machek <[email protected]> wrote:

> Hi!
>
> On P? 26-03-04 13:59:55, Michael Frank wrote:
>> On Thu, 25 Mar 2004 23:13:48 +0100, Pavel Machek <[email protected]> wrote:
>> >>Suspend is a mechanism to suspend the system transparently and
>> >>_NOT_EVER_ impairing the system. There can be NO_COMPROMISE and
>> >>NO_EXCUSE. I walk out of my office suspending the machine and resuming it
>> >>in front of my client it can't ever fail, or am I an idiot to advocate
>> >>linux?
>> >>
>> >>If I would be willing to accept failure I would not spend my time here and
>> >>utilize M$'s incarnation of an architectural idiocy.
>> >
>> >You are wrong.
>> >
>> >swsusp1 fails your test, swsusp2 fails your test, and pmdisk fails it,
>> >too. If half of memory is used by kmalloc(), there's no sane way to
>> >make suspend-to-disk working. And swsusp[12] does not. Granted, half
>> >of memory kmalloc-ed is unusual situation, but it can theoreticaly
>> >happen. Try mem=8M or something.
>>
>> No, I am not!
>>
>> mem=8M won't boot into a usable system. mem=~11M will not suspend and
>> swsusp2 will exit gracefully and this is tested.
>>
>> So swsusp2 does _not_ fail. You still have a usable system instead of a
>> paniced system you seem to like to accept.
>
> If swsusp1 panics system, that's a bug. I'm not accepting that one.

OK,

>
> Refusing to suspend (I'd call it "fail to suspend") is bad but is not
> a bug.

Right, and in case of swsusp2, it generally can be avoided by proper
sizing of swap.

> Do we understand each other now?

Yes

Michael

2004-03-26 10:50:38

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi!

> >> > only "non compressed" part to reach mainline for 2.6. Feature freeze
> >> > was few months ago, and "adding possibility to compress swsusp data"
> >> > does not sound like a bugfix to me...
> >>
> >> Feature freeze is always half unfrozen anyway. 2.4 just gained XFS!
> >> I'm
> >
> > XFS probably has 10+ people working on it, full time, and it still
> > required quite a lot of pushing.
>
> We could push harder. Should we ask all swsusp2 users using
> compression to announce themselves on LKML? ;)

Hmm, I've one such troll announcing into my inbox, sending stuff like
below. I hope he's the only one.

Subject: Microsoft RULES!!!
To: Pavel Machek <[email protected]>
...

So stop spreading nonsense about "NO COMPROMISES" and
stop comparing
me to Mickey$oft. Thanks.

---------------------

Such a CUNT!!! Microsoft do very well engineered
software, something Linux can't sadly achive in ages
if still emply LOOSERS LIKE U.

When a Microsoft engineer see a people like u, simply
laugh!!!

U MORON. U ARE SUCH A STUPID CUNT, NOW ALL THE LIST
KNOW ABOUT U, WE ALL LAUGHT!

and

> So swsusp2 does _not_ fail. You still have a usable
system instead of a
> paniced system you seem to like to accept.

If swsusp1 panics system, that's a bug. I'm not
accepting that one.

Refusing to suspend (I'd call it "fail to suspend") is
bad but is not
a bug. Do we understand each other now?

-------------

U know, I used to call u your name this days: STUPID,
IDIOT, MORON, but that's not enough.

Your last sentence related with your CRAPPY, LURID,
DANGEROUS swsusp1 qualify u as what u are.

A WORST GEEK, A LOOSER, A STUPID CUNT: U WONT EVER
ADMIT YOUR PROGRAM CRASH AND THAT IS A BUG :-((((



--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-26 13:21:24

by Michael Frank

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Fri, 26 Mar 2004 11:50:23 +0100, Pavel Machek <[email protected]> wrote:

> Hi!
>
>> >> > only "non compressed" part to reach mainline for 2.6. Feature freeze
>> >> > was few months ago, and "adding possibility to compress swsusp data"
>> >> > does not sound like a bugfix to me...
>> >>
>> >> Feature freeze is always half unfrozen anyway. 2.4 just gained XFS!
>> >> I'm
>> >
>> > XFS probably has 10+ people working on it, full time, and it still
>> > required quite a lot of pushing.
>>
>> We could push harder. Should we ask all swsusp2 users using
>> compression to announce themselves on LKML? ;)
>
> Hmm, I've one such troll announcing into my inbox, sending stuff like
> below. I hope he's the only one.

Is not what I had in mind. Please accept that people are benefitially
using compression, both lzf and gzip.

The individual in question has poor taste, I would never call anyone
my favorite dessert ;)

Michael

>
> Subject: Microsoft RULES!!!
> To: Pavel Machek <[email protected]>
> ...
>
> So stop spreading nonsense about "NO COMPROMISES" and
> stop comparing
> me to Mickey$oft. Thanks.
>
> ---------------------
>
> Such a CUNT!!! Microsoft do very well engineered
> software, something Linux can't sadly achive in ages
> if still emply LOOSERS LIKE U.
>
> When a Microsoft engineer see a people like u, simply
> laugh!!!
>
> U MORON. U ARE SUCH A STUPID CUNT, NOW ALL THE LIST
> KNOW ABOUT U, WE ALL LAUGHT!
>
> and
>
>> So swsusp2 does _not_ fail. You still have a usable
> system instead of a
>> paniced system you seem to like to accept.
>
> If swsusp1 panics system, that's a bug. I'm not
> accepting that one.
>
> Refusing to suspend (I'd call it "fail to suspend") is
> bad but is not
> a bug. Do we understand each other now?
>
> -------------
>
> U know, I used to call u your name this days: STUPID,
> IDIOT, MORON, but that's not enough.
>
> Your last sentence related with your CRAPPY, LURID,
> DANGEROUS swsusp1 qualify u as what u are.
>
> A WORST GEEK, A LOOSER, A STUPID CUNT: U WONT EVER
> ADMIT YOUR PROGRAM CRASH AND THAT IS A BUG :-((((
>
>
>

2004-03-26 21:31:19

by Nigel Cunningham

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Good morning.

On Fri, 2004-03-26 at 22:22, Pavel Machek wrote:
> > /proc is needed a lot
> >
> > - enable escape
> > - select reboot mode
> > which is essential for multibooting. We use it all the time to
> > boot various installations of Linux
>
> Perhaps reboot() can have parameter for that.

Sounds feasible. Who maintains the package with reboot/shutdown and so
on?

> > - select compression none, lzw or gzip
> > none is used when disk faster that cpu-limited lzf
> > lzf is used when cpu is fast enough to compress to disk
> > Fast CPU can do 100MB/s+ to 50MB/s drives
> > gzip is used by some who care about image size eg flash users
>
> If you are doing "resume=swap:<something>", why not "resume=lzw-swap:something"?

Because it's ugly? resume= is supposed to specify where the image's
header is found, nothing more. More than that, though, doing this still
doesn't solve the issue of how to enable/disable a compressor (or
encryption when such a plugin appears) after booting. (Yes, I know - you
don't want that much flexibility).

> > - keep image mode (when compiled in)
> > which is used for embedded kiosks for example
> > Boeing requested and uses it
>
> Boeing can keep external patch, this seems like pretty dangerous
> feature for joe user. And it should not be selected at /proc, but as
> command line parameter.

It relies upon both a compile time CONFIG option (because yes, it is
rarely used) and a /proc entry. The proc entry is necessary so that the
image can be updated and set up in the first place.

> > - default console level
> > Controls console messages or nice display
> > - access debug info header
> > This is needed to analyze swsusp2 performance
> > - access resume parameters
> > Saves a reboot when changing parameters
> > - activate
> > swsusp2 activation independent of apci, apm
>
> Should not be needed. There's reboot() syscall to do that.

That's fine once we get one implementation. For now, I've been trying to
play nicely with swsusp and pmdisk. That's why I used resume2= and its
also why I supported 13 headers; I needed to recognise pmdisk and swsusp
headers so that I could know to ignore them (I tried leaving swsusp
first in the boot order, and it paniced when I'd suspended from suspend2
because it didn't recognise the header format).

> > - last result
> > info on why swsusp2 did not suspend such as out of
> > memory or swap or freezing failure
>
> That should be return value of reboot() syscall.

Could be.

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-26 22:22:52

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi!

> Good morning.

Its around midnight here :-).

> > > /proc is needed a lot
> > >
> > > - enable escape
> > > - select reboot mode
> > > which is essential for multibooting. We use it all the time to
> > > boot various installations of Linux
> >
> > Perhaps reboot() can have parameter for that.
>
> Sounds feasible. Who maintains the package with reboot/shutdown and so
> on?

On my system it says:

AUTHOR
Miquel van Smoorenburg, [email protected]

There was even patch to do shutdown -z, or something like that... And
binary that calls it should be easy, too.

> > > - select compression none, lzw or gzip
> > > none is used when disk faster that cpu-limited lzf
> > > lzf is used when cpu is fast enough to compress to disk
> > > Fast CPU can do 100MB/s+ to 50MB/s drives
> > > gzip is used by some who care about image size eg flash users
> >
> > If you are doing "resume=swap:<something>", why not "resume=lzw-swap:something"?
>
> Because it's ugly? resume= is supposed to specify where the image's
> header is found, nothing more. More than that, though, doing this still
> doesn't solve the issue of how to enable/disable a compressor (or
> encryption when such a plugin appears) after booting. (Yes, I know - you
> don't want that much flexibility).

You are right, that would be ugly. How is encryption supposed to work,
kernel asks you to type in a key?

Okay, when there's more than one output plugin, some kind of file
interface is probably good.

> > > - default console level
> > > Controls console messages or nice display
> > > - access debug info header
> > > This is needed to analyze swsusp2 performance
> > > - access resume parameters
> > > Saves a reboot when changing parameters
> > > - activate
> > > swsusp2 activation independent of apci, apm
> >
> > Should not be needed. There's reboot() syscall to do that.
>
> That's fine once we get one implementation. For now, I've been trying to
> play nicely with swsusp and pmdisk. That's why I used resume2= and its
> also why I supported 13 headers; I needed to recognise pmdisk and swsusp
> headers so that I could know to ignore them (I tried leaving swsusp
> first in the boot order, and it paniced when I'd suspended from suspend2
> because it didn't recognise the header format).

Okay, we really should have only one implementation.
Pavel

--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-27 02:21:39

by Micha

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Thu, Mar 25, 2004 at 11:27:45PM +0100, Pavel Machek wrote:
> Hi!
>
> > By the way, here's an example where having the /proc interface is a good
> > thing: which do you use? zip compression, lzf compression or no
> > compression? Until recently I always used lzf compression. I just
>
> We should select one, and drop the others.
>
> gzip is useless for almost everyone -> gets little testing -> is
> probably broken.
>
> > upgraded my laptop's hard drive, and found I wasn't getting the
> > performance improvements in suspending I expected. It turns out that the
> > CPU is now the limiting factor. Because I had the /proc interface, I
> > could easily adjust the debug settings to show me throughput and then
> > try a couple of suspend cycles with compression enabled and with it
> > disabled. Without the /proc interface, I would have had to have
> > recompiled the kernel to switch settings. (I didn't try gzip because I
> > knew it wasn't going to be a contender for me).
>
> Kernel could automagically select the right one.. But I'd prefer for
> only "non compressed" part to reach mainline for 2.6. Feature freeze

That would mean that most people around would want to patch their
kernel considering the speed increase and the saved space (which
converts to needing less swap) considering most people get 30%-50%
compression rate, which translates to quite a bit with laptops with 1G
ram being no too uncommon these days.

You shouldn't be to extreme in you eagerness to trim things. You should
always take in mind what would be good for the user, not only the
developer.

> was few months ago, and "adding possibility to compress swsusp data"
> does not sound like a bugfix to me...
> Pavel
> --
> When do you have a heart between your knees?
> [Johanka's followup: and *two* hearts?]
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> swsusp-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/swsusp-devel
>
> +++++++++++++++++++++++++++++++++++++++++++
> This Mail Was Scanned By Mail-seCure System
> at the Tel-Aviv University CC.
>

2004-03-27 03:08:23

by Nigel Cunningham

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Howdy again.

On Sat, 2004-03-27 at 10:22, Pavel Machek wrote:
> You are right, that would be ugly. How is encryption supposed to work,
> kernel asks you to type in a key?

I haven't thought about the specifics there. Perhaps the plugin prompts
for one, or perhaps it takes a lilo parameter? Since the resume time
command line gets forgotten with the switch back to the original kernel,
I don't think that would introduce any insecurity. Anyway, not being a
security expert, I'll leave that issue to whoever cares enough to make
an encryption plugin.

Nigel
--
Nigel Cunningham
C/- Westminster Presbyterian Church Belconnen
61 Templeton Street, Cook, ACT 2614.
+61 (2) 6251 7727(wk); +61 (2) 6253 0250 (home)

Evolution (n): A hypothetical process whereby infinitely improbable events occur
with alarming frequency, order arises from chaos, and no one is given credit.

2004-03-27 03:37:57

by Luke-Jr

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Saturday 27 March 2004 02:08 am, Nigel Cunningham wrote:
> On Sat, 2004-03-27 at 10:22, Pavel Machek wrote:
> > You are right, that would be ugly. How is encryption supposed to work,
> > kernel asks you to type in a key?
>
> I haven't thought about the specifics there. Perhaps the plugin prompts
> for one, or perhaps it takes a lilo parameter?
The only purpose I can think of for encryption would be so someone can't grab
the HD and boot it on another PC or read the image directly.
Unless I'm missing something, that would imply that the key would need to be
generated from a hardware profile (only creatable by root) somehow to
restrict its readability to that one system.

2004-03-27 04:29:02

by Micha

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Sat, Mar 27, 2004 at 03:37:48AM +0000, Luke-Jr wrote:
> On Saturday 27 March 2004 02:08 am, Nigel Cunningham wrote:
> > On Sat, 2004-03-27 at 10:22, Pavel Machek wrote:
> > > You are right, that would be ugly. How is encryption supposed to work,
> > > kernel asks you to type in a key?
> >
> > I haven't thought about the specifics there. Perhaps the plugin prompts
> > for one, or perhaps it takes a lilo parameter?
> The only purpose I can think of for encryption would be so someone can't grab
> the HD and boot it on another PC or read the image directly.
> Unless I'm missing something, that would imply that the key would need to be
> generated from a hardware profile (only creatable by root) somehow to
> restrict its readability to that one system.
>

Actually it would be very unlikely that grabbing the hard disk would
enable to boot on another machine since you are restoring all the
context/modules etc. The grabber would need an identical system, and
even then I doubt it would work (I don't know how flexible linux and
the hardware are in this respect.

Its more a question of grabbing you entire computer and getting access
to you hard disk, including encrypted partitions. In this case you
would want to request a key from the user and not use a hardware
related key.

>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> swsusp-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/swsusp-devel
>
> +++++++++++++++++++++++++++++++++++++++++++
> This Mail Was Scanned By Mail-seCure System
> at the Tel-Aviv University CC.
>

2004-03-27 04:42:46

by Luke-Jr

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Saturday 27 March 2004 04:28 am, Micha Feigin wrote:
> Actually it would be very unlikely that grabbing the hard disk would
> enable to boot on another machine since you are restoring all the
> context/modules etc. The grabber would need an identical system, and
> even then I doubt it would work (I don't know how flexible linux and
> the hardware are in this respect.
But a different system *could* be used to analyze the content of the partition
were it stolen.
>
> Its more a question of grabbing you entire computer and getting access
> to you hard disk, including encrypted partitions. In this case you
> would want to request a key from the user and not use a hardware
> related key.
hardware-related is probably better than an argument, at least.

2004-03-27 14:50:55

by Jamie Lokier

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Nigel Cunningham wrote:
> > Kernel could automagically select the right one..
>
> How?

Run two parallel tasks: 1. write pages to disk by queuing the I/Os;
2. compress unqueued pages.

The first task won't use much CPU because it's always waiting for disk
DMAs to complete. While it sleeps, the second task runs.
Alternatively this can be implemented using polling for I/O
completions in the second task, if that's easier.

The first task should keep the I/O queue full enough to sustain
writing, but not much fuller than that. Either a fixed queue length
will be fine, or it is easy to adjust the queue length dynamically by
enlarging it if any I/O completions occur when the queue is empty.

The second task consumes uncompressed pages and makes available
compressed pages.

When the first task wants to queue more pages for I/O, it first checks
the compressed-page list. If there are any, they are queued,
otherwise it consumes uncompressed pages.

This will automatically converge on an optimal balance between
compressed and uncompressed page writing, provided the disk is using
DMA, which they do on all modern system.

This is actually better than fully enabling or disabling compression.
Even on a slow CPU, the fastest strategy is to compress x% of the
pages. E.g. if the CPU can compress 1 page in the time it takes to
write 3 pages, you will suspend fastest by compressing about 30% of
all pages.

-- Jamie

2004-03-27 19:50:29

by Micha

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Sat, Mar 27, 2004 at 04:40:47AM +0000, Luke-Jr wrote:
> On Saturday 27 March 2004 04:28 am, Micha Feigin wrote:
> > Actually it would be very unlikely that grabbing the hard disk would
> > enable to boot on another machine since you are restoring all the
> > context/modules etc. The grabber would need an identical system, and
> > even then I doubt it would work (I don't know how flexible linux and
> > the hardware are in this respect.
> But a different system *could* be used to analyze the content of the partition
> were it stolen.

It could be relevant so that you can access the resume data only on the
same computer so that access to encrypted partitions couldn't be gained
by looking at the suspend data, but I don't know how encrypted
partitions work, so can't say anything about that.

> >
> > Its more a question of grabbing you entire computer and getting access
> > to you hard disk, including encrypted partitions. In this case you
> > would want to request a key from the user and not use a hardware
> > related key.
> hardware-related is probably better than an argument, at least.

If the key is given at resume command line and this is properly
forgotten when the resumed kernel kicks in then a user key will also
probably be ok.

It wouldn't do much good getting a safer method then the way the
encrypted partition works in the first place.

I think the difference would be user convenience more then security.

>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> swsusp-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/swsusp-devel
>
> +++++++++++++++++++++++++++++++++++++++++++
> This Mail Was Scanned By Mail-seCure System
> at the Tel-Aviv University CC.
>

2004-03-27 20:03:38

by Luke-Jr

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Saturday 27 March 2004 07:50 pm, Micha Feigin wrote:
> If the key is given at resume command line and this is properly
> forgotten when the resumed kernel kicks in then a user key will also
> probably be ok.
The resume command line is usually stored on the same disk as the image in a
configuration file.

2004-03-27 21:07:43

by Michael Frank

[permalink] [raw]
Subject: Paranoia is fun [Was Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]]

This thread mutates fast :)

On Sat, 27 Mar 2004 20:03:35 +0000, Luke-Jr <[email protected]> wrote:

> On Saturday 27 March 2004 07:50 pm, Micha Feigin wrote:
>> If the key is given at resume command line and this is properly
>> forgotten when the resumed kernel kicks in then a user key will also
>> probably be ok.
> The resume command line is usually stored on the same disk as the image in a
> configuration file.
>

... so one really would not want to put the key there.

Each and every shortcut is unsafe as it somwhere has to store the
full key and could be reverse engineered and broken "easily"
relative to breaking the key.

Guess Micha meant to edit the resume command line prior to
boot, which would work at this time.

The only "safe" way is to enter the key when prompted
For references Google for cryptoswap, loop-aes, cryptoapi

Also resuming kernel md5 checksum should flow into the key to
prevent some schlaphut replacing the kernel. (I know that
it would be hard to make addresses match, but still easier
than breaking the key). So, this is really important.

It was discussed to pass the resume command line on to the
resumed kernel for config, in which case the key should be
stripped prior to doing so.

Michael

P.S.

I say "safe" because it is safe only as long as noone can observe
key entry or touch the machine to install a (keyboard) bug...

BTW, When did we look last into our keyboards and are we
sure there are no spare chips (bugs) planted in our machines ;)

Well, perhaps we need linux computers implanted into our teeth
so that we can be "more" safe. The key could be transmitted
by tongue using (morse) code however (unauthorized) third party
objects must be prevented from entering the mouth to prevent spying.

Well, should I mention what could be hidden _on_ those bloody chips.

Have a nice day.

2004-03-27 21:30:04

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi!

On So 27-03-04 03:37:48, Luke-Jr wrote:
> On Saturday 27 March 2004 02:08 am, Nigel Cunningham wrote:
> > On Sat, 2004-03-27 at 10:22, Pavel Machek wrote:
> > > You are right, that would be ugly. How is encryption supposed to work,
> > > kernel asks you to type in a key?
> >
> > I haven't thought about the specifics there. Perhaps the plugin prompts
> > for one, or perhaps it takes a lilo parameter?
> The only purpose I can think of for encryption would be so someone can't grab
> the HD and boot it on another PC or read the image directly.
> Unless I'm missing something, that would imply that the key would need to be
> generated from a hardware profile (only creatable by root) somehow to
> restrict its readability to that one system.

Hmm, I do not see how hardware hash helps. When I can steal your hdd,
there's good chance I can steal whole machine, too.
Pavel
--
When do you have a heart between your knees?
[Johanka's followup: and *two* hearts?]

2004-03-27 21:40:41

by Luke-Jr

[permalink] [raw]
Subject: Re: Paranoia is fun [Was Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]]

On Saturday 27 March 2004 09:01 pm, Michael Frank wrote:
> ... so one really would not want to put the key there.
Right.
>
> Each and every shortcut is unsafe as it somwhere has to store the
> full key and could be reverse engineered and broken "easily"
> relative to breaking the key.
If the key is based on hardware details that only root can obtain, it would
require at least having the time to boot the victim computer into another OS
to create the key. If the key is also dependant on details of the running
kernel, it could be even harder to crack it.

Password-based encryption might be wanted for certain cases, but I think most
cases would do fine to prevent the image from being used for anything except
restoring on the original system. That way, it would be significantly more
difficult for someone to gain access to the memory that could be used for
other encrypted things (such as GPG key generation).

2004-03-28 00:27:19

by Micha

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

On Sat, Mar 27, 2004 at 10:29:46PM +0100, Pavel Machek wrote:
> Hi!
>
> On So 27-03-04 03:37:48, Luke-Jr wrote:
> > On Saturday 27 March 2004 02:08 am, Nigel Cunningham wrote:
> > > On Sat, 2004-03-27 at 10:22, Pavel Machek wrote:
> > > > You are right, that would be ugly. How is encryption supposed to work,
> > > > kernel asks you to type in a key?
> > >
> > > I haven't thought about the specifics there. Perhaps the plugin prompts
> > > for one, or perhaps it takes a lilo parameter?
> > The only purpose I can think of for encryption would be so someone can't grab
> > the HD and boot it on another PC or read the image directly.
> > Unless I'm missing something, that would imply that the key would need to be
> > generated from a hardware profile (only creatable by root) somehow to
> > restrict its readability to that one system.
>
> Hmm, I do not see how hardware hash helps. When I can steal your hdd,
> there's good chance I can steal whole machine, too.
> Pavel

Which is probably much more probable. I have to admit I haven't heard
much of hard disk theft, but quite a bit of laptop theft, and when
swsusp comes to mind in this respect I would think the laptop thingy is
more of a problem.

> --
> When do you have a heart between your knees?
> [Johanka's followup: and *two* hearts?]
>
>
> -------------------------------------------------------
> This SF.Net email is sponsored by: IBM Linux Tutorials
> Free Linux tutorial presented by Daniel Robbins, President and CEO of
> GenToo technologies. Learn everything from fundamentals to system
> administration.http://ads.osdn.com/?ad_id=1470&alloc_id=3638&op=click
> _______________________________________________
> swsusp-devel mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/swsusp-devel
>
> +++++++++++++++++++++++++++++++++++++++++++
> This Mail Was Scanned By Mail-seCure System
> at the Tel-Aviv University CC.
>

2004-03-29 12:56:30

by Pavel Machek

[permalink] [raw]
Subject: Re: -nice tree [was Re: [Swsusp-devel] Re: swsusp problems [was Re: Your opinion on the merge?]]

Hi!
> > Kernel could automagically select the right one.. But I'd prefer for
> > only "non compressed" part to reach mainline for 2.6. Feature freeze
...
> You shouldn't be to extreme in you eagerness to trim things. You should
> always take in mind what would be good for the user, not only the
> developer.

We are trying to merge into stable series. "Good for developer"
takes priority here. And likely patching compression in is easier
than patching swsusp2+compression....
--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms

2004-03-29 13:00:42

by Pavel Machek

[permalink] [raw]
Subject: fast compressed fs (was Re: -nice tree)

Hi!

> > How?
>
> Run two parallel tasks: 1. write pages to disk by queuing the I/Os;
> 2. compress unqueued pages.
>
> The first task won't use much CPU because it's always waiting for disk
> DMAs to complete. While it sleeps, the second task runs.
> Alternatively this can be implemented using polling for I/O
> completions in the second task, if that's easier.
>
> The first task should keep the I/O queue full enough to sustain
> writing, but not much fuller than that. Either a fixed queue length
> will be fine, or it is easy to adjust the queue length dynamically by
> enlarging it if any I/O completions occur when the queue is empty.
>
> The second task consumes uncompressed pages and makes available
> compressed pages.
>
> When the first task wants to queue more pages for I/O, it first checks
> the compressed-page list. If there are any, they are queued,
> otherwise it consumes uncompressed pages.
>
> This will automatically converge on an optimal balance between
> compressed and uncompressed page writing, provided the disk is using
> DMA, which they do on all modern system.
>
> This is actually better than fully enabling or disabling compression.

Very clever. This even could be used for a filesystem to make writes faster...
Hmm, compressed fs and *faster* than normal... sounds good.

--
64 bytes from 195.113.31.123: icmp_seq=28 ttl=51 time=448769.1 ms