From: Eric Dumazet <dada1@cosmosbay.com>
To: Evgeniy Polyakov <johnpol@2ka.mipt.ru>
Subject: Re: [take19 1/4] kevent: Core files.
Date: Tue, 17 Oct 2006 15:19:36 +0200
User-Agent: KMail/1.9.5
Cc: Johann Borck <johann.borck@densedata.com>,
       Ulrich Drepper <drepper@redhat.com>, Ulrich Drepper <drepper@gmail.com>,
       lkml <linux-kernel@vger.kernel.org>, David Miller <davem@davemloft.net>,
       Andrew Morton <akpm@osdl.org>, netdev <netdev@vger.kernel.org>,
       Zach Brown <zach.brown@oracle.com>,
       Christoph Hellwig <hch@infradead.org>,
       Chase Venters <chase.venters@clientec.com>
References: <11587449471424@2ka.mipt.ru> <453465B6.1000401@densedata.com> <20061017103940.GA19246@2ka.mipt.ru>
In-Reply-To: <20061017103940.GA19246@2ka.mipt.ru>
MIME-Version: 1.0
Content-Type: text/plain;
  charset="koi8-r"
Content-Transfer-Encoding: 7bit
Content-Disposition: inline
Message-Id: <200610171519.37051.dada1@cosmosbay.com>
Sender: linux-kernel-owner@vger.kernel.org
Content-Length: 3001
Lines: 69

On Tuesday 17 October 2006 12:39, Evgeniy Polyakov wrote:

> I can add such notification, but its existense _is_ the broken design.
> After such condition happend, all new events will dissapear (although
> they are still accessible through usual queue) from mapped buffer.
>
> While writing this I have come to the idea on how to imrove the case of
> the size of mapped buffer - we can make it with limited size, and when
> it is full, some bit will be set in the shared area and obviously no new
> events can be added there, but when user commits some events from that
> buffer (i.e. says to kernel that appropriate kevents can be freed or
> requeued according to theirs flags), new ready events from ready queue
> can be copied into mapped buffer.
>
> It still does not solve (and I do insist that it is broken behaviour)
> the case when kernel is going to generate infinite number of events for
> one requested by userspace (as in case of generating new 'data_has_arrived'
> event when new byte has been received).

Behavior is not broken. It's quite usefull and works 99.9999% of time.

I was trying to suggest you but you missed my point.

You dont want to use a bit, but a full sequence counter, 32bits.

A program may handle XXX.XXX handles, but use a 4096 entries ring 
buffer 'only'.

The user program keeps a local copy of a special word 
named 'ring_buffer_full_counter'

Each time the kernel cannot queue an event in the ring buffer, it increase 
the "ring_buffer_was_full_counter" (exported to user app in the mmap view)

When the user application notice the kernel 
changed "ring_buffer_was_full_counter" it does a full scan of all file 
handles (preferably using poll() to get all relevant info in one syscall) :

do {
   if (read_event_from_mmap()) {handle_event(fd); continue;}
   /* ring buffer is empty, check if we missed some events */
   if (unlikely(mmap->ring_buffer_full_counter !=  
my_ring_buffer_full_counter)) {
	my_ring_buffer_full_counter = mmap->ring_buffer_full_counter;
	/* slow PATH */
	/* can use a big poll() for example, or just a loop without poll() */
	for_all_file_desc_do() {
		check if some event/data is waiting on THIS fd
		}
	/* 
	}
    else  syscall_wait_for_one_available_kevent(queue)
}

This is how a program can recover. If ring buffer has a reasonable size, this 
kind of event should not happen very frequently. If it does (because events 
continue to fill ring_buffer during recovery and might hit FULL again), maybe 
a smart program is able to resize the ring_buffer, and start using it after 
yet another recovery pass.
If not, we dont care, because a big poll() give us many ready file-descriptors 
in one syscall, and maybe this is much better than kevent/epoll when XX.XXX 
events are ready.

Eric
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/