On 2013-02-16 14:53:43 +1030, Rusty Russell wrote:
> Andrew Morton <[email protected]> writes:
> > On Fri, 15 Feb 2013 18:13:04 -0500
> > Johannes Weiner <[email protected]> wrote:
> >> I dunno. The byte vector might not be optimal but its worst cases
> >> seem more attractive, is just as extensible, and dead simple to use.
> >
> > But I think "which pages from this 4TB file are in core" will not be an
> > uncommon usage, and writing a gig of memory to find three pages is just
> > awful.
>
> Actually, I don't know of any usage for this call.
[months later, catching up]
I do. Postgres' could really use something like that for making saner
assumptions about the cost of doing an index/heap scan. postgres doesn't
use mmap() and mmaping larger files into memory isn't all that cheap
(32bit...) so having fincore would be nice.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services
On Wed, May 29, 2013 at 04:53:12PM +0200, Andres Freund wrote:
> On 2013-02-16 14:53:43 +1030, Rusty Russell wrote:
> > Andrew Morton <[email protected]> writes:
> > > On Fri, 15 Feb 2013 18:13:04 -0500
> > > Johannes Weiner <[email protected]> wrote:
> > >> I dunno. The byte vector might not be optimal but its worst cases
> > >> seem more attractive, is just as extensible, and dead simple to use.
> > >
> > > But I think "which pages from this 4TB file are in core" will not be an
> > > uncommon usage, and writing a gig of memory to find three pages is just
> > > awful.
> >
> > Actually, I don't know of any usage for this call.
>
> [months later, catching up]
>
> I do. Postgres' could really use something like that for making saner
> assumptions about the cost of doing an index/heap scan. postgres doesn't
> use mmap() and mmaping larger files into memory isn't all that cheap
> (32bit...) so having fincore would be nice.
How much of the areas you want to use it against is usually cached?
I.e. are those 4TB files with 3 cached pages?
I do wonder if we should just have two separate interfaces. Ugly, but
I don't really see how the two requirements (dense but many holes
vs. huge sparse areas) could be acceptably met with one interface.
On 2013-05-29 13:32:23 -0400, Johannes Weiner wrote:
> On Wed, May 29, 2013 at 04:53:12PM +0200, Andres Freund wrote:
> > On 2013-02-16 14:53:43 +1030, Rusty Russell wrote:
> > > Andrew Morton <[email protected]> writes:
> > > > On Fri, 15 Feb 2013 18:13:04 -0500
> > > > Johannes Weiner <[email protected]> wrote:
> > > >> I dunno. The byte vector might not be optimal but its worst cases
> > > >> seem more attractive, is just as extensible, and dead simple to use.
> > > >
> > > > But I think "which pages from this 4TB file are in core" will not be an
> > > > uncommon usage, and writing a gig of memory to find three pages is just
> > > > awful.
> > >
> > > Actually, I don't know of any usage for this call.
> >
> > [months later, catching up]
> >
> > I do. Postgres' could really use something like that for making saner
> > assumptions about the cost of doing an index/heap scan. postgres doesn't
> > use mmap() and mmaping larger files into memory isn't all that cheap
> > (32bit...) so having fincore would be nice.
> How much of the areas you want to use it against is usually cached?
> I.e. are those 4TB files with 3 cached pages?
Hard to say in general. The point is exactly that we don't know. If
there's nothing of a large index in memory and we estimate that we want
20% of a table we sure won't do an indexscan. If its all in memory?
Different story.
For that usecase its not actually important that we get a 100% accurate
result although I, from my limited understanding, don't really see that
helping much.
(Yes, there are some problems with cache warming here)
> I do wonder if we should just have two separate interfaces. Ugly, but
> I don't really see how the two requirements (dense but many holes
> vs. huge sparse areas) could be acceptably met with one interface.
The difference would be how the information would be encoded, right? Not
sure how the passed in memory could be sized in some run length encoded
scheme. What I could imagine is specifying the granularity we want
information about, but thats probably too specific.
Greetings,
Andres Freund
--
Andres Freund http://www.2ndQuadrant.com/
PostgreSQL Development, 24x7 Support, Training & Services