[udig-devel] Re: [Geotools-devel] RFC: paged access to Features
Chris Holmes
cholmes at openplans.org
Wed Aug 2 13:49:00 PDT 2006
Looks like the proper param names are:
startPosition and maxRecords
So I'd recommend using those.
I'll work on getting a change request, as we're OGC members now.
Chris
Javier de la Torre wrote:
> Yes! :)
>
> +10
>
> And if uDig supports it then that would be great... I will try to
> contact the people from gvSIG and see of they can work on supporting
> this too.
>
> If you want to then submit it to OGC and I can provide you with some
> scenarios where this is a MUST. Specially if what you want to do is
> analysis with the data.
>
> Thanks.
>
> Javier.
>
> On 01/08/2006, at 22:40, Chris Holmes wrote:
>
>> I know Javier would love to have this feature exposed in WFS.
>>
>> I'm more or less +1. I'd say check out the C-SW 2 spec if you haven't
>> already as I believe they do paging, and you should use the same names
>> as they do. We could probably expose through WFS as an extension of
>> some sort, and submit a change request as well.
>>
>> Chris
>>
>> Gabriel Roldán wrote:
>>> Jessy and I were ranting about adding the ability to the GeoTools
>>> DataStore api of being required for a page of data/features, rather
>>> than the whole dataset.
>>> That would immensely improve (and actually allow) some common
>>> scenarios like presening a result set in tabular form.
>>> On the client side (uDig in this case, but could be JSF, swing,
>>> whatever), it is easy to, for example, implement a lazy List, as long
>>> as the underlying data api allows for pagination.
>>> Doing so in geotools would be actually easy!
>>> No API change would be needed beyond adding two fields to Query:
>>> fromIndex and pageSize. OrderBy is already present in the Filter spec.
>>> here is the conversation, hope to get some comments.
>>> Gabriel
>>> -----------------------
>>> [18:40:00]Gabriel Roldán was thinking what should be needed on the
>>> geotools front in order to make TableView lazily loaded. On the uDig
>>> front it is easy, problem would be with the insufficient geotools
>>> Data api
>>> [18:40:50] … I thought the DataStore api could be augmented in a
>>> similar fashion as the catalog queries are done, aka, you can request
>>> by "pages" of data
>>> [18:40:56] Jesse Eichar I think Jody envisioned using the FeatureList
>>> API
>>> [18:41:01] … you can get all the fids.
>>> [18:41:16] … then using the FeatureList API load up the features by fid
>>> [18:42:27] Gabriel Roldán I mean, on the uDig side you can use
>>> FeatureList for sure. The lazy loading strategy would be transparent
>>> for uDig, but still the datastore api should be more friendly
>>> [18:42:35] … getting all the fids could still be a pain
>>> [18:43:02] Jesse Eichar I see.
>>> [18:43:12] Gabriel Roldán an approach that's proven to work is
>>> getting the feature/element count and then request by pages of a user
>>> defined size
>>> [18:43:42] … (proven to work == I'm doing it on the catalog
>>> implementation)
>>> [18:43:54] Jesse Eichar I can't think of any problems with that right
>>> off.
>>> [18:44:25] … Where would that fit in?
>>> [18:44:34] … WE have a feature collection
>>> [18:44:38] … feature list..
>>> [18:44:54] … would you add a method to feature store? [18:45:06] …
>>> New type of query?
>>> [18:45:10] Gabriel Roldán you should treat FeatureList as a normal
>>> list, use the get(int) or iterator() as normal
>>> [18:45:38] … the featurelist impl uses a paging strategy to retrieve
>>> content on the back
>>> [18:46:19] … provided that you pass it the Query and the list size in
>>> the constructor, for example
>>> [18:46:58] Jesse Eichar Makes sense. Would you add a new method in
>>> FeatureSource? getFeatures(Query, pageSize)
>>> [18:46:57] … ?
>>> [18:47:18] … or make a new type of Query that has that information.
>>> [18:47:29] Gabriel Roldán getFeature(Query, startIndex, pageSize)
>>> [18:48:26] Jesse Eichar I think I prefer that too. But could be hard
>>> to get it to fly because it will eventually require negotiation with
>>> GeoAPI.
>>> [18:49:11] Gabriel Roldán Well, could encapsulate it in Query as
>>> well, after all Query _is_ a parameter object
>>> [18:50:14] … and I guess something like that could certainly be in
>>> future versions of wfs spce at least, since they already recognized
>>> the problem and designed it for catalog 2.0
>>> [18:50:57] Jesse Eichar I didn't know that.
>>> [18:53:52] Gabriel Roldán now, implementing getFeatures(Query, from,
>>> size) or whatever would have its implications. Some RDBMS backed
>>> datastores could manage it easyly I guess
>>> [18:54:32] Jesse Eichar It does. [18:54:34] … WFS for example
>>> [18:55:16] … The problem is that things aren't inherently ordered
>>> [18:55:46] … so index 3 could (at least theoretically) be a different
>>> item between calls.
>>> [18:56:31] … WFS 1.1 I think has some sort-by functionality I think
>>> but 1.0 doesn't
>>> [18:57:49] … For that one it makes sense to get all fids in the
>>> query. Shapefile and other file based ones I the will be ok.
>>> [18:58:13] Gabriel Roldán just 1'
>>> [18:58:39] Jesse Eichar sur
>>> [19:00:34] Gabriel Roldán sorry, had a phone call
>>> [19:00:42] Jesse Eichar np
>>> [19:00:44] Gabriel Roldán you're completelly true
>>> [19:01:10] … so a requirement would be an order being explicitly set
>>> in the Query
>>> [19:01:37] Jesse Eichar For WFS we could obtain all fids and manage
>>> the paging on the client, at least until 1.1. [19:01:39] Gabriel
>>> Roldán what I'm doing in catalog is ordering by ID if the Query has
>>> no orderBy
>>> [19:02:05] Jesse Eichar we can order by fid if not specified.
>>> [19:02:14] … seems reasonable.
>>> [19:02:19] Gabriel Roldán problem with fids is that getting two
>>> million fids could still be quite killer
>>> [19:03:07] Jesse Eichar I know it. I'm open to suggestions...
>>> [19:03:09] Gabriel Roldán what raises me another concern I was
>>> thinking on
>>> [19:03:41] … I know we've defined feature ids to be String as to be
>>> friendly with the WFS spec
>>> [19:03:58] … still it makes no much sense on the pure java side of
>>> things
>>> [19:04:10] … I would like to see FID as an interface
>>> [19:04:33] … so implementors could optimize as needed, instead of
>>> creating millions of strings by prepending the feature type name, etc
>>> [19:05:07] … but that's another concern, I tend to ramp :P
>>> [19:05:16] Jesse Eichar :D You best jump on the FM discussion with
>>> that. I don't think that's going to happen too soon.
>>> [19:05:34] Gabriel Roldán yeah, I guess so
>>> [19:06:11] Jesse Eichar But back to the point. I'm completely in
>>> agreement with you on the Paging requirement.
>>> [19:06:30] … I'd be happy to do some of the implementations for you.
>>> [19:06:49] Gabriel Roldán cool, that kind of stratagies works just
>>> great for presenting huge amounts of tabular data in other domains
>>> [19:06:56] … so it should work for us too
>>> [19:07:19] Jesse Eichar I think it has to be done. Its impossible to
>>> deal with this amount of data otherwise.
>>> [19:07:47] Gabriel Roldán I like the idea that the FeatureSource
>>> interfaces doesn't needs to be touched
>>> [19:07:53] … just Query
>>> [19:08:19] Jesse Eichar It'll be much quicker and easier to get it
>>> integrated with geotools that way. Pretty clean too. [19:08:19]
>>> Gabriel Roldán we already have order by in Filter, so just from and
>>> page size are needed
>>> [19:09:12] … note that FeatureCollection.size() still should return
>>> the whole query size (aka, hits), and not the page size
>>> [19:09:31] Jesse Eichar Yes.
>>> [19:09:47] Gabriel Roldán in that case I'm wondering what's the
>>> easiest way of knowing when you're done fetching content
>>> [19:09:51] Jesse Eichar and get() can get any feature, not just those
>>> in the current page (for FeatureList)
>>> [19:09:57] Gabriel Roldán other by requiring client code to use a
>>> counter
>>> [19:10:02] … sure
>>> [19:10:32] … I did that for presenting catalog results using Java
>>> Server Faces and works great
>>> [19:10:57] … a custom list impl that queries the required page of
>>> data if the index isn't on the current page
>>> [19:11:38] Jesse Eichar Do you cache the fetched features so you
>>> don't have to get them more than once? or get a fresh copy each time?
>>> [19:12:36] Gabriel Roldán in the catalog case I fetch the whole page
>>> onto memory. In our case I guess we could be even smarter and
>>> maintain the streamed nature of stuff even on the pages
>>> [19:13:12] … not sure if I'm explaining me well enough
>>> [19:13:25] Jesse Eichar You're doing fine
>>> [19:16:02] Gabriel Roldán cool, do you mind if I post this to the list?
>>> [19:16:15] Jesse Eichar No at all
>>> [19:16:26] Gabriel Roldán better said, do you think it is something
>>> that worths being posted?
>>> [19:16:37] Jesse Eichar haha
>>> [19:16:44] … Yeah I think it should be.
>>> [19:16:56] … People will have comments for sure.
>>> [19:17:34] Gabriel Roldán nice, forgot to jump on the geoserver irc
>>> meeting, uhg
>>> [19:17:49] Jesse Eichar shoot me too
>>
>> --Chris Holmes
>> The Open Planning Project
>> http://topp.openplans.org
>> <cholmes.vcf>
>
>
> !DSPAM:1003,44d101186502095110867!
>
--
Chris Holmes
The Open Planning Project
http://topp.openplans.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cholmes.vcf
Type: text/x-vcard
Size: 269 bytes
Desc: not available
Url : http://lists.refractions.net/pipermail/udig-devel/attachments/20060802/146b3391/cholmes.vcf
More information about the udig-devel
mailing list