[udig-devel] Re: [Geotools-devel] RFC: paged access to Features

Chris Holmes cholmes at openplans.org
Wed Aug 2 13:49:00 PDT 2006


Looks like the proper param names are:
  startPosition and maxRecords

So I'd recommend using those.

I'll work on getting a change request, as we're OGC members now.

Chris

Javier de la Torre wrote:
> Yes! :)
> 
> +10
> 
> And if uDig supports it then that would be great... I will try to 
> contact the people from gvSIG and see of they can work on supporting 
> this too.
> 
> If you want to then submit it to OGC and I can provide you with some 
> scenarios where this is a MUST. Specially if what you want to do is 
> analysis with the data.
> 
> Thanks.
> 
> Javier.
> 
> On 01/08/2006, at 22:40, Chris Holmes wrote:
> 
>> I know Javier would love to have this feature exposed in WFS.
>>
>> I'm more or less +1.  I'd say check out the C-SW 2 spec if you haven't 
>> already as I believe they do paging, and you should use the same names 
>> as they do.  We could probably expose through WFS as an extension of 
>> some sort, and submit a change request as well.
>>
>> Chris
>>
>> Gabriel Roldán wrote:
>>> Jessy and I were ranting about adding the ability to the GeoTools 
>>> DataStore api of being required for a page of data/features, rather 
>>> than the whole dataset.
>>> That would immensely improve (and actually allow) some common 
>>> scenarios like presening a result set in tabular form.
>>> On the client side (uDig in this case, but could be JSF, swing, 
>>> whatever), it is easy to, for example, implement a lazy List, as long 
>>> as the underlying data api allows for pagination.
>>> Doing so in geotools would be actually easy!
>>> No API change would be needed beyond adding two fields to Query:
>>> fromIndex and pageSize. OrderBy is already present in the Filter spec.
>>> here is the conversation, hope to get some comments.
>>> Gabriel
>>> -----------------------
>>> [18:40:00]Gabriel Roldán was thinking what should be needed on the 
>>> geotools front in order to make TableView lazily loaded. On the uDig 
>>> front it is easy, problem would be with the insufficient geotools 
>>> Data api
>>> [18:40:50] … I thought the DataStore api could be augmented in a 
>>> similar fashion as the catalog queries are done, aka, you can request 
>>> by "pages" of data
>>> [18:40:56] Jesse Eichar I think Jody envisioned using the FeatureList 
>>> API
>>> [18:41:01] … you can get all the fids.
>>> [18:41:16] … then using the FeatureList API load up the features by fid
>>> [18:42:27] Gabriel Roldán I mean, on the uDig side you can use 
>>> FeatureList for sure. The lazy loading strategy would be transparent 
>>> for uDig, but still the datastore api should be more friendly
>>> [18:42:35] … getting all the fids could still be a pain
>>> [18:43:02] Jesse Eichar I see.
>>> [18:43:12] Gabriel Roldán an approach that's proven to work is 
>>> getting the feature/element count and then request by pages of a user 
>>> defined size
>>> [18:43:42] … (proven to work == I'm doing it on the catalog 
>>> implementation)
>>> [18:43:54] Jesse Eichar I can't think of any problems with that right 
>>> off.
>>> [18:44:25] … Where would that fit in?
>>> [18:44:34] … WE have a feature collection
>>> [18:44:38] … feature list..
>>> [18:44:54] … would you add a method to feature store? [18:45:06] … 
>>> New type of query?
>>> [18:45:10] Gabriel Roldán you should treat FeatureList as a normal 
>>> list, use the get(int) or iterator() as normal
>>> [18:45:38] … the featurelist impl uses a paging strategy to retrieve 
>>> content on the back
>>> [18:46:19] … provided that you pass it the Query and the list size in 
>>> the constructor, for example
>>> [18:46:58] Jesse Eichar Makes sense.  Would you add a new method in 
>>> FeatureSource?  getFeatures(Query, pageSize)
>>> [18:46:57] … ?
>>> [18:47:18] … or make a new type of Query that has that information.
>>> [18:47:29] Gabriel Roldán getFeature(Query, startIndex, pageSize)
>>> [18:48:26] Jesse Eichar I think I prefer that too.  But could be hard 
>>> to get it to fly because it will eventually require negotiation with 
>>> GeoAPI.
>>> [18:49:11] Gabriel Roldán Well, could encapsulate it in Query as 
>>> well, after all Query _is_ a parameter object
>>> [18:50:14] … and I guess something like that could certainly be in 
>>> future versions of wfs spce at least, since they already recognized 
>>> the problem and designed it for catalog 2.0
>>> [18:50:57] Jesse Eichar I didn't know that.
>>> [18:53:52] Gabriel Roldán now, implementing getFeatures(Query, from, 
>>> size) or whatever would have its implications. Some RDBMS backed 
>>> datastores could manage it easyly I guess
>>> [18:54:32] Jesse Eichar It does.  [18:54:34] … WFS for example
>>> [18:55:16] … The problem is that things aren't inherently ordered
>>> [18:55:46] … so index 3 could (at least theoretically) be a different 
>>> item between calls.
>>> [18:56:31] … WFS 1.1 I think has some sort-by functionality I think 
>>> but 1.0 doesn't
>>> [18:57:49] … For that one it makes sense to get all fids in the 
>>> query.  Shapefile and other file based ones I the will be ok.
>>> [18:58:13] Gabriel Roldán just 1'
>>> [18:58:39] Jesse Eichar sur
>>> [19:00:34] Gabriel Roldán sorry, had a phone call
>>> [19:00:42] Jesse Eichar np
>>> [19:00:44] Gabriel Roldán you're completelly true
>>> [19:01:10] … so a requirement would be an order being explicitly set 
>>> in the Query
>>> [19:01:37] Jesse Eichar For WFS we could obtain all fids and manage 
>>> the paging on the client, at least until 1.1. [19:01:39] Gabriel 
>>> Roldán what I'm doing in catalog is ordering by ID if the Query has 
>>> no orderBy
>>> [19:02:05] Jesse Eichar we can order by fid if not specified.
>>> [19:02:14] … seems reasonable.
>>> [19:02:19] Gabriel Roldán problem with fids is that getting two 
>>> million fids could still be quite killer
>>> [19:03:07] Jesse Eichar I know it.  I'm open to suggestions...  
>>> [19:03:09] Gabriel Roldán what raises me another concern I was 
>>> thinking on
>>> [19:03:41] … I know we've defined feature ids to be String as to be 
>>> friendly with the WFS spec
>>> [19:03:58] … still it makes no much sense on the pure java side of 
>>> things
>>> [19:04:10] … I would like to see FID as an interface
>>> [19:04:33] … so implementors could optimize as needed, instead of 
>>> creating millions of strings by prepending the feature type name, etc
>>> [19:05:07] … but that's another concern, I tend to ramp :P
>>> [19:05:16] Jesse Eichar :D You best jump on the FM discussion with 
>>> that.  I don't think that's going to happen too soon.
>>> [19:05:34] Gabriel Roldán yeah, I guess so
>>> [19:06:11] Jesse Eichar But back to the point.  I'm completely in 
>>> agreement with you on the Paging requirement.
>>> [19:06:30] … I'd be happy to do some of the implementations for you.
>>> [19:06:49] Gabriel Roldán cool, that kind of stratagies works just 
>>> great for presenting huge amounts of tabular data in other domains
>>> [19:06:56] … so it should work for us too
>>> [19:07:19] Jesse Eichar I think it has to be done.  Its impossible to 
>>> deal with this amount of data otherwise.
>>> [19:07:47] Gabriel Roldán I like the idea that the FeatureSource 
>>> interfaces doesn't needs to be touched
>>> [19:07:53] … just Query
>>> [19:08:19] Jesse Eichar It'll be much quicker and easier to get it 
>>> integrated with geotools that way.  Pretty clean too.  [19:08:19] 
>>> Gabriel Roldán we already have order by in Filter, so just from and 
>>> page size are needed
>>> [19:09:12] … note that FeatureCollection.size() still should return 
>>> the whole query size (aka, hits), and not the page size
>>> [19:09:31] Jesse Eichar Yes.
>>> [19:09:47] Gabriel Roldán in that case I'm wondering what's the 
>>> easiest way of knowing when you're done fetching content
>>> [19:09:51] Jesse Eichar and get() can get any feature, not just those 
>>> in the current page (for FeatureList)
>>> [19:09:57] Gabriel Roldán other by requiring client code to use a 
>>> counter
>>> [19:10:02] … sure
>>> [19:10:32] … I did that for presenting catalog results using Java 
>>> Server Faces and works great
>>> [19:10:57] … a custom list impl that queries the required page of 
>>> data if the index isn't on the current page
>>> [19:11:38] Jesse Eichar Do you cache the fetched features so you 
>>> don't have to get them more than once?  or get a fresh copy each time?
>>> [19:12:36] Gabriel Roldán in the catalog case I fetch the whole page 
>>> onto memory. In our case I guess we could be even smarter and 
>>> maintain the streamed nature of stuff even on the pages
>>> [19:13:12] … not sure if I'm explaining me well enough
>>> [19:13:25] Jesse Eichar You're doing fine
>>> [19:16:02] Gabriel Roldán cool, do you mind if I post this to the list?
>>> [19:16:15] Jesse Eichar No at all
>>> [19:16:26] Gabriel Roldán better said, do you think it is something 
>>> that worths being posted?
>>> [19:16:37] Jesse Eichar haha
>>> [19:16:44] … Yeah I think it should be.
>>> [19:16:56] … People will have comments for sure.
>>> [19:17:34] Gabriel Roldán nice, forgot to jump on the geoserver irc 
>>> meeting, uhg
>>> [19:17:49] Jesse Eichar shoot me too
>>
>> --Chris Holmes
>> The Open Planning Project
>> http://topp.openplans.org
>> <cholmes.vcf>
> 
> 
> !DSPAM:1003,44d101186502095110867!
> 

-- 
Chris Holmes
The Open Planning Project
http://topp.openplans.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cholmes.vcf
Type: text/x-vcard
Size: 269 bytes
Desc: not available
Url : http://lists.refractions.net/pipermail/udig-devel/attachments/20060802/146b3391/cholmes.vcf


More information about the udig-devel mailing list