[udig-devel] Re: [Geotools-devel] RFC: paged access to Features

Chris Holmes cholmes at openplans.org
Tue Aug 1 13:40:40 PDT 2006


I know Javier would love to have this feature exposed in WFS.

I'm more or less +1.  I'd say check out the C-SW 2 spec if you haven't 
already as I believe they do paging, and you should use the same names 
as they do.  We could probably expose through WFS as an extension of 
some sort, and submit a change request as well.

Chris

Gabriel Roldán wrote:
> Jessy and I were ranting about adding the ability to the GeoTools DataStore 
> api of being required for a page of data/features, rather than the whole 
> dataset.
> That would immensely improve (and actually allow) some common scenarios like 
> presening a result set in tabular form.
> 
> On the client side (uDig in this case, but could be JSF, swing, whatever), it 
> is easy to, for example, implement a lazy List, as long as the underlying 
> data api allows for pagination.
> 
> Doing so in geotools would be actually easy!
> 
> No API change would be needed beyond adding two fields to Query:
> fromIndex and pageSize. OrderBy is already present in the Filter spec.
> 
> here is the conversation, hope to get some comments.
> 
> Gabriel
> -----------------------
> [18:40:00]Gabriel Roldán was thinking what should be needed on the geotools 
> front in order to make TableView lazily loaded. On the uDig front it is easy, 
> problem would be with the insufficient geotools Data api
> [18:40:50] … I thought the DataStore api could be augmented in a similar 
> fashion as the catalog queries are done, aka, you can request by "pages" of 
> data
> [18:40:56] Jesse Eichar I think Jody envisioned using the FeatureList API
> [18:41:01] … you can get all the fids.
> [18:41:16] … then using the FeatureList API load up the features by fid
> [18:42:27] Gabriel Roldán I mean, on the uDig side you can use FeatureList for 
> sure. The lazy loading strategy would be transparent for uDig, but still the 
> datastore api should be more friendly
> [18:42:35] … getting all the fids could still be a pain
> [18:43:02] Jesse Eichar I see.
> [18:43:12] Gabriel Roldán an approach that's proven to work is getting the 
> feature/element count and then request by pages of a user defined size
> [18:43:42] … (proven to work == I'm doing it on the catalog implementation)
> [18:43:54] Jesse Eichar I can't think of any problems with that right off.
> [18:44:25] … Where would that fit in?
> [18:44:34] … WE have a feature collection
> [18:44:38] … feature list..
> [18:44:54] … would you add a method to feature store? 
> [18:45:06] … New type of query?
> [18:45:10] Gabriel Roldán you should treat FeatureList as a normal list, use 
> the get(int) or iterator() as normal
> [18:45:38] … the featurelist impl uses a paging strategy to retrieve content 
> on the back
> [18:46:19] … provided that you pass it the Query and the list size in the 
> constructor, for example
> [18:46:58] Jesse Eichar Makes sense.  Would you add a new method in 
> FeatureSource?  getFeatures(Query, pageSize)
> [18:46:57] … ?
> [18:47:18] … or make a new type of Query that has that information.
> [18:47:29] Gabriel Roldán getFeature(Query, startIndex, pageSize)
> [18:48:26] Jesse Eichar I think I prefer that too.  But could be hard to get 
> it to fly because it will eventually require negotiation with GeoAPI.
> [18:49:11] Gabriel Roldán Well, could encapsulate it in Query as well, after 
> all Query _is_ a parameter object
> [18:50:14] … and I guess something like that could certainly be in future 
> versions of wfs spce at least, since they already recognized the problem and 
> designed it for catalog 2.0
> [18:50:57] Jesse Eichar I didn't know that.
> [18:53:52] Gabriel Roldán now, implementing getFeatures(Query, from, size) or 
> whatever would have its implications. Some RDBMS backed datastores could 
> manage it easyly I guess
> [18:54:32] Jesse Eichar It does.  
> [18:54:34] … WFS for example
> [18:55:16] … The problem is that things aren't inherently ordered
> [18:55:46] … so index 3 could (at least theoretically) be a different item 
> between calls.
> [18:56:31] … WFS 1.1 I think has some sort-by functionality I think but 1.0 
> doesn't
> [18:57:49] … For that one it makes sense to get all fids in the query.  
> Shapefile and other file based ones I the will be ok.
> [18:58:13] Gabriel Roldán just 1'
> [18:58:39] Jesse Eichar sur
> [19:00:34] Gabriel Roldán sorry, had a phone call
> [19:00:42] Jesse Eichar np
> [19:00:44] Gabriel Roldán you're completelly true
> [19:01:10] … so a requirement would be an order being explicitly set in the 
> Query
> [19:01:37] Jesse Eichar For WFS we could obtain all fids and manage the paging 
> on the client, at least until 1.1. 
> [19:01:39] Gabriel Roldán what I'm doing in catalog is ordering by ID if the 
> Query has no orderBy
> [19:02:05] Jesse Eichar we can order by fid if not specified.
> [19:02:14] … seems reasonable.
> [19:02:19] Gabriel Roldán problem with fids is that getting two million fids 
> could still be quite killer
> [19:03:07] Jesse Eichar I know it.  I'm open to suggestions...  
> [19:03:09] Gabriel Roldán what raises me another concern I was thinking on
> [19:03:41] … I know we've defined feature ids to be String as to be friendly 
> with the WFS spec
> [19:03:58] … still it makes no much sense on the pure java side of things
> [19:04:10] … I would like to see FID as an interface
> [19:04:33] … so implementors could optimize as needed, instead of creating 
> millions of strings by prepending the feature type name, etc
> [19:05:07] … but that's another concern, I tend to ramp :P
> [19:05:16] Jesse Eichar :D You best jump on the FM discussion with that.  I 
> don't think that's going to happen too soon.
> [19:05:34] Gabriel Roldán yeah, I guess so
> [19:06:11] Jesse Eichar But back to the point.  I'm completely in agreement 
> with you on the Paging requirement.
> [19:06:30] … I'd be happy to do some of the implementations for you.
> [19:06:49] Gabriel Roldán cool, that kind of stratagies works just great for 
> presenting huge amounts of tabular data in other domains
> [19:06:56] … so it should work for us too
> [19:07:19] Jesse Eichar I think it has to be done.  Its impossible to deal 
> with this amount of data otherwise.
> [19:07:47] Gabriel Roldán I like the idea that the FeatureSource interfaces 
> doesn't needs to be touched
> [19:07:53] … just Query
> [19:08:19] Jesse Eichar It'll be much quicker and easier to get it integrated 
> with geotools that way.  Pretty clean too.  
> [19:08:19] Gabriel Roldán we already have order by in Filter, so just from and 
> page size are needed
> [19:09:12] … note that FeatureCollection.size() still should return the whole 
> query size (aka, hits), and not the page size
> [19:09:31] Jesse Eichar Yes.
> [19:09:47] Gabriel Roldán in that case I'm wondering what's the easiest way of 
> knowing when you're done fetching content
> [19:09:51] Jesse Eichar and get() can get any feature, not just those in the 
> current page (for FeatureList)
> [19:09:57] Gabriel Roldán other by requiring client code to use a counter
> [19:10:02] … sure
> [19:10:32] … I did that for presenting catalog results using Java Server Faces 
> and works great
> [19:10:57] … a custom list impl that queries the required page of data if the 
> index isn't on the current page
> [19:11:38] Jesse Eichar Do you cache the fetched features so you don't have to 
> get them more than once?  or get a fresh copy each time?
> [19:12:36] Gabriel Roldán in the catalog case I fetch the whole page onto 
> memory. In our case I guess we could be even smarter and maintain the 
> streamed nature of stuff even on the pages
> [19:13:12] … not sure if I'm explaining me well enough
> [19:13:25] Jesse Eichar You're doing fine
> [19:16:02] Gabriel Roldán cool, do you mind if I post this to the list?
> [19:16:15] Jesse Eichar No at all
> [19:16:26] Gabriel Roldán better said, do you think it is something that 
> worths being posted?
> [19:16:37] Jesse Eichar haha
> [19:16:44] … Yeah I think it should be.
> [19:16:56] … People will have comments for sure.
> [19:17:34] Gabriel Roldán nice, forgot to jump on the geoserver irc meeting, 
> uhg
> [19:17:49] Jesse Eichar shoot me too
> 

-- 
Chris Holmes
The Open Planning Project
http://topp.openplans.org
-------------- next part --------------
A non-text attachment was scrubbed...
Name: cholmes.vcf
Type: text/x-vcard
Size: 280 bytes
Desc: not available
Url : http://lists.refractions.net/pipermail/udig-devel/attachments/20060801/30e01657/cholmes.vcf


More information about the udig-devel mailing list