All posts in the topic Idea for a new Open Govt Project - HTML -> API service (Short link)
Summary
- There are 3 posts — by 2 authors — in this topic.
- Latest post made by Glen Barnes at 2009 Jul 10 11:43 NZST
I'm thinking that we could build an HTML -> Web Service tool that would normalise government data from local councils that at the moment is only accessible via HTML. My use case is this: - Rateable values are available on most council websites (http://www.horizons.govt.nz/default.aspx?mode=0&pageid=37&ratingsid=11220/048.00&HorizonsRatesGrid_current_page= ) - The data is Crown Copyright - http://www.horizons.govt.nz/default.aspx?pageid=340 - Every council seems to have it in a slightly different format but the basics are the same (Valuation Number, Address, Rating Period, Legal Description, Area, Capital Value, Land Value) If we could build a standard REST interface "http://api.open.org.nz/council/rates/11220/111.00AA " and build scrapers in behind the scenes to gather the data it could be quite a useful service. I'm sure there is quite a bit of other information that would be useful and is also stored in HTML on the council sites. Although we might not be able to get 100% of the councils we could get a fair chunk of them done with relatively little work. Does anything like this exist already?
On 9/07/2009, at 2:37 PM, Glen Barnes wrote:
> Although we might not be able to get 100% of the
> councils we could get a fair chunk of them done with relatively little
> work.
You're reading my mind again, Glen :) I can easily do the scrapage
(nearly two decades in the Perl mines weren't misspent!). How often
does the data change?
The data doesn't change that often (the councils are required to
update Rating Valuations at least every 3 years I think) but people
can challenge their valuations and mistakes can be fixed so the
document may not be static for the whole period. Ideally we could get
an update log from councils but obviously this cannot happen within
the services currently provided by councils so this virtual API is a
first step and still very useful to others. I would expect the
service to work something like this:
- Request comes in for record 'nnnn/nnnn.nn'
- We check our cache and see if we have a current version (current
being within the RV period OR older than say 1 month). This will give
us a reasonable cache while not hitting council servers too often.
- If we have a cached version then we serve it up
- If we don't have a cached version then we scrape it and return the
value
- If we don't have a scraper for that council then we return some form
of error document/HTTP response
We can keep track of stats such as how many requests for each council,
how many we can fulfil, etc. If/When the councils contact us to see
WTF we are up to then we offer to help them provide this service
themselves and help define the API.
This site is provided by OnlineGroups.Net, where you can start your own free online groups site, using the open source web-based mailing list manager GroupServer.