OU Facebook Apps, Reprise

I was at a meeting yesterday looking at rebooting the OU’s Facebook strategy. With a bit of luck, this means that we’ll be doing another push on the OU Facebook apps that were developed several years ago now and which I still believe provide a sound basis for a range of community building and social learning support services (Course Profiles – A Facebook Application for Open University Students and Alumni).

The apps were largely developed out of time and in stolen time, and it seems that things are likely to continue in this way (which is both a plus – freeing us from constraints of interminable committees wanting to plan strategies rather than jfdi, and a minus – @liamgh is the only person we trust with the code which means any maintenance falls to him ;-)

For those who don’t remember the apps we developed, there were two: Course Profiles, which allowed students to declare the courses that had taken were taking and intended to take, and then provided a range of services around that information (find friends on a course, find a study buddy, link to course information or course related OpenLearn resources, get course recommendations); and My OU Story, where students could maintain a “status diary” about their progress on a course, along with a mood indicator so they could track their mood over a course, and other app users could add supportive comments. (I’d be surprised if anyone in the Student Services retention project has even heard about this project, but looking at some of the peer support that has gone on within the context that app, I’d argue it might be contributing to retention…)

Course Profiles quickly attracted several thousand users following the initial push just after it was first launched, so it evidently served a need then that presumably still exists today, i.e. a badging mechansims for celebrating course achievements and declaring future study intentions. One thing that might be worth looking at is the rate at which early adopters of Course Profiles have continued to update it, and report on the extent to which their original “future study” intentions converted to actual course registrations.

There’s also going to be a push on growing the number of fans on the official OU profile page. I’m not sure what plan @stuartbrown has for growing the numbers (for the task appears to have fallen to him…;-) but with a bit of luck the apps as well as the fan page will get highlighted through some of the official communication channels.

We also had a bit of discussion around other potential apps. Something I’d quite like to see would be a gallery app pulling images from the various flickr groups that have popped up around the T189 Digital Photography short course. Alumni of that group are already pretty active, and have just launched their first online exhibition, so if we could provide a channel that increases the audience for their show, and if they’re happy for us to amplify it via an OU Facebook app, that might be quite a fun thing to try as a community building app… (For more about the background to the exhibition, see Inspiring Learners; also see the T189 Graduates’ Exhibition).

(I also wonder if a similar gallery style app might work to showcase some of the games that students on T151 Digital worlds manage to create, all with their permission of course…)

Someone (I forget who) also suggested a “Share on Facebook” button within the gallery environment students use to build their portfolio whilst they take T189 (limited so that sharing was limited to photographs that a student had uploaded themselves, of course). This would amplify a student’s work and progress on a course to their Facebook friends, and provide their friends with a glimpse of what sorts of activities are involved in this particular OU course.

One thing I never even half managed to convince anybody that it was important was the data that was collected by the Course Profiles app in particular, though I did have a go at a few quick’n'dirty takes on this, such as OU Course Profiles Facebook App – Treemaps, Hierarchical Course Clusters from Course Profiles App and Tinkering with Google Charts (which started to consider what a course team dashboard view might look like). I was mulling this over again last night, and the following uses came to mind if we started to reconcile Course Profiles with institutional data (something we were always wary of, but anyway – here’s the thinking…;-)

- predictors and conversion rates: I’m not sure if Liam is logging when/how users change their status updates, but it’d be useful to know what percentage of users are updating their Course Profiles (e.g. from ‘currently taking’ to ‘taken’ courses, or more interestingly ‘intend to take’ to taking) and whether an “intend to take” course declaration is a good predictor of whether students do actually take a course. There’s an obvious quick win here for a possibly intrusive marketing campaign chasing folk who’ve declared an ‘intend to take’ course but don’t appear to have followed up on it;

- predicting course sizes: with several thousand users, does the sample of users on Course Profiles predict future course enrollment numbers? As far as I know, no-one in planning ever came to us asking to peak at our data to explore this. Nor did any more than a couple of Course Chairs ever seem to think it was interesting that we had stated intentions about course pathways, and that for new courses in particular we might be able to spot whether students were signing up for a course based on a pathway the course team was hoping for?

- retention: is the retention rate of students on a course who are on Facebook with Course Profiles and/or My OU Story different to the retention rate across the course as a whole? Does the fact that students who have declared ‘intend to take’ courses on the Course Profile correlate with their likelihood of completing an award?

- course planning and recommendation: on the one hand, courses appear to have natural numbers; on the other, working out what courses to take in what order for a particular degree given various factors (such as courses already taken, course exclusions etc) can be a confusing affair. At the moment, I believe a rule based support tool is being explored to help with course recommendations, but how well do those suggestions compare with a simple clustering based on Course Profiles data?

PS Just in passing, it’s worth noting that as with other groups who’ve used Facebook to mount campaigns against unpopular corporate decisions, OU students are no different… Open University curbs Tesco ‘clubcard degree’ scheme .

Maintaining a Google Calendar from a Google Spreadsheet, Reprise

In the post Updating Google Calendars from a Google Spreadsheet, I described a recipe for adding events to a Google Calendar from a Google Spreadsheet using Google Apps Script. After a quick chat with the person who was compiling a spreadsheet they wanted to use to populate a set of calendars, I revisited the script to make a few tweaks and hopefully increase its usability.

So here’s a glimpse of the spreadsheet they’re using to list dates for various campaigns and channels where related activity might occur. Firstly, we have some columns relating to the event or activity, and the dates on which they occur. The first column (added to calendar) is a control switch that identifies that the calendar details have been updated for that event:

Within the spreadsheet, I set the two date columns to have the Date type (from the Tools menu, set the Data Validation option to Date). I’m not sure how the spreadsheet is (correctly) identifying the US date format (MM-DD-YY) – maybe from a US timezone as a global setting for the spreadsheet?

As well as various other admin columns, there are columns relating to whether or not a channel will be used to support a particular event:

From what I could ascertain, the way the spreadsheet is supposed to work goes along the lines of: someone adds details of an event and the associated channels for the event to the spreadsheet. “Add” in a channel column says that event is to be added to that channel calendar. When the updating script is run, for each event it checks the control column A to see that an event hasn’t been added to the various channel calendars, and if it hasn’t checks the channel columns; if a channel column is set to “Add” the event details are added to that event calendar.

So – how do I need to modify the original script? Firstly, the original script use the default calendar. In this case, we need a separate calendar for each channel, so in Google Apps I created one calender per channel:

We can grab a calendar by name from a spreadsheet apps script using a call of the form:

 var cal_broadcast=CalendarApp.openByName("broadcastDemo");

When the script runs, we need to grab the appropriate range of cells from the spreadsheet to see which calendars to update. For testing purposes, I only grabbed a few rows…

var startRow = 2;  // First row of data to process
var numRows = 4   // Number of rows to process
var dataRange = sheet.getRange(startRow, 1, numRows, 26);
var data = dataRange.getValues();

for (i in data) {
  var row = data[i];
  var title = row[1];
  var desc=row[15];
  var added = row[col_added];  //Check to see if details for this event have been added to the calendar(s)
  var tstart = row[2]; //start time - I have defined the column in the spreadsheet as a Date type
  var tstop = row[3]; //start time - I have defined the column in the spreadsheet as a Date type
  var broadcast=row[col_broadcast]; // is this event one to "Add" to the broadcast calendar?
  var itunes=row[col_itunes]; // is this event one to "Add" to the itunes calendar? etc
  var youtube=row[col_youtube];
  if (added!="Added") { //the calendar(s) have not been updated for this event
    if (broadcast=="Add") {
      cal_broadcast.createEvent(title, tstart,tstop, {description:desc}); //add the event to the "broadcast" calendar
    }
    if (itunes=="Add"){
      cal_itunes.createEvent(title, tstart,tstop, {description:desc});
    }
    // etc for each channel
    var v = parseInt(i)+2; // +2 is an offset to do with the numbering of rows and the "blank" header row 0;
    sheet.getRange(v, 1, 1, 1).setValue("Added"); //set the fact that we have updated the calendars for this event
  }
}

In order to identify which columns to use to identify the broadcast, itunes, etc channels, I went defensive (the following bit of code comes before the previous snippet; what is does is to look at each column heading, and then set the column number for each channel appropriately based on its name; I should probably use a similar technique to identify the start/stop dates. What this approach does is accommodate changes to the spreadsheet in terms of the insertion of additional columns or the reordering of columns, for example, at a later date):

var ss=SpreadsheetApp.getActiveSpreadsheet();
var sheet=SpreadsheetApp.setActiveSheet(ss.getSheets()[0]); //need a routine to set active sheet by name?
//go defensive
var col_broadcast,col_itunes, col_youtube=1;
var maxcols=sheet.getMaxColumns();
for (var j=1;j<=maxcols;j++){
  var header= sheet.getRange(1, j, 1, 1).getValue();
  switch(header){
   case "Added to Google (Y/N/Hold)":col_added=j-1;
   case "Broadcast":col_broadcast=j-1; break;
   case "iTunes":col_itunes=j-1; break;
   case "YouTube": col_youtube=j-1; break;
   default:
  }
}

Running the combined function thus searches the spreadsheet for the appropriate channel columns and control column, checks the control column for each event entry to ensure that the event hasn’t been added to the selected calendars, and then adds the event to the appropriate channel calendars if required.

Playing with the script, it seemed a little bit clunky, so I tweaked it to update the channel cells with the word “Added” if it had been set to Add, and the calendar had been updated:

if (broadcast=="Add") {
  cal_broadcast.createEvent(title, tstart,tstop, {description:desc});
  dataRange.getCell(parseInt(i)+1,col_broadcast+1).setValue('Added'); // Replace "Add" with "Added"; +1 is offset for sheet numbering
}

It also struck me that if the settings of a channel was updated to “Add” after that event was updated, that channel’s calendar would never get updated. So I created a variant of the updating function that would just run on a per column basis and update a calendar entry for an event if it was set to “Add”, rather than checking the control column:

function caltestAddtoCal_broadcast(){ caltestAddtoCal("broadcast"); }
function caltestAddtoCal(addCal){
  //...
  if (addCal!="") {
    if ((addCal=="broadcast")&&(broadcast=="Add")) {
      cal_broadcast.createEvent(title, tstart,tstop, {description:desc});
      dataRange.getCell(parseInt(i)+1,col_broadcast+1).setValue('Added'); //+1 is offset for sheet numbering
    }
  // ...
  }
}

What this means is is that a channel controller can update entries in their calendar by running the script just for that channel and adding “Add” to any event they want adding to the calendar, the list of “Added” entries showing which events have already been added to that calendar:

Having doodled a script that sort of works, it’s now time to hack it around it so it looks a little more elegant. Which means refactoring.. sigh… and another reprise in a day or two, I guess…?!

Feed Aggregation, Truncation and Post Labeling With Google Spreadsheets and Yahoo Pipes

Got another query via Twitter today for a Yahoo Pipe that is oft requested – something that will aggregate a number of feeds and prefix the title of each with a slug identifying the appropriate source blog.

So here’s one possible way of doing that.

Firstly, I’m going to create a helper pipe that will truncate the feed from a specified pipe to include a particular number of items from the feed and then annotate the title with a slug of text that identifies the blog: (Advisory: Truncate and Prefix).

The next step is to build a “control panel”, a place where we list the feeds we want to aggregate, the number of items we want to truncate, and the slug text. I’m going to use a Google spreadsheet.

We can now create a second pipe (Advisory: Spreadsheet fed feed aggregator that will pull in the list of feeds as a CSV file from the spreadsheet, for each feed grab the feed contents, then truncate them and badge them as required using the helper pipe:

To keep things tidy, we can sort the posts so they appear in the traditional reverse chronological order.

PS Hmmm… it might be more useful to be able to limit the feed items by another criteria, such as all posts in the last two weeks? If so, this sort of helper pipe would do the trick (Advisory: Recent Posts and Prefix):

HTH:-)

The University Expert Press Room – COP15

Chatting just now to @paulafeery, I learned about something that completely passed me by at the time – the OU COP15 Press Room.

Built on WordPress (yay!-) using the Studiopress Lifestyle theme, the site provided a single point of access to content and several OU academics with relevant expertise in the area in order to “support” journalists writing around issues raised over the course of the COP15 Climate Talks last year.

The site makes good use of categories to partition content into several areas (each, of course, with its own feed:-) So for example, there are categories for News, Research and Opinion, the latest items from which are also highlighted on the front page:

The site also syndicated a feed from an OU Audioboo site where OU academics were posting audio commentaries on related matters:

I don’t think there was a COP15 channel on the OU Boxee TV channel though, although there was an OU COP15 Youtube playlist:

(It strikes me that it might have been good to put a playlist player in an obvious or obviously linked to place on the COP15 press room front page? I also wonder how we might best guarantee OU exposure from any video material we publish and what sort of form it needs to be in, and under what sort of licensing conditions, in order for news outlets to run with it? e.g. How the Ian Tomlinson G20 video spread The Guardian brand across the media, Video Journalism and Interactive Documentaries and to a lesser extent The OU on iPlayer (Err? Sort of not…).)

Anyway, this thematic press room seems like a great idea to me – though I’d have also liked to see a place for 200-500 word CC (attribution) licensed explanatory posts of the sort that could be used to populate breakout factual explanation boxes (with attribution) in feature articles, for example.

Compared to the traditional press release site (which apparently serves as much as an OU timeline/corporate memory device as anything, something that hadn’t occurred to me before…) this topical press room offers another perspective on the whole “social media press release” thang (e.g. Social Media Releases and the University Press Office).

If you want to look back over the COP15 Press Room, you can find it here: OU COP15 Press Room

PS If I was as diligent as Martin Belam at this sort of critique, I’d have probably have done a comparison of the OU Press Room site and example output as appearing on the Guardian COP15 topic page:

or the BBC COP15 topic page:

in order to see what sorts of content fit there might be going from copy on the OU Press Room to the material that is typically published on news media sites. If the content doesn’t fit, no-one will re-use it, right?

Maybe next time?!;-) (If you know of such a comparative critique, please post a link back to here or add a comment below;-)

Grabbing Google Calendar Event Details into a Spreadsheet

A comment to Updating Google Calendars from a Google Spreadsheet, where I showed how to get events described in a Google spreadsheet into a Google Calendar wondered if the other way around is possible? … It would be great for logging hours for projects or declaring working hours ;-)

Yes:-) Twenty seconds ago, this spreadsheet was empty:

A quick run of a script and I populated it from my default calendar using Google Apps script. Once again, all I did was peek through the docs and pull out the fragments I needed: Here’s the script:

function caltest3(){
  //http://www.google.com/google-d-s/scripts/class_calendar.html#getEvents
  // The code below will retrieve events between 2 dates for the user's default calendar and
  // display the events the current spreadsheet
  var cal = CalendarApp.getDefaultCalendar();
  var sheet = SpreadsheetApp.getActiveSheet();

  var events = cal.getEvents(new Date("March 8, 2010"), new Date("March 14, 2010"));
  for (var i=0;i<events.length;i++) {
    //http://www.google.com/google-d-s/scripts/class_calendarevent.html
    var details=[[events[i].getTitle(), events[i].getDescription(), events[i].getStartTime(), events[i].getEndTime()]];
    var row=i+1;
    var range=sheet.getRange(row,1,1,4);
    range.setValues(details);
  }
}

So now not only can we use a spreadsheet as a database we can also use a calendar in a similar way, and if necessary, link them all together?

Updating Google Calendars from a Google Spreadsheet

I got a request today along the lines of:

We’re in the process of creating a master calendar of events spreadsheet relevant to [various things]. These [various things] will all then have their own Google calendar so they can be looked at individually, embedded etc and everyone could of course have access to all and view them all via their personal Google calendar, turn different calendars on or off, sync with Outlook etc. etc.

X said “wouldn’t it be great if we made the master spreadsheet with Google docs and it could somehow automate and complete the calendars”.

Sigh…;-) So – is it possible?

I’ve only had a quick play so far with Google Apps script, but yes, it seems to be possible…

Take one spreadsheet, liberally sprinkled with event name, description, start and end times, an optional location, and maybe a even a tag or too (not shown):

The time related columns I specified as a date type using the “Data Validation…” form from the Tools menu:

Now take one Google apps script:

function caltest1() {
  var sheet = SpreadsheetApp.getActiveSheet();
  var startRow = 2;  // First row of data to process
  var numRows = 2;   // Number of rows to process
  var dataRange = sheet.getRange(startRow, 1, numRows, 5);
  var data = dataRange.getValues();
  var cal = CalendarApp.getDefaultCalendar();
  for (i in data) {
    var row = data[i];
    var title = row[0];  // First column
    var desc = row[1];       // Second column
    var tstart = row[2];
    var tstop = row[3];
    var loc = row[4];
    //cal.createEvent(title, new Date("March 3, 2010 08:00:00"), new Date("March 3, 2010 09:00:00"), {description:desc,location:loc});
    cal.createEvent(title, tstart, tstop, {description:desc,location:loc});
 }
}

(This was largely a copy and paste from Sending emails from a Spreadsheet Tutorial, which I’d half skimmed a week or two ago and seemed to remember contained a howto for bulk mailing from a spreadsheet, and the Apps script Calendar class documentation.)

And here’s the result of running the function:

The email tutorial adds a bit of gloss that allows a further column to contain state information about whether an email has already been set; we could do something similar to specify whether or not an event has been automatically added to the calendar, and if not, add it when the function is run.

Because it can be a pain having to go into the script editor to run the function, it’s easier to just create a menu option for it:

Here’s how:

function onOpen() {
  var ss = SpreadsheetApp.getActiveSpreadsheet();
  var menuEntries = [ {name: "Shove stuff in calendar", functionName: "caltest2"} ];
  ss.addMenu("OUseful", menuEntries);
}

I had a little play to see if I could trivially get an RSS feed into the spreadsheet using an =importFeed() formula, and use the details from that to populate the calendar, but for some reason the feed importer function didn’t appear to be working?:-( When I tried using CSV data from a Yahoo RSS2CSV proxy pipe via a =importData() formula, the test function I’d written didn’t appear to recognise the date format…

PS Arghh… the test formula assumes a Date type is being passed to it… Doh!

Hack round importFeed still not working by grabbing a CSV version of the feed into the spreadsheet:
=importdata(“http://pipes.yahoo.com/ouseful/proxy?_render=csv&url=http%3A%2F%2Fopen2.net%2Ffeeds%2Frss_schedule.xml”)

Tweak the calendar event creation formula:

cal.createEvent(title, new Date(tstart), new Date(tstop), {description:desc});

Run the function:

Heh heh :-)

PS it’s also possible to move content from a Google Calendar to a Google spreadsheet, as Grabbing Google Calendar Event Details into a Spreadsheet shows…

PPS it strikes me that the spreadsheets2calendar route provides one way of generating an iCal feed from a list of event times held in a spreadsheet, by popping the events into a Calendar and then making the most of its output formats? A bit like using Yahoo pipes as a quick’n'easy KML generator?

Grabbing the JSON Description of a Yahoo Pipe from the Pipe Itself

In a series of recent posts, (The Yahoo Pipes Documentation Project – Initial Thoughts, Grabbing JSON Data from One Web Page and Displaying it in Another, . Starting to Think About a Yahoo Pipes Code Generator) I’ve started exploring some of the various ingredients that might be involved in documenting the structure of a Yahoo Pipe and potentially generating some programme code that will then implement a particular pipe.

One problem I’d come across was how to actually obtain the abstract description of a pipe. I’d found an appropriate Javascript object within an open Pipes editor, but getting that data out was a little laborious…

…and then came a comment on one of the posts from Paul Daniel/@hapdaniel, pointing me to a pipe that included a little trick he was aware of. A trick for grabbing the description of a pipe from a pipe’s pipe.info feed (e.g. http://pipes.yahoo.com/pipes/pipe.info?_out=json&_id=eed5e097836289dfb4e8586220b18e0e.

Paul used something akin to this YPDP pipe’s internals pipe to grab the data from the info feed of a specified pipe (the URL of which has the form http://pipes.yahoo.com/pipes/pipe.info?_id=PIPE_ID using YQL:

http://query.yahooapis.com/v1/public/yql?url=http%3A%2F%2Fpipes.yahoo.com%2Fpipes%2Fpipe.info%3F_out%3Djson%26_id%3D44d4492a582d616bffda237d461c5ef4&q=select+PIPE.working+from+json+where+url%3D%40url&format=json

It’s just as easy to grab the JSON feed from YQL, e.g. using a query of the form:
select PIPE.working from json where url=”http://pipes.yahoo.com/pipes/pipe.info?_out=json&_id=44d4492a582d616bffda237d461c5ef4″. The pipe id is the id of the pipe you want the description of.

If you have a Yahoo account, you can try this for yourself in the YQL developer console:

We can then grab the JSON feed either from YQL or the YPDP pipe’s internals pipe into a web page and run whatever we want from it.

So for example, the demo service I have set up at http://ouseful.open.ac.uk/ypdp/pipefed.php will take an id argument containing the id of a pipe, and display a crude textual description of it. Like this:

So what’s next on the “to do” list? Firstly, I want to tidy up – and further unpack – the “documentation” that the above routine produces. Secondly, there’s the longer term goal of producing the code generator. If anyone fancies attacking that problem, you can get hold of the JSON description of a pipe from its ID using either the YPDP internals pipe or the YQL query that are shown above.

So What Is It About Linked Data that Makes it Linked Data™?

If you’ve been to any confrences lately where Linked Data has been on the agenda, you’ll probably have seen the four principles of Linked Data (I grabbed the following from Wikipedia…)

1. Use URIs to identify things.
2. Use HTTP URIs so that these things can be referred to and looked up (“dereference”) by people and user agents.
3. Provide useful information (i.e., a structured description — metadata) about the thing when its URI is dereferenced.
4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.

Wot, no RDF? ;-)

Anyway – here’s my take on what we have… building on my Parliamentary Committees Treemap, I thought I’d do something similar for the US 111st Congress Committees to produce something like this map for the House:

US 111st COngress committees

I reused an algorithm I’d used to produce the UK Parliamentary committee maps:

- grab the list of committees;
- for each committee, grab the membership list for that committee.

That is, I want to annotate one dataset with richer information from another one; I want to link different bits of data together…

The “endpoint” I used to make the queries for the Congress committee map was the New York Time Congress API.

The quickest way (for me) to get the data was to use a couple of Yahoo Pipes. Firstly, here’s one that will get a list of committee members from a 111st Congress House committee given its committee code (it’s left as an exercise for the reader to generalise this pipe to also accept a chamber and congress number arguments ;-)

I get the data using a URL. Here’s what one looks like:
http://api.nytimes.com/svc/politics/v3/us/legislative/congress/111/house/committees/HSAG.xml?api-key=MY_KEY

So given a committee code, can get a list of members. Here’s what a single member’s record looks like:

rank_in_party: 5
name: Neil Abercrombie
begin_date: 2009-01-07
id: A000014
party: D

If I wanted to annotate these details further, there is also a list of House members that return records of the form:

id: A000014
api_uri: http://api.nytimes.com/svc/politics/v3/us/legislative/congress/members/A000014.json
first_name: Neil
middle_name: null
last_name: Abercrombie
party: D
seniority: 22
state: HI
district: 1
missed_votes_pct: 12.81
votes_with_party_pct: 98.27

I can grab a single member record using a URL of the form:
http://api.nytimes.com/svc/politics/{version}/us/legislative/congress/members/{member-id}[.response-format]?api-key=MY_KEY

Now, where can I get a list of committees?

From a URL like this one
http://api.nytimes.com/svc/politics/v3/us/legislative/congress/111/house/committees.xml?api-key=MY_KEY

The data returned has the form:

chair: P000258
url: http://agriculture.house.gov/
name: Committee on Agriculture
id: HSAG

Here’s how I grab the committee listing and then augment each committee with its members:

Although I don’t directly have a identifier in the form of a URL for the membership list of a committee, I know how to generate one given a pattern that will create the URL for a committee ID given a committee ID, and a committee ID. The pattern generalises around the chamber (House or Senate) and Congress number as well:
http://api.nytimes.com/svc/politics/{version}/us/legislative/congress/{congress-number}/{chamber}/committees[/committee-id][.response-format]?api-key=MY_KEY

So I think this counts as linkable data, and we might even call it linked data. If I work within a closed system, like the pipes environment, then using “local” identifiers, such as committee ID, chamber and congress number, I can generate a URL style identifier that works as a web address.

But can we call the above approach a Linked Data™ approach?

1. Use URIs to identify things.
This works for the committee membership lists, the list of committees and individual members, if required.

2. Use HTTP URIs so that these things can be referred to and looked up (“dereference”) by people and user agents.
Almost – at the moment the views are XML or JSON (no human readable HTML), but at least in the committee list there’s a link to a human audience web page.

3. Provide useful information (i.e., a structured description — metadata) about the thing when its URI is dereferenced.
The members’ records are useful, and the committee records do describe the name of the committee, along with it’s identifier. But info that make committee records uniquely identifiable exist “above” the individual committee record (e.g. the congress number and the chamber). In a closed pipes environment, such as the one described above, if we can propagate the context (committee id, chamber, congress number), we can uniquely identify resourceses using dereferencable HTTP URIs (i.e. things that work as web addresses) using a URI pattern and local context.

4. Include links to other, related URIs in the exposed data to improve discovery of other related information on the Web.
Yes, we have some of that…

So, the starter for ten: do we have an example of Linked Data™ here? Note there is no RDF and no SPARQL endpoint exposed to me as a user. But I’ve had to use connective tissue to annotate one HTTP URI identified resource (the committee list) with results from a family of other HTTP URI idnetified resources (the membership lists). I could have gone further and annotated each member record with data from the “member’s info” family of HTTP URIs.

The “top level” pipe is a “linking query”. IF I had constructed it slightly differently, I could have passed in a chamber and congress number and it would have:
- constructed an HTTP URI to look up a list of committees for that chamber in that Congress; (this was a given in the pipe shown above);
- grabbed the list of committees;
- annotated with them with membership lists.

As it is, the pipe contains “assumed” context (the congress number and chamber), as well as the elephant in the room assumption – that I’m making queries on the NYT Congress API.

On reflection, this is perhaps bad practice. The congress number and chamber are hidden assumptions within the pipe. The URL pattern that the NYT Congress API defines explicitly identifies mutable elements/parameters:

http://api.nytimes.com/svc/politics/{version}/us/legislative/congress/{congress-number}/{chamber}/committees[/committee-id][.response-format]?api-key={your-API-key}

Which suggests that maybe best practice would be to pass local context data via user parameters throughout the pipework to guarantee a shared local context within child pipes?

So where am I coming from with all this?

I’m happy to admit that I can see how it’s really handy having universal, unique URIs that resolve to web pages or other web content. But I also think that local identifiers can fulfil the same role if you can guarantee the context as in a Yahoo Pipe or a spreadsheet (e.g. Using Data From Linked Data Datastores the Easy Way (i.e. in a spreadsheet, via a formula)).

So for example, in the OU we have course codes which can play a very powerful role in linking resources together (e.g. OU Course Codes – A Web 2.OU Crown Jewel). I’ve tended to use the phrase “pivot point” to describe the sorts of linking I do around tags, or course codes, or the committee identifiers described in this post and then show how we can use these local or partial identifiers to access resources on other websites that use similar pivot points (or “keys”). (ISBNs are a great one for this, as ISBN Playground shows.)

If Linked Data™ zealots continue to talk about Linked Data solely in terms of RDF and SPARQL, I think they will put a lot of folk who are really excited about the idea of trying to build services across distrubuted (linkable) datasets off… IMVHO, of course…

My name’s Tony Hirst, I like linking things together, but RDF and SPARQL just don’t cut it for me…

PS this is relevant too: Does ‘Linked Data’ need human readable URIs?

PPS Have you taken my poll yet? Getting Started with data.gov.uk… or not…

Getting Started with data.gov.uk… or not…

Go to any of the data.gov.uk SPARQL endpoints (that’s geeky techie scary speak for places where you can run geeky techie datastore query language queries and get back what looks to the eye like a whole jumble of confusing Radical Dance Faction lyrics [in joke;-0]) and you see a search box, of sorts… Like this one on the front of the finance datastore:

So, pop pickers:

One thing that I think would make the SPARQL page easier to use would be to have a list of links that would launch one of the last 10 or queries that had run in a reasonable time, returned more than no results, displayed down the left hand side – so n00bs like me could at least have a chance at seeing what a successful query looked like. Appreciating that some folk might want to keep their query secret (more on this another day…;-), there should probably be a ‘tick this box to keep your query out of the demo queries listing’ option when folk submit a query.

(A more adventurous solution, but one that I’m not suggesting at the moment, might allow folk who have run a query from the SPARQL page on the data.gov.uk site “share this query” to a database of (shared) queries. Or if you’ve logged in to the site, there may be an option of saving it as a private query.)

That is all…

PS if you have some interesting SPARQL queries, please feel free to share them below or e.g. via the link on here: Bookmarking and Sharing Open Data Queries.

PPS from @iand “shouldnt that post link to the similar http://tw.rpi.edu/weblog/2009/10/23/probing-the-sparql-endpoint-of-datagovuk/“; and here’s one from @gothwin: /location /location /location – exploring Ordnance Survey Linked Data.

PPPS for anyone who took the middle way in the vote, then if there are any example queries in the comments to this post, do they help you get started at all? If you voted “what are you talking about?” please add a comment below about what you think data.gov.uk, Linked Data and SPARQL might be, and what you’d like to be able to with them…

Parliamentary Committees Treemap

Can’t sleep, can’t work do “proper work”, so I’ve taken the day off as holiday to play out some more of the thoughts and momentum that cropped up during Dev8D… Like this treemap of parliamentary committee membership (as of 1/2/10).

Parliamenttary committees treemap

Here’s the tale…;-)

Whilst resisting playing with the new Guardian Politics API (and absolutely dreading the consequences any other topic specific APIs they open up might have on my time!) I did have a quick peek at at it to see whether or not it had any details about Parliamentary committees. It didn’t, but just looking upped my desire enough to pop over to the TheyWorkForYou API to see if they had any appropriate calls: what I wanted was a list of committees, and the membership thereof… (if you’ve read Council Committee Treemaps From OpenlyLocal you’ll maybe guess why!;-)

And it seems they do – getCommittee will “[f]etch the members of the committee that match [a specified] name – if more than one committee matches, return their names” and for a specified committee “[r]eturn the members of the committee as they were on [a specified] date”.

What I wanted was to be able to “select all” committees, which there’s no explicit switch for; but a search on the letter ‘e’ did turn up quite a comprehensive list, so as long as they’re using a free text search, I guess this has the same effect!

So I can get a list of committees, and now I want the members of each one. Pipework, I think:-)

First up, fetch a list of the committees from the TheyWorkForYou API. Then, for each committee, generate the URL that will pull pack the committee membership on a particular date.

For each committee, we now annotate the committee item with the membership of the committee and do a little tidying.

(In terms of API calls, this pipe makes N+1 calls to the TheyWorkForYou API, where the first call returns N committees.)

This pipe gives a JSON feed containing a list of committees annotated with the names and ID of committee members. The structure is not totally dissimilar to the structure I used for the Openlylocal committees, so I should be able to reuse that code if I can make the representations match.

So how do they differ? In no particular order, the differences are:
- the Openlylocal committee representation elaborates each member with a .party attribute which does not exist in the TWFY member lists, as well as using .first_name and last_name rather than .name to name the committee member;
- the TheyWorkForYou list does not have an .id attribute for each committee, which the Openlylocal committee list does have, and which is required for the treemap tooltips to work.

First task, then, is to annotate the committee membership list with each member’s party. A call to another TheyWorkForYou API call (http://www.theyworkforyou.com/api/docs/getMPsInfo returns a list of MPs and their party.

It’s easy enough to elaborate the committee list with this information – create a party lookup table and then annotate as required… While we’re at it, we can do a fudge with the name and id…;-)

var partyLookup=[];
for (var i=0; i< mpDetails.length; i++){
 partyLookup[mpDetails[i].person_id]=mpDetails[i].party;
}

for (i=0; i<mysocMPcommittees.value.items.length; i++){
  var currComm=mysocMPcommittees.value.items[i];
  currComm.id="cid"+i;

  for (var j=0;j<currComm.members.length; j++){
     currComm.members[j].party=partyLookup[currComm.members[j].person_id];

    currComm.members[j].first_name="";
    currComm.members[j].last_name=currComm.members[j].name;
  }
}

It was then easy enough to just plug this into the code I’d used to display the Openlylocal treemap. You can see it here.

Looking over the TheyWorkForYou API, I guess it would be possible to do a similar sort of thing to map the parties of people who spoke in debates? (I’m not sure if there’s data on votes/divisions with a party breakdown? If so, that would be good to plot too, with hierarchy based on : division, ayes/nays, party, so we could see defections, as well as relative sizes of the votes?

Next Page »


TweetMeme Chicklet

Custom Search Engines

How Do I? Instructional Video Metasearch Engine
UK HE Libraries metasearch infoskills resources
OUseful web properties search

OUseful feedthru bookmarks...

Pages

 

March 2010
M T W T F S S
« Feb    
1234567
891011121314
15161718192021
22232425262728
293031