eduTwitterin’

Jane’s list of “100+ (E-)Learning Professionals to follow on Twitter” (which includes yours truly, Martin and Grainne from the OpenU :-) has been doing the rounds today, so in partial response to Tony Karrer asking “is there an equivalent to OPML import for twitter for those of us who don’t want to go through the list and add people one at a time?”, I took an alternative route to achieving a similar effect (tracking those 100+ e-learning professionals’ tweets) and put together a Yahoo pipe to produce an aggregated feed – Jane’s edutwitterers pipe

Scrape the page and create a semblance of a feed of the edutwitterers:

Tidy the feed up a bit and make sure we only include items that link to valid twitter RSS feed URLs (note that the title could do with a little more tidying up…) – the regular expression for the link creates the feed URL for each edutwitterer:

Replace each item in the edutwitterers feed with the tweets from that person:

From the pipe, subscribe to the aggregated edutwitters’ feed.

Note, however, that the aggregated feed is a bit slow – it takes time to pull out tweets for each edutwitterer, and there is the potential for feeds being cached all over the place (by Yahoo pipes, by your browser, or whatever you happen to view the pipes output feed etc. etc.)

A more efficient route might be to produce an OPML feed containing links to each edutwitterer’s RSS feed, and then view this as a stream in a Grazr widget.

Creating the OPML file is left as an exercise for the reader (!) – if you do create one, please post a link as a comment or trackback… ;-) Here are three ways I can think of for creating such a file:

  1. add the feed URL for each edutwitter as a separate feed in an Grazr reading list (How to create a Grazr (OPML) reading list). If you don’t like/trust Grazr, try OPML Manager;
  2. build a screenscraper to scrape the usernames and then create an output OPML file automatically;
  3. view source of Jane’s orginal edutwitterers page, cut out the table that lists the edutwitterers, paste the text into a text editor and work some regular ecpression ‘search and replace’ magic; (if you do this, how about posting your recipe/regular expressions somewhere?!;-)

Enough – time to start reading Presentation Zen

Author: Tony Hirst

I'm a Senior Lecturer at The Open University, with an interest in #opendata policy and practice, as well as general web tinkering...

9 thoughts on “eduTwitterin’”

  1. I thought the pipe looked interesting. I found a new person to follow in under 1 minute. I’ll check it out again.

    The main point for me is your initiative and the idea to piggy-back Jane’s nifty idea.Thanks.
    Elaine

  2. Awesome use of Pipes, I tried to follow some of your ideas, and managed to hack together an OPML of all the feeds-

    http://spreadsheets.google.com/pub?key=p05Ar3f-0bXkUCzYBS8oxfQ&output=txt&gid=1

    Use right click -> save as -> edutwitters.xml

    (you can also visit the url and save the page, make sure to save in plain text format)

    Please note, this has not been tested much, use with caution.

    GDocs importXML function was used to scrape the URLs, if the original page is updated, it *should* update itself as well.

    The spreadsheet can be viewed & copied if anyone is interested, or notices things that can be fixed:

    http://spreadsheets.google.com/ccc?key=p05Ar3f-0bXkUCzYBS8oxfQ&hl=en

    Here is the result of importing to Google Reader, which may be easier for some people to import: http://www.google.com/reader/public/subscriptions/user/13323617217207974436/label/EduTwitters

    Thanks for the interesting and challenging topic, hope this helps some people =)

  3. Hi Mike
    That’s a really neat use of google spreadsheets :-) I’m assuming the output sheet has cells constructed from cell items in the input sheet?

    What query did you use for the importXML function?

    I wonder whether it would be possible to import the data from the table on Jane’s HTML page using the “=importHtml(URL, element, index)” construct (e.g. as described in http://googlesystem.blogspot.com/2007/09/google-spreadsheets-lets-you-import.html or http://cybernetnews.com/2008/06/13/myfive-google-spreadsheet-functions-you-wont-find-in-microsoft-office/ )?

    I think I’m going to have to look at Google spreadsheets a little more closely as an intermediate step in the data scraping and making available stakes… Thanks for the recipe :-)

  4. @Tony
    Thanks, I’m sure the same thing could be done with many tools, but my experience with Google Docs is much greater than with Yahoo Pipes or anything else (but this post helped me learn more about Pipes).

    Yep you are right, the first sheet pulls in the list of Twitter usernames, and then its a question of preparing the OPML format.

    The function used was imporXML which accepts an XPath query: =importxml(“http://c4lpt.co.uk/socialmedia/edutwitter.html”,
    “//ol//a[contains(text(),’@’)]/@href”)

    For me Xpath is easier to use to find the specific elements, but importHTML should definitely work as well.

    The spreadsheet ccc? link can be used to see the full document, that’s one of the great things about tools like this – instant open source code.
    It wasn’t specifically mentioned in the original post but the same is true for Yahoo Pipes, anyone can visit http://pipes.yahoo.com/ouseful/edutwitterers and see the source code (this helped me a lot when reading the explanation).

    Google spreadsheets doesn’t offer all the flexibility of Pipes, but having a lot of experience with Excel and similar programs the interface is easier for me to pick up.
    Glad to find a case where it can put to good use :)

  5. Mike

    Can I use you as my XSL/XPath buddy then? ;-) I *really* have to hack around with that stuff to get it to work!

    Just by the by, if you are happier with XSL/XPath, but can see value in the pipes approach, have you tried looking at http://pipes.deri.org/

    “Inspired by Yahoo’s Pipes, DERI Web Data Pipes implement a generalization which can also deal with formats such as RDF (RDFa), Microformats and generic XML.

    DERI Pipes are Open Source Software, ad as such they can be easily extended and applyed in use cases where a local deployment is needed.

    DERI Pipes provides a rich web GUI where pipes can be graphically edited, debugged and invoked. The execution engine is also available as a standalone JAR, which is ideal for embedded use.

    DERI Pipes, in general, produce as an output streams of data (e.g. XML, RDF,JSON) that can be used by applications. However, when invoked by a normal browser, they will provide a end user GUI for the user to enter parameter values and browse the results”

    That URL again: http://pipes.deri.org/

Comments are closed.