I was impressed to know that Yahoo Pipes allow average joe and geeks alike to remix information across the web through RSS/JSON. It has a powerful and rich web GUI editor that allows users to create new feeds without doing coding. It was really designed for the Web 2.0 age, with other data sources that speaks RSS or JSON. For well formed HTML, which is a form of XML, Yahoo Pipes seems not as easy to work with. Given that I know how to write my own regular expression parsing code in Java, perhaps someone can show me how something similar can be done in Yahoo Pipes. Beyond my special use case, Yahoo Pipes is still very powerful tool for many other Web 2.0 mash-ups and I would definitely come back if a new idea comes to my mind. is a screen scrap site that allows users to extract information from a web page. It also have a powerful GUI that allows users to scrap data off a web page without writing code or even geeky stuff like regular expression or xpath. is really good at extracting a particular piece of information. You can just select a link, a paragraph or any particular block that may interest you on a page. This is a very powerful way to select several fields off a complex HTML page. However, it isn’t very intelligent at recognizing a long list. For a list, the user may have to select each individual item, which can be very tedious and error pron. Even though for my specific use case, it practically is useless, I still respect the creator(s) behind for creating such a wonderful tool for some people.

All in all, both Yahoo Pipes and attempt to bring some power of information processing to the masses and enable creative data remixing. At the same time, building powerful GUIs also limits users’ expressiveness. This is a trade-off that the designers of these tools made in order to bring usability up to a level that average Joe/Jane can actually understand and learn quickly. For those who needs more, he/she can write her own code to do what she/he wants.


  1. Thank you for this nice article, I use Pipes both for professional and personnal research: It’s easy to remix information and create custom filter… when RSS Feeds are Available.
    that’s why dapper is usefull.
    Actually, I’m trying to integrate Dapper in Yahoo, I still haven’t fix it up, strange because Yahoo has bought dapper withoutintegrating the tool in Pipes.

  2. An interesting thing to know abvout dapper is that you have the possibility to parameter your own way of scrapping the page using the template created by Dapper:
    FOr ex I used it to extract on pages of comments and as the website has the same structure I can easily program it for anypage of that website I want…
    Via the test option

