Stereonaut!

Archive for the ‘feed’ Category

Perl on the NYMag

with 3 comments

Reading my morning load of feed news, I spotted an interesting bit on the New York Magazine travel feed:

Picture 18

Of course, someone's doing something wrong there, but I'd say it's their usage of XML::Feed (my guess on what they are using since the generator tag is omitted), by not dereferencing that hash reference or trying to forcibly stringify it.

Of course that comes from the feed itself:

Picture 20

It's awesome, and nothing new of course, that big publications and organizations are using Perl to process and offer data to users, but I find it always super cool when they make it so obvious :)

Written by David Moreno

August 10th, 2009 at 9:21 am

Categorized in: feed, perl, planeta linux

Tagged with , ,

Feedbag now using feedvalidator

with one comment

There's a very special case that I hadn't spotted on Feedbag. Within the different methods that Feedbag uses to discover the feed on a given URL, the very first one is lookup on a table of "known" content types. If the alleged feed is served with any of the following content types, then Feedbag just returns that same URL as it assumes it's the feed:

@content_types = [
'application/x.atom+xml',
'application/atom+xml',
'application/xml',
'text/xml',
'application/rss+xml',
'application/rdf+xml',
]

However, what happens if the feed is not served with any of those but it's a valid feed? Well, Feedbag wouldn't auto-discover the feed itself but would start parsing the HTML, which is time-consuming (and unneeded after all). Because of this, between the content type lookup and the HTML parsing, I've added W3C feed validation using the nice gem feedvalidator. However, since this would result on an extra dependency, I've left it as optional. If the gem is available, it'll use it, otherwise, it won't and will start parsing the HTML.

You can see the fix patch on this commit.

Written by David Moreno

February 11th, 2009 at 4:00 pm

Categorized in: feed, feedbag, planeta linux

Tagged with , , ,

Quick feed aggregation with Vitacilina

without comments

Vitacilina, ¡ah, qué buena medicina!

A few months ago. Maybe more than a year, I started hacking on Vitacilina, which was meant to be the replacement for Planet on all countries Planeta Linux supports. I was doing well, I even hosted the code back then in Google Code. Later, I forgot about it, but I'd always been wanting to replace Planet with some homebrew solution for the Planeta Linux community. Anyway, that hasn't happened yet. However, I did start using Vitacilina for my own needs on a local sandbox for my employer and it used to work pretty well. I've been hacking it to fit very specific requirements, though.

Anyway, I thought it was a good moment to release it publicly, just because it was all hidden there. So, I didn't implement the changes I did for my employer (because they were very specific for our products) but I did clean it up and wrote some documentation.

Now, what exactly is Vitacilina? Well, it's a feed aggregator. It's written in Perl (it's a Perl module) and it uses YAML to get its list of feeds and names and Template Toolkit to format and dump the output, it was efficient for me because it was very easy for me to create dumps:

use Vitacilina;

my $v = Vitacilina->new(
  config => "config.yml",
  template => "template.tt",
  output => "output.html",
);
$v->render;

And that's it. I used to create YAML files on the fly to create new Vitacilina objects and render them according to some data.

The  config file would look something like this:

http://myserver.com/myfeed:
  name: Some Cool Feed

http://feeds.feedburner.com/InfinitePigTheorem:

  name: David Moreno

And the template file:

 [% FOREACH p IN data %]
  <a href="[% p.permalink %]">[% p.title %]</a>
   by <a href="[% p.channelUrl %]">[% p.author %]</a>

 [% END %]

In that way, it's very simple, quick and easy to do aggregations. I just love TT, why wouldn't I? :-)

So go grab Vitacilina at CPAN. Also, the Git repo is at github.com/damog/vitacilina.

However… I started to hack on a similar more ambitious project called rFeed, that it's more of a framework than a simple library, which is why I stopped further Vitacilina development. I'll talk about rFeed later when the time comes.

Written by David Moreno

January 29th, 2009 at 9:36 pm

Categorized in: feed, in-english, perl, planeta linux

Tagged with , , , , , ,

Introducing Feedbag: Feed auto-discovery Ruby library/tool

without comments

Last week, I spent some time building a good (that I liked) feed auto-discovery tool to use in Ruby for other project I'm building, rFeed. I liked CPAN's Feed::Find, and at some point I made a wrapper class to run a Perl script using such module, however, I wasn't happy by mixing it all. So, Feedbag was born:

>> require "rubygems"
=> true
>> require "feedbag"
=> true
>> Feedbag.find "log.damog.net"
=> ["http://feeds.feedburner.com/TeoremaDelCerdoInfinito",
 "http://log.damog.net/comments/feed/"]
>> planet_feeds = Feedbag.find("planet.debian.org")
[ ... ]
>> planet_feeds.first(3)
=> ["http://planet.debian.org/rss10.xml",
 "http://planet.debian.org/rss20.xml",
 "http://planet.debian.org/atom.xml"]
>> planet_feeds.size
=> 104
>>

It makes smart use of relative and absolute bases, hrefs, links, content types, etc. It is also a single Ruby file, so you can grab it and use it on your application. Plus, it only requires Hpricot as dependency. It can find all feeds linked on a web page, but it will return the most important at the beginning of the resulting array, so you will have the important one on the first results (see example above with Planet Debian).

Synopsis, README and a brief tutorial have been placed at axiombox.com/feedbag. You can also take a look at the git repo, hosted in GitHub.

Written by admin

December 30th, 2008 at 6:58 pm

Categorized in: feed, feedbag, in-english, planet-debian, planeta linux, ruby, web

Tagged with , , ,

Get Adobe Flash playerPlugin by wpburn.com wordpress themes