Followers Scraping
For what it’s worth I thought I’d post the ruby script that lets me scrape my followers from Tumblr. Of course, you can get the tumblrs you are following considerably more easily, since there are formatting tags for that, but for some reason unknown to man I wanted to list all the people that are following me, rather than the other way around. (Perhaps because I’m astonished and grateful that anyone would want to do so; perhaps because I’m just awkward.)
So, here it is. You should be able to save your followers page to a file, then run this script passing the name of the file, and it should spew out HTML code to paste into your tumblr.
It uses _Why’s wonderful Hpricot library to parse the HTML. My understanding is that if you have Gem installed, it should load this for you automagically, but I wonder how well that works out in practice. Sorry about that.
#! /usr/bin/env ruby
require 'rubygems'
require 'hpricot'
def main(filename)
begin
doc = Hpricot(open(filename, "r"))
rescue =>eurgh
$stderr.puts "file read failed: #{eurgh}"
return
end
doc.search("//a").each do |a|
puts modlink(a) if testlink(a)
end
end
def testlink(a)
return a.search("img[@class='avatar']").any?
end
def modlink(a)
href = a.attributes["href"]
src = a.search("img").first.attributes["src"]
src.gsub!(/followers_files/, "http://data.tumblr.com")
src.gsub!(/128\..*/, "24.gif")
src = "images/default_avatar_24.gif" if src == "http://data.tumblr.com/default_avatar_24.gif"
return "<a style='avatar' href='#{href}'><img src='#{src}' /></a>"
end
if ARGV[0] == nil
$stderr.puts("Usage: followparse.rb <saved followers page>")
else
main(ARGV[0])
end
~ by shadowfirebird on May 21, 2008.
Posted in intangible, meta, programming

Leave a Reply
You must be logged in to post a comment.