This week I had a friend ask me whether he knew of any RSS services for Movie listings. I looked at Moviefone, Fandango and Yahoo Movies and no one listed anything. We came across the Yahoo Hacks page for a Perl script that uses a Yahoo Movies theater id to screenscrape info from their pages. Two problems came up: a) this script was not meant to be used as a web page, since it dumps its info into a file and cannot take arguments as a cgi-bin script, and b) Yahoo changed their pages and page formats so that the regular expressions didn’t work. After that, well, there really wasn’t anything left of the original.
Since it uses the XML::RSS::SimpleGen module, I looked that up and found that printing to the screen is as easy as using rssas_string in place of their rsssave. And adding use of the CGI module, I was able to suck in a $tid variable from the web – with proper sanitizing thanks to the perl HTML::Scrubber module. That was the easy part.
Because Yahoo Movies changed URLs and page formats, the actual screenscraping code had to be redone, but eventually I got it. The nice part is that this all uses Yahoo’s WAP interface to view things, so that this works nicely for anyone on the road using a Treo or other such web enabled phone.
You can see the results at the link below. To modify this for your own use, look up your local theater at the Yahoo Movies page and find your tid value. Plug it in place of my 2764 value for Hillsborough, NJ and you’ll be able to keep track of your own movie times at that theater.
I’m really surprised Yahoo isn’t already doing this with Yahoo Movies, seeing how hard they embraced RSS this past year.