More advanced HTTP scripting: Facebook Photo Album Downloader
Friday, 4 September 2009
I wanted to download a photo album from Facebook, but of course there’s no simple API or option to do that. Searching the internet finds a couple of options (a Firefox plugin, a Facebook app,) but I couldn’t seem to find anything standalone.
This gives another great opportunity to talk about advanced HTTP scripting in PowerShell. We’ve talked about it in the past (here: http://www.leeholmes.com/blog/AdvancedHTTPASPNetScriptingWithPowerShell.aspx,) so this script introduces some new techniques.
There are a couple of challenges for scripting Facebook, especially using the techniques in the earlier post.
The first is that the login sequence is secure :) The login page gets served over SSL, meaning that the basic Send-TcpRequest commands we did previously won’t work. SSL requires encryption and a complex handshake. Luckily the System.Net.WebClient class provides that for use.
The second issue is that the cookies are dynamic, and given out in two stages. You need to first connect to the main Facebook homepage, at which point you get a couple of session cookies. Then, you connect to the login page, at which point you get some login cookies. Once you have THOSE, you can script the rest of Facebook.
Here is Get-FacebookCookie.ps1:
|
001
002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 |
## Get-FacebookCookie.ps1
## Logs into Facebook, returning the cookie required for further operations param($Credential) $Credential = Get-Credential $Credential ## Get initial cookies $result = $wc.DownloadString("http://www.facebook.com/") ## Login $wc = New-Object System.Net.WebClient ## Get the resulting cookie, and convert it into the form to be returned in the query string |
The most complex part of that script comes from the Set-Cookie headers returned by the server. Information in that header limits the domain of the cookie, expiration dates, and more. Browsers use this information to determine cookie policies, but we want to just blindly feed it back to Facebook. The little match, replace, and join combination converts the “Set-Cookie” syntax into one suitable to return to the server.
Once we’ve logged into Facebook, we download the album page, cycling through successive pages until we stop finding photos to download. Since you can derive the URL to the large size images from the thumbnails, we don’t need to do an extra request for the photo information page. If you wanted to extract photo comments as well, then you would have to.
Here is Get-FacebookAlbum.ps1:
|
001
002 003 004 005 006 007 008 009 010 011 012 013 014 015 016 017 018 019 020 021 022 023 024 025 026 027 028 029 030 031 032 033 034 035 036 037 038 039 040 041 042 043 044 045 046 047 048 049 050 051 052 053 054 055 056 057 058 059 060 |
## Get-FacebookAlbum.ps1
## Downloads the images attached to a Facebook photo album param($Album = $(Read-Host "Inital album URL (for example, http://www.facebook.com/album.php?aid=12345&id=12345&ref=nf)"), $Prefix, $Cookie ) $albumId = $album -replace ‘.*id=([\d+]+).*’,‘$1′ if(-not $Cookie) $pageNumber = 1 $wc = New-Object System.Net.WebClient ## Regex for images "(None)" |
So in under 100 lines of code, we’ve got ourselves an image downloader.


Subscribe to this blog.
No. 1 — September 4th, 2009 at 9:10 pm
I like it! I have wanted to do something like this to create my own personal friend feed. While researching this, I ran across quite a few people that lost their accounts for "screen scraping" facebook.
No. 2 — September 6th, 2009 at 3:55 pm
I wonder how long it takes before Facebook comes and complains about this like they did in the past with me and other people for interacting with their site by crafting HTTP requests like you do and not using the APIs http://www.muscetta.com/2007/09/06/facebook-status-change-is-not-a-crime/ – even if there WASN’T an API at the time to do what I wanted to do… http://www.muscetta.com/2007/08/03/facebook-statetray/
After much complaining, they eventually implemented an API for what I wanted to do http://www.muscetta.com/2007/10/01/facebook-implemented-a-usersetstatus-api/
So maybe we’ll have APIs for photos too, at one stage.
I keep using Flickr for my photos, tho.
No. 3 — September 29th, 2009 at 1:35 pm
Excellent article. I can see powershell getting much more popular for web pentesting with capabilities like this.