Skip to content

Scraping the Apple App Store

February 8, 2011

iOS developers would profit from the historical rating, reviews, and rank information that iTunes provides and so should be able to easily download and store such information. Unfortunately, Apple is a tad paranoid with regard to the information it provides on the App Store data. We think that a app distributor (including Apple) should provide programmatic (API) to access its store’s data. If you want to build your own “App Store Scraper”, you will find below a few hints.

Developers normally access the App Store via Apple iTunes. iTunes behaves like a specialized browser that sends HTTP queries to a web-server. The web-server replies in different ways depending on whether it identifies the caller is iTunes or a web browser. If  you want to see all reviews in the UK for the application with id=xxxxxxxxx (look for a real id starting from here)., you should request the file:

http://itunes.apple.com/WebObjects/MZStore.woa/wa/customerReviews?s=143444&id=xxxxxxxxx&displayable-kind=11

If you paste this URL into your browser, you won’t be able to see the same amount of information you would see on iTunes. It might also be that you cannot see anything at all, and your browser will ask to open iTunes. Still, the URL above is the same visited by iTunes –the only difference being in the way iTunes sends the request. Fortunately, you can cheat Apple’s server into believing you are using iTunes when you’re actually not, by making a request via cURL, an common application on most GNU/Linux distributions, that has been ported also to Windows.

1. If you are on Windows, and do not have cURL installed,  download it from here, unzip it, and add the bin directory to the PATH variable;

2. Open a terminal window (META+R, digit CMD);

Once you have cURL installed, both on Windows and *nix, cut and paste in your terminal:

curl -H ‘Host: itunes.apple.com’ -H ‘Accept-Language: en-us, en;q=0.50′ -H ‘X-Apple-Store-Front: 143444,5′ -H ‘X-Apple-Tz: 3600′-U ‘iTunes/9.2.1 (Macintosh; Intel Mac OS X 10.5.8) AppleWebKit/533.16”http://itunes.apple.com/WebObjects/MZStore.woa/wa/customerReviews?s=143444&id=xxxxxxxxx&displayable-kind=11′

If you are prompted for a password, just type enter. You should see now the actual XML file seen by iTunes, with all reviews.

About these ads
11 Comments leave one →
  1. ENB14 permalink
    January 11, 2012 11:37 pm

    Keep getting the error, -bash: syntax error near unexpected token `(‘

    As I’m not a programmer, I have no idea what’s wrong.

  2. specious permalink
    March 24, 2012 12:25 pm

    curl -H ‘Host: itunes.apple.com’ -H ‘Accept-Language: en-us, en;q=0.50′ -H ‘X-Apple-Store-Front: 143444,5′ -H ‘X-Apple-Tz: 3600′ -U ‘: iTunes/9.2.1 (Macintosh; Intel Mac OS X 10.5.8) AppleWebKit/533.16′ ‘http://itunes.apple.com/WebObjects/MZStore.woa/wa/customerReviews?s=143444&id=xxxxxxxxx&displayable-kind=11′

    To make it fully automatic :)

  3. specious permalink
    March 26, 2012 9:35 am

    Actually, I have it reduced to this now and it still works:

    curl -H “X-Apple-Store-Front: 143444,5″ -U “:” “http://itunes.apple.com/WebObjects/MZStore.woa/wa/customerReviews?s=143444&id=452118074&displayable-kind=11&sort=4″

    143444,5 is the ID of the iTunes U.S.A. storefront. You can see all the others in this beautiful Ruby script: http://github.com/gonzoua/random-stuff/blob/master/appstorereviews.rb

    sort=4 is to sort by most recent review :-)

    • specious permalink
      March 26, 2012 9:44 am

      It’s unfortunate this blog’s configuration insists on transcribing the quotes into artsy characters. So beware, you will need to transcribe them into some sort of standard quotes after you copy-paste the above command for it to work ;-)

      • March 26, 2012 9:54 am

        Yep, it refuses to have normal quotes… Thanks for your comments by the way!

  4. Mr. 305 permalink
    April 10, 2012 10:42 pm

    Thanks a lot for this post. Can anybody explain what “displayable-kind” means?
    My wireshark says “displayable-kind=2″ and i would like to understand what that means..
    Thx a lot

  5. Karthik permalink
    November 13, 2012 3:11 am

    Is there a paid service from Apple which helps us to rip the above historical rating, reviews, price and rank information and other details

  6. imediaacademyomers permalink
    September 20, 2014 11:44 am

    That’s great, thank you.

    Can you access the search results for a search term in a similar way?

    If I’m using the official iTunes Search API, the results are different (mostly the order in which they are listed) than when I search directly on the iOS device. (You can verify this by searching for a term using iTunes on your computer, and compare that to searching in the “App Store” app on your iOS device)

    • September 20, 2014 2:59 pm

      Hi
      Thanks for your comment. I have never tried to use the search, I’m sorry. BTW, I actually haven’t attempted any scraping for a while (and I’m surprised to hear this still works!). Yes, AFAIK Apple propose different rankings (or info in general) when the access comes from a mobile device or from iTunes or, possibly, using the official API…

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

Follow

Get every new post delivered to your Inbox.

%d bloggers like this: