Generate "top sites" list from Alexa ( The list is written to a text file, as a service for other tools.


File Version Date Comments
AlexaList.jar 1.0 23-Oct-2012


Usage: AlexaList global/country/category


AlexaList global
Global top 500 (up to) sites.
Output file '.../results/alexa/'
Downloaded page 1. Found 777 lines. URL:;0
Downloaded page 2. Found 777 lines. URL:;1
Downloaded page 20. Found 777 lines. URL:;19

$ head .../results/alexa/


A typical Alexa list displays 25 results per page. The page link for the global list is; suffixed with the page number minus 1. For example, the first page is;0, the second is;1, etc.

Country pages look like this:;7/US (page 8 in this example).

Category pages look like this:;1/Top/News/Breaking_News.

Currently each relevant entry in the HTML has the following structure:

<li class="site-listing">
 <div class="count">45</div>
 <div class="desc-container">
 <a href="/siteinfo/"></h2> 
 <span class="small topsites-label"> 
As of 18-Mar-2012 there are two bugs in Alexa that are relevant to this code:
  1. Sometimes a page is missing the last item (25th).
  2. There is no "Next" button after the 20th page (with the 500 item) but it is sometimes possible to get to page 21 with a link.