[ Team LiB ] Previous Section Next Section

Hack 86 Import Data Directly into Excel

figs/expert.giffigs/hack86.gif

Manipulate data the old-fashioned way—with a spreadsheet.

Microsoft's Excel spreadsheet program was created to analyze and manipulate data. Using Amazon's Web Services as a source of data, you can easily integrate live information into Excel spreadsheets. This example imports sales rank data for particular books and calculates the average rank.

86.1 What You Need

This hack relies on Excel's ability to perform web queries, so you'll need Excel 97 or higher. Excel's Web Queries tool transforms simple HTML tables into Excel spreadsheets.

86.2 The Code

This code uses several features of both Amazon Web Services and Excel. Once you see how it's put together, building your own queries is a snap.

86.2.1 Getting the Data

This hack starts with a standard XML/HTTP query. We want to analyze sales ranks of O'Reilly's Hacks series, so we build a standard query to retrieve those results.

http://xml.amazon.com/onca/xml3?t=insert associate tag [RETURN]
&dev-t=insert developer token&PowerSearch=publisher:O'Reilly [RETURN]
%20and%20keywords:Hack&type=heavy&mode=books&f=xml

This request uses a Power Search [Hack #9] to specify a publisher (O'Reilly) and a keyword (Hack).

86.2.2 Transforming the Data

The next task is to get the Amazon response data into a form that Excel can work with. Because Excel Web Queries rely on simple HTML, Amazon's response must be transformed. As mentioned, XSL stylesheets [Hack #83] are a quick way to make that happen.

Put the following code into a file called excel_SalesRank.xsl. This file will narrow the Amazon response to the fields needed and turn it into HTML.

<?xml version="1.0" ?> 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/
Transform">
<xsl:output method="html"/>
<xsl:template match="/">
<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<body>
<table id="basic">
<tr>
    <th bgcolor="#cccccc" colspan="3">Sales Data</th>
</tr>
<tr>
    <th bgcolor="#999999">ASIN</th>
    <th bgcolor="#999999">Title</th>
    <th bgcolor="#999999">Sales Rank</th>
</tr>
<xsl:for-each select="ProductInfo/Details">
<tr>
    <td><xsl:value-of select="Asin" /></td>
    <td><xsl:value-of select="ProductName" /></td>
    <td><xsl:value-of select="SalesRank" /></td>
</tr>
</xsl:for-each>
<tr><td colspan="3"></td></tr>
<tr>
    <td bgcolor="#ffcc00" colspan="2" align="right">
        <b>Average Sales Rank</b>
    </td>
    <td bgcolor="#ffcc00">=ROUND(AVERAGE(C3:C<xsl:value-of 
select="count(ProductInfo/Details) + 2" />),0)</td>
    </tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

This file takes an AWS response and turns it into a simple HTML table. The xsl:for-each section loops through the Details node and adds a table row for each result. Once you upload this file to a publicly accessible server, you should be able to view the results of the transformation in a web browser by specifying the XSL file's URL in the request. Just modify the URL from the last step by replacing f=xml with f=http://example.com/excel_SalesRank.xsl.

http://xml.amazon.com/onca/xml3?t=insert associate tag [RETURN]
&dev-t=insert developer token &PowerSearch=publisher:O'Reilly[RETURN]
%20and%20keywords:Hack&type=heavy&mode=books[RETURN]
&f=http://example.com/excel_SalesRank.xsl 

You should see a table like the one in Figure 6-4 with the data from the previous request. Note that the last cell of the table contains an Excel function. It looks like gibberish at this point, but serves an important purpose once it's inside Excel.

Figure 6-4. HTML table of sales data
figs/amzh_0604.gif
86.2.3 Importing the Data

To glue the two applications together, we'll use an Excel Query (IQY) file. The file will hold all of the information about the query, including the URL that points to the data. Create a new file called amzn_avg_sales.iqy and add this code:

WEB
1
http://xml.amazon.com/onca/xml3?t=insert associate tag[RETURN]
&dev-t=insert developer token&PowerSearch=publisher:O'Reilly[RETURN]
%20and%20keywords:Hack&type=heavy&mode=books&f=http://example.com[RETURN]
/excel_SalesRank.xsl

The top line lets Excel know that this is a Web Query. The 1 is a Web Query version number (this will always be set to 1) followed by the URL of the AWS query that includes the XSL file. Save the file and note its location.

86.3 Running the Hack

To run the hack, double-click the amzn_avg_sales.iqy file. Excel should open, contact the URL, and populate a spreadsheet resembling Figure 6-5.

Figure 6-5. Excel spreadsheet with Amazon sales rank data
figs/amzh_0605.gif

You now have some useful data—the average sales rank of the books in an application well-suited to manipulation and data analysis. You can update the data at any time by right-clicking any of the data cells in the spreadsheet and choosing "Refresh Data."

86.4 Hacking the Hack

The tough part of this hack is knowing how to get data directly from AWS into Excel. Once inside Excel the data is available to all of the features Excel offers: calculations, graphing, user input, etc. Here are a few quick ways to extend this example further.

86.4.1 Making the Query Dynamic

Instead of limiting the data to static information built into the query inside the IQY file, you can add a bit of interactivity. Suppose we have a list of ASINs and want to know the average sales rank, but we don't have the list when we're building the IQY file. Excel offers the ability to prompt the user for information before making the Web Query.

The only change you need to make is to place the prompt information inside the Web Query URL where you'd like the user input to go. Change the URL inside amzn_avg_sales.iqy to:

http://xml.amazon.com/onca/xml3?t=insert associate tag &dev-t=insert  [RETURN]
developer token &PowerSearch=isbn:["ASINs","Enter a list of ASINs  [RETURN]
separated by pipe symbols (|)."] &type=heavy&mode=books&f=http://[RETURN]
example.com/excel_SalesRank.xsl

Now, you're prompted for a list of ASINs upon opening the file, as you can see in Figure 6-6.

Figure 6-6. ASINs Excel dialogue
figs/amzh_0606.gif

Try these if you don't have a list in mind:

1565927141|0596003595|0596002246|0596002505

This should give you the average sales rank for the ASINs you enter. It could be different every time!

86.4.2 Using Different Data

As in the previous example, changing the data you're working with is just a matter of changing the URL of the AWS query. But if you want to work with a different set of data (like the cost of each item rather than sales rank), you need to tweak the XSL stylesheet a bit.

To see how the stylesheet makes all the difference, create a new file called excel_PriceDiff.xsl and add the following code:

<?xml version="1.0" ?> 
<xsl:stylesheet version="1.0" xmlns:xsl="http://www.w3.org/1999/XSL/Transform">
<xsl:output method="html"/>
<xsl:template match="/">
<html xmlns:o="urn:schemas-microsoft-com:office:office"
xmlns:x="urn:schemas-microsoft-com:office:excel"
xmlns="http://www.w3.org/TR/REC-html40">
<body>
<table id="basic">
<tr>
    <th bgcolor="#cccccc" colspan="4">Sales Data</th>
</tr>
<tr>
    <th bgcolor="#999999">ASIN</th>
    <th bgcolor="#999999">Title</th>
    <th bgcolor="#999999">List Price</th>
    <th bgcolor="#999999">Amazon Price</th>
</tr>
<xsl:for-each select="ProductInfo/Details">
<tr>
<td>
  <xsl:value-of select="Asin" /> 
</td>
<td>
  <xsl:value-of select="ProductName" /> 
</td>
<td>
  <xsl:value-of select="ListPrice" /> 
</td>
<td>
  <xsl:value-of select="OurPrice" /> 
</td>
</tr>
</xsl:for-each>
<tr><td colspan="3"></td></tr>
<tr>
    <td bgcolor="#ffcc00" colspan="3" align="right">
        <b>Average List Price</b>
    </td>
    <td bgcolor="#ffcc00">=ROUND(AVERAGE(C3:C<xsl:value-of 
select="count(ProductInfo/Details) + 2" />),2)
    </td>
</tr>
<tr>
    <td bgcolor="#ffcc00" colspan="3" align="right">
        <b>Average Amazon Price</b>
    </td>
    <td bgcolor="#ffcc00">=ROUND(AVERAGE(D3:D<xsl:value-of 
select="count(ProductInfo/Details) + 2" />),2)
    </td>
</tr>
</table>
</body>
</html>
</xsl:template>
</xsl:stylesheet>

Upload this XSL file to a public server and note the URL. Create a new query file called amzn_price_diff.iqy and use the same code as in previous examples, but change the f= variable to the URL of the new stylesheet. Open the file and you should see a new spreadsheet with the list price and the Amazon price, as shown in Figure 6-7.

Figure 6-7. Excel spreadsheet with price data
figs/amzh_0607.gif
86.4.3 Graphing Results

Once the data is in Excel, it's easy to create graphs to get a sense of what the data means at a glance. Here's how to add a graph to the spreadsheet:

  1. Building on the last example, run the amzn_price_diff.iqy file. You should see data that includes a list of books along with the list price and Amazon price.

  2. Highlight all of the Title, List Price, and Amazon Price cells.

  3. Choose Insert Chart from the menu. This will start the chart wizard.

  4. Click "Finish."

You should now have a nice graphic representation of the data (Figure 6-8).

Figure 6-8. Graph of price data
figs/amzh_0608.gif
    [ Team LiB ] Previous Section Next Section