Matthew D. Laudato writes about software and technology

Archive for March 2013

Using REST APIs from R – Campaign statistics

with 2 comments

In my previous posts in this series, we looked at how to call REST APIs from R. Now let’s get serious and get real email campaign data from the Constant Contact APIs. To do this, we’ll write a new function to do more sophisticated parsing of the XML returned from the Constant Contact campaigns resource. The algorithm looks like this:

  • GET the campaigns collection, which provides high level XML data on all campaigns
  • Transform the raw XML into a DOM object
  • Iterate over each campaign in the DOM and make a second call to the APIs to get the detail data for that campaign, and assemble an R dataframe object that holds the data for all campaigns

To get started, here’s a simple R script that we’ll use to set up the call to our new function:

campaignXML = getURL(url = "{username}/campaigns?access_token=your_token_goes_here")
campaignDOM = xmlRoot(xmlTreeParse(campaignXML))
campaignStats = getCampaignDataframe(campaignDOM)

You should review the previous posts (here and here) if this script doesn’t make sense to you.

The heavy lifting is done by our new function, getCampaignDataframe. The code is pretty straightforward, and amounts to parsing the DOM object to extract per-campaign data:

getCampaignDataframe <- function(doc) {
 namelist <- NULL
 urllist <- NULL
 sends <- NULL
 opens <- NULL
 for (i in 1:xmlSize(doc)) {
    node <- doc[[i]]
    namelist <- c(namelist, node$children[["content"]]$children$Campaign[["Name"]]$children$text$value)
    url = node$children[["content"]]$children$Campaign$attributes[["id"]]
    url = sub ("http","https",url)
    urllist <- c(urllist, url)
    if (length(url) > 0) {
       url = paste(url,"?access_token=your_token_goes_",sep="",collapse=NULL)
       campaignDetailXML = getURL(url = url)
       campaignDetailDOM = xmlRoot(xmlTreeParse(campaignDetailXML))
       sends <- strtoi(c(sends, campaignDetailDOM[["content"]]$children$Campaign[["Sent"]]$children$text$value))
       opens <- strtoi(c(opens, campaignDetailDOM[["content"]]$children$Campaign[["Opens"]]$children$text$value))
 campaignstats = data.frame(name=namelist,url=urllist,sends=sends,opens=opens,stringsAsFactors=FALSE)

The function takes a single parameter ‘doc’, which is a DOM object containing the list of campaigns – including the URLs to get to the campaign detail. The for() loop simply iterates over the DOM and extracts the campaign name and URL. It then issues a GET request to the URL to obtain the sends and opens for that campaign. Once this is done, the data is assembled into the data frame and returned. Here is an example of the dataframe contents when I run the script against one of my Constant Contact accounts (with specific identifying data for username and campaign removed):

name                                                                url                                                                                                                                                                           sends    opens

1 Email Created 2012/11/10, 9:03 AM{username}/campaigns/{campaignid}           3             1
2 Created via API as UTF-8 v6       {username}/campaigns/{campaignid}           3             1
3 Created via API as UTF-8 v5       {username}/campaigns/{campaignid}           3             2

Given this raw data, now in a useful form in a dataframe, we can use R to help us calculate the open rate for emails. In R, this is as simple as:

openRate <- campaignStats$opens/campaignStats$sends
campaignStats$openRate <- openRate

This adds a new column called openRate to the dataframe that contains the calculated open rates for each email campaign. Admittedly this is a simple example, but at this point, you have all the tools you need to pull data into R from a REST API, and do some basic manipulations on it.

Happy Modeling!

– Matt

Written by Matthew D. Laudato

March 13, 2013 at 4:30 pm