Pull Google Analytics data into a Google Docs spreadsheet using the API

Update: a better solution

Google now presents an inbuilt library for accessing Google analytics from Google docs. You can see how to use it with a ready-made and powerful solution for template driven analytics reports and dashboards here.


This post is left here in case it is of interest to anyone. However, this is no longer the best way of accessing the Google Analytics API from Google Docs. Google have made an analytics API library available directly within docs script.


The Google Analytics Reporting API  enables you to

  • automate reporting tasks,
  • combine GA data with other information,
  • apply analysis and visualisation tools, and
  • access data you couldn’t easily reach via the Google Analytics UI.

As well as reducing repetitive work, this reduces your chance of errors, and enables you to build up more sophisticated analysis.

Advantages of using the API

The Google Analytics UI is great for exploring metrics – it’s user friendly and easy to discover features. However, for some tasks such as regular reporting and more sophisticated analysis, it is not always the sharpest tool. Using the API you can:

  • Specify a query parameters in the spreadsheet – this enables you to run exactly the same query across multiple segments, filters, dates or profiles, varying one parameter at a time as needed. In the spreadsheet you can create lists of queries to be run and analysed automatically
  • Retrieve data directly into the spreadsheet and use functions to calculate impacts, relate metrics to data from other sources, apply charting and visualisation tools, and pull highlights to the front
  • Retrieve data from disparate areas of the GA UI and separate profiles, to report in one place
  • Using the dynamic segments feature in the API you can create and apply segments on the fly – a task that is slow in the UI
  • Access data not visible through the standard UI – eg geolocation, filter on hostname

Background and other projects

For a while now I have been using a combination of:

These have saved many hours of work. When I need an updated report, I open a spreadsheet and click a button. I use these tools to run the same analysis across different periods and areas of sites, without having to remember (or re-invent) the complex regex needed to get the right groupings in a ‘mature’ URL scheme.

I wanted to build something using Google Spreadsheets that incorporate the benefits of these, and add a few features.

Prerequisites

You need a working Google Analytics account and profile to use this. Your Google account must have access to both the Analytics profile, and Google Docs.
The discussion assumes familiarity with:

Features

  • Version 3 of the Google’s reporting API to take advantage of the JSON return, and avoid uncertainty around the redirection from 2.3 to 2.4 and
  • OAuth for authorization, as recommended by Google – to avoid storing credentials in the spreadsheets.
There are two different ways of calling the script:
  • as a spreadsheet function – you can simply place a function in a cell, once the script is set up and authorised.
    • Depending on what you ask for, the function may return a single value, or an array from Google Analytics. If you ask for only one metric, the function will return a single value – simple. If you ask for multiple metrics and/or dimensions the function returns an array. In Google spreadsheets array functions are really easy to use.
    • The function is:
      getMetric(profile, metric, startdate, enddate, [dimensions], [segment], [filter], [sort], [maxresults])
    • any or all the parameters can be cell references
  • from the menu. Called this way the script uses query parameters specified in the _settings worksheet, and outputs data to a location also specified in _settings – a new or existing worksheet.
    • This results in a table of cells with the metrics in them, with no spreadsheet formulas behind the values, so no unexpected recalculation
    • Activated from a custom menu “GA”.
    • Values in the _settings sheet may also be references to other cells or formulas (for example to automatically update the target spreadsheet name depending on the date, or to change the filter from a list)

Issues

503 gateway timeout issues are frequent – I haven’t found a way to increase the timeout for a response from the GA API. I suggest just running the script again.

The script currently does not have the API registered application functions in it – I am so far unable to exhaust my grace quota for testing, so haven’t needed to use this. If you need it, you can add the values directly to the getGA function in script, or for a more elegant solution, follow Google’s example script for twitter authorization – which stores the oAuth shared secrets in script properties.

To Implement

This is offered for use with no warranty or support of any kind.

  1. Load this shared Google Docs spreadsheet with analytics script
  2. Select File|Make a copy (you have to be logged in to a Google account to do this)
  3. Update the “ids” value to one of your profile ids – should be something in the form “ga:######” – you can find the list in the query explorer, or look at the id parameter in the query string for a page in Google Analytics (at some point I will ad a feature to retrieve the list)
  4. Open the script editor (Tools|Script Editor…),
  5. Execute the function Run|getSettingGADataToSheet. This will prompt you to authorise the script to run. – select yes.
  6. Run the script again (Run|getSettingGADataToSheet). This will trigger the oAuth process for GA access – you may need to run it twice for all the authorisations. (After all this you won’t have to do this from the script environment again – from then you run it from the spreadsheet menu)
  7. Close and re-open the spreadsheet. If you don’t see the GA menu after a minute, go back to the script environment and make any edit (eg add a //comment) to the script and save. Sometimes the spreadsheet copy seems not to recognise the script until it has been edited. Close and re-open again.
  8. change the query parameters in _settings as you like, or setup sheets and use the getMetric function

Alternatively, if you want to start from a new or existing spreadsheet of your own, you can

  1. copy the script below into a new spreadsheet and
  2. (optionally – only needed if you want to invoke from a menu command) copy the table below into a worksheet called _settings, and set your parameters (especially the profile ID). The script expects to find this table in the top left of the _settings sheet, starting from A1. You can change anything in the values column as needed.
  3. pick up from step (4) above

Details of Script and Sheet

A worksheet

Called “_settings” with the following

Key Value Notes
metrics ga:entrances,ga:bounces,ga:avgTimeOnPage,ga:pageviews,ga:uniquePageviews required
dimensions ga:pagepath
filters
segment
ids ga:11111111 required – the profile id
sort -ga:pageviews
start-date 01/07/2011 required
end-date 14/01/2012 required
start-index 1 1 or more
max-results 250
targetSheetName Results20120123 sheet will be created if doesn’t exist
startRow 1 Where 1st row of data will be placed on target sheet – integer
startCol 1 Where 1st column of data will be placed on target sheet – integer
showParams TRUE Show the query parameters with the results

The script

/*
Copyright (c) 2012 Justin Gough 

http://www.peakconversion.com/

Licensed under the MIT license: http://www.opensource.org/licenses/mit-license.php
*/

var settings = getSettings();

function onOpen() {
  var ss, menuEntries;
  ss = SpreadsheetApp.getActiveSpreadsheet();
  menuEntries = [ {
    name : "Activate settings query",
    functionName : "getSettingGADataToSheet"
  } ];
  ss.addMenu("GA", menuEntries);
}

function getMetric(profile, metric, startdate, enddate, dimensions,
  segment, filter, sort, maxresults) {
  // if called with one metric returns a single value
  // if called with multiple metrics returns an array
  var params, query, returnVal, gaData;
  params = {
    "ids" : profile,
    "metrics" : metric,
    "segment" : segment,
    "dimensions" : dimensions,
    "filters" : filter,
    "start-date" : startdate,
    "end-date" : enddate,
    "sort" : sort,
    "max-results" : maxresults
  };
  query = gaQuery(params);
  gaData = query.fetchResult();
  if (gaData.error) {
    returnVal = gaData.errorMsg;
  } else {
    returnVal = gaData.rows;
  }
  return returnVal;
}

function getSettingGADataToSheet() {
  var gaData, query;
  query = gaQuery(settings);
  gaData = query.fetchResult();
  if (gaData.error) {
    Browser.msgBox(gaData.errorMsg);
  } else {
    populateSpreadsheet(gaData, query.info);
  }

  // debug function
  var gaString = Utilities.jsonStringify(gaData);
  Logger.log(gaString);
}

// -----------------------------------------------------------------
// set up request

function gaQuery(params) {
  var baseURL, startDate, endDate, returnObj, query, qryStr, qryInfo, missingParams;
  baseURL = "https://www.googleapis.com/analytics/v3/data/ga?";
  startDate = dateForURL(params["start-date"]);
  endDate = dateForURL(params["end-date"]);
  qryStr = "";
  qryInfo = [ [ "Query info", "" ] ];
  missingParams = [];
  query = {
    addElem : function(elemName, isMandatory, value) {
      if (typeof value === "undefined") {
        value = params[elemName] || "";
      }
      if (isMandatory && (value === "")) {
        missingParams.push(elemName);
      }
      // add to query info even if zero length value
      qryInfo.push([ elemName, value ]);
      // add to query string if there is an actual value
      if (value !== "") {
        if (qryStr !== "") {
          qryStr = qryStr + "&";
        }
        qryStr = qryStr + elemName + "=" + encodeURIComponent(value);
      }
      return this;
    }
  };
  query.addElem("ids", true)
    .addElem("dimensions")
    .addElem("metrics", true)
    .addElem("segment")
    .addElem("filters")
    .addElem("sort")
    .addElem("start-date", true, startDate)
    .addElem("end-date", true, endDate)
    .addElem("start-index")
    .addElem("max-results");
  if (missingParams.length) {
    returnObj =
      {
        uri : "",
        info : [],
        fetchResult : function() {
          return {
            "error" : true,
            "errorMsg" : "Mandatory parameter(s) missing or invalid. Check "
              + missingParams.join(",")
          };
        }
      };
  } else {
    returnObj = {
      uri : baseURL + qryStr,
      info : qryInfo,
      fetchResult : function() {
        var result = getGA(baseURL + qryStr);
        qryInfo.push([ "", "" ]);
        qryInfo.push([ "Result info", "" ]);
        qryInfo.push([ "Total results", result.totalResults ]);
        qryInfo.push([ "sampled", result.containsSampledData ]);
        return result;
      }
    };
  }
  return returnObj;
}

// ------------------------------------------------------------
// interaction with Analytics API
function getGA(URL) {
  var oauthConfig, requestData, result;
  oauthConfig = UrlFetchApp.addOAuthService("google");
  oauthConfig
    .setAccessTokenUrl("https://www.google.com/accounts/OAuthGetAccessToken");
  oauthConfig
    .setRequestTokenUrl("https://www.google.com/accounts/OAuthGetRequestToken?" +
    		"scope=https://www.googleapis.com/auth/analytics.readonly");
  oauthConfig
    .setAuthorizationUrl("https://www.google.com/accounts/OAuthAuthorizeToken");
  oauthConfig.setConsumerKey("anonymous");
  oauthConfig.setConsumerSecret("anonymous");
  requestData = {
    "method" : "GET",
    "oAuthServiceName" : "google",
    "oAuthUseToken" : "always",
  };
  Logger.log("doing fetch: " + URL);

  try {
    result = UrlFetchApp.fetch(URL, requestData);
    result = Utilities.jsonParse(result.getContentText());
  } catch (e) {
    if (e.message) {
      Logger.log(e.message);
      result = {
        "error" : true,
        "errorMsg" : e.message
      };
    }
  }
  Logger.log("done fetch");
  return result;
}

// ------------------------------------------------------------
// functions to read and manipulate the spreadsheet

function getSettings() {
  // returns a javascript object where each name-value pair in the
  // _settings sheet is represented by a member
  var settingSheetName, ss, sheet, numRows, cells, settingsObj, i;
  // pre-load some default settings
  settingsObj = {
    "start-index" : 1,
    "max-results" : 100,
    "startRow" : 1,
    "startCol" : 4,
    "showParams" : true
  };
  settingSheetName = "_settings";
  ss = SpreadsheetApp.getActiveSpreadsheet();
  sheet = ss.getSheetByName(settingSheetName);
  numRows = sheet.getLastRow() - 1; // data start at row 2
  cells = sheet.getRange(2, 1, numRows, 2).getValues();
  for (i = 0; i < cells.length; ++i) {
    settingsObj[cells[i][0]] = cells[i][1];
  }
  return settingsObj;
}

function populateSpreadsheet(gaData, qryInfo) {
  // gaData is the JSON result from the GA query
  // qryInfo is an array with the query parameters and result metadata
  // the rows object within the json is conveniently compatible with spreadsheet
  // setValues
  var i, rngData, sheet, rngHeadsRow, destinationRange;
  rngData = gaData.rows;
  sheet = targetSheet();
  // pull the column headers out of json and splice into beginning of data array
  rngHeadsRow = [];
  for (i = 0; i < gaData.columnHeaders.length; ++i) {
    rngHeadsRow.push(gaData.columnHeaders[i].name.replace('ga:', ''));
  }
  rngData.splice(0, 0, rngHeadsRow);
  // put the data in and set froze rows if 4 or less would be frozen (any
  // further down and we
  // are likely placing date below soemthing else)
  destinationRange =
    sheet.getRange(settings["startRow"], settings["startCol"],
      rngData.length, rngData[0].length);
  destinationRange.setValues(rngData);
  if (settings["startRow"] < 5) {
    sheet.setFrozenRows(1);
  }
  // put the query info in if settings say so
  if (settings["showParams"]) {
    destinationRange =
      sheet.getRange(settings["startRow"], settings["startCol"]
        + rngData[0].length + 1, qryInfo.length, qryInfo[0].length);
    destinationRange.setValues(qryInfo);
  }
  // format the columns based on the heads info
  for (i = 0; i < gaData.columnHeaders.length; ++i) {
    destinationRange =
      sheet.getRange(settings["startRow"] + 1, settings["startCol"] + i,
        rngData.length - 1, 1);
    switch (gaData.columnHeaders[i].dataType) {
      case "STRING":
        // do nothing - if this is a date and you set the format, strange things
        // happen
        // ga returns dates as strings like yyyymmdd
        break;
      case "INTEGER":
        destinationRange.setNumberFormat("#,##0");
        break;
      case "TIME":
        destinationRange.setNumberFormat("0.00");
        break;
    }
  }
}

function targetSheet() {
  var ss, sheetName, sheet, msg, usrCheckResult;
  ss = SpreadsheetApp.getActiveSpreadsheet();
  sheetName = settings["targetSheetName"];
  if (sheetName === "") {
    Browser
      .msgBox("The setting 'targetSheetName' was not found. A new sheet will be started");
    sheet = ss.insertSheet();
  } else {
    sheet = ss.getSheetByName(sheetName);
    if (sheet == null) {
      sheet = ss.insertSheet(sheetName);
    } else {
      // this is an existing sheet, check if there is data in it
      if (sheet.getLastRow() > 0) {
        msg =
          "The sheet "
            + sheetName
            + " exists and has data in it. "
            + "Cells in the target area will be overwritten. Cells not " 
            +	"in the target area will not. \\nProceed? \\n(If you say 'no' a new sheet "
            + "will be created instead.)";
        usrCheckResult =
          Browser.msgBox("overwrite sheet?", msg, Browser.Buttons.YES_NO);
        if (usrCheckResult === "no") {
          sheet = ss.insertSheet();
        }
      }
    }
  }
  return sheet;
}

// -------------------------------------------------------------
// utility functions for formatting for URL

function dateForURL(inDate) {
  var dTmp, sTmp, day, year, month;
  dTmp = new Date(inDate);
  if ((Object.prototype.toString.call(dTmp) === '[object Date]')
    && isFinite(dTmp)) {
    year = dTmp.getFullYear();
    month = dTmp.getMonth();
    month++; // js month is zero based
    if (month.toString().length === 1) {
      month = "0" + month;
    }
    day = dTmp.getDate();
    if (day.toString().length === 1) {
      day = "0" + day;
    }
    sTmp = year + "-" + month.toString() + "-" + day.toString();
  }
  return sTmp;

}

Feedback

Any feedback or suggestions are welcome, using the comments below or justin@peakconversion.com.

Be Sociable, Share!

16 comments to Pull Google Analytics data into a Google Docs spreadsheet using the API

  • [...] Tralasciando quelli a pagamento, il problema principale che riscontro è che spesso la presenza di un proxy impedisce a Excel di connettersi a GA. Mi sono così orientato verso una soluzione che collegasse Analytics a Google Docs, così che il passaggio dei dati sia server-server e si possano evitare intoppi. Per prima cosa sono capitato su questo post di SEOmoz, che sembra promettente ma che non ho ancora sperimentato. La soluzione più immediata – per le mie limitate esigenze del momento – è stato leggere questo post di peakconversion.com. [...]

  • Justin

    Grazie Marco!
    I’m really glad you found this useful.
    Justin

  • Nice to meet. This code is great. By the way, this is what the license of the code? GPL? MIT? I have published the code for the Google Analytics blog. And many will want to modify.

  • Justin, great stuff, thanks for putting this together!

    Found a small bug, filters are not currently getting picked up due to a typo on line 31: “filter” > “filters” (should be plural).

    • Justin

      Oops – thanks very much for spotting that Ilya! Fixed now.

      And apologies for the delay in picking up the comment – I have my spam filter sorted now so hopefully should be more responsive next time!

  • Kevin McHugh

    Hi Justin,

    Awesome tool, thanks for putting this together. After doing a debug, I am getting an error from line 209: “TypeError: Cannot call method “splice” of undefined”. Any thoughts on how to correct this? Apologies in advance, I am not an advanced coder.

  • Krischan

    Hi Justin,

    very cool stuff, thanks for sharing! As you mentioned above, I often get the 503 gateway timeout messages. But running the script again somehow doesn’t work, as the data (in this case the error message) seem to be cached for a certain period of time. Any ideas? Thanks a lot!

    • Justin

      Hi Krischan
      Very sorry for my slow reply, and thanks very much for the kind feedback!
      I have noticed this problem – it does seem to cache for a while. If you do anything that changes the URL for the API call it fetches the new URL – so changing any parameter. I haven’t tried adding a spurious parameter to the URL, but that just might work. (I don’t know if the google guys would like the suggestion though!)
      Cheers
      Justin

  • Martino Bagini

    Hey Justin. Thanks a lot for this !

    I’m unfortunately getting insufficientPermissions to perform the operation when running getSettingGAData.. I’m using GDocs with the administrator of the Google Analytics account. Can’t understand why I shouldn’t have these permissions.

    Any insights are welcome and congrats again !

    Martino

  • Oscar

    I want to see the unique visitors of a specific segment. The segment is called ‘Optinpages’, which includes all my optin pages on my site.

    But how do I do that with spreadsheets to get that data automatically?

    I already tried everthing to get this working, but I don’t get it. Here are the reference codes: https://developers.google.com/apps-script/class_analytics_v3_schema_segment

  • Fantastic script. Exactly what I required for a new project. Also, I was receiving several quote issues, but by adding: Utilities.sleep(5000); to line 155 solved the issue. This will allow it to wait 5 second each time before querying.

  • Fantastic script, thanks for putting this together. I have only one problem. Transaction metrics comes with a ‘dot’ floating sign. Spreadsheet can’t recognize them as number so can’t summarize these values. Would it be possible to round values?

    • It has been solved:
      for (i = 1; i < rngData.length; i++) {
         for (j = 1; j < rngData[i].length; j++) {
      rngData[i][j] = Math.round(Number(rngData[i][j]));
         }
       }

  • We’ve been using this script for a while but when revenue data is beeing extracted, the numbers are not in the right format, they come as a string and in addition to this when the amount is greater than 9mill, the result is even worse (ex: 1.1512202E7)
    we managed to make some changes in the script so that all numbers show in the right format (not as a string) so that you can add further formatting like decimals or type of currency
    now we got stuck into sums which are equal or above 10mill as mentioned above
    any clues?

Leave a Reply

  

  

  

You can use these HTML tags

<a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>