Google Analytics – using Python API


Google analytics is a free tool for website traffic analysis, it’s widely used and is most popular analytics tool. Google provides API to query analytics data (now there are several reasons to use analytics data for your own custom analysis, for example, you might want to group certain pages in a category and you might want to see trends, differences on specific day/event etc).

We will try to cover basic setup (python) on getting started with google analytics API. We generally use google analytics and google webmasters data to run analytics. (Remember Google webmaster’s data is not stored and google only shows last 90 days information).

as root install python google client libs

#pip install --upgrade google-api-python-client
...
...
#[root@localhost Downloads]# ls -lrt /usr/lib/python2.7/site-packages/
...
-rw-r--r--.  1 root root  30888 Nov 19 12:07 six.py
-rw-r--r--.  1 root root  30210 Nov 19 12:07 six.pyc
drwxr-xr-x.  2 root root    131 Nov 19 12:07 six-1.11.0.dist-info
drwxr-xr-x.  5 root root    150 Nov 19 12:07 pyasn1
drwxr-xr-x.  2 root root    147 Nov 19 12:07 pyasn1-0.3.7.dist-info
drwxr-xr-x.  2 root root   4096 Nov 19 12:07 pyasn1_modules
drwxr-xr-x.  2 root root    147 Nov 19 12:07 pyasn1_modules-0.1.5.dist-info
drwxr-xr-x.  3 root root   4096 Nov 19 12:07 oauth2client
drwxr-xr-x.  2 root root    131 Nov 19 12:07 oauth2client-4.1.2.dist-info
-rw-r--r--.  1 root root    265 Nov 19 12:07 easy-install.pth
drwxr-xr-x.  2 root root     45 Nov 19 12:07 apiclient
drwxr-xr-x.  3 root root   4096 Nov 19 12:07 googleapiclient
drwxr-xr-x.  2 root root    131 Nov 19 12:07 google_api_python_client-1.6.4.dist-info
....

1. Create Credentials for your application

From google API console link https://console.developers.google.com/apis/credentials create new credentials.


2. Create a client_secrets.json file local to python script.

You can download this file(from google API console) when you create new credentials.

{
  "installed": {
    "client_id": "<client id>",
    "client_secret":"<client secret>",
    "redirect_uris": ["http://localhost", "urn:ietf:wg:oauth:2.0:oob"],
    "auth_uri": "https://accounts.google.com/o/oauth2/auth",
    "token_uri": "https://accounts.google.com/o/oauth2/token"
  }
}


3. Create python script to get started

Let’s create a python script ga.py. This is simple python script to get started, it will display information of your google analytics account.

#!/usr/bin/python

from googleapiclient.errors import HttpError
from googleapiclient import sample_tools
from oauth2client.client import AccessTokenRefreshError

service, flags = sample_tools.init(
      "", 'analytics', 'v3', __doc__, __file__,
      scope='https://www.googleapis.com/auth/analytics.readonly')

accounts = service.management().accounts().list().execute()
print accounts

when you will run the code for the first time you will get below

#./ga.py 
/usr/lib/python2.7/site-packages/oauth2client/_helpers.py:255: UserWarning: Cannot access analytics.dat: No such file or directory
  warnings.warn(_MISSING_FILE_MESSAGE.format(filename))

Your browser has been opened to visit:

    https://accounts.google.com/o/oauth2/auth?scope=https%3A%2F%2Fwww.googleapis.com%2Fauth%2Fanalytics.readonly&redirect_uri=http%3A%2F%2Flocalhost%3A8080%2F&response_type=code&client_id=333565445573-kkrg64534reftrgefwefwef2rfeadfwr342efd322345terfefwfs.apps.googleusercontent.com&access_type=offline

If your browser is on a different machine then exit and re-run this
application with the command-line parameter

  --noauth_local_webserver
...
...

#

You can authorize from URL above and then a file analytics.dat will be created on linux machine (local to the python script) which has credentials information.


4. run the code

It will print some information related to your google analytics account.

[root@localhost Downloads]# ./ga.py 
{u'username': u'editoxr@bitarray.io', u'kind': u'analytics#accounts', u'items': [{u'kind': u'analytics#account', u'name': u'www.bitarray.io', u'created': u'2016-04-06T21:27:32.000Z', u'updated': u'2017-08-12T00:19:57.493Z', u'childLink': {u'href': u'https://www.googleapis.com/analytics/v3/management/accounts/<id>/webproperties', u'type': u'analytics#webproperties'}, u'id': u'<id>', u'selfLink': u'https://www.googleapis.com/analytics/v3/management/accounts/<id>', u'permissions': {u'effective': []}}], u'itemsPerPage': 1000, u'startIndex': 1, u'totalResults': 1}

 

Please make sure you are able to run scripts before moving to next section.

 


Now let’s change the code to add a few more things:

Get real-time users:

#!/usr/bin/python

from googleapiclient.errors import HttpError
from googleapiclient import sample_tools
from oauth2client.client import AccessTokenRefreshError

service, flags = sample_tools.init(
      "", 'analytics', 'v3', __doc__, __file__,
      scope='https://www.googleapis.com/auth/analytics.readonly')

realtime = service.data().realtime().get(ids="ga:XXXXXX",metrics="rt:activeUsers").execute()
print realtime

NOTE: ga:XXXXXXX is View ID and NOT profile ID/Account ID/Tracking ID! You can look for view ID in your analytics account.

Once you run above code you will get data on real-time users!

# ./ga.py 
{u'kind': u'analytics#realtimeData', u'rows': [[u'49']], u'totalResults': 1, u'totalsForAllResults': {u'rt:activeUsers': u'49'}, u'columnHeaders': [{u'dataType': u'INTEGER', u'columnType': u'METRIC', u'name': u'rt:activeUsers'}],
...
...

 

Let’s add below line in python code to get data for a given date range:

print service.data().ga().get( ids='ga:21434938', start_date='2017-01-01', end_date='2017-01-15', metrics='ga:visits', dimensions='ga:source,ga:keyword', sort='-ga:visits').execute()

Once you run above code you will get output like:

#./ga.py

u'kind': u'analytics#gaData', u'rows': [[u'google', u'(not provided)', u'154406'], [u'(direct)', u'(not set)', u'23692'], [u'yahoo', u'(not provided)', u'1257'], [u'bitarray.io', u'(not set)', u'1056'], [u'bing', u'(not provided)', u'906'], [u'facebook.com', u'(not set)', u'496'], [u'website-analytics.online', u'(not set)', u'423'], [u'l.facebook.com', u'(not set)', u'317'], [u'm.facebook.com', u'(not set)', u'295'], [u'yandex', u'(not set)', u'213'],...
...
...

 

API docs: https://developers.google.com/resources/api-libraries/documentation/analytics/v3/python/latest/

 


How to use data? Why should we use API when analytics already has all the information?

We run APIs and store data in our local database so that we can integrate with our CMS and run some reports:

We have some big sites and we use this data to see trends:

  1. We have grouped our pages into several categories and we try to see trends in categories (there is limited option in google analytics (as of now))
  2. We have integrated analytics data with our own CMS, so we can see data for every page (while editing etc).
  3. We like to write notes on each page so that we can have a history what changes impacted our rankings etc. Once you store data locally you have tons of options to query and use data according to your needs, for example, if your site is hosted in multiple locations (like in Asia and North America) you might want to change content (menu links, sidebar etc) based on pages which are viewed mostly in Asia or North America.

Above are some of the uses of using API data.

+ There are no comments

Add yours