Display PageViews and List of Popular Posts on a Static Site

Sep 6, 2019

A static site, like this one based on Hugo, cannot easily display site statistics. But there’s a solution. When the page is loaded on the client side, a Javascript pulls the relevant statistics from my server then render it to the page. The server runs a small Python program to contact Google Analytics API in order to get the statistics and return to the frontend. We can even use it to build a list of popular posts.

Since this site is built with Hugo, I’m going to use Hugo as an example.

Fetch PageViews using Javascript

I created a partial named pageviews.html which contains only the following Javascript. It’s fired when the content is ready and POST the current page’s relative URL to the API endpoint of my server, which is /google-analytics/pageviews. I have already a post on how my server is setup and how it works to handle all requests.

The logic of this script is that:

  1. When Hugo builds the site, {{ .Permalink | relURL }} gets the relative path of the page. So when this partial is injected to the html head or footer, relURL is the hard-coded relative path.
  2. The relative path is decoded as some of my posts are in Chinese and hence their URLs need to be decoded before sent to the server and Google Analytics.
  3. The fetching is only necessary when the page wants to display the pageviews. This is achieved by having a <div id='pageviews'> somewhere in the page. If it’s not found, then there’s no need to fetch this page’s statistics.
<script>
    window.addEventListener("DOMContentLoaded", function () {
        const url = "https://api.adrian-gao.com/google-analytics/pageviews";
        const relURL = "{{ .Permalink | relURL }}";
        const decodedURL = decodeURI(relURL);
        const stat = document.querySelector('#pageviews');
        if (stat) {
            fetch(url, {
                method: "post",
                headers: {
                    'Accept': 'application/json',
                    'Content-Type': 'application/json'
                },
                body: JSON.stringify({
                    "url": decodedURL
                })
            }).then(resp => resp.json()).then(json => {
                stat.innerHTML = json.views + ' views'
            })
        }
    })
</script>

Python to Query Google Analytics API

Google has a very good Python quickstart for service accounts for using the new Analytics Reporting API v4. As long as you follow through, there shouldn’t be any problem at all.

Then I just wrap it with a Flask app. This app is responsible to parse the json data posted from the frontend and uses the Analytics Reporting API to get the statistics, which is then returned to the frontend.

Now that we can know the pageviews of each post (and page), we are actually able to build a list of popular posts. The idea is simple and can be implemented very easily.

  1. We need a Python program that runs periodically to get the pageviews of all posts, for a certain time period, e.g. past 30 days.
  2. Write the data to a json file at the data/ folder.
  3. Having hugo --watch running in the background so that it watches filesystem for changes (including the data/ directory) and rebuilds as needed.
  4. Use data-templates or data-driven content to render the list of most popular posts.

A small Python program

import re
import requests
import urllib.parse
import ga   # a custom class for Analytics Reporting API v4
import json

SITEMAP_URL = "https://mingze-gao.com/sitemap.xml"
RE_PAT = "<loc>https://mingze-gao.com(/posts/.*/)</loc>"
DEST_PATH = "/path/to/mingze-gao.com/data/posts_pageviews.json"

def get_sitemap(sitemap_url):
    resp = requests.get(sitemap_url)
    if resp.status_code == 200:
        return resp.text

def decodeURL(encodedURL):
    return urllib.parse.unquote(encodedURL) 

def main():
    data = []
    bot = ga.PageViews()
    sitemap_xml_string = get_sitemap(SITEMAP_URL)
    for url in re.findall(RE_PAT, sitemap_xml_string):
        bot.get_report(pagePath=decodeURL(url), startdate='2005-01-01', enddate='today')
        title = get_page_title(url)
        data.append({'url': url, 'title': title, 'views': bot.parse_response()['ga:pageviews']})
    data_sorted = {"urls": sorted(data, key=lambda x: int(x['views']), reverse=True)}
    # write the pageviews data to the Hugo `data/` director
    with open(DEST_PATH, 'w') as fp:
        json.dump(data_sorted, fp)

if __name__ == '__main__':
    main()

You get the idea. This script runs periodically and I also fetch and store the page titles in the json file. Below is an example of what it looks like.

Example data/posts_pageviews.json

{
    "urls": [
        {
            "url": "/posts/bloomberg-bquant/",
            "title": "Bloomberg BQuant (BQNT)",
            "views": "519"
        },
        {
            "url": "/posts/computing-jackknifed-coefficient-estimates-in-sas/",
            "title": "Computing Jackknifed Coefficient Estimates in SAS using Discretionary Accruals as an Example",
            "views": "110"
        },
        {
            "url": "/posts/timetable-for-tutorial-workshop-and-consultation/",
            "title": "Timetable for Tutorial, Workshop and Consultation",
            "views": "108"
        },
        {
            "url": "/posts/posting-anywhere-anytime/",
            "title": "Posting Anywhere Anytime",
            "views": "108"
        },
        {
            "url": "/posts/minimum-variance-hedge-ratio/",
            "title": "Minimum Variance Hedge Ratio",
            "views": "79"
        }
    ]
}

Data-Driven Content at Hugo

The execution of the Python program above will modify data/posts_pageviews.json. Since we have hugo --watch in the background, Hugo will automatically recreate the site.

I write the following Shortcode so that I can embed the list in markdown, specifically _index.md the homepage.

<div class="popular-posts">
    {{ range .Site.Data.posts_pageviews }}
    <ol>
        {{ range first 10 . }}
        <li><a href="{{.url}}">{{.title}}</a> {{ .views }} views</li>
        {{ end }}
    </ol>
    {{end }}
</div>

This piece of code renders the list of popular posts on my homepage.

Now, I finally have pageviews displayed for each post, as well as a list of popular posts. Well done!

comments powered by Disqus