[Analytics Intern] How did I develop my first report automation for NGO’s social media?

Henry Feng
7 min readOct 2, 2018

--

In the previous articles I wrote in July, I have mentioned my first task to UMASH, the NGO under School of Public Health, is to develop the analytics structure of the two social media: Facebook and Twitter. And I lay out the structure of my analyses process.

And finally, it comes to the most challenging and interesting part to me: the report automation. Report automation is what I am really good at and always find a lot of fun playing with it.

Report Automation and Dashboarding are always a lot of fun

Intro

At the very beginning, I’d like to share some of my understanding toward report automation. So, you might want to ask what is the perfect timing for report automation.

To me, the ad hoc as well as timely reporting is the specific type of analyses you might want to consider report automation. If the data is regularly updated in a timely pattern and it is used to the revelation of status quo and organization performance, it will be a perfect time for report automation.

In the article, I will showcase the thinking and execution process what I have gone through the report automation. And honestly speaking, the final result is not 100% automated, but it is definitely more efficient than before the introduction of system. With rough estimation, I saved the organization about 70% of time on this task!

Process

One by one, I will elaborate each of them with details and my examples from my automation process.

A. Design Analytics Structure

I have elaborated the process how I develop and design the analytics structure for the two social media in the previous article. I will just post my structure below and leave more space for the rest of process.

B. Confirm data source, data storage and data naming

After the discussion with my boss, we framed our basic reporting time period as month. Therefore, the automation will be triggered once a month as soon as the data from previous month is released and ready to download. For both Facebook and Twitter platform, I am able to utilize the insight system to download the monthly data. It is the beginning of data source confirmation

1. Confirm data source

I tend to use the raw data, instead of the processed data showed in the platform dashboard, because it might grant me more flexibility to decide which metrics are the most important.

For both Facebook and Twitter, I just log into the insight platform and download what I need. For Facebook, I use ‘page data’ and ‘post data’, and for Twitter, I download the ‘tweet data’, both in monthly time range.

Facebook Platform for Data Export : Page Data & Post Data
Twitter Platform for Downloading Monthly Data

2. Confirm data storage and data naming

As an product/business analyst, having some disciplines toward storage and naming is important. I always appreciate my colleagues to sort data in a rational hierarchy and with consistent and easy-to-understand file name.

What I do for the data for UMASH, I basically separate them with the platforms (Facebook and Twitter) into two different folders. For each folder, they have sub-folders for overall data and post data. As the example I showed below. It is well categorized, with appropriate naming and clear date.

With Clear Structure and Proper Naming

C. Finalize the metrics

Usually, the raw data an analyst downloads contains hundreds of metrics defined by Facebook and Twitter. It is a challenging and fun task for an analyst to decide which metrics is aligned with the organization. The principle for metrics selection is to always stick to the organization goals.

I usually categorize the metrics into two categories.

  1. The goal-related metrics:

It is the metrics, as I just mentioned, linked directly to the goal of organization. For instance, as a social media analyst, impression, engaged user, people reach are the basic metrics that aligned with the goal of social media, whose goal is to reach more online user and convert them to the website.

2. The observational metrics:

These kinds of metrics selection are really based on the analyst experience. While working in the organization, there might be some metrics that are worth observation, which are equipped with big potential for better the performance. It might be some ratio that need extra calculating, or it might be related to budget. Picking these metrics is not for ad-hoc usage, but for possible optimization for the procedure or the user experience. For example, I am super curious about the performance of different type of posts, so I select some engagement metrics, such as likes, comments and shares when valuing different types of posts.

D. Create database/data table

Based on the size of data for report automation, an analyst can pick an suitable database for pre-processing or storage. UMASH is an NGO, and we can’t afford a SQL database, and the data is not big enough at the same time.

Therefore, in the end, I decide to use google sheet as our final database. It is easy to use, approachable, and everyone can get access to it with single clicks and don’t need complex queries.

The logic is below. We have a database/data table and I use certain method to connect the automated reports to the database, and there are both in the same environment.

This is real logic for my diverse sheets. Two sheets for database/data able, and three sheets for reports, all of which are included in one google file.

Two Original Databases and Three Reports

Therefore, what I need to do is in the beginning of every month, I just dump the selected data columns and row into the database, the report in specific sheets will change automatically.

E. Linking, visualizing and automating

The last mile will be how I should automate the report.

First, as an analyst, I pictured in my mind how the data should be presented visually and in an understandable manners. So I plot the trend, distribution, and growing graph based on the fixed data.

Second, as soon as the graph is link to certain table. What I have to do is make the data in the table can be automatically filled in. In google sheet and excel, the best tool are ‘pivot table’ and ‘vlookup,’ which are two of my favorite tools .

Vlookup can link tables to the original database, and pivot table will change every time you refresh the data. And you dump the data, vlookup and pivot table may refresh the table and the graph will proceed to the new month simultaneously.

It is actually not a difficult task to do. I show some of my sample automated report below. The first graph is showing the monthly trends on certain metrics and compared on yearly basis. The numbers are all fake for the data privacy principle for the organization. The table on the left hand side is automatically updated when I dump the data into database, using vlookup for catching the newest data points. And the graph will automatically grow to the latest month for the trend at the same time.

The second graph is to showcase the power of pivot table. The data source of pivot table is set to the database. Whenever the data is dumped and the pivot table refreshed, the recent row of month will be automatically added, so do the post ranked with share, organic reach and engaged user. The usage of left hand side chart is to present each metrics differ based on the format of the post (i.e. text, photo, and video). And from the right hand side post list, the user can also filter with month, and see the posts which performed well in specific month based on different metrics.

And this is also a showcase how I utilize some observational metrics to better the performance of the social editors.

Conclusion

Even though the report automation for UMASH seems to be not a big deal. The tools are easy, but the thinking process is thorough and well-planned. The automated reports can be used even after I graduate from school. I have set the time range for the automated reports for five year more. The colleagues just need to dump into data and the result is generated automatically. The most importantly, it is sustainable. And I believe it is the power of report automation, and it is also the reason why I love it so much. Via automation and visualization, data is more approachable for those who is not that familiar with data. And they can actually get into the dashboard, have some findings and make decision with a data-centric mindset.

Next step, I will start my automation of Google analytics. And I will share my experience in the near future.

If you like the article, feel free to give me 5+ claps
If you want to read more articles like this, give me 10+ claps
If you want to read articles with different topics, give me 15+ claps and leave the comment here
Thank for the reading

--

--

Henry Feng
Henry Feng

Written by Henry Feng

Sr. Data Scientist | UMN MSBA | Medium List: https://pse.is/SGEXZ | 諮詢服務: https://tinyurl.com/3h3uhmk7 | Podcast: 商業分析眨眨眼

No responses yet