5 Tools to Help Sensory and Consumer Scientists Automate Beautiful Reports
This article is part of Aigora's "Original Content" series, which consists of our original thoughts at the intersection of consumer science and artificial intelligence!
It had happened again. Working towards another tight deadline, I was having to pick between formatting a chart and spending additional time crafting the conclusions for the report that was due in 45 minutes. The conclusions were more important - so I chose to work on them - but I knew I could do a better job on the chart. It was frustrating having to make these trade-offs and, yet again, I found myself wishing I had more time.
"If only I had more time." Have you ever found yourself saying that to yourself? Before founding Aigora, I worked for 11 years as a market research consultant at The Institute for Perception, and I quickly lost track of how often I said that to myself in my first few years there. As someone who loves developing and applying new analytic tools, and who seeks clear communication whenever possible, I found myself frustrated by the amount of time I had to divert from what I considered to be exciting work towards more mundane tasks such as data transformation, table and chart development, and report formatting. Surely there had to be a better way! Fortunately, about five years ago I began to learn data science and, as I blogged last week, learning even a little data science can dramatically improve the lives of sensory and consumer scientists. This week I want to dig a little deeper and to expand on a topic that I discussed at the recent ASTM E-18 Statistic Seminar on automation within sensory. Specifically, this week, I want to talk about what I believe to be the most useful tools for sensory and consumer scientists who wish to spend less time on mundane tasks and more time on creative work using their category-specific expertise.
What do we mean by automated reporting?
To begin, let's define our terms. Automated reporting is a special case of reproducible research, which is the scientific ideal that researchers should be able to reproduce one another's results if they conduct the same experiment with the same methods and follow the same analysis plan. The expectation for how similar results need to be depends on the discipline, with an exact science such as physics perhaps requiring nearly identical results, but all scientific endeavors should adhere to this ideal to some degree. In practice, it can be unreasonable to expect ourselves to reproduce someone's experiment, so we are often left with what Roger Peng has called "really reproducible research" - given someone's data and analysis plan, we should be able to reproduce their results. Applying this idea to our lives as sensory consumer scientists, given our own past data and our own past analysis plans, we should be able to reproduce our past reports.
To achieve this ideal of reproducibility for ourselves we have two options. One option is to perform breathtaking feats of memory and rely on impeccable notes to recall what were the steps we took from receipt of the data to the completion of the final report. The other option is to have a fully (or at least mostly) systematized process that takes us from receipt of data through a series of explicitly recorded steps to our final report. Too often in sensory and consumer science, the first option is chosen for lack of tools to implement the second option. The goal of this article then is to raise awareness of readily available tools, with the benefit that we can not only quickly reproduce our past reports, but also that we can quickly produce any similar reports by following the same automated steps.
Tool 1: RStudio - Ground zero for working in R
Throughout this article, I'm going to favor tools associated with the R programming environment. I've made this choice because:
1) R is freely available.
2) R has a vast community that has produced a cornucopia of tools for scientific research, including many tools developed explicitly for sensory scientists such as sensR and FactoMinerR.
3) R plays well with Microsoft Office, as we'll see.
4) R is more statistically-oriented than other freely available tools such as Python (which is orientated more towards computer science). This orientation makes it relatively natural for sensory and consumer scientists to learn R, in my experience.
Among the R-based tools, pride of place must go to RStudio and the tools created by its scientists such as Hadley Wickham and Yihui Xie and which interact well with RStudio. Technically speaking, you don't need RStudio to use R, and you could use the other tools described below independent of RStudio, but you would be making your life unnecessarily difficult. Plus RStudio is free.
Note: RStudio also interfaces well with R Markdown - which itself supports a clear path to reproducible research - and it's worth being aware that R Markdown (through knitr and pandoc) provides the capability to produce reports within the Microsoft Office paradigm. Because of the emphasis on Microsoft Office products within sensory and consumer science, and because of the level of additional customization that is supported by the tools that follow, I'm going to save further discussion of R Markdown for another day.
Tool(s) 2: readxl and openxlsx - Two packages for interacting with Excel
Computer scientists and statisticians may bellyache about Excel, but I believe Excel has its place as an easy-to-work-with way to share small datasets. And perhaps more important, it's better to accept it because it's not going anywhere - Excel is just going to be part of life for sensory and consumer scientists for the foreseeable future. Regardless of your personal opinion of Excel, however, the twin packages of readxl and openxlsx are your friends.
The first package, readxl, provides a host of options for getting data out of Excel programmatically, meaning that you can run the same script to harvest your data as long as the format of your input datasets remains unchanged. And, even better, as long as you can diagnose the possible differences, you can write code that will determine such details as to how many lines to skip before reading and which sheet contains the data in which you're interested.
The second package, openxlsx, provides almost unlimited freedom for writing data/results out to Excel. These options include the ability to dynamically determine formatting properties such as the font, text size and color, and background shading color. When you use openxlsx, you not only recover the time you spent formatting Excel sheets by hand the first time - you save on all of that formatting work in the future when you recycle your scripts.
Tool 3: officer - A package for automatically creating Word and PowerPoint documents
We now come to a package that truly blew my mind when I first used it - the officer package by David Gohel. This package allows users to custom-build Word and PowerPoint documents, optionally starting with pre-defined templates or slide masters, to control almost every aspect of the final document details. Using officer, I now create reports that are 90% completed by merely running a custom R script, and I arrange most of that script from scripts I've used on other projects, perhaps with a few small adjustments as the specific project dictates. The result is that I've achieved what I long considered the holy grail of reporting - instead of spending 90% of my time in mundane activities such as formatting text, arranging tables, resizing charts, etc., I now spend 10% of my time setting up the script and 90% of my time thinking about the key takeaways and how to communicate them. And, even better, when I do have to spend time getting a chart to look just right, I now gain the benefit of that work on all future projects (or even future iterations of the current project) because the improvements I make are captured in my script and not confined to my final report.
Tool(s) 4: ggplot2, rvg, and flextable - Three packages for making beautiful charts and tables in Word and PowerPoint
Three packages that play very well with officer are ggplot2, rvg, and flextable. The first, ggplot2, is a standard package in the so-called tidyverse that provides the ability to make layered, vector-based graphics with apparently infinite flexibility - if you can think of a chart, draw it on paper, or find an example of it in the world, you can almost certainly find a way to create it using ggplot. And, to make your life easier, the incredibly generous R community has produced examples of charts of every possible type - access to all of these contributions is just a web search away.
Once your chart is made using ggplot, you can use the rvg package together with officer to output your chart to PowerPoint as a Microsoft drawing object. That fact means both that your chart will render correctly in PowerPoint and that you will still have the opportunity to make any final adjustments manually in PowerPoint should you so desire.
Wrapping up this trio is the flextable package, which brings much the same programmatic control over the formatting of tables in output to Word and PowerPoint as openxlsx does to formatting data output to Excel.
Tool 5: PowerPoint Designer - Design-based suggestions for bringing visual interest to your slides
The last tool on this list may surprise you as it's a native feature of PowerPoint - the Designer functionality. The reason PowerPoint Designer makes this list is that it allows us to move quickly from a report skeleton containing only title headers and perhaps some text to a beautiful slide, by merely selecting artwork or icons that complement the story we're telling. And, once we've accepted a Designer suggestion, we can still adjust the color scheme and font choices using the little-known selection pane. When I made my recent talk at ASTM, I used the R tools described above to make slides that only had titles formatted according to an Aigora slide master template, then added images and icon that reflected the story I wanted to tell, accepted a Designer selection, and adjusted a few colors and fonts in the selection pane to match the Aigora branding in a final pass. I then combined those storytelling slides with some slides showing sample output in the form of tables, charts, and some text - also generated automatically - and the presentation was done. The entire process was (much) faster than the manual approach I might have once used, and the result was of significantly higher quality and without errors that might have arisen from copying and pasting.
We now arrive at the end of our whirlwind tour through my favorite tools for automated reporting. Future blog posts will go into more detail on the use of these tools - each tool easily deserves at least one post to convey the basics. Be sure to sign up for our weekly blog newsletter by signnig up for an Aigora blog account, and/or follow Aigora on LinkedIn, if you'd like to be notified when additional posts go live. In the meantime, the links I've provided contain documentation - and my ASTM talk contains references - so you should be able to get started on your journey to freedom from some of the mundane aspects of our jobs as sensory and consumer scientists.
That's it for now. If you'd like to receive email updates from Aigora, including weekly video recaps of our blog activity, click on the button below to join our email list. Thanks for stopping by!