Scaled-Up Usability: Going Unmoderated

The journey had so many screens to test, that even with 90-minute interviews we struggled to get robust consistent feedback over multiple rounds of testing

This post is about how I created an unmoderated usability testing platform at the Test and Trace service.

Designers and product people may just need data to make a few quick decisions. Does this page work better? Does the form work? How can it be optimised? Do people follow on-page instructions? Critically these designs, and decisions, should be validated with users before putting them live.

However when you are working at pace, to some stakeholders and researchers the moderated testing approach can feel like a luxury, even a barrier to delivery. Also moderated testing approaches, and manual analysis of stickies, tends to fall over as soon as any need to split usability findings by different user groups.

In one recent Test and Trace project, the end-to-end journey had over 100 pages. We already tested it for 90-minutes over multiple rounds with 4-6 users.

I appreciate all the evidence and arguments around ‘5 users is enough’ to identify the most critical issues (and all the counter arguments against that too). However, in our experience on this project 5-users (multiple rounds) was still not enough to gather consistent insights as the journey was very long/complex journey that jumps between service channels. We also had some inconsistencies from adjusting focus each Sprint to include new service features/content that needed testing, and differences in opinion among the diverse demographics of users who participated too.

Workaround Methods to Scale-Up

I knew we needed to scale-up the research approach, so that we could rapidly and robustly test any element of the service and provide same day insights. Acquring a commercial off the shelf (COTS) unmoderated tool can cost tens/hundreds of thousands, we would need to scope the requirement, find the budget, seek internal approvals, go to market; basically it would take multiple months to procure a tool (if it could be done at all). An alternative solution was required urgently as the finding were needed immediately.

Using GitHub and VSCode I deployed a few changes to the same GOV.uk html prototype kit that our interaction designers were using. I designed a research scenario similar to the one we typically use for a moderated interview, adding this as a new “Start page” within the prototype. Users were invited to provide ‘dummy information’ into the forms.

I then embedded GTM (Google Tag Manager) and from here deployed a number of other conversion optimisation tools such as Google Analytics, Google Optimize (for AB Testing), Inspectlet and HotJar (for on-page feedback, session recording, heatmaps), and Qualtrics (for an exit survey).

The main technical challenge beyond installing these tools, was to deploy Cookies and Triggers correctly within GTM to ensure only users who arrived as part of a specific research usability testing activity, who had consented to be tracked, we observed through the journey.*

This was because the prototype was already being shared widely with stakeholders, and so it was important to keep research data separate for analysis purposes.

Testing at scale

[Insert anonymised video clip of user session here]

By following an unmoderated testing approach, through using our own user panels we were able to test with approximately 1 hundred users each round.

This would produce 8 to 10 hours of session replays, which could be reviewed at 2x or 4x speed. As no analytics was available in the live service, we gathered baseline data such as:

  • number of pages visited per user,
  • conversion funnels and drop outs,
  • keyboard entry data for forms,
  • heatmaps / click analysis,
  • time on task for the journey,
  • how many participants perform a ‘U turn’,
  • satisfaction metrics,
  • ease-of-use metrics,
  • contextual feedback about all pages in the journey.

Overall this new data helped to validate and prioritise many of the findings we had seen in the moderated testing. Conclusively, the digital journey is has been demonstrated to be far too long relative to the task of getting a coronavirus test. The consequence is that many users only skim-read the guidance, then abandon later due to usability other issues. Being asked ‘just one more question’ with ‘no end in sight’ and ‘no guarantee of a test’, is not supportive for users with symptoms of a potentially dangerous virus where anxiety and stress is already very high. The product people and stakeholders fed back they had more confidence in our teams’ work now they could see the whole journey had been tested with a whole lot more users.

TL;DR

The most impactful UX teams are taking a blended approach to different usability methods – combining qualitative, quantitative, analytics and also survey data. This post highlights a few workarounds that can deliver good results for projects doing “adhoc” unmoderated testing on your html prototype.


* Note aiming to do a follow up post some time about the ‘how’