How to measure design
How can one quantify such an ephemeral thing as design (UI&UX)? How to measure its impact? Is there a way to do a controlled design overhaul without disrupting the user's experience? Read on.
Hi and welcome to the Corporate Waters weekly newsletter 🙌
I’m Mikhail and I'm excited to share my learnings to help you navigate the complex waters of product management, leadership, and corporate dynamics. Subscribe to unlock the full value.
In today’s paid newsletter:
(Free) My story of trying to measure design over the span of my career;
(Paid) Effective and less effective strategies in measuring design impact (with cases from my personal experience);
(Paid) An actionable framework you can implement in your company to measure the efficacy of design.
Back in 2015, I met with a product designer from Booking.com at a local product event. We talked quite a bit. One word led to another, and he asked me a question, “How many redesigns do you think we’ve had over the last 5 years?”. “Seems like none,” I replied. “Actually, we had more than a dozen. However, the transitions were gradual and barely noticeable for our users.”
“Eureka, that is it!” I thought. You break down your large legacy design system into testable components, brush them up, and run a bunch of A/B tests. After a well-planned sequence of experiments, you will end up with an improved UI, beaming with user delight.
Reality proved to be grim, and the first test already left me and my team dumbfounded. The results were bipolar. Identical changes on iOS have shown a neutral impact, while Android stubbornly resulted in a -5% drop of our core metric for unexplained reasons. Yes, we did check test setup and the quality of data multiple times. Moreover, we followed up with 8 additional tests that showed a similar impact.
Talks on how to measure design or do a full redesign have been grappling with me throughout my whole career. Having an up-to-date design and good quality bar is a well-accepted axiom in mature product companies. Not only does it improve the user experience, but it positively impacts team attrition and talent acquisition (it’s harder to find employees willing to work with an outdated design and tech stack).
Nevertheless, there’s a lot of analysis paralysis around the technicalities of it. Every single company I go to at a certain point asks the question, “how do we measure the impact of design”?
To battle this ambiguity, I’ve decided to share my experience of what worked, what was semi-successful, and what you should avoid when trying to quantify your design efforts. More over, I’ve tried to blend all of those insights into the actual framework that can be applicable for any company.
Let’s dive in ⬇️.
![](https://substackcdn.com/image/fetch/w_1456,c_limit,f_auto,q_auto:good,fl_progressive:steep/https%3A%2F%2Fsubstack-post-media.s3.amazonaws.com%2Fpublic%2Fimages%2Fb193a92f-2c4b-40a8-9ed1-30f0509a9892_1024x1024.png)
A bit of semantics
What do I mean by design?
In the context of this article, by design, I consider “UI” and minor “UX” changes that do not dramatically impact the existing user flows.
❌ What doesn’t work
Measuring the value of design through business metrics
Josh Miller from The Browser Company (guys who are building the Arc Browser) in one of the interviews drops the concept of “optimizing for feelings” instead of metrics. However, he later avoids the question of “how”.
Frankly speaking, it’s a nice-sounding concept that conceals “opinion-based decisions”. Not saying there’s anything wrong with this approach. Being opinionated is totally fine when you’re an early-stage company.
However, once you’re big and have a loyal base of users, opinions can easily become biases. You absolutely have to measure them. No matter how fancy you call it, metrics would still be one of the key guardrails for your decision-making.
Measuring the impact of single components
That’s what many teams rush to do first. We were no exception. Our assumption was that we needed to find some sort of a sticking point that would show a positive impact on user behavior. Be it causality or even a correlation. Anything.
Eventually, we did find a correlation. We noticed a slight uplift in sessions per user (for a newly acquired cohort) in one of the A/B tests with a redesigned component. We interpreted this as a predictor of new user retention uplift. This thin justification was our key pitch for selling design improvements in the org.