A/B Testing was a common theme throughout the conference but Navin from Netflix and Hilary from Skyscanner focused their talks on this alone. It was really interesting to see two large companies that operate in very different markets putting this into practice in a very similar way.
Navin from Netflix spoke about their three step approach:
- Hypothesis - defining the feature or improvement and the goal(s) it needs to achieve.
- Experiment - always using the current version as a control experience and running at least 2-3 variations.
- Result - pick the winner based on which achieved the goal(s). If a variation wins then this becomes the new control experience.
He highlighted the importance of breaking goals down into smaller goals to make them easier to achieve. It was interesting to see that a page doesn’t always necessarily have one goal. On Netflix’s website, the page for a TV show has a different goal based on the user visiting it. If they have a Netflix account then the goal is to get them to watch the TV show. If they don’t have a Netflix account the goal is to get them to sign up. The page template and design changes based on this to help achieve each goal.
A/B tests typically result in a success or a failure and Hilary pointed out that both are good results. She also went on to talk about two other results types she and the team at Skyscanner have found when running tests (at one point this year they were running 150-200 tests on over 70 hypothesis!).
The first is an Invalid Result. These are usually one-sided results but it’s difficult to measure the upsides and downsides. As an example, she talked about a test that achieved all its goals but at the expense of a user experience customers and the team didn’t like. Because of this, it was never implemented.
The second alternative result is a Flailed Result. This usually happens when the team cannot make a decision and throw it out there to be tested. The results don’t provide any insight into the decision because it shouldn’t have been set as a test in the first place.
Both of these results are bad and they should be avoided. They are a waste of time and resources. To avoid this she advised that a good balance of science and sensibility must be achieved.
Don’t Just Listen: Watch!
Navin raised this point and it really stuck with me. Even if users ask for something it doesn't mean that it will work. At Netflix, they test with small groups so they can listen and observe the participants. They’ve found that while users tell them they loved the new feature they tested, the test recordings show a different story. They show users struggling to convert due to too many choices. The key is to get a balance between the user tasks and the goals of the business.
Dariusz from Spotify iterated on this. When building the Running feature in the Spotify mobile app he explained the whole concept was born by watching how users ran. They were able to see that running is a repetitive movement, similar to a beat. This could be mapped to tempos and then music. If they had only asked runners what they wanted they would not have reached this solution.
Charlotte from Deliveroo echoed this view of observing. To improve the app their restaurants used they spent time observing the staff while they worked and used the app. To them, it looked like the staff were struggling to process orders while managing customers in the restaurant itself. But in fact, this was normal to them and not stressful at all.
The different speakers highlighted the importance of observing their customers when using their products and asking them questions to get a true understanding of the user experience.
Jane from Moo spoke about decisions and how everyone can make better decisions for the things we make. It was interesting to see some of the biases that can cause us to make bad decisions. They are:
- Anchoring – using a reference point that we then compare our decisions to.
- Framing - the order or wording of questions that can influence the answer given.
- Confirmation bias - when your mind is already made up and you ignore information that goes against it.
- Identify protective - doing something because similar people do it.
- Illusion of control - when you think your decision has made a difference. The need to feel in control of what's happening.
- Law of small numbers - because something has worked 3 times does not mean it will work a 4th time.
- Sunk costs - spending more money to get a result instead of pulling the plug.
To help us make better decisions she proposed that they need to be made at the right level of the business. For example, Executives should be deciding the strategy and not the colour of a button. Decisions need to have diversity. Studies have shown that companies with a more gender-diverse team are 15% more likely to perform. Ethnically-diverse reaches even higher, at 35%. She also reiterated that we should design with data and remove opinions by testing.
Dariusz from Spotify also spoke about making decisions. They tried to get the Running feature to match the tempo perfectly every time. When they realised it wouldn't be achievable in their timeframe they made the hard decision to settle for 95% which was good enough to launch with. They tried to perfect it too much when what they had was all they needed.
Become the User
Merck from Slack educated us on becoming our users. It’s easy to put yourself in someone else’s shoes but you need to amplify this. Add duress by pretending they’re having a terrible day and they’re stressed, under pressure. We also need to embrace the beginner's mindset. It’s easy to miss problems because you’re familiar with the product you’re building. To get first-hand experience of end users problems everyone at Slack helps with product support and ZenDesk tickets. I thought this was a great way of understanding the users. As by doing this, they get better at predicting user behaviours.
Christophe and Brian from Facebook spoke about designing for the exciting new technology of virtual reality. It’s incredible to see all of the things that need to be considered and the research they’ve carried out already.
Presence is an affordance that won't be the same for everyone. There are three types of presence:
- Self-presence - the idea that you are you in the space.
- Social presence - other people in the space with you.
- Spatial presence - the environment in the space around you.
Avatars are key to self and social presence. In VR, rather than seeing your avatar, you are your avatar. A balance needs to be found between realistic or non-realistic styles and care needs to be taken to avoid the uncanny valley effect. If something looks real but does something unreal it ruins the effect.
Facebook demonstrated the potential uses of VR in the future.
Hands and hand gestures are very important and help when users are communicating with each other in VR. Eyes are important especially eye contact. Facebook have found that even though they only need head and hands a torso helps to tie everything together, making it easier to see which head and hands are connected.
The environment is another important factor as it effects the way users feel. For example, small spaces can cause claustrophobia. Interestingly, small environments make users feel powerful and god-like where as big environments make them feel small and almost timid.
Performance is key and if the experience stutters then it's really bad. 60fps is the ideal speed to maintain.
Unstable horizons, caused when flying, affect balance but can be remedied by adding a fixed frame like a cockpit. One company added a virtual nose that helped users keep their balance but went unnoticed, similar to our own physical noses.
Despite the vastly different companies and their products, it was humbling to see one thing that every speaker mentioned. Testing. We need to keep innovating and produce new products and features but we must test and validate them. Without it, they are purely assumptions.