Conversation-Driven Development

I've been looking for a name that describes our approach to building conversational AI. The challenge with building great AI assistants is that it's impossible to anticipate all the things your users could say. But the opportunity is that in every conversation users are telling you-in their own words-exactly what they want.

Conversation-driven development (CDD) is the process of listening to your users and using those insights to improve your AI assistant. The actions that make up CDD are:

1. Share: Give your prototype to users to test as early as possible. People will always surprise you with what they say. Too many teams spend months designing conversations that will never happen.

2. Review: Take time to read through the conversations people have with your assistant. It's helpful at every stage of a project, from prototype to production. Too many teams get caught up looking exclusively at metrics (like "what % of users express intent X?").

3. Annotate: Improve your NLU model based on messages from real conversations. Coming up with examples yourself, or generating synthetic examples with a paraphrasing approach can help you bootstrap. But when you're going into production, less than 10% of your data should be synthetic.

4. Test: Use whole conversations as end-to-end tests of your assistant. Professional teams don't ship applications without tests. When you go into production, you should have dozens of end-to-end tests covering the most important conversations. Use continuous integration and deployment to ship updates reliably.

5. Track: Come up with a way to identify successful conversations. For example, a user taking an action (like signing up for your service) or not taking an action (like not getting back in touch with support within 24 hours). Use that data to tag and filter conversations to understand what's working and what's not working.

6. Fix: Study conversations that went smoothly and ones that failed. Successful conversations can become tests right away. Unsuccessful conversations show you where you need more training data, or where you need to fix your code. Track the different ways your assistant fails so you know you're reducing failures over time.

CDD is a user-centric approach to building AI assistants and requires more than just software skills. It's also not a linear process, you will find yourself jumping between each of these actions. Some require a deep understanding of the domain and of the end user, others require software or data science skills. It's a collaborative process between product, design, and development that reveals what users are asking for. Over time it ensures that your assistant is adapting to what the user wants, rather than expecting the user to adapt their behaviour so the assistant doesn't break.

P.S. If you remember anything from this post, remember point 1: you need test users. Shipping without having had lots of testers has never worked and your project won't be the exception.