On October 1st and 2nd, Valencia once again became the capital of software testing with a new edition of VLCTesting 2025, the largest software testing festival in Spain organized by the Instituto Tecnológico de Informática (ITI).
More than 1,000 attendees, both in-person and online, gathered at this 15th edition under the motto “No Test, No Gain”, sharing insights and exploring the latest trends, all with one goal: keeping software quality at the heart of organizational strategy.
LedaMC, always committed to quality
At LedaMC we could not miss this event and part of our team headed to Valencia. Ana, Lily, Andrés and Jess took advantage of this opportunity to learn from top industry experts and contribute to the ongoing discussion about the future of testing. A future that is already very influenced by the integration of generative AI into validation and quality assurance processes.
In addition, our colleague Jesús Alonso, Service Manager at LedaMC, participated as a speaker during the online sessions on October 2nd with the talk “How to tame your chatbot: Smart testing (without losing your mind)”, where he presented a case study on testing a virtual assistant for an e-commerce platform.
The challenge of testing a generative AI chatbot
Jesús shared the experience of a project involving the testing of a multilingual LLM-based chatbot, connected to catalogue, stock, and order APIs, designed to accompany customers throughout the shopping journey.
The system posed a significant challenge: due to its generative nature, the AI’s responses weren’t always predictable, producing “hallucinations”, inconsistencies in tone or errors in language detection.
The main problems we encountered
In practice, we encounter three major difficulties. The first was to ensure the relevance of the answers and avoid the famous “hallucinations” of AI. A customer asking about discounts on sneakers expects a concrete answer, not a generic description about the most popular brands.
Second, managing languages properly: It wasn’t enough for the bot to detect the language of the query; it also had to adapt to language switches during the conversation and understand local idioms, essential for a global e-commerce context.
The third challenge had to do with tone. The chatbot is, in essence, the store’s salesperson, and poor virtual interaction can be as damaging as bad in-person service. For instance, we discovered that when a user was angry, the bot sometimes responded curtly and even scolded them. Adjusting those replies to be empathetic and consistent with the brand’s voice was critical.
Testing: A mix of Human Intuition and AI Creativity
To deal with all this, we designed a hybrid testing strategy. On the one hand, we relied on our QA team’s expertise: well-thought-out test cases, edge scenarios, exploratory testing with different customer profiles and thorough defect documentation.
On the other hand, we leveraged on generative AI, with the help of Quanter, to multiply our testing coverage. This tool allowed us to generate infinite variations of questions, unusual inputs, spelling mistakes or simulations of long conversations in several languages. Thus, while human judgment gave us control and knowledge of the business, AI provided creativity and volume.
What did we achieve?
Before testing, we estimated that nearly three out of ten conversations with the bot failed. After the hybrid strategy, we managed to ensure that more than 90% of typical queries were answered correctly and with the right tone. In pilots with real users, satisfaction increased by 20%, and the QA process detected 40% more defects than with manual testing alone.
In addition, we freed testers from the repetitive task of inventing new test cases so they could focus on analyzing patterns and improving the overall experience.
Lessons for those who want to tame their chatbot
If there’s one takeaway from this project, it’s that testing a Generative AI system cannot rely solely on traditional methods. You need sound judgment, but also tools that help explore the unexpected. Think about the whole experience and not just whether the logic is well implemented.
Because we must remember that users never behave as you expect and the best testing is the one that anticipates that chaos. And if you can use AI to help you simulate it, even better.
Our many years of experience in software quality assurance and testing are the foundation for being able to integrate new approaches such as generative AI applied to testing. And thanks to our expert QA and Testing team, and the continued evolution of our tool Quanter, we can also help you achieve it. Tell us about your challenges and let’s see how we can overcome them together.

