How using AI is like 'having a bunch of smart interns'

‘Using AI is like having a bunch of smart interns’ - I probably stole that from someone at one of the conferences this year, but it feels like the best analogy when comparing the promise and hype around AI with the actual reality. Roughly one year ago we started our AI journey at Ingentis and we learned a few lessons along the way.

But let’s get one thing out of the way, because it really bugs me when people talk about AI without really understanding what the underlying Large Language Model really is. A LLM is not some Voodoo magic and has nothing to do with ‘Intelligence’ in a human sense. At its core AI is a massively scaled multi-dimensional probability matrix, which predicts which word (= token) comes next based on the last word (= token). Yes, it’s like an auto-complete function, but trained at internet scale.

Now some might stop here and feel there is nothing more to ses. However, as much as it is wrong to attribute anything like intelligence to a probabilistic network, it is also wrong to dismiss it. Because the fact is, with probability one can solve perhaps 70% or 80% of most problems. In other words, AI can most likely save you the 30% of time to get you to a 70% solution of a problem, but you will still need to invest the 70% of time to finish the remaining 30%. And this is why the analogy of AI being like a smart intern feels true. A good intern can prepare a presentation or draft a letter or do internet research, but you would not rely on their work without first reviewing it.

At Ingentis we started our journey like so many other companies, with an internal company GPT. We exported and chunked our Intranet Confluence in the hope of some smashing success. As you might have guessed, it wasn’t one. Just because information becomes available, doesn’t mean someone needs it. Those needing the information knew where to look, and those who did not need it - well, they did not. Also prompting for information when you knew exactly where it was only was fun the first few times. In retrospect it reminded me of an earlier hype around IoT, and how startups created apps to switch on lights, when in reality the light switch was a magnitude faster and more convinient. The company GPT project also lacked specific use cases. So that was a lesson learnt - start with a set of validated use cases, make them successfull and then build from there. Also the Company GPT lacked common features like a web connector - based on experience being able to do web searches and summarize website content is considered a must-have.

Our next project was a lot more specific - we built an RFC (= “Request For Comments”) copilot for our presales team together with our partners at CodeCentric. In B2B, a sale almost always involves the answering of a RFC questionnaire from the customer’s IT/Security/Compliance/YourFavoriteDepartmentHere before the contract is signed. Most RFC’s all ask roughly the same questions while phrasing it differently. Plus, the two or three odd questions, which are company-specific. We built a RAG application and grounded it with our system reference guides. An interesting fact I learned during that project is the importance of the right input chunking. Make it too small, and the reponse is too narrow; make it too large and the response becomes too generic. The resulting RAG application was indeed an unqualified success - it saves us around 50% of the time in answering those RFC questionnaires by prefilling it with answers.

While there are lots of potential AI usecases in the market and product-support side of the business, it becomes much harder to leverage AI as a meaningful and value-adding feature to an enterprise product. Which, BTW, is something I have been hearing from a lot of CTOs over the year. The problem comes down to predictability vs creativity. LLMs really shine in creativity with a high degree of freedom. But at the same time most customers value predictability and accuratness. So finding the right balance has been a challenge. In addition LLMs do not do math well (think about it … it has no concept of 1+1 being 2, it only ‘knows’ that there is an exceedingly high prohability that the next token after a ‘1’, followed by a ‘+’, followed by a ‘1’, followed by a ‘=’, is a ‘2’) - which is yet another challenge for any kind of analytics. But where are challenges there are opportunities. I am not going to tip my hat just yet, but it’s going to be super interesting where we end up with AI in the product.

I started this journey as an AI skeptic. While I am not yet an AI fanboy, my experiences over this year have made me an AI convert. I see a lot of potential usecases where a LLM can get us to the 70% solution while saving us the 30% of the time we would have needed to get there. And that is nothing to scoff at.