CYB Engineering Day: developing with LLMs for client solutions
At Caution Your Blast Ltd (CYB), we are facing the challenges - and exploring the possibilities - brought by artificial intelligence (AI). Not just in terms of how the technology itself works - although that is obviously vital - but also how we bring all of the knowledge in our multidisciplinary team together to make the most of this generational opportunity. We see AI holistically - to maximise its potential, we are utilising our research, design, engineering and security expertise. We know from experience that large language models (LLMs) like ChatGPT have the capacity to help our core aim - to use digital as a force for good.
LLMs is an area we have already started to develop in terms of client solutions, which is hugely exciting for us.
One of our talented colleagues, Jen has developed the (modestly named!) JenPT Slack plugin which allows our colleagues at CYB to ask questions utilising LLMs in a data-safe environment. We not only see potential to use it where data needs to remain in-house and not sent to external third party APIs - we are already using this technology to help our clients.
You can read about a great example of how we used this thinking for one of our clients in government, and how we deployed our own LLM to solve a problem in a recent project.
To further explore this with our Engineering team, we decided to use LLMs as a topic for our most recent Engineering day. In our Engineering practice, we prioritise continual learning and staying up-to-date with emerging trends. Every quarter, we come together as a team to tackle a challenge, learn together, and discuss our findings.
The challenge
Our goal was to build an LLM application that could be deployed in our own environment, leveraging a custom dataset to generate relevant and accurate responses to questions in a data-secure environment.
How does this differ to querying a regular database with custom data I hear you ask? Firstly, we are intending to do more than just search for an item by keywords, which is what a traditional database is good at. We need to generate text based on a text (English language) query. This requires understanding the context and semantics of the query, not just matching keywords (which is what LLMs are more suitable at). Consider a query emailed by a customer from a helpdesk support system where we may want to generate a response that a support team can potentially use rather than crafting one from scratch.
The Engineering day kicked off with the usual introductions, setting the scene and agenda for the day but more importantly providing a baseline understanding for everyone by explaining some core concepts: vector databases and Hugging Face transformers. In simple terms, vector databases organise data as points in multi-dimensional space, making them excellent for similarity searches based on attributes like words and phrases. Hugging Face, on the other hand, is a treasure trove of AI resources, offering APIs and tools for downloading and training state-of-the-art pretrained models. By combining these tools, we aimed to tackle our challenge.
Each member of our Engineering team got to choose which of the two problems they wanted to solve. We then rolled up our sleeves and got to work playing around with our new toys.
Vector victory
The vector team focused efforts on looking at Milvus and Chroma, two open source vector databases. We could have considered using one of the popular API-based solutions like Pinecone or Weaviate but the objective was to have the whole solution deployable in our own environment to avoid data being sent to third parties. While they both took a bit of time to set up and familiarise with the configuration options, etc, we had more success getting Milvus running. Chroma was easy enough to set up and configure but we faced some unusual behaviour when trying to insert data. In order to best utilise the time available, we focused on using Milvus. We were able to fully setup Milvus, insert embedding data and query out relevant matching results. Objective achieved, well done Team Vector!
Transformer tussle
Team Transformer needed to understand how to use the Hugging Face tools and choose a suitable model to generate text with. Running an LLM requires a lot of compute resources so they can’t realistically be run on a laptop so the first challenge was picking a platform to download and run the model on. We chose Google Colab as our playground as it’s free up to a limit and perfect for our experiments.
The next challenge was to pick an open source LLM to use. Luckily, Hugging Face provides a LLM Leaderboard where you can compare all sorts of LLMs and decide which ones are most suited for your use-case. We decided to try Falcon and Llama-2 and had some interesting results. We initially struggled to run both the models as they kept running out of memory and occupying too many resources. After applying a technique called quantisation to reduce the size of the Llama-2 model, it was able run on the compute resources available and we were able to get relevant responses to our queries like the one below.
Prompt input:
Question: What is electroencephalography?
Answer: Let's think step by step.
The Llama-2 model then produced the following response:
Step 1: Break down the word "electroencephalography" into its individual parts:
* "Electro": means related to electricity
* "Encephalo-": refers to the brain
* "-graphy": means writing or recording
So, electroencephalography can be defined as:
"A technique for recording and analysing electrical activity in the brain".
We ran out of time to apply the quantisation method to the Falcon model but we hope to do so in future to compare the outputs of both. Objective achieved, well done Team Transformer!
At the end of the day we each presented our findings and our perspectives on how we used the different tools. We all found it a valuable experience and summarised our learnings from the day so we can improve on the next event.
Where to next?
The Engineering day was a successful and productive experience that we all very much enjoyed. It has given the team the opportunity to learn new skills that we can apply in the wild. It has sparked some ideas we can apply these techniques to that we will be developing further.
One such example is a productivity tool to allow interrogation of any text data set. So the ability to upload data and ask questions of it, similar to a private version of ChatGPT where data doesn’t leave our organisation. This will allow our colleagues in other disciplines at CYB to access the technology to gain complex insights into data they may otherwise not have been able to achieve without considerable effort or experience outside their discipline.
Using LLM technology on the ground has also made us think about the risks of irresponsibly using it. As the industry gets to grips with it, many patterns and life cycles will emerge and we intend to be at the forefront, sharing our thoughts and ideas into the tech community as we use LLMs to help our clients use digital as a force for good.
Exciting times ahead!