What I acquired proper (and flawed) about traits in 2024 and daring to make bolder predictions for the 12 months forward
In 2023, constructing AI-powered functions felt filled with promise, however the challenges have been already beginning to present. By 2024, we started experimenting with strategies to deal with the onerous realities of creating them work in manufacturing.
Final 12 months, I reviewed the biggest trends in AI in 2023 and made predictions for 2024. This 12 months, as an alternative of a timeline, I need to deal with key themes: What traits emerged? The place did I get it flawed? And what can we count on for 2025?
If I’ve to summarize the AI house in 2024, it might be the “Captain, it’s Wednesday” meme. The quantity of main releases this 12 months was overwhelming. I don’t blame anybody on this house who’s feeling exhausted in direction of the tip of this 12 months. It’s been a loopy journey, and it has been onerous to maintain up. Let’s assessment key themes within the AI house and see if I appropriately predicted them final 12 months.
Evaluations
Let’s begin by taking a look at some generative AI options that made it to manufacturing. There aren’t many. As a survey by A16Z revealed in 2024, firms are nonetheless hesitant to deploy generative AI in customer-facing functions. As a substitute, they really feel extra assured utilizing it for inner duties, like doc search or chatbots.
So, why aren’t there that many customer-facing generative AI functions within the wild? In all probability as a result of we’re nonetheless determining learn how to consider them correctly. This was one in every of my predictions for 2024.
A lot of the analysis concerned utilizing one other LLM to guage the output of an LLM (LLM-as-a-judge). Whereas the method could also be intelligent, it’s additionally imperfect attributable to added price, introduction of bias, and unreliability.
Wanting again, I anticipated we might see this difficulty solved this 12 months. Nevertheless, wanting on the panorama at this time, regardless of being a significant subject of dialogue, we nonetheless haven’t discovered a dependable option to consider generative AI options successfully. Though I believe LLM-as-a-judge is the one manner we’re in a position to consider generative AI options at scale, this reveals how early we’re on this subject.
Multimodality
Though this one may need been apparent to a lot of you, I didn’t have this on my radar for 2024. With the releases of GPT4, Llama 3.2, and ColPali, multimodal basis fashions have been a giant development in 2024. Whereas we, builders, have been busy determining learn how to make LLMs work in our current pipelines, researchers have been already one step forward. They have been already constructing basis fashions that might deal with multiple modality.
“There may be *completely no manner in hell* we are going to ever attain human-level AI with out getting machines to study from high-bandwidth sensory inputs, comparable to imaginative and prescient.” — Yann LeCun
Take PDF parsing for example of multimodal fashions’ usefulness past text-to-image duties. ColPali’s researchers prevented the tough steps of OCR and structure extraction by utilizing visible language fashions (VLMs). Methods like ColPali and ColQwen2 course of PDFs as pictures, extracting info immediately with out pre-processing or chunking. It is a reminder that easier options usually come from altering the way you body the issue.
Multimodal fashions are a much bigger shift than they could appear. Doc search throughout PDFs is only the start. Multimodality in basis fashions will unlock fully new prospects for functions throughout industries. With extra modalities, AI is now not nearly language — it’s about understanding the world.
Effective-tuning open-weight fashions and quantization
Open-weight fashions are closing the efficiency hole to closed fashions. Effective-tuning them offers you a efficiency increase whereas nonetheless being light-weight. Quantization makes these fashions smaller and extra environment friendly (see additionally Green AI) to run wherever, even on small gadgets. Quantization pairs nicely with fine-tuning, particularly since fine-tuning language fashions is inherently difficult (see QLoRA).
Collectively, these traits make it clear that the long run isn’t simply greater fashions — it’s smarter ones.
I don’t suppose I explicitly talked about this one and solely wrote a small piece on this in the second quarter of 2024. So, I cannot give myself some extent right here.
AI brokers
This 12 months, AI brokers and agentic workflows gained a lot consideration, as Andrew Ng predicted firstly of the 12 months. We noticed Langchain and LlamaIndex transfer into incorporating brokers, CrewAI gained quite a lot of momentum, and OpenAI got here out with Swarm. That is one other subject I hadn’t seen coming since I didn’t look into it.
“I believe AI agentic workflows will drive large AI progress this 12 months — maybe much more than the following technology of basis fashions.” — Andrew Ng
Regardless of the large curiosity in AI brokers, they are often controversial. First, there may be nonetheless no clear definition of “AI agent” and its capabilities. Are AI brokers simply LLMs with entry to instruments, or have they got different particular capabilities? Second, they arrive with added latency and value. I’ve learn many feedback saying that agent programs aren’t appropriate for manufacturing programs attributable to this.
However I believe we’ve already been seeing some agentic pipelines in manufacturing with light-weight workflows, comparable to routing person queries to particular operate calls. I believe we are going to proceed to see brokers in 2025. Hopefully, we are going to get a clearer definition and film.
RAG isn’t de*d and retrieval goes mainstream
Retrieval-Augmented Generation (RAG) gained vital consideration in 2023 and remained a key subject in 2024, with many new variants rising. Nevertheless, it stays a subject of debate. Some argue it’s changing into out of date with long-context fashions, whereas others query whether or not it’s even a brand new concept. Whereas I believe the criticism of the terminology is justified, I believe the idea is right here to remain (for a short while at the least).
Each time a brand new lengthy context mannequin is launched, some individuals predict it is going to be the tip of RAG pipelines. I don’t suppose that’s going to occur. This entire dialogue must be a weblog put up of its personal, so I’m not going into depth right here and saving the dialogue for an additional one. Let me simply say that I don’t suppose it’s one or the opposite. They’re enhances. As a substitute, we are going to most likely be utilizing lengthy context fashions along with RAG pipelines.
Additionally, having a database in functions just isn’t a brand new idea. The time period ‘RAG,’ which refers to retrieving info from a data supply to boost an LLM’s output, has confronted criticism. Some argue it’s merely a rebranding of strategies lengthy utilized in different fields, comparable to software program engineering. Whereas I believe we are going to most likely half from the time period in the long term, the approach is right here to remain.
Regardless of predictions of RAG’s demise, retrieval stays a cornerstone of AI pipelines. Whereas I could also be biased by my work in retrieval, it felt like this subject turned extra mainstream in AI this 12 months. It began with many discussions round key phrase search (BM25) as a baseline for RAG pipelines. It then advanced into a bigger dialogue round dense retrieval fashions, comparable to ColBERT or ColPali.
Information graphs
I fully missed this subject as a result of I’m not too aware of it. Information graphs in RAG programs (e.g., Graph RAG) have been one other huge subject. Since all I can say about data graphs at this second is that they appear to be a strong exterior data supply, I’ll hold this part brief.