How LLMs Learn — And What That Means
To set the stage, let’s think about how LLMs (Large Language Models) - the predominant AI system in use to do software development today - work.
A model is a mathematical function defined through billions of parameters. These parameters are tuned based on training data. The goal of training to make the function capable of doing a good job of predicting the next token, given a set of input tokens.
We can thus see that the knowledge embedded in the model is a complex mathematical distillation of all that it has “seen” during training.
An important implication is that we can expect the model to be more biased towards what is well represented in the training data.
As software engineers starts to employ AI models more and more, this leads to an important phenomenon - the code that a model produces tend to re-inforce what is already there, rather than what ought to be there.
A Case in Point: Java’s Rapid Evolution
Let’s take an example to make it more concrete. The Java language ecosystem has been a powerhouse of innovation in recent years. Oracle and the OpenJDK community are delivering well-researched improvements to the language, tooling, and developer experience at an amazing pace. This is evident from the release cycle. The gap between Java 7 and Java 8 was nearly 3 years. Today, meaningful features are delivered every 6 months. And, no, this has NOT been due to AI. It is just how much focus and effort is going on at Oracle’s Java team. The innovations are remarkably diverse: new syntax (pattern matching), new constructs (value types), new concurrency primitives (green threads), performance improvements (sub-millisecond GC), and new build paradigms (native compilation). This diversity means capabilities have changed dramatically. What was possible a few years ago may be entirely different now, depending on your use case. The way code is written can be fundamentally different. And all of this is for the better. Why ? Because, thanks to Java’s extreme backward compatibility guarantees, you can still write code in 1990s style and the system will still happily run your code. But, you will never want to - if you had a choice.
However, the point of this article is to call out that, if you want to take advantage of AI’s supercharged productivity, you may not have that choice. Why ? Because, the new Java is likely to represent a minuscule part of the training set compared to the old Java. So, LLM’s parametric knowledge is expected to be highly biased towards old patterns.
The Reinforcement Trap
Now, you see the problem ? New code being written with AI help, will continue to propagate the old patterns, further re-inforcing the old patterns. In other words, if we don’t do anything, we are headed towards a crisis when it comes to innovation. I have to qualify here that, passionate and clever people will continue to work on innovative ideas, but they will face a huge barrier in diseminating that innovation. And eventually the innovation itself could stop.
Obviously this cannot happen - so, i am not here to say that we are headed for a doom - but rather to think out loud about how we might end up solving this. Because, there is no way we cannot avoid this problem, nor is it possible to go down with it. So the only way is - through.
Finding a Way Through
Let’s go back to the Java example to make the scenario more concrete. Imagine a software company of the future. Also assume that the company runs primarily on Java. It has many Java apps it is running in production - each dating back to a different period. There is a new project underway and a new app is being built. Pre-AI, this is the time, when engineers bring in the latest and the greatest. But this time, they face a challenge, the LLMs they are using haven’t seen the latest patterns enough to intuitively think in those. What do they do ?
I submit the following (imagined) solutions to the team of the future:
Their LLM provider offers “fine tuning” features to embed new patterns into the model. In practice, the team would prepare a sample project with documentation. They would then upload it for fine-tuning. The LLM provider would use this to “rewire” the model’s internal knowledge.
They prepare DOs and DON’Ts to instruct the LLM. As a teacher disciplines a rebellious child, the team forces the LLM to behave correctly.
I guess I have so unimaginatively given away the point here. The second approach is not going to work - we need something like the first. We are going to need some kind of mechanism to do deep rewiring with manageable effort. I am sure the LLM people are working hard on it already. Since I don’t have any particular knowledge in the area, let me not speculate how this will develop.
Engineers as AI Tutors
Instead, let’s focus on what this could mean for software engineering teams of the (near) future. Once such facilities become available, they will need to take on the role of AI “tutors.” Like all good tutors, they must “curate” good patterns and make them available for the model to learn from. Depending on how everything pans out, this might mean preparing just a few samples or many samples or something else.
We can speculate about “how much” effort this would involve based on our own historical experience training human students. Think about how much effort it took you to learn a new programming language or paradigm. A few lectures/videos and a few dozen examples for practice usually suffice. The best-case scenario is that a similar effort will be needed to give AI a genuinely new way of thinking.
To be honest, i have some doubts about the technical feasibility of this happening. Because the learning and generalisation capability of the human brain, per byte of new information is astronomically superior to AI. We all know this. These LLMs had to mug up terrabytes of data to achieve the same level of proficiency as we humans - who on average would have hardly read 100 books properly (which amounts to a few mega bytes).
But then, as we discussed, this problem needs to be solved. It is everybody’s need. So we can hope that, it will get solved one way or the other.
Living With 2010s Code — For Now
For now - perhaps for this year and a couple more - we might accept that LLMs are going to generate 2010s-era code by default. (If we could measure the mean age of code in the training set, i would say it is somewhere in the 2010s?)
But at some point, there will be a real demand that AI-generated code upgrades to the more modern patterns. Where is that demand going to come from ? - likely businesses seeking competitive advantage in terms of less bugs or reduced infrastructure costs etc. In any case, that will be point when our AIs will need our hand to jump past their inherent limitations to truly learn new things.
Then, atleast some of us (if not many), will have to step up and start tutoring our AI mates. Because, at the end of the day, no AI learns as well, with so little, as we do.
Aside: If this happens, it is possible that we might have the rise of a new industry - AI tutoring as a service - where a few specialised companies charge good dollars to upskill other people’s AIs !