Generative AI needs better context - you can't talk about Large Language Models without reverse neural networks and backpropagation
- Summary:
- The hype over Large Language Models (LLMs) has reached a fever pitch. But how much of the hype is justified? We can't answer that without some straight talk - and some definitions. Time for a technical review of how we got here.
Artificial intelligence (AI) has come a long way since its inception, and backpropagation is one of the most fundamental algorithms that has contributed to the development of machine learning. It is a mathematical method for training neural networks to recognize patterns in data. The history and development of the backpropagation algorithm, including the contributions of Paul Werbos, take us back to the beginnings of AI.
The concept of neural networks was first introduced in the 1950s and 1960s. Still, it was in the 1970s that the backpropagation algorithm was first proposed and independently developed by David Rumelhart, Geoffrey Hinton, and Ronald Williams at the University of California, San Diego. It was first published in 1986 in a paper titled "Learning representations by back-propagating errors."
From Wikipedia, the definition of backpropagation is:
In machine learning, backpropagation is a widely used algorithm for training feedforward artificial neural networks or other parameterized networks with differentiable nodes. It is an efficient application of the Leibniz chain rule to such networks.
The basic idea behind backpropagation is to adjust the weights of a neural network by propagating the error back through the network. This is done by comparing the network output to the expected output and adjusting the weights accordingly. The algorithm works by calculating the gradient of the loss function with respect to the weights, which is used to update the weights using gradient descent.
One of the most significant contributions to the development of backpropagation came from Paul Werbos. In 1974, Werbos proposed a method for training neural networks that he called "backpropagation of errors." His method was similar to the algorithm later developed by Rumelhart, Hinton, and Williams, but he also proposed the use of a nonlinear activation function, which is now a standard feature of neural networks. However, his paper was largely ignored at the time, and it was in the publication of the Rumelhart, Hinton, and Williams paper that backpropagation gained widespread recognition.
Werbos first introduced the concept of backpropagation in the 1970s. Still, it was in the development of faster computers and larger data sets in the 1980s that it became widely adopted as a training method for neural networks. Backpropagation has since become one of the most widely used algorithms in the field of artificial intelligence.
After the publication of the backpropagation algorithm, it quickly became a popular method for training neural networks. However, it had its limitations. One of the main challenges with backpropagation was the problem of vanishing gradients, where the gradients of the loss function with respect to the weights became very small as they propagated through the network. This made it challenging to train deep neural networks.
In recent years, many advancements in deep learning have addressed the limitations of backpropagation. These include the development of new activation functions, such as the rectified linear unit (ReLU), and the use of regularization techniques, such as dropout and weight decay. These advancements have made it possible to train deep neural networks with millions of parameters, leading to breakthroughs in areas such as computer vision, natural language processing, speech recognition and, of course, very large language models like GPT-4.
The point of all this geeky stuff is that backpropagation is one of the most fundamental algorithms in developing artificial intelligence and machine learning. Its history dates back to the 1970s, with Paul Werbos making a significant contribution to its development. While there have been challenges associated with backpropagation, recent advancements have made it possible to train deep neural networks and achieve breakthroughs in AI research.
Reverse neural networks
Reverse neural networks, also known as generative adversarial networks (GAN), are a fascinating development in the field of artificial intelligence. These networks, built using a similar architecture to traditional neural networks, can generate new data and even manipulate existing data in previously impossible ways.
What that means, in plain(er) English: deep learning can only be partially compensated by layering thousands or millions of neural networks. These smarter NLP's use AI techniques in the form of GANs, allowing for temporal (time) dynamic behavior. Unlike feedforward neural networks, GANs can use their internal state (memory) to process sequences of inputs. This makes them applicable to tasks such as unsegmented, connected handwriting recognition, or speech recognition and employs very sophisticated operations such as:
- Optical Character Recognition – Converting written or printed text into data.
- Speech Recognition – Converting spoken words into data or commands to be followed.
- Machine Translation – Converting your spoken or written language into another person’s language and vice versa.
- Natural Language Generation – The machine producing meaningful speech in your language
The idea behind reverse neural networks is to use a network to create new data that resembles existing data. This is accomplished by training the network on a set of data, such as images or audio files, and then using it to generate new data that is similar to the original set. The network can do this by learning the patterns and relationships that exist within the original data set and then using those patterns to create new data that fits within those relationships.
Reverse neural networks have a wide range of potential applications. They can be used to generate new images or video content, which could be particularly useful in fields such as art or design. They can also be used to create realistic simulations, which could have applications in areas such as engineering.
Reverse neural networks have also been used to create realistic audio and music. One example is the development of WaveNet, a generative neural network that is capable of producing high-quality audio that sounds similar to human speech. This technology has potential applications in fields such as speech synthesis and text-to-speech conversion.
Another application of reverse neural networks is in creating deepfakes, which are manipulated videos or images that appear to be real. While the technology has the potential for harmless fun, it also has the potential to be used for malicious purposes, such as spreading fake news or creating fake evidence.
As with any new technology, there are concerns about reverse neural networks' potential risks and ethical implications. For example, there is the risk that the technology could be used to create highly realistic fake videos or images that could be used to spread disinformation or manipulate public opinion.
Overall, the development of reverse neural networks represents an exciting new frontier in the field of artificial intelligence. While there are potential risks and ethical concerns that must be addressed, the technology has the potential to revolutionize fields such as art, design, and engineering. As research in this area progresses, seeing what new applications and developments emerge will be interesting.
My take
Is the hysteria of LLMs and OpenAI's chatGPT in particular warranted? Some argue it isn't the technology that enables mischief; it's people. True, but that does not give any guidance on what to do about it. For most people, the technology is opaque, and the only way to explain it is through metaphor and analogy, but the problem is that it is so new there isn't a resonant vocabulary to deal with it yet.
I have two other concerns. First, people have been obsessed with the idea that AI will replace humans or, at least, human civilization as we know it. Whether that ever happens or not is a too-distant concern to wring our hands about. What is concerning is the expanding belief that it is already happening through LLMs. It's not. However, LLMs introduce a new peril not seen in our "narrow AI": language. Language is culture, and by synthesizing language and controlling the narrative, we are at risk of losing our culture to a machine culture, reft with hallucinations and non-facts. That is a clear and present danger.