What to expect from the next generation of chatbots: OpenAI’s GPT-5 and Meta’s Llama-3

Recently, there has been a flurry of publicity about the planned upgrades to OpenAI’s ChatGPT AI-powered chatbot and Meta’s Llama system, which powers the company’s chatbots across Facebook and Instagram.

The technology behind these systems is known as a large language model (LLM). These are artificial neural networks, a type of AI designed to mimic the human brain. They can generate general-purpose text, for chatbots, and perform language processing tasks such as classifying concepts, analysing data and translating text.

They acquire these abilities through an intensive process known as training, where the AI system is exposed to vast amounts of data in an effort to improve what it does. OpenAI and Meta are expected to release the newer versions of their chatbots – called GPT-5 and Llama 3, respectively, before the end of summer 2024. But how will these differ from their predecessors and what value will they add?

Like its predecessor GPT-4, GPT-5 will be capable of understanding images and text. For instance, users will be able to ask it to describe an image, making it even more accessible to people with visual impairments.

However, GPT-5 will have superior capabilities with different languages, making it possible for non-English speakers to communicate and interact with the system. This includes greater mastery of language translation. The upgrade will also have an improved ability to interpret the context of dialogue and interpret the nuances of language.

Hence, it will be able to provide more accurate information to users. For instance, the system’s improved analytical capabilities will allow it to suggest possible medical conditions from symptoms described by the user. GPT-5 can process up to 50,000 words at a time, which is twice as many as GPT-4 can do, making it even better equipped to handle large documents.

It will feature a higher level of emotional intelligence, allowing for more empathic interactions with users. This could be useful in a range of settings, including customer service. GPT-5 will also display a significant improvement in the accuracy of how it searches for and retrieves information, making it a more reliable source for learning.

It is said to go far beyond the functions of a typical search engine that finds and extracts relevant information from existing information repositories, towards generating new content.

GPT-5 is also expected to show higher levels of fairness and inclusion in the content it generates due to additional efforts put in by OpenAI to reduce biases in the language model.

It will be able to interact in a more intelligent manner with other devices and machines, including smart systems in the home. The GPT-5 should be able to analyse and interpret data generated by these other machines and incorporate it into user responses. It will also be able to learn from this with the aim of providing more customised answers.

This could enable smarter environments at home and in the workplace. GPT-5 will be more compatible with what’s known as the Internet of Things, where devices in the home and elsewhere are connected and share information. It should also help support the concept known as industry 5.0, where humans and machines operate interactively within the same workplace.

GPT-5 will feature more robust security protocols that make this version more robust against malicious use and mishandling. It could be used to enhance email security by enabling users to recognise potential data security breaches or phishing attempts.

Overall, the upgrade from OpenAI should be more versatile, more energy efficient in its computations and offer a more adaptable and personalised service.

Meta’s Llama upgrade

Llama-3 is Meta’s competitor to GPT-5. It features several improvements compared to its predecessor, Llama-2. It is a more capable model that will eventually come with 400 billion parameters compared to a maximum of 70 billion for its predecessor Llama-2. In machine learning, a parameter is a term that represents a variable in the AI system that can be adjusted during the training process, in order to improve its ability to make accurate predictions.

Llama-3 will also be multimodal, which means it is capable of processing and generating text, images and video. Therefore, it will be capable of taking an image as input to provide a detailed description of the image content. Equally, it can automatically create a new image that matches the user’s prompt, or text description.

It will be able to perform tasks in languages other than English and will have a larger context window than Llama 2. A context window reflects the range of text that the LLM can process at the time the information is generated. This implies that the model will be able to handle larger chunks of text or data within a shorter period of time when it is asked to make predictions and generate responses.