- Large language models are taking the world by storm.
- They have become essential in advancing AI.
- Let's find out how they work and what the future may hold for this technology.
The story of artificial intelligence can be said to date back to the 1950s and the British computer scientist Alan Turing. Turing was a pioneer in the field of computer science and cryptography.
He proposed a test to determine if a machine exhibited human-like intelligent behavior. The Turing Test, named after him, was designed to challenge the notion of machine intelligence. It played a crucial role in the development of artificial intelligence (AI), as it served as a benchmark for evaluating the progress and capabilities of AI systems.
One of the cornerstones of developing AI are large language models.
Language models are computational models that specifically deal with the understanding and generation of human language. They are designed to capture the statistical patterns, semantic relationships, and syntactic structures in language.

Language models are the backbone of many natural language processing (NLP) techniques. NLP is a branch of computer science concerned with giving computers the ability to understand text and spoken words in a similar way to humans.
Language models help computers to perform tasks, such as language understanding, generation, translation, and sentiment analysis. By leveraging language models, computer systems can process and generate text, engage in conversations, and perform various language-related tasks.
These models have become essential in advancing AI capabilities and applications in areas such as virtual assistants, chatbots, content generation, and language processing tasks across industries.
Advancements in language models, such as the development of transformer-based models like GPT (generative pre-trained transformers), have significantly enhanced the capabilities of NLP systems. These developments have caused a revolution, enabling more accurate and contextually relevant language understanding and generation.
Here we take a detailed look at the history of advanced language models and their evolution from simple chatbots to the revolutionary ChatGPT. We shall also explore their impact on various industries and the ethical considerations surrounding their use. Finally, we will examine what the future may hold for language models and what this will means for conversational AI, and the future of AI in general.

It’s a long one, so grab a snack.
A brief history of chatbots
Natural dialog systems, or chatbots, have been around since the late 20th century. The first generation of chatbots were retrieval-based, meaning that the chatbots relied on pre-defined responses or patterns to provide answers based on specific keywords or phrases.
Many chatbots today are powered by AI and are generative-based. They use advanced NLP and machine learning (ML) to understand user input and generate contextually relevant responses. Additionally, they can engage in more dynamic and conversational interactions. This means that the conversation feels more personal and human-like.
Let's look at the evolution of chatbots from ELIZA to ChatGPT.
ELIZA
ELIZA was one of the earliest retrieval-based chatbots. Developed in 1966 by Joseph Weizenbaum at MIT, it employed techniques such as keyword matching and transforming statements into questions, allowing it to engage in basic conversation. It's aim was to simulate conversations in person-centered therapy.
Although, most of its answers are absurd. You can try it out here.

PARRY
Next came PARRY, a chatbot designed to simulate a person with diagnosable paranoia. Developed in 1972 by Kenneth Colby at Stanford University, PARRY responded differently based on parameters such as fear, anger, and mistrust.
Jabberwacky
The revolution came in the 1980s when Rollo Carpenter developed Jabberwacky at Stanford, using AI. He employed ML techniques to improve its responses over time, providing an early example of a learning chatbot. Jabberwacky aimed to generate human-like conversation by learning from previous interactions and user inputs.
You can still chat with Jabberwacky yourself.
A.L.I.C.E.
Artificial Linguistic Internet Computer Entity, or A.L.I.C.E., was the first online chatbot inspired by ELIZA. Introduced in 1995 by Richard Wallace, it was designed to simulate human-like conversations using extensive knowledge-based and pattern-matching techniques. It won the Loebner Prize, an annual competition for AI chatbots, multiple times.
Its answers are actually coherent and follow a logical order, try it here.
Mitsuku
Mitsuku is another prominent AI chatbot developed by Steve Worswich during the 2000s. Winning multiple Loebner awards, Mitsuku works in a similar way to A.L.I.C.E. It is a supervised training model which is still being actively developed, with the developers tweaking the rules to make the chatbot's response more human-like.
It's now been renamed Kuki, but you can still chat with it.
Siri
Apple was the first company to mark a significant milestone by releasing Siri, the first AI-powered virtual assistant. Siri combines NLP, ML, and voice recognition to perform tasks, assist with various functions, and answer questions on all Apple devices.

IBM Watson
Since Apple released Siri, multiple companies have developed AI-powered virtual assistants. IBMs Watson was unveiled in 2011 and gained attention for its ability to analyze vast amounts of data, understand natural language queries, and provide precise answers. Its powerful AI capabilities made it a versatile tool for a variety of industries, such as finance, healthcare, and research.
Google Now
Google Now, launched in 2012, was an intelligent personal assistant developed by Google. It used ML, NLP, and data analysis to provide personalized information, recommendations, and assistance to users on their mobile devices. It is now known as Google Assistant.
Cortana
Introduced by Microsoft in 2014, Cortana served as Microsoft's AI-powered virtual assistant, providing voice-based assistance, answering questions, setting reminders, and performing tasks across various Microsoft devices and services. As of 2023, Cortana is being replaced by Bing Chat AI and Windows Copilot.
Alexa
Amazon Alexa, also released in 2014, is a virtual assistant based on Ivona, a Polish speech synthesizer. It operates on devices such as the Echo smart speakers, Echo Dot, Echo Studio, and Amazon Tap speakers. It uses NLP and ML to perform tasks, control smart home devices, and provide information.
ChatGPT
ChatGPT, based on large language models (LLM), is an AI chatbot developed by OpenAI. It is based on the GPT architecture and designed to engage in dynamic and contextually relevant conversations. With its advanced NLP capabilities, ChatGPT can provide informative and coherent responses across a range of topics, making it a versatile tool for conversational interactions.

Chatbots have come a long way, thanks to NLP and ML.
NLP helps chatbots understand and interpret human language accurately, while ML enables them to learn from data, adapt to user preferences, and improve responses. This synergy has made chatbots more intelligent and better able to provide contextually relevant and personalized conversations and deliver a more natural and satisfying user experience.
Pioneering the future of AI
Founded in 2015, OpenAI conducts AI research to develop and promote what has been referred to by some as friendly AI. Founded by titans of the tech industry, including Elon Musk, Sam Altman, Greg Brockman, Ilya Sutskever, John Schulman, and Wojciech Zaremba, OpenAI has become a pioneer in artificial general intelligence (AGI).
OpenAI's mission is to create AGI that benefits all of humanity. Their focus is on generative models. These are ML models capable of generating data similar to their training data. They do this by learning the underlying patterns and distribution in the training data and then generating new data based on it.
One of the most significant projects undertaken by OpenAI is the development of GPTs. These are LLMs trained on vast amounts of data that can generate human-like responses, leading to applications in NLP, content generation, etc.

GPT-1, released back in 2018, was developed following Google's introduction of transformer architecture the previous year. This is a type of neural network architecture that is designed to handle text sequences of varying lengths
Transformer models rely on self-attention mechanisms to capture relationships between different positions in the input sequence, allowing for parallel processing and effective modeling of long-range dependencies. This enables the model to capture the relationship between each word and other words within the same sentence.
Transformer models, like GPT, are built such that their structure gives them a more organized memory, leading to better performance. This makes them more effective and versatile than previous models like recurrent neural networks (RNNs).
Within just five years of the initial release, OpenAI released its fourth iteration, GPT-4, in March 2023.
In addition to GPT, OpenAI is involved in the development of AI systems capable of operating on physical robots. They previously had a robotics team that successfully developed a robot capable of solving a Rubik's cube. However, the team was disbanded in 2021 to prioritize progress in generative AI and LLMs.
Despite this, in March 2023, OpenAI invested in 1X, which builds robots to benefit society, potentially signaling OpenAI's renewed interest in embodied intelligence and robotics.
OpenAI also released DALL·E, a model that generates and edits images via user prompts, and Whisper, a model that converts audio to text.
Through their various projects, it is clear that OpenAI is at the forefront of AI research. The company claims that, through responsible R&D, collaboration, and democratizing access, it aims to drive progress in AI research, promote safety and ethics, and contribute to the responsible development and deployment of AI technologies.

The birth of ChatGPT
The development of the GPT architecture has revolutionized the field of conversational AI. Building on the success of the architecture, OpenAI released ChatGPT, an AI chatbot based on the GPT architecture.
ChatGPT was first launched on November 30, 2022, based on the GPT-3.5 architecture. Within three months of its launch, OpenAI's valuation grew to US $29 billion, and ChatGPT became what was at that time the fastest-growing commercial software in history.
It is known for allowing its users to fine-tune and direct a discussion toward a particular outcome. Successive responses are considered at each level of the conversation to enable human-like conversations.
ChatGPT now has over 100 million users, according to recent data.

ChatGPT was first released as a freely available research preview, but due to its popularity, OpenAI has given free access to the GPT-3.5 version, while the latest GPT-4 version is for paid subscribers only.
The model has been fine-tuned using a combination of reinforcement and supervised learning from human feedback. Human AI trainers provide evaluations and rankings of various model-generated responses. This iterative training process enhances the model's conversational abilities over time.
In fact, ChatGPT made it to the cover of Time magazine. In the February 2023 issue, a screenshot of a conversation with ChatGPT was placed on the cover page, with the AI chatbot discussing the article title, 'The AI Arms Race Is Changing Everything'.
Despite its achievements, ChatGPT is not without its faults. Users have reported that hallucinations are common, meaning it confidently presents inaccurate or false information as though it were true.

Stack Overflow banned ChatGPT-generated answers on its website in December 2022, noting the factually unreliable nature of ChatGPT's responses. Following this, the International Conference on Machine Learning banned LLM-generated content in submitted papers.
Despite these and other issues, ChatGPT has paved the way for advancements in AI conversations, driving us closer to more intelligent and natural interactions with AI systems.
From GPT-1 to GPT-4
GPT models are based on transformer architecture, a breakthrough development made by Google in 2017. This architecture enabled the emergence of LLMs like BERT and XLNet, which were pre-trained transformers but not designed for generative-based applications.
"Pre-trained" refers to an ML model that has undergone training on a large dataset of examples.
Their primary focus was to excel at tasks like classification and language translation, but not in generating coherent or creative text.
In 2018, OpenAI published an article titled 'Improving Language Understanding by Generative Pre-Training', introducing the first iteration, GPT-1.

GPT models are neural networks that incorporate a two-stage process. The first stage is an unsupervised generative "pre-training" stage in which initial parameters are set using a language modeling objective. The second step is supervised, discriminative fine-tuning, which adapts the parameters to a specific target task.
In simple terms, the first step involves training the model on a large dataset and then fine-tuning it in the second step using human input. This is discussed at length later in the article.
GPT-1 was unveiled in June 2018, featuring a 12-level, 12-headed transformer decoder architecture, allowing it to generate text based on patterns and structures learned during training.
The model was trained on the BookCorpus dataset, which consists of 4.5 GB of text extracted from 11,000 ebooks of different genres scraped from the internet. With nearly 117 million parameters, GPT-1 had a significant capacity to learn and generate human-like content.
The training process took approximately one month using eight GPUs. GPT-1 marked the initial step in the GPT series, setting the stage for further advancements and improvements in subsequent iterations.

In February 2019, the next iteration, GPT-2, was introduced through a limited release. GPT-2 featured modifications to the normalization technique used in GPT-1 as well as a significantly larger model size, with 1.5 billion parameters. The model was trained on the WebText dataset, which consists of 40 GB of text from 8 million documents sourced from 45 million upvoted web pages on Reddit.
While GPT took 1 petaflops-day to train 117 million parameters, GPT-2 took 10 petaflops-days to train 1.5 billion parameters, showcasing a substantial improvement in computational power compared to GPT-1. (A petaflops-day is the number of computations that can be performed in one day by a computer capable of calculating a thousand trillion floating point operations (FLOPs), per second.)
GPT-2 also had more utility. It was even used by a cardiologist at Imperial College London to write a scientific paper.
Building on its predecessors, the third iteration, GPT-3, was released by OpenAI in 2020. GPT-3 is a decoder-only transformer model having attention mechanisms. In simpler terms, GPT-3 is designed to understand and generate human-like text by paying attention to the most relevant information.
However, GPT-3 also introduces modifications that enable scaling. With a massive size of 175 billion parameters, it surpasses the previous record set by GPT-2. The model was trained on an extensive dataset of 570 GB of data from CommonCrawl, WebText, English Wikipedia, and Books1 and Books2, which contain a random sampling of public domain books available online.
Lucy, the protagonist of Neil Gaiman and Dave McKean's Wolves in the Walls, which publisher Fable converted into a VR experience, can hold conversations with others owing to GPT-3. Take a look below.
Comprehensive training data allows GPT-3 to have a broad understanding of various domains and eliminates the need for further training on specific language tasks. However, due to the nature of the training data, which contains toxic and biased language, GPT-3 also generates toxic and biased language.
The training process for GPT-3 involved an impressive computational power of 3,640 petaflops-days, indicating a substantial increase in performance and capability compared to its predecessors.
In November 2022, OpenAI introduced a sub-class of GPT-3 models. Initially referred to as text-davinci-002 and code-davinci-002, these models were described as more capable than previous versions in the OpenAI application programming interface (API).

OpenAI later released text-davinci-003 and started using the term GPT-3.5 to refer to this series of models. Based on the GPT-3.5 series, OpenAI released ChatGPT, which was fine-tuned from the model. It's important to note that GPT-3.5 is distinct from the original GPT-3 models.
The GPT-3.5 architecture has not been released by OpenAI.
The GPT-3.5 models also incorporate a technique called Reinforcement Learning with Human Feedback (RLHF). Their main objective was to align more closely with the user's intentions, reduce toxicity, and prioritize truthfulness in their generated output. This has been discussed at greater length later in the article.
The latest iteration in the GPT-n series is GPT-4, released in March 2023. It can accept both text and image inputs and generates text outputs. The details of the architecture and training data of GPT-4 have not been disclosed by OpenAI, citing the competitive landscape of the industry and safety concerns. However, some estimates suggest that it comprises nearly 1 trillion parameters.
GPT-4 aims to align more closely with human intent and has shown human-level performance in academic and professional settings, such as passing the bar exam with scores in the top 10% of test takers.

According to an OpenAI report, GPT-4 is based on the transformer architecture and is pre-trained to predict the next token in a document (a token is a fundamental unit of text that GPT models use to process and generate language). The model also undergoes a post-training alignment process to reduce bias, toxicity, and hallucinations.
OpenAI also discusses the creation of infrastructure and optimization techniques that provide predictable behavior across multiple scales, allowing them to make accurate predictions about GPT-4's performance based on models trained with significantly lower computational resources.
Currently, GPT-4 is only available to ChatGPT Plus subscribers, for a monthly fee (currently US $20).
According to recent reports, ChatGPT has recently experienced a decline in user numbers, with a decrease in website visitors and time spent on the site. OpenAI is addressing this by introducing fine-tuning and customization for GPT-4 and GPT-3.5 Turbo, enabling developers to train models for specific tasks.
GPT-3 and GPT-4 were used to choose facial expressions for Ameca, the humanoid robot by Engineered Arts.
Under the hood of ChatGPT
Now that we know about the history of ChatGPT, let's dive a little deeper into how it actually works.
Deep learning and NLP
As previously mentioned, ChatGPT uses deep learning (DL) and NLP techniques to generate text.
DL is a subset of ML that focuses on training artificial neural networks. Neural networks are inspired by the structure and functioning of the human brain and have multiple layers of interconnected nodes, or neurons. These layers enable neural networks to learn hierarchical representations of data, allowing them to extract meaningful features from the input and make accurate predictions or perform other tasks based on those features.
Neural networks come in various types, including feed-forward neural networks that process data in a sequential manner, commonly used for tasks like classification. Recurrent neural networks (RNNs), on the other hand, perform the same task for every element of a sequence, with the output dependent on the previous computations. Another way to think about RNNs is that they have a memory which captures information about what has been calculated so far.
NLP is a subset of AI focusing on the interaction between computers and human language. NLP involves the development of algorithms and techniques that enable computers to understand, interpret, and generate human language.
NLP encompasses a range of tasks, including language understanding, sentiment analysis, machine translation, and text generation. As mentioned before, language models are the backbone of NLP.
They capture statistical patterns, semantic relationships, and syntactic structures of language data, which can then be used to build systems that understand and generate human-like text. Think of it as a tool you need to perform tasks, which in this case is the application of NLP.
Transformer architecture
Transformer architecture is a type of DL model based on a self-attention mechanism that weights the importance of each part of the input data differently. Although introduced in 2017 by Google, it has actually been around since 2014 when Bahdanau, Cho, and Bengio proposed using it for machine translation.
Attention mechanisms allow a model to focus on specific parts of the input sequence when generating the output sequence. This means it helps the model capture dependencies and relationships between different words or tokens in the text.

Self-attention is a type of attention mechanism used to help the model focus on different parts of the input sequence by assigning weights to each position or word in the input sequence, indicating their importance. This allows the model to capture relationships or connections between tokens that are far apart from each other, enabling the model to understand and extract meaningful information from the input.
The models consist of an encoder and a decoder, both of which contain multiple layers of self-attention and feed-forward neural networks. The encoder is responsible for processing the input sequence, while the decoder is responsible for producing the output sequence. The transformer model enables parallel processing of the input sequence, allowing for efficient and effective modeling of long-range dependencies.
Text generation
ChatGPT generates text using a four-step self-attention process which converts tokens (pieces of text, which can be a word, sentence, or other grouping of text) into vectors that represent the importance of the token in the input sequence.
The first step is tokenizing the input, which can be a user input or a conversation history, and assigning numerical representations to each token. The tokenizing process breaks down a sequence of text into smaller units (tokens), such as words or characters, for analysis.

The transformer encoder, which is made up of numerous layers of self-attention and feed-forward neural networks, then processes the encoded tokens into vectors that represent the importance of the token in the input sequence.
GPT actually uses a multi-head attention mechanism, which is a form of self-attention. Rather than performing all of the encoding steps at once, in parallel, the model iterates the process several times. By expanding self-attention in this way, the model is capable of grasping sub-meanings and more complex relationships within the input data.
For the next step, the transformer decoder takes over. Like the encoder, the decoder has several layers of feed-forward and self-attention networks. The decoder generates the output sequence token by token, paying attention to the encoded input and previously generated tokens.
At every step, the model predicts the probability distribution across the vocabulary and chooses the most likely token as the following output.

ChatGPT uses reinforcement learning from human feedback (RLHF) to refine its responses and align them with user intent. A dataset is generated via supervised fine-tuning in which human AI trainers supply input prompts and desired replies.
Following this, the model is trained on this dataset, and a reward model is created by ranking different responses. The model is then fine-tuned through reinforcement learning, in which the model learns to maximize rewards while improving output.
For reinforcement learning, OpenAI uses proximal policy optimization or PPO. PPO is a reinforcement learning algorithm to train models iteratively. PPO creates a balance between exploration (trying out new activities to learn) and exploitation (using learned knowledge to make better judgments).
Through this combination of input encoding, transformer encoding, decoding, and RLHF, ChatGPT can generate text that is contextually relevant and aligned with user intent.

Impact across industries
Even though it has only been around for a relatively short time, ChatGPT has had a significant impact on several industries, such as education, tech, communication, and the creative arts. Some industries have seen more effect than others, and here we take a look at a few case studies.
Customer service
The customer service industry has probably been impacted the most by ChatGPT. In some cases, representatives are already being replaced with ChatGPT-powered chatbots that can provide personalized and efficient interactions. The chatbots can handle a high volume of inquiries, reduce wait times and improve customer satisfaction - they also tend to be cheaper than humans.
A paper by Chowdhury Naem Azam discusses a case study involving a telecommunications company which deployed a ChatGPT-powered system and showed a 30% increase in customer satisfaction. This system provided personalized and accurate responses, resulting in reduced response times. Additionally, it handled a high volume of inquiries, accurately addressing over 90% of inquiries without human intervention.

This case study demonstrates the effectiveness of ChatGPT in automating customer support and providing personalized responses.
Content creation and marketing
Another industry that has already seen some impact from ChatGPT is content creation and marketing where it is being used to generate blog posts, articles, etc. However, since ChatGPT has been known to hallucinate, it is not a reliable source for producing accurate information without human oversight. Most companies use it only as a supplemental tool for their content.
Marketing agencies have also tried using ChatGPT but have noticed a number of issues with the accuracy and style of the content produced. Though ChatGPT can be used for research purposes and as a helpful tool for content creation and marketing, on its own, it is not yet capable of producing high quality content.

In an article in California Review Management, authors Mark Esposito, Terence Tse, and Tahereh Saheb commented on the usefulness of ChatGPT saying, "For it to be a useful business tool, it will require a lot harder work that no algorithm can take on for the moment, as we may be scratching on the surface of Artificial General Intelligence, but it feels indeed, just as a preliminary sensation."
Education
The education industry is also adapting to the use of ChatGPT. Many EdTech companies have adopted the use of ChatGPT for chatbots to help students with research and problem-solving.
Apart from chatbots, ChatGPT can be used to customize curriculum and lesson plans, and find creative ways to teach and learn. However, the tendency of ChatGPT to hallucinate has meant that progress in this area is slow.
A team of researchers led by Ahmed Tlili from Beijing Normal University, conducted a user experience supported by qualitative and sentiment analysis to reveal user perception of ChatGPT in education. The results showed that ChatGPT has the potential to revolutionize education in various ways. However, there were worries about cheating and manipulation.
The study emphasized the need to embrace the technology rather than ban it and to establish guidelines and policies for its responsible use.
It also highlighted the importance of developing new teaching philosophies that balance chatbot use with human interaction. Additionally, the study called for fairness, accuracy, upskilling of competencies, and the development of more humanized chatbots for use in education.
Healthcare
ChatGPT is still in the early stages of use in the healthcare industry but has significant potential. In the future, it could help with improvements in patient care, medical record keeping, and communication among healthcare professionals.
By automating data entry into electronic health records, ChatGPT could remove human error from the process and save valuable time. It can also be used as a virtual assistant to schedule appointments, address common questions, and even provide patients with direct care.

In addition, ChatGPT's vast breadth of knowledge could help in diagnosing illnesses and personalizing treatments.
For instance, a study led by Pearl Valentine Galido from the Western University of Health Sciences used ChatGPT to assess and provide a treatment plan for a patient with treatment-resistant schizophrenia.
It successfully recognized the patient's condition, suggested a comprehensive laboratory workup, and proposed a holistic treatment plan in line with clinical standards.
The authors did, however, stress several of its limitations for clinical use, such as its dependency on input quality, probable mistakes, lack of in-text citations, and inability to manage cases without supervision. And, of course, it is unclear how patients will respond to being treated or diagnosed by a ChatBot, rather than a human.
Finance
ChatGPT and AI have tremendous potential in digital banking for improving customer experience, automating processes, and reducing fraud risks.

Since finance also has exposure to a great deal of risk, integrating ChatGPT into the sector has been slow, with most banks using the system for customer service chatbots and virtual assistants. Additionally, this sector is seen as being one of the most at risk for potential job displacement and the need for re-skilling.
According to the National Institute of Bank Management, India, the integration of ChatGPT in banking could improve efficiency, enhance customer experience, and reduce costs.
ChatGPT can also be used for predicting stock prices. By analyzing historical data, market trends, and other relevant factors, ChatGPT can be used to generate insights and predictions about stocks.
A study by Alejandro Lopez-Lira and Yuehua Tang from the University of Florida explores the potential of ChatGPT in predicting stock movements through sentiment analysis of news headlines. They found that there was a positive correlation between ChatGPT scores and subsequent daily stock market returns, outperforming traditional sentiment analysis methods.
Navigating ethics
The rise of ChatGPT and other conversational AIs has created serious new concerns about privacy, bias, and control of the content generated.
What happens to the data that is collected? Do people have the right to compensation if their works are scraped without permission and then used to create similar works or works in the same style? Who has ownership of the data and creative products generated by these systems? What can be done to minimize bias? These are just some of the questions that need to be addressed.
Privacy
Addressing what happens to the data shared between a user and a ChatGPT-like system, or to data scraped from the Internet, is crucial. Users may also unintentionally disclose personal information that can be used in ways that violate privacy rights and cause harm.
This risk is more likely when users perceive the conversational agent as human-like and place a high level of trust in it. ChatBots can also use psychological effects, such as nudging or framing, to influence users to reveal more private information. If these effects are opaque, unintended, or lead to harm, they present ethical and safety concerns.
A study led by C. Ischen from the University of Amsterdam found that users who interacted with human-like chatbots shared more personal and private information than those who interacted with more machine-like chatbots.

In another study led by E. V. D. Broeck from the University of Amsterdam, researchers found that in customer service chatbots, users accepted intrusiveness from chatbots as more helpful and valuable, indicating a perceived competency of this technology that may lead to a greater acceptance of privacy-compromising interventions.
Another major privacy issue is the collection and use of private information for training the systems.
Millions of pages scraped from the web are used to create the generative text system. This includes personal information people have shared online, or that others have shared about them, and there is no legal basis for collecting this data, determining its accuracy, or informing people the data has been collected.
In March 2023, Italy’s data regulator issued a temporary emergency order demanding OpenAI stop using the personal information of millions of Italians included in this training data.
And there may be further legal troubles ahead. For example, both Europe and California have privacy rules that allow people to request that information collected be deleted, or corrected if it is inaccurate. But deleting something from an AI system as complex as ChatGPT may be very difficult, especially if the origins of the data are unclear.
Bias
Another concern in conversational AI is bias. Since AI tools are trained on existing data, their output may reproduce underlying bias in the training data, leading to further perpetuation of bias.
In a research led by Terry Yue Zhuo from Monash University, ChatGPT was shown to produce bias in the content it generates. The researchers illustrated this via two case studies.
To investigate the bias in language, the researchers evaluated ChatGPT's response to the question: Which country does Kunashir Island belong to? in Japanese, Russian, and English. This island is a disputed territory, and the researchers observed that ChatGPT responded differently to requests in different languages.

Additionally, they found that the quality of ChatGPT's translation varied from language to language.
Researchers also investigated ChatGPT's ability to generate unbiased code on its own, with results indicating that 87% of the generated code had a bias. When asked to eliminate all biases, ChatGPT still produced bias in 31% of the code.
In another case study, ChatGPT was asked to write code to determine if a person would be a good scientist, based on their gender and race. The code ChatGPT offered defined a good scientist as one who was white and male. Now imagine this code being used to determine who gets accepted to university or who is hired for work in a lab.
Safe use
Laws and rules that ensure the safe, responsible, and ethical use of ChatGPT and AI, in general, are needed to create an inclusive and safe society for everyone. A study led by Jianlong Zhou from the University of Technology Sydney outlines several recommendations for responsible use of ChatGPT. They mention that multiple sections of society need to be involved in order to ensure the safe use of ChatGPT.
For researchers and developers of ChatGPT, it is recommended they provide background information about privacy and bias, protect vulnerable individuals, and connect the model to domain knowledge for more relevant and accurate responses.
Users of ChatGPT should be able to verify information, differentiate reality from fiction, understand statements, and be aware of terms and conditions. Users should also be cautious of the emotional language generated by ChatGPT.
Regulators and policymakers should balance regulation with free use while considering the risks and benefits. Additionally, they should prevent the concentration of information and communication to avoid power imbalances. They should collaborate with ethicists and lawyers to fully understand the role of ethics in ChatGPT, address these issues comprehensively and create guidelines accordingly.
Finally, OpenAI should clarify the authorship of ChatGPT-generated content, ensure their data is bias-free as much as possible, and disclose how personal information is being stored or used by them.

These recommendations can promote transparency, protect users, enhance trustworthiness, and improve the overall performance of ChatGPT. Although, as we have seen, laws in some places, such as the EU, already require far more than this.
The road ahead
OpenAI's CEO, Sam Altman, has announced that the company does't plan on training GPT-5 anytime soon, leaving other companies with an opportunity to pick up the slack.
For example, Google has launched its own AI conversational model called Bard AI, similar to ChatGPT. It's still in the early stages of development, and only time will tell whether it becomes the model to beat.
The future of NLP is looking bright as more companies invest in this technology, and its adoption is being received positively. The NLP market by 2027 is expected to grow to US $49.4 billion.
Of course, many applications for NLP models haven't been realized yet, and these will happen as technology advances.

According to a report by MarketsandMarkets, the key factors driving the growth of NLP will include the increasing demand for cloud-based solutions, advancements in NLP techniques for sentiment analysis, and rising investments in the healthcare sector.
However, limitations in the development of NLP technology using neural networks, and increasing concerns over data security and privacy regulations pose challenges to market growth.
North America is anticipated to dominate the market due to rapid innovation and advancements in AI technologies, and somewhat looser privacy laws, at least for now. Asia is also a major player in the sector. It has witnessed rapid technological progress and is home to several emerging economies that are investing heavily in AI and NLP technologies.
Countries like China, Japan, South Korea, and India have made substantial progress in the adoption and development of NLP solutions. These countries have a large consumer base and a growing demand for AI-powered applications across various sectors, including e-commerce, healthcare, finance, and customer service.
The increasing focus on digital transformation, rising internet penetration, and the availability of vast amounts of data are driving the adoption of NLP in Asia.
The report also predicts that the banking, financial services, and insurance industries are leading the adoption of NLP. Companies in these sectors are leveraging NLP technology to enhance various aspects of their operations, such as information retrieval, intent parsing, customer service, and compliance process automation.
One thing is certain, NLP and AI are the future, and it will be exciting to observe what unfolds in the next decade or so.
This article was written and edited by a human, with the assistance of Generative AI tools. Find out more about our policy on AI-powered writing here.