Understanding Language AI and LLMs: Evolution, Application and Ethics.

6 min readDec 20, 2024

Insights from Chapter 1 of Hands — On Large Language Models.

Introduction to Language AI:

Language AI — focuses on creating system capable of understanding and generating Natural Language, bridging gap between human communication and machine processing. These system are designed to analyze text, interpret its meanings and produce text, that feel natural and relevant, making them essential for applications like virtual assistants, translation tools, and content generation. By utilizing huge amount of data and advance algorithms language AI enables machines to mimic human language with accuracy and efficiency, which revolutionizing how we interact with technology.

What are Large Language Models (LLMs)?

LLMs are advance AI systems designed to understand, analyze and generate Natural language like humans. These models have a wide application used in the sentiment analysis, chatbots and many more. What makes them standout is their ability to learn and process the human natural language from a vast amount of data, when helps them to produce a contextually relevant and coherent response to human questions.

Evolution of LLMs — How did we get Here?

The research on the having a system capable of processing and learning human language started in 1950’s and evolved over time.

1. Bag of Words (BoW):

The simplest form to represent the collection of words aka tokens.
but It ignores the Grammar and context, lacks understanding of relation between the words, for-example it treats “I love AI” and “AI love I” as same.

Image — credit (google: bag of words visualization )

2. Embeddings Model (Word2Vec):

image — credits: Word2vec Paper (source: Arxiv)

A Leap forward a new model introduced known as Word2Vec.
It Maps the words in to vectors space, capturing their meanings and context relationships.
Its the opens doors for the Language models to understand language relations.

Introduction to Transformers:

Transformers have significance in advance Natural Language Processing by enabling models to grasp the context and relationship within the language effectively. Unlike traditional models, Transformers used an “Attention” mechanism, which helps them to assess the importance of each word in a sentence, regardless of its position.

Transformer Architecture:

BERT and GPT architecture. (image_credits:https://heidloff.net/)

1. Encoder — Decoder Block:

Encoders: are responsible to process the input sequence to generate a meaningful representation.
Decoders: designed to utilize the output of encoders and predict the next token and generate the final output.

2. Self — Attention Mechanism:

Allows the model to focus on relevant parts of input sequence and capturing the context irrespective of distance among them.
It Depends upon the words regardless of their position for context understanding.

3. Positional Encoding:

Since Transformers don’t process data sequentially, we add positional encoding which provides information about the position of words in the sequence.

Implementation of Self — Attention using PyTorch:

Following code demonstrate of Self — Attention.

import torch
import torch.nn.functional as F
from typing import Tuple

def self_attention(query: torch.Tensor, key: torch.Tensor, value: torch.Tensor) -> Tuple[torch.Tensor, torch.Tensor]:

    """
    Computes the self--attention scores and output.
    """  
    scores = torch.matmul(query, key.transpose(-2, -1))/torch.tensor(query.size(-1), dtype=torch.float32)

    attention_weights = F.softmax(scores, dim=-1)

    output = torch.matmul(attention_weights, value)
    
    return output, attention_weights

# Examples Usage
# providing value as (batch_size, sequence_length, embedding_dim)
query = torch.rand(1, 5, 64)
key = torch.rand(1, 5, 64)
value = torch.rand(1, 5, 64)
  
output, attention = self_attention(query, key, value)
print(f"Output:{output}")
print(f"Attention Weight:{attention}")

Output:tensor([[[0.4182, 0.4089, 0.2600, 0.6970, 0.6088, 0.4115, 0.4008, 0.4579,
          0.4378, 0.6211, 0.6303, 0.5175, 0.5443, 0.4821, 0.4584, 0.5885,
          0.2551, 0.6553, 0.2939, 0.5931, 0.2483, 0.3491, 0.7229, 0.7293,
          0.5425, 0.2926, 0.4966, 0.5931, 0.4462, 0.5146, 0.7086, 0.4190,
          0.6045, 0.4947, 0.4273, 0.5682, 0.4216, 0.4610, 0.4926, 0.7615,
          0.2698, 0.6018, 0.5071, 0.6624, 0.3652, 0.6379, 0.5813, 0.4958,
          0.4063, 0.5036, 0.4916, 0.2885, 0.6401, 0.5129, 0.5682, 0.3387,
          0.3674, 0.5818, 0.3626, 0.1584, 0.6250, 0.1831, 0.2818, 0.5192],
         [0.4192, 0.4087, 0.2603, 0.6976, 0.6078, 0.4105, 0.4024, 0.4594,
          0.4393, 0.6215, 0.6308, 0.5172, 0.5450, 0.4812, 0.4592, 0.5880,
          0.2562, 0.6550, 0.2937, 0.5918, 0.2484, 0.3505, 0.7230, 0.7299,
          0.5422, 0.2920, 0.4959, 0.5922, 0.4443, 0.5156, 0.7071, 0.4195,
          0.6053, 0.4961, 0.4262, 0.5669, 0.4201, 0.4615, 0.4941, 0.7619,
          0.2700, 0.6003, 0.5081, 0.6634, 0.3648, 0.6372, 0.5812, 0.4956,
          0.4068, 0.5023, 0.4903, 0.2880, 0.6413, 0.5149, 0.5675, 0.3384,
          0.3668, 0.5806, 0.3631, 0.1585, 0.6258, 0.1830, 0.2809, 0.5204],
         [0.4191, 0.4080, 0.2608, 0.6969, 0.6096, 0.4111, 0.4007, 0.4582,
          0.4396, 0.6205, 0.6305, 0.5183, 0.5424, 0.4805, 0.4576, 0.5900,
          0.2552, 0.6547, 0.2940, 0.5919, 0.2484, 0.3498, 0.7243, 0.7286,
          0.5437, 0.2928, 0.4967, 0.5930, 0.4451, 0.5155, 0.7070, 0.4193,
          0.6053, 0.4953, 0.4268, 0.5673, 0.4204, 0.4628, 0.4943, 0.7629,
          0.2695, 0.6006, 0.5056, 0.6613, 0.3669, 0.6384, 0.5821, 0.4949,
          0.4078, 0.5035, 0.4903, 0.2898, 0.6394, 0.5129, 0.5676, 0.3392,
          0.3674, 0.5804, 0.3634, 0.1584, 0.6253, 0.1827, 0.2810, 0.5204],
         [0.4183, 0.4091, 0.2597, 0.6966, 0.6089, 0.4119, 0.4004, 0.4575,
          0.4374, 0.6212, 0.6301, 0.5174, 0.5445, 0.4827, 0.4583, 0.5883,
          0.2549, 0.6553, 0.2938, 0.5937, 0.2481, 0.3488, 0.7227, 0.7295,
          0.5418, 0.2924, 0.4971, 0.5930, 0.4465, 0.5141, 0.7090, 0.4192,
          0.6047, 0.4943, 0.4272, 0.5687, 0.4221, 0.4608, 0.4920, 0.7616,
          0.2697, 0.6023, 0.5070, 0.6626, 0.3651, 0.6377, 0.5809, 0.4963,
          0.4059, 0.5034, 0.4922, 0.2882, 0.6401, 0.5128, 0.5682, 0.3385,
          0.3679, 0.5821, 0.3625, 0.1586, 0.6250, 0.1831, 0.2817, 0.5189],
         [0.4188, 0.4085, 0.2601, 0.6967, 0.6091, 0.4118, 0.4005, 0.4573,
          0.4382, 0.6209, 0.6300, 0.5179, 0.5430, 0.4820, 0.4575, 0.5895,
          0.2549, 0.6548, 0.2936, 0.5929, 0.2484, 0.3493, 0.7233, 0.7287,
          0.5433, 0.2926, 0.4968, 0.5934, 0.4464, 0.5146, 0.7080, 0.4192,
          0.6046, 0.4948, 0.4276, 0.5684, 0.4216, 0.4620, 0.4931, 0.7619,
          0.2698, 0.6012, 0.5059, 0.6619, 0.3663, 0.6386, 0.5816, 0.4953,
          0.4072, 0.5040, 0.4915, 0.2892, 0.6396, 0.5122, 0.5682, 0.3384,
          0.3677, 0.5813, 0.3629, 0.1585, 0.6249, 0.1830, 0.2817, 0.5195]]])
Attention Weight:tensor([[[0.2011, 0.2015, 0.2001, 0.1982, 0.1991],
         [0.2020, 0.2037, 0.1999, 0.1964, 0.1980],
         [0.2007, 0.2033, 0.1982, 0.1999, 0.1979],
         [0.2011, 0.2009, 0.2001, 0.1979, 0.2001],
         [0.2001, 0.2020, 0.1994, 0.1994, 0.1990]]])

Application of Large Language Models (LLMs):

image — credits: Large Language Model (source: ArXiv)

LLMs have impacted various industries by opening doors of advance Natural Language Processing capabilities. Here are some noteworthy applications:

Sentiment Analysis: LLMs can asses the sentiment behind text — data, helping business in understanding their client opinions and improving products or services.
Summarization: Models like GPT — can condense a very large volumes of text in to a very concise summaries, aiding in understanding information and decision — making.
Translation: Applications powered by LLMs facilitate real — time translation, breaking language barriers and opening global opportunities and improving communications.
Chatbots and Virtual Assistance: LLM’s powered conversation bots and agents provides automated customer support, answering their queries and improve user engagement.
Content Creations: They helps in streamlining the workflow in marketing and media, They can produce the text for blogs, articles, and ads. They even helps in writing complex reports, by analyzing the data.

What are Ethical Considerations in development of LLMs?

While Large Language Models are very powerful with significant benefits, there are some ethical considerations that must be answered:

Bias and Fairness in AI: Its very often is Large Language Models get biased to training data, which cause unfair and discriminatory responses. It’s very crucial to address these issues and ensure fair responses for everyone.
Addressing Hallucination Risks: Sometimes LLMs generate the information that sounds correct but that totally wrong. These are known as “Hallucinations” which can confuse or mislead decisions.
Generating Harmful Content: An LLM can be used to generate fake — news, articles or other misleading information that can defame or threat someone, which raise and ethical concerns.
Respecting Privacy: Training these models requires huge of data, often pulled from internet. This raise question about using people’s personal data without their consent.

More insights on LLMs and their impact coming up in the next chapters — stay tuned!