My GenAI Thesaurus

Last updated: August 12, 2024.

Everyone today heard or read about GenAI, ChatGPT and other AI things. There are a lot of terminology, abbreviations and other clever words. I found myself troubled to remember all of them. So I decided to write down my definitions for each of these terms.

TL;DR; If you are familiar with AI terminology, you might want to stop reading this post here. For the rest of the readers, welcome to my personalized thesaurus of GenAI terminology.

This post will be updated once in a while when I learn about a new term. Or when I find out that my understanding of known term need an update. The “Last updated” at the top of the post will track the latest update. If you come to this post more than once, look for 🆕 to find terms that were added or changed in the last update.

Formatting

For my own accountability, I will do my best to describe each term according to publicly available “official” definition and then to give it my own clarification when, in my opinion, the definition is not sufficient or ambiguous. All terms will be in the Alphabetic order.

Thesaurus

AI Agent

At its essense, an application that uses GenAI to provide interactive experience to a user. Some academic courses and publications distinguish between AI agents and other types of applications that use AI. For example they may differentiates between AI Agents, Chat assistants and data generation and/or augmentation applications. So, when you see “AI agent”, look for the context to understand what this term means.

Chain-of-Thoughts

The method of creating prompts that mirrors the way people think. The chain-of-thoughts prompting helps model “to understand” complex reasoning through intermediate reasoning steps. Classical example of the chain-of-thoughts is the following prompt:

I went to the market and bought 10 apples.
I gave 2 apples to the neighbor and 2 to the repairman.
I then went and bought 5 more apples and ate 1.
How many apples did I remain with?
Let's think step by step.

The last instruction (“Let’s think step by step”) triggers the model to apply reasoning in steps instead of trying to reason the whole prompt. Resulting in the right answer.

Do not confuse it with “prompt chaining” which is a sequence of prompts where each next prompt provides additional explanations as a response to the previous output of the model. A chat with LLM is an example of “prompt chaining”.

Embedding

is just an ordered collection of weights or coefficients stored as floating point numbers that represent an item. The collection is a vector pointing to a position on d-Dimensional map where d is a number of dimensions. The size of the collection equals to the number of dimensions. The item can be a token (see model training) or a data (e.g. an image).

Generative model

A type of machine learning model that can create novel outputs based on its training data. At its simplest, the model generates new data that looks like a certain set of categories that it was trained on. LLMs is one of the generative model types. Other types of models can be generative as well.

Generative AI or GenAI

Generative AI describes AI applications capable of creating and running inference on text, images, videos and audio. Gen AI solutions are typically based on generative models. The term is often used for both applications and the models themselves (Definition in Wiki).

This is a generic term much like AI or ML. It covers an area related to products that are being based on LLMs.

LangChains

Is an Open Source framework to build applications that use LLMs by chaining interoperable components. Today it became a synonym to the methodology of creating GenAI applications in general and not necessarily those that use this particular framework. The method usually means to use multiple requests to one or more LLMs together with calls to Vertex databases or other data sources.

Large Language Model or LLM

A computational (often referenced as a neural network AI) model with many parameters that can perform a wide variety of general-purpose language tasks, such as generating, classifying, or summarizing text (Definition in wiki). LLM is used as a generic term to describe different types of of large models. I guess, it is made for simplicity. Instead of write a list of LVM (Large Video Model), LAM (Large Audio Model), etc. The model is defined by the type of data that was used to build the model. For example, Large Video Model is built using video recording as input. There are large models that use more than one data type or modality as input. These models are usually referenced as multimodal. For example, Google released multimodalembedding model as one of its multimodal models. All these models are usually referenced as LLMs in posts, papers and documentation. You can deduce the type of the model by input and output the application produces. If it is able to accept multiple modalities or accepts as input a text but returns an image, this is multimodal model.

Practically speaking LLM is used to reference the model itself which is a large database of embeddings that was generated after multiple iterations of training. Such database is often referenced as a vertex database. Or, LLM is used to reference a software that includes the model and the program that accepts requests (a.k.a. prompts) and generates new content (a.k.a. inference output). NOTE: this is a very naive description of the process.

Machine learning model

In a very general way, a construct that can find patterns or make decisions from a previously unseen data.

Model training

is a process of feeding input data (a.k.a. training dataset) to some algorithm which results in the creation of machine learning model. The training can have multiple iterations when the results of the previous iteration are improved through adjusting different parameters of the algorithm. The simplified description of the algorithm would include 3 steps:

  • Tokenizing – breaking data into smaller chunks that can be processed further (a.k.a “tokens”)
  • Encoding – converting the tokens to numerical representation
  • Evaluation – assigning the encoded tokens coefficients reflecting relevance of the token related to other tokens (a.k.a. “weights)

Prompt

A request in a certain format that LLM or other generative model can understand. Usually the request is a text instruction describing the output that is expected from the model. The request can be a video or audio. Or it can be a mixed of multiple types of data – multi-modal request. Multimodal requests can be sent only to multimodal models.

Retrieval-augmented generation or RAG

A technique for enhancing the accuracy and reliability of generative models with facts fetched from external sources. Practically, RAG means that the application pre-process the request to LLM by gathering additional information and enhancing or improving (a.k.a. augmenting) the prompt that will be send to the model.