📑Primer: AI-ML-DL-NN-LLM

Abbreviated sketch of key terms and relationships with regard to AI and related components.

Before we can discuss how Lumerin can work with Morpheus to enable decentralized, web3 based, AI routing, we need to start with an overview of the concept, technology and terms. This is not meant to be exhaustive nor authoritative, but a high level sketch to get most folks started.

Artificial Intelligence (AI)

Artificial Intelligence (AI) represents the broadest field dedicated to creating machines capable of performing tasks that typically require human intelligence.
Attempts to train computers to think and learn as humans do using interconnected, artificial "neurons" with deep learning algorithms to solve complex problems
- Computer Vision - extract information and insights from image and videos
- Speech Recognition - analyze human speech across wide variety of variables (accent, pitch, tone, language, etc..)
- Natural language processing - use deep learning to gather insights and meaning from text-based data
- Recommendation engines - track user activity to deliver personalized recommendations
- Generative - generating text, images or videos based on input

Machine Learning (ML)

Within AI, Machine Learning (ML) emerges as a subset that focuses on algorithms and statistical models that enable computers to perform specific tasks without explicit instructions, relying instead on patterns and inference.
- Supervised Learning: known and categorized data set as a "starter", with human-based correction
- Deep Learning (DL) is a specialized subset of ML that employs neural networks with many layers (deep neural networks) to analyze large amounts of data, making it particularly effective for tasks like image and speech recognition.
  - The challenge is that deep learning requires large quantities (petabytes) of high quality data and large compute resources to process that data and can have millions of parameters for queries.
  - Efficient processing of unstructured data
    Pattern discovery
    Unsupervised learning
    Volatile data processing
Google AI Team provides a great set of YouTube videos, here are just two:
- What Is Machine Learning?
- Seven Steps of ML - Google AI Team
  - Model - Question answering system
  - Training - answer questions accurately most of the time
  - Features - color (wavelength) and alcohol percentage
  - Learning Process - Gather & Prepare Data, Choose Model & Train
  - Evaluation - once trained, use the set-aside initial data to validate model
  - HyperParameter Tuning - test assumptions and try other values
  - Prediction/Inference - use data to answer the question...Use random color and alcohol percentage as input, will determine wine v beer

Neural Networks (NN)

Neural Networks (NN), fundamental to deep learning, are architectures inspired by the human brain's structure and function, designed to recognize patterns through a complex system of nodes and connections.
- Nodes - artificial "neurons" that use mathematical calculations to process data
- Base Components
  - Input Layer - nodes that input data, process, categorize and pass to further layers of the NN
  - Hidden Layer(s) - interconnected nodes/multiple layers for input analysis sometimes from several different angles ...eg: multiple hidden layers can be trained to recognize different features in animal categorization (#of legs, fur/no fur, eye shape, etc...)
  - Output Layer - nodes that output data eg: output that is "yes" or "no" as only two nodes in the output layer
- Types of neural networks
  - Feedforward - process data in one direction
  - Back-propagation Algorithm - Reinforcement of path from Input node to output nodes using guess and validation feedback loops
  - Convolutional - perform specific mathematical functions like summarizing or filtering (eg: image attribute classification)
- Deep neural networks or deep learning networks uses weighted relationships between nodes to make better predictions or inferences by using corrective feedback loops and these must be trained with various machine learning techniques

Large Language Models (LLM)

Large Language Models (LLMs) are a specific application of neural networks, designed to understand, interpret, and generate human language, exemplifying the advancements in deep learning and AI's capability to process and produce natural language content.
Very large deep learning models that are pre-trained on vast amounts of data relying on a set of neural networks
- Encoder/Decoder -extract meaning from a sequence of text and understand the relationships between words and phrases
- Self-attention
- Requires little domain training
  - Few shot or Zero Shot - model is able to recognize with little or no training

{WIP}

{Asked GitBook AI to Generate a paragraph on LLM:} Large Language Models (LLMs), such as GPT (Generative Pre-trained Transformer) and BERT (Bidirectional Encoder Representations from Transformers), represent the cutting edge of AI's ability to process and generate human-like text. These models are built upon deep neural network architectures, particularly transformers, that have been pre-trained on extensive datasets covering a vast swath of human knowledge. This pre-training enables LLMs to understand context, generate coherent and contextually relevant text, and perform tasks like translation, summarization, and question-answering with minimal additional training. Their versatility and capability have made them invaluable tools in developing applications that require a deep understanding of language, from automated customer service bots to sophisticated content creation tools.

Key Links

LLM References

PreviousOverview NextDesign Principles

Last updated 1 year ago