AI Concepts - Grokipaedia

AI Concepts

The fundamental building blocks of artificial intelligence—explained clearly and understood deeply.
🔤
Tokens
The atoms of AI language
AI models don't read words—they read tokens. A token is a chunk of text that could be a word, part of a word, or even a single character. "Understanding" might be one token, while "artificial" could be split into "art" and "ificial".
Why it matters:
Tokens determine cost (APIs charge per token), context limits (models have maximum token windows), and processing speed.
Example:
"Hello world" = 2 tokens | "Artificial intelligence" = 3-4 tokens depending on the model
Fundamental Pricing Context Windows
🔄
Transformers
The architecture that changed everything
Before transformers, AI struggled with long-range dependencies. The transformer architecture introduced self-attention—allowing models to weigh the importance of every word relative to every other word, regardless of distance.
The breakthrough:
The 2017 paper "Attention Is All You Need" proved you could build powerful language models without recurrent networks. GPT, BERT, Claude, Gemini—all built on transformers.
Architecture Revolutionary 2017
🪟
Context Window
The model's working memory
Context window is how much information an AI can "remember" at once. GPT-4 has 128K tokens (~96,000 words), Gemini 1.5 Pro has 2M tokens (~1.5 million words). Everything you send—your prompt, the conversation history, documents—lives here.
Real-world impact:
Larger context = analyze entire codebases, process full books, maintain longer conversations without forgetting. But it costs more and runs slower.
Comparison:
• GPT-4: 128K tokens ≈ 300 pages
• Claude Sonnet 4.5: 200K tokens ≈ 470 pages
• Gemini 1.5 Pro: 2M tokens ≈ 4,700 pages
Memory Capacity Tradeoffs
🎯
RLHF
Teaching AI what humans actually want
Reinforcement Learning from Human Feedback (RLHF) is how we make models helpful instead of just accurate. Humans rank different responses, and the model learns to prefer what people actually want—not just what's technically correct.
Before RLHF:
Model: "Here are 10 ways to hack someone's account."
After RLHF:
Model: "I can't help with that, but I can explain cybersecurity best practices."
Training Alignment Safety
🌡️
Temperature
The creativity dial
Temperature controls randomness in AI outputs. Low temperature (0.0-0.3) = focused and deterministic. High temperature (0.7-1.0) = creative and varied. It's literally how "confident" vs "exploratory" the model acts.
Temperature 0.0:
"The capital of France is Paris."
Temperature 1.0:
"Paris, that luminous city on the Seine, serves as France's capital—a beacon of culture and history."
Parameter Creativity Control
🧬
Embeddings
How AI understands meaning
Embeddings transform words into numbers—specifically, high-dimensional vectors. Words with similar meanings get similar vectors. "King" - "Man" + "Woman" ≈ "Queen" isn't magic—it's vector math in embedding space.
Why it's powerful:
Semantic search, recommendation systems, and RAG (Retrieval-Augmented Generation) all depend on embeddings. They let AI find information based on meaning, not just keywords.
Real-world use:
Search for "fast car" and get results about "speedy vehicle" or "quick automobile" even though the exact words don't match.
Representation Semantic Search Vector Math