Writing & Editing
Drafts, rewrites, tone shifts, outlines, summaries. The model excels at producing a competent first draft quickly, which a human then refines.
ExampleDraft a polite follow-up email to a client who hasn't replied in two weeks.
A plain-language entry point to the technology behind Claude, ChatGPT, Gemini, and Llama — what an LLM is, what it can do, and what it costs to use.
By the end of this lesson, students will be able to — mapped across Bloom's cognitive levels:
A large language model — LLM for short — is a computer program that has read an enormous amount of text and, as a result, has learned to predict which word is likely to come next. That single ability, repeated one word at a time, is what lets systems like Claude, ChatGPT, Gemini, and Llama answer questions, draft emails, write code, and summarize documents.
Modern LLMs are trained on hundreds of billions of words drawn from books, articles, websites, and code. The model's own internal parameters — the numbers that encode what it has learned — often number in the hundreds of billions as well.
The model works on sequences of text. You give it a prompt — a question, an instruction, a document — and it responds with more text. Many modern LLMs also accept images, audio, or files as input.
An LLM is not a search engine and does not look up facts in a database. It generates a response from learned statistical patterns — which is why it is fluent, flexible, and occasionally wrong.
| Attribute | Traditional Software | Large Language Model |
|---|---|---|
| How It Works | Follows a fixed set of rules a programmer wrote: if this, then that. | Follows statistical patterns learned from training data — rules are implicit, not hand-coded. |
| Input Format | Structured — buttons, forms, specific commands. | Natural language — plain English (or code, or another language) is the interface. |
| Output Behavior | Deterministic — the same input always produces the same output. | Probabilistic — the same prompt can yield slightly different answers each time. |
| Failure Mode | Crashes, error messages, or does nothing. | Produces a confident-sounding answer that may be wrong — a hallucination. |
How next-token prediction works. Start typing "The capital of France is —". An LLM does not "know" the answer the way a database does. It has seen that pattern of words countless times during training and calculates that Paris is the most probable next token. It picks a token, appends it, and repeats. Everything the model does — reasoning, coding, writing poetry — emerges from this one loop run billions of times across a vast learned landscape of language.
LLMs are general-purpose tools. The same underlying model can shift between roles depending on the prompt. Below are six categories of tasks where LLMs are already in daily use across workplaces and classrooms.
Drafts, rewrites, tone shifts, outlines, summaries. The model excels at producing a competent first draft quickly, which a human then refines.
ExampleDraft a polite follow-up email to a client who hasn't replied in two weeks.
Generating functions, explaining unfamiliar code, spotting bugs, translating between programming languages, and writing test cases.
ExampleConvert this Python script to JavaScript and add error handling.
Condensing long documents, extracting key points from meetings, comparing sources, and turning unstructured notes into structured tables.
ExampleSummarize this 40-page report into ten bullet points for an executive briefing.
Explaining concepts at different levels, generating practice problems, walking through solutions step by step, and role-playing Socratic dialogue.
ExampleExplain subnetting to me as if I've never seen an IP address before.
Brainstorming angles, surfacing counterarguments, synthesizing across sources, and translating specialist language for a general audience.
ExampleWhat are the main criticisms of NIST SP 800-171, and who makes them?
Answering routine questions, classifying tickets, drafting personalized replies, and handing off to a human when the conversation exceeds its scope.
ExampleA first-line chatbot that triages IT help-desk tickets before a technician reviews them.
Current frontier LLMs — the generation that includes Claude, GPT, Gemini, and Llama — share a common feature set. Capability does not mean infallibility: each strength has a matching limit worth knowing before you rely on the tool.
Works through problems that require several linked steps — math, planning, code logic — especially when asked to think it through before answering.
Condenses long text, changes tone, adjusts reading level, and rewrites for different audiences without losing the source's meaning.
Translates between dozens of human languages and between programming languages, preserving nuance when the input is clear.
Writes, explains, and refactors code in most modern programming languages, and can produce tests, documentation, and commit messages alongside it.
Modern LLMs can call external tools — search the web, run code, query a database, send an email — as part of answering a single prompt.
Accepts images, PDFs, spreadsheets, and in some products audio or video — allowing questions like "What's wrong with this circuit diagram?"
Four terms explain most of what you'll encounter when you start using LLMs seriously: token, context window, prompt vs. completion, and temperature. Each one is simple once you see it in action.
A token is a chunk of text — roughly three-quarters of a word in English. Short common words are one token; longer or unusual words split into several. Tokens are the unit the model reads, writes, and — importantly — is billed for.
The maximum number of tokens the model can consider at once — your prompt, any attached documents, and the model's own reply. Current frontier models offer windows of 200,000 tokens or more, roughly a 500-page book.
The prompt is everything you send in: instructions, context, documents. The completion is the model's response. Input tokens and output tokens are usually priced separately, with output costing more.
Temperature is a setting, between 0 and about 1, that controls how predictable the model's output is. A low temperature (near 0) makes the model pick the most likely next token every time — good for factual, consistent work. A higher temperature makes the model sample from less-likely options — good for brainstorming or creative writing. Most chat interfaces hide this control; API users adjust it directly.
Type or paste text below. The rough token breakdown appears as you type. This uses a simple approximation — real tokenizers vary slightly by model — but it's accurate enough to build intuition.
Rough rule of thumb: 1 token ≈ 4 characters or ¾ of a word in English. Rare words, numbers, punctuation, and non-English text tend to produce more tokens per character.
The same underlying model is usually offered in three different forms — a free tier, a paid subscription, and direct API access. Which one fits best depends on how much you use it, whether you're building software, and how sensitive your data is.
| Tier | Free Chat | Paid Subscription | API Access |
|---|---|---|---|
| Who It's For | Casual users trying it out or using it lightly. | Regular users who hit free-tier limits or want the best model. | Developers building apps, automations, or integrations. |
| Interface | Web or mobile chat window. | Same chat window, with higher limits and extra features. | Programmatic — requests sent from your own code. |
| Typical Cost | $0 — usage-capped per day or message. | Around $20/month for individuals; enterprise plans available. | Metered — priced per million input and output tokens. |
| Rate Limits | Strict caps; may pause during peak traffic. | Generous caps per hour or day; priority access to newest models. | Tiered — raise your limits as usage and trust grow. |
| Data Handling | Check the provider's policy — some train on free-tier chats. | Usually not used for training by default. | Business-grade controls; zero-retention options available. |
Why cost scales with usage. API pricing is per token, and both your prompt and the model's reply count. A short question ("What is a subnet?") might cost a fraction of a cent. Feeding in a 100-page PDF for summarization can cost several cents per run — and running that summarization on 10,000 PDFs becomes a real line item. The token counter above is a first, practical tool for estimating cost before you hit Send.
Why output tokens cost more than input. Reading your prompt is comparatively cheap; generating each new token requires the model to run its full calculation again. Most providers price output at roughly 3–5× the input rate — which is why a concise prompt that asks for a concise answer is usually the cheapest and best-performing combination.
Five questions covering the lesson. Select your answer — feedback appears immediately. Submit at the end to see your score.