LLM Token Counter

Model	Tokens	Context used	Input cost	Total cost
GPT-5 OpenAIexact	119	0.03%	$0.00015	$0.00515
GPT-5 mini OpenAIexact	119	0.03%	$0.00003	$0.00103
GPT-4.1 OpenAIexact	119	0.01%	$0.00024	$0.00424
GPT-4o OpenAIexact	119	0.09%	$0.00030	$0.00530
GPT-4o mini OpenAIexact	119	0.09%	$0.00002	$0.00032
o3 / o4-mini OpenAIexact	119	0.06%	$0.00024	$0.00424
GPT-4 Turbo OpenAIexact	119	0.09%	$0.00119	$0.0162
GPT-3.5 Turbo OpenAIexact	119	0.73%	$0.00006	$0.00081
Claude Opus 4 Anthropicest	133	0.07%	$0.00067	$0.0132
Claude Sonnet 4 Anthropicest	133	0.07%	$0.00040	$0.00790
Claude Haiku 4.5 Anthropicest	133	0.07%	$0.00013	$0.00263
Gemini 3 Pro Googleest	119	0.01%	$0.00024	$0.00624
Gemini 3 Flash Googleest	119	0.01%	$0.00006	$0.00156
Gemini 2.0 Flash Googleest	119	0.01%	$0.00001	$0.00021
Llama 3.1 405B Metaest	126	0.10%	$0.00011	$0.00056
DeepSeek V3 DeepSeekest	129	0.10%	$0.00003	$0.00058
Grok 4 xAIest	123	0.05%	$0.00037	$0.00787
Mistral Large Mistralest	129	0.10%	$0.00026	$0.00326

Model

Tokens

Total cost

GPT-5

OpenAIexact

119

$0.00515

GPT-5 mini

OpenAIexact

119

$0.00103

GPT-4.1

OpenAIexact

119

$0.00424

GPT-4o

OpenAIexact

119

$0.00530

GPT-4o mini

OpenAIexact

119

$0.00032

o3 / o4-mini

OpenAIexact

119

$0.00424

GPT-4 Turbo

OpenAIexact

119

$0.0162

GPT-3.5 Turbo

OpenAIexact

119

$0.00081

Claude Opus 4

Anthropicest

133

$0.0132

Claude Sonnet 4

Anthropicest

133

$0.00790

Claude Haiku 4.5

Anthropicest

133

$0.00263

Gemini 3 Pro

Googleest

119

$0.00624

Gemini 3 Flash

Googleest

119

$0.00156

Gemini 2.0 Flash

Googleest

119

$0.00021

Llama 3.1 405B

Metaest

126

$0.00056

DeepSeek V3

DeepSeekest

129

$0.00058

Grok 4

xAIest

123

$0.00787

Mistral Large

Mistralest

129

$0.00326

What Is an LLM Token Counter?

An LLM token counter is a free online tool that tells you exactly how many tokensa piece of text uses before you send it to a large language model. Models like GPT-5, GPT-4o, Claude, Gemini, Llama and DeepSeek don't read words — they read tokens, the small chunks their tokenizer breaks text into. Because both context limits and API billing are measured in tokens, counting them first lets you control cost, avoid truncation, and optimize your prompts with confidence.

This counter uses OpenAI's official tiktoken encodings (o200k_base and cl100k_base) for exact GPT and o-series counts, and calibrated estimates for providers that don't publish a browser tokenizer. Everything runs locally in your browser — nothing is uploaded.

How to Use the Token Counter

Add your text. Paste a prompt, open a .txt, .md or code file with Open File, or load the sample.
Pick a model. Choose from GPT-5, GPT-4o, Claude, Gemini, Llama and more — token counts update in real time.
Read the metrics.See total tokens, characters, words, tokens-per-word, and how much of the model's context window you're using.
Estimate cost. Enter your expected output tokens to get a per-request price, then compare every model side by side.
Visualize tokens. Toggle Visualize to see exactly where each token boundary falls in your text.

Why Use Our Token Counter

Exact OpenAI counts via the real tiktoken BPE encodings — not a rough character guess.
Built-in cost calculator with separate input and output pricing across 18+ models.
Side-by-side model comparison so you can pick the cheapest model that fits your prompt.
Context-window meter that warns you before you hit a limit.
100% private & offline-capable — your text never leaves the browser, with no sign-up and no limits.

Common Use Cases

Prompt engineering: trim and refine prompts to fit context windows and reduce spend.
Cost budgeting: forecast API bills before shipping a feature to production.
RAG & chunking:size document chunks so they fit a model's window with room for the answer.
Model selection: compare token efficiency and price between GPT, Claude and Gemini for the same content.
Fine-tuning & datasets: estimate training and inference token volumes for large text corpora.

Exact vs. Estimated Token Counts

For all OpenAI models we run the exact o200k_base (GPT-5, GPT-4.1, GPT-4o, o-series) and cl100k_base (GPT-4 Turbo, GPT-3.5) encodings, so results match the API to the token. Anthropic, Google, Meta, DeepSeek, xAI and Mistral do not ship public browser tokenizers, so their counts are close approximations using calibrated character-per-token ratios — typically within a few percent for English text. Counts marked EXACT are authoritative; those marked ESTIMATE are a reliable guide for budgeting.

Frequently Asked Questions

What is an LLM token?

A token is the basic unit a language model reads and generates. Tokens can be whole words, parts of words, punctuation, or single characters. As a rough guide, 1,000 tokens is about 750 English words, but the exact number depends on the model's tokenizer and the language you use.

How accurate is this token counter?

For OpenAI models (GPT-5, GPT-4.1, GPT-4o, o-series, GPT-4 Turbo and GPT-3.5) we use the exact tiktoken BPE encodings (o200k_base and cl100k_base), so counts match the API precisely. Anthropic, Google, Meta, DeepSeek, xAI and Mistral do not publish browser tokenizers, so those counts are close estimates based on calibrated character-to-token ratios.

Is my text sent to a server?

No. This tool runs 100% in your browser. Your prompts, documents and code never leave your device, are never uploaded, and are never logged. That makes it safe for confidential and proprietary content.

How do I estimate my API cost?

Your pasted text is treated as input tokens. Enter the number of tokens you expect the model to generate in the 'expected output' field, and the tool multiplies both by each model's per-million-token price to show a total cost. You can compare the same prompt across every model instantly.

Why do different models report different token counts?

Each model family is trained with its own tokenizer and vocabulary, so the same text splits into a different number of tokens. Newer vocabularies (like OpenAI's o200k_base) are generally more efficient than older ones, which is why GPT-4o often uses fewer tokens than GPT-3.5 for identical text.

Does counting tokens help me avoid context window errors?

Yes. Every model has a maximum context window (input + output). This tool shows how much of each model's window your prompt consumes, so you can trim or chunk content before you hit a limit and get a truncation or error.

Model	Tokens	Context used	Input cost	Total cost
GPT-5 OpenAIexact	119	0.03%	$0.00015	$0.00515
GPT-5 mini OpenAIexact	119	0.03%	$0.00003	$0.00103
GPT-4.1 OpenAIexact	119	0.01%	$0.00024	$0.00424
GPT-4o OpenAIexact	119	0.09%	$0.00030	$0.00530
GPT-4o mini OpenAIexact	119	0.09%	$0.00002	$0.00032
o3 / o4-mini OpenAIexact	119	0.06%	$0.00024	$0.00424
GPT-4 Turbo OpenAIexact	119	0.09%	$0.00119	$0.0162
GPT-3.5 Turbo OpenAIexact	119	0.73%	$0.00006	$0.00081
Claude Opus 4 Anthropicest	133	0.07%	$0.00067	$0.0132
Claude Sonnet 4 Anthropicest	133	0.07%	$0.00040	$0.00790
Claude Haiku 4.5 Anthropicest	133	0.07%	$0.00013	$0.00263
Gemini 3 Pro Googleest	119	0.01%	$0.00024	$0.00624
Gemini 3 Flash Googleest	119	0.01%	$0.00006	$0.00156
Gemini 2.0 Flash Googleest	119	0.01%	$0.00001	$0.00021
Llama 3.1 405B Metaest	126	0.10%	$0.00011	$0.00056
DeepSeek V3 DeepSeekest	129	0.10%	$0.00003	$0.00058
Grok 4 xAIest	123	0.05%	$0.00037	$0.00787
Mistral Large Mistralest	129	0.10%	$0.00026	$0.00326

Model

Tokens

Total cost

GPT-5

OpenAIexact

119

$0.00515

GPT-5 mini

OpenAIexact

119

$0.00103

GPT-4.1

OpenAIexact

119

$0.00424

GPT-4o

OpenAIexact

119

$0.00530

GPT-4o mini

OpenAIexact

119

$0.00032

o3 / o4-mini

OpenAIexact

119

$0.00424

GPT-4 Turbo

OpenAIexact

119

$0.0162

GPT-3.5 Turbo

OpenAIexact

119

$0.00081

Claude Opus 4

Anthropicest

133

$0.0132

Claude Sonnet 4

Anthropicest

133

$0.00790

Claude Haiku 4.5

Anthropicest

133

$0.00263

Gemini 3 Pro

Googleest

119

$0.00624

Gemini 3 Flash

Googleest

119

$0.00156

Gemini 2.0 Flash

Googleest

119

$0.00021

Llama 3.1 405B

Metaest

126

$0.00056

DeepSeek V3

DeepSeekest

129

$0.00058

Grok 4

xAIest

123

$0.00787

Mistral Large

Mistralest

129

$0.00326

What Is an LLM Token Counter?

How to Use the Token Counter

Add your text. Paste a prompt, open a .txt, .md or code file with Open File, or load the sample.

Pick a model. Choose from GPT-5, GPT-4o, Claude, Gemini, Llama and more — token counts update in real time.

Read the metrics.See total tokens, characters, words, tokens-per-word, and how much of the model's context window you're using.

Estimate cost. Enter your expected output tokens to get a per-request price, then compare every model side by side.

Visualize tokens. Toggle Visualize to see exactly where each token boundary falls in your text.

Why Use Our Token Counter

Exact OpenAI counts via the real tiktoken BPE encodings — not a rough character guess.

Built-in cost calculator with separate input and output pricing across 18+ models.

Side-by-side model comparison so you can pick the cheapest model that fits your prompt.

Context-window meter that warns you before you hit a limit.

100% private & offline-capable — your text never leaves the browser, with no sign-up and no limits.

Common Use Cases

Prompt engineering: trim and refine prompts to fit context windows and reduce spend.

Cost budgeting: forecast API bills before shipping a feature to production.

RAG & chunking:size document chunks so they fit a model's window with room for the answer.

Model selection: compare token efficiency and price between GPT, Claude and Gemini for the same content.

Fine-tuning & datasets: estimate training and inference token volumes for large text corpora.

Exact vs. Estimated Token Counts

Frequently Asked Questions

What is an LLM token?

How accurate is this token counter?

Is my text sent to a server?

How do I estimate my API cost?

Why do different models report different token counts?

Does counting tokens help me avoid context window errors?

Cost estimate

Compare across all models

What Is an LLM Token Counter?

How to Use the Token Counter

Why Use Our Token Counter

Common Use Cases

Exact vs. Estimated Token Counts

Frequently Asked Questions

100% Private

Instant & Exact

18+ Models

100% private — your data never leaves your browser

No server, no uploads

Works offline

Verify it yourself

Explore More Tools

OpenAPI & Swagger Viewer

Remove Objects from Photos

WebM to MP4 Converter

LLM Token Counter

Cost estimate

Compare across all models

What Is an LLM Token Counter?

How to Use the Token Counter

Why Use Our Token Counter

Common Use Cases

Exact vs. Estimated Token Counts

Frequently Asked Questions

100% Private

Instant & Exact

18+ Models