Character Counts
Character frequency analysis is one of the oldest techniques in cryptanalysis. When a substitution cipher replaces each letter with a fixed substitute, the frequency distribution of the ciphertext mirrors the distribution of the original language. In English, E is the most common letter (~12.7%), followed by T, A, O, I, and N. Identifying the highest-frequency ciphertext character and guessing it represents E is often the first step in breaking a simple cipher.
Beyond cryptanalysis, frequency analysis appears in data compression — Huffman coding assigns shorter binary codes to more frequent characters, shrinking file size — and in game design. Scrabble tile counts and point values are derived directly from English letter frequency: rare letters like Q and Z score 10 points each because they appear so infrequently, while E scores only 1 point because it can be used in almost any word.
This tool counts every character in your input, including spaces, newlines, and punctuation, sorted by frequency with the most common first. The "Case sensitive" option controls whether uppercase and lowercase are counted separately — useful when analysing source code where capitalisation carries meaning, or when studying title-case patterns in a document.
Frequently Asked Questions
What is character frequency analysis used for?+
Character frequency analysis reveals which characters appear most often in a text. It is used in cryptanalysis (breaking substitution ciphers), linguistics research (letter distribution in different languages), data compression (Huffman coding assigns shorter codes to frequent characters), and game design (Scrabble tile distribution is based on English letter frequency).
What is the most common letter in English?+
In typical English text, E is the most frequent letter (~12.7%), followed by T (~9.1%), A (~8.2%), O (~7.5%), I (~7.0%), and N (~6.7%). The space character is the single most common character overall in natural prose. Letter frequency varies significantly by genre — technical writing uses more numbers and symbols, while poetry often has unusual patterns.
Does case sensitivity matter?+
By default the tool counts case-insensitively — uppercase A and lowercase a are counted together. Enable the "Case sensitive" option if you need to distinguish them, for example when analysing source code where variable names are case-sensitive, or when studying title-case patterns in a text.
Why are spaces and newlines included?+
Whitespace characters (spaces ␣, newlines ↵, tabs ⇥) are real characters that occupy bytes in any encoding. Including them gives a complete picture of the text's composition and is essential for cipher analysis, where whitespace can reveal word boundaries. They are displayed with visual labels so they are not confused with empty entries.
How to use
- Type or paste your text into the editor.
- The tool will instantly list every character found and how many times it appears.
- Results are sorted by frequency (most common first).
- Special characters like spaces and newlines are clearly labeled.