The problem with playing Japanese games as a learner
There are hundreds of Japanese games that never received an English localization — older JRPGs, niche visual novels, doujin titles, indie games, and console exclusives that are simply not available in any other language. If you want to experience them, you need to read Japanese.
For learners at an intermediate level, this is often achievable — but painful in practice. You recognize the kanji 行く but draw a blank on 彷徨う. The standard solution is to alt-tab into a dictionary, look up the word, return to the game, re-read the sentence, and continue. Then do it again for the next unfamiliar word. And the next.
At ten lookups per dialogue box, you're spending more time in a dictionary than in the game. Your immersion collapses. The game stops feeling like a game and starts feeling like homework.
This is the exact problem Japanese OCR overlays solve.
What an OCR overlay actually does
How screen OCR reads game text
OCR stands for Optical Character Recognition. It's the technology that converts images of text — pixels on a screen — into actual readable characters. Your phone uses it to read receipts. Banks use it to process cheques. For Japanese games, it means capturing a screenshot of the game window and extracting the Japanese characters from it.
A Japanese OCR tool for games does this continuously and automatically. It takes a screenshot of the dialogue box (or the whole window), runs it through an OCR model trained on Japanese text, and returns the extracted characters — typically within 200–500 milliseconds.
Why a game overlay is different from just taking a screenshot
The critical insight is what happens with the extracted text. A basic OCR tool gives you a text file. A game overlay takes that extracted text and renders it directly on top of your game window as a transparent, interactive layer.
This means the Japanese text appears floating above the game, positioned precisely where it appears on screen. And because it's rendered as actual HTML text (not an image), your cursor can interact with it. Hover a word. Get a definition. Move on.
The role of Yomitan in the immersion workflow
Yomitan (the successor to Yomichan) is a browser extension that shows a popup dictionary when you hover over Japanese text. It reads from dictionary files you import — JMdict, pitch accent databases, frequency lists — and shows readings, definitions, and pitch patterns in a compact popup.
In a browser, Yomitan works automatically on any Japanese text on a webpage. The challenge has always been: games are not webpages. Yomitan can't see the text inside a game window.
A Japanese OCR overlay bridges this gap. It extracts the text from the game and renders it in a context where Yomitan can reach it. The result: hover any kanji in any game, get an instant Yomitan popup — exactly as if you were reading a Japanese website.
YomiNinja: the free open-source solution
YomiNinja is a free, open-source Japanese OCR overlay application available for Windows, Linux, and macOS. It was built specifically for this use case: giving Japanese immersion learners a way to read any game or application without leaving the window.
YomiNinja extracts text from any on-screen content
Unlike text-hooking tools (more on those below), YomiNinja doesn't need to understand the game's code or engine. It captures a screenshot of the window and runs OCR on it. This means it works on:
- Any Japanese game, regardless of the engine
- Console games running on emulators (Ryujinx, RPCS3, RetroArch)
- Visual novels with hardcoded, image-rendered text
- Manga readers and Japanese document viewers
- Video players with Japanese subtitles
If you can see Japanese text on screen, YomiNinja can read it.
Yomitan and 10ten ship pre-installed
YomiNinja runs an Electron application with a Chromium browser context embedded inside it. Yomitan and 10ten Reader are pre-installed into this context. You open YomiNinja, import your dictionary files into Yomitan (a one-time 5-minute setup), and hover any word in the overlay to get a definition — no browser, no Chrome Web Store, no additional configuration.
Auto OCR detects new text without a hotkey
YomiNinja can run in a continuous "Auto OCR" mode that detects when the on-screen text changes and automatically runs a new capture. In practice, when a new dialogue line appears in a game, the overlay updates within half a second. You never need to press a hotkey — you can focus entirely on playing.
OCR Templates narrow the capture to the dialogue box
Rather than capturing the entire game window (which wastes processing time and introduces false positives from UI elements), you can draw an OCR Template — a rectangle over the specific area where game text appears. For most games, this is the dialogue box at the bottom of the screen. Defining this template once per game dramatically improves both speed and accuracy.
Setting up YomiNinja for the first time
Installation takes under two minutes
YomiNinja is a portable application — download it from the download page, run the installer (Windows) or extract the archive (Linux), and launch it. No additional runtimes or dependencies are required for the default PaddleOCR engine.
The five steps from zero to first lookup
- Download and install YomiNinja — see the download page for platform-specific instructions.
- Import dictionaries into Yomitan — open YomiNinja, click the Yomitan icon, go to Settings → Dictionaries, and import JMdict. This takes 2–3 minutes and is a one-time process.
- Select your game window — in YomiNinja's capture source list, find and select your game's window.
- Draw an OCR Template — select "OCR Templates", draw a rectangle over the dialogue box area. Save it.
- Enable Auto OCR, then play — enable Auto OCR in settings. The overlay will appear automatically when Japanese text is detected. Hover any word.
Choosing the right OCR engine for your game
YomiNinja supports five OCR engines. The right choice depends on the game's font style and your hardware.
| Engine | Best for | Requires internet? | GPU required? |
|---|---|---|---|
| PaddleOCR (default) | Standard printed game text | No | No |
| MangaOCR | Stylized, hand-drawn, manga fonts | No | Optional |
| Google Cloud Vision | Difficult fonts, maximum accuracy | Yes | No |
| Google Lens | General fallback, no API key | Yes | No |
| Apple Vision | macOS users, vertical text | No | No |
For most games: start with PaddleOCR. If text recognition is inaccurate — especially if the game uses an unusual or artistic font — switch to MangaOCR. It's trained specifically on Japanese comic and game typography and handles stylized characters significantly better.
For vertical text, rotated layouts, or persistent hard cases: Apple Vision on macOS and cloud engines like Google Cloud Vision are the next things to test. The right engine depends less on genre and more on how the game actually renders text.
See the OCR engines comparison for a full technical breakdown of each option.
Tips for better OCR accuracy in games
Use Borderless Windowed mode and scale up the window
OCR accuracy degrades on small text. If your game window is small, increase its resolution or window size. Text rendered at a larger pixel size is significantly easier to recognize — especially for kanji with complex stroke patterns.
Set up a focused OCR Template, not full-screen capture
Capturing the entire screen includes HUD elements, health bars, map icons, and other non-text graphics. These create OCR noise. A focused OCR Template over just the dialogue box eliminates this and improves both speed and accuracy.
Adjust the furigana threshold if lookups are broken
Furigana — the small kana characters above kanji that show pronunciation — confuse OCR engines. They get captured alongside the kanji, producing text like 行い(おこな)い instead of 行い. The dictionary lookup fails because this isn't a real word.
In YomiNinja settings, the furigana threshold slider (added in v0.9.1) filters out text below a certain size. Increasing it progressively removes furigana from the captured output.
Font mods solve the stylized font problem
Some games use custom fonts that no OCR engine handles well. Stardew Valley's default Japanese font is a well-known example — PaddleOCR and MangaOCR both struggle with it out of the box. The community solution is a font replacement mod that substitutes the game's font with a standard, legible Japanese typeface. After applying the mod, OCR accuracy improves dramatically. Check NexusMods or the game's subreddit for font mods.
Use Google Cloud Vision for the hardest cases
When PaddleOCR and MangaOCR both fail on a specific game's text, Google Cloud Vision is the fallback. It requires a Google Cloud account and an API key (with a free tier that covers moderate use), but it handles unusual fonts, low-contrast text, and complex layouts more reliably than local engines.
Expect Google Lens to be a fallback, not a perfect replacement
Google Lens can be useful because it does not require an API key, but community reports and YomiNinja issue discussions show that it still struggles on some manga-style layouts, bubbles, and noisy captures. Treat it as a practical backup rather than the default best choice.
OCR overlays vs text hooking: which should you use?
Text-hooking tools like Textractor and LunaTranslator take a completely different approach. Instead of reading text from pixels on screen, they inject into the game's process and extract text directly from memory — exactly the string of characters the game was about to display, before it renders them as pixels.
When text hooking is better than OCR
When a text hook works for a specific game, it's almost always more accurate than OCR. Memory extraction returns the exact, unambiguous text string — no recognition errors, no furigana confusion, no sensitivity to font choices or contrast. For visual novels built on well-supported engines (Kirikiri, RPGMaker, some Unity titles), Textractor or LunaTranslator will give you near-perfect text extraction.
When OCR is the only option
Text hooks require engine-specific plugins and compatibility. They fail on:
- Console ports — games ported from PS4/Switch often use engines that Textractor has no hook for
- Custom engines — games built on proprietary engines have no public hook
- Emulated games — text hooks can't reach into emulator processes to extract game text
- Image-rendered text — some VNs render dialogue as pre-baked images, not text strings
- Action games and JRPGs — real-time games where text appears embedded in the HUD rather than in a dialogue system
In these cases, OCR is not the fallback — it's the only approach that works at all. See the full comparison guide for a decision tree.
Adding YomiNinja to your Anki mining workflow
One of the most popular uses of OCR overlays in the Japanese learning community is sentence mining — extracting vocabulary from native content into Anki flashcards. YomiNinja fits into this workflow through its WebSocket output.
The WebSocket → texthooker → Anki pipeline
As YomiNinja's OCR captures text, it simultaneously broadcasts each result over a local WebSocket server. You can connect a texthooker page — a simple HTML page running in a browser — to this WebSocket. The page receives and displays each captured sentence.
With Yomitan running in that browser, you can hover words on the texthooker page to create Anki cards. The complete pipeline: play the game → YomiNinja captures text → texthooker page receives it → hover unfamiliar word → Yomitan exports card to Anki.
Direct Anki integration without the texthooker intermediary is on YomiNinja's roadmap. For current setup instructions, see the Anki mining guide.
Known limitations of OCR-based translation
OCR is not perfect and makes recognition errors
Unlike text hooking, OCR introduces recognition errors. Visually similar kanji (土 vs 士, 己 vs 巳) are sometimes confused, especially on lower-resolution text. This is less frequent with MangaOCR and Google Cloud Vision, but no OCR engine is error-free.
In practice, most errors are obvious and easy to recognize as errors. The occasional wrong character doesn't prevent understanding the sentence. For vocabulary mining, you should verify the exported text before making Anki cards.
Fullscreen exclusive mode blocks the overlay
This is the most common setup issue. If your game runs in exclusive fullscreen, YomiNinja's transparent window cannot appear above it. The fix is always the same: switch to Borderless Windowed in the game's display settings. Borderless Windowed looks identical to fullscreen but allows the display compositor (and therefore overlay apps) to function.
Linux requires X11 — Wayland support is experimental
On Linux, YomiNinja requires the X11 display server. Wayland support was added as experimental in v0.9.1 but is not yet stable. If you use Wayland, run YomiNinja under XWayland or switch your session to X11.
Dictionary popup setup can fail even when OCR works
A separate failure mode is when OCR extracts text correctly but the Yomitan popup never appears. In practice this usually means JMdict was not imported, the scanning modifier key is still enabled, or you are hovering the original game text instead of the overlay text. That is a setup problem, not an OCR problem.
Ready to start?
Download YomiNinja Free
Free, open source, no account required. Windows, Linux, macOS.