AI · 214 File Classes
File Type Detector
Detect a file's true type from its content — not its extension. Powered by Google Magika (a 3 MB neural model running entirely in your browser via WebAssembly + ONNX Runtime). Identifies 214 file categories including source code, documents, images, audio, video, archives, executables, and cryptographic keys.
How to Use File Type Detector in 3 Steps
- Drop files. Drag one or more files onto the drop zone, or click to open the file picker. Detection runs locally — nothing leaves your device.
- Review detected types. Each file's detected label, group (code, document, image, archive, executable…), MIME type, and confidence score are listed in the results table.
- Export results. Copy as JSON or CSV for triaging unknown downloads, validating content against extensions, or scripted pipelines.
Why File Type Detector on Pixlane
File extensions lie. A .txt can be a ZIP archive, a .jpg can be a PHP shell, and many formats have no extension at all. Magika — Google's content-aware classifier — reads only the first and last 4 KB of a file, packs them into a 2048-token tensor, and runs a tiny neural network (214 output classes) to return a robust label. Pixlane ships this exact model inside its WebAssembly bundle so detection is fast, offline, and private.
- Content, not extension — Works on renamed, broken, or extensionless files.
- 214 classes — Source code (50+ languages), documents (PDF, DOCX, ODT, RTF), images (PNG, JPEG, SVG, HEIC, AVIF), audio/video, archives (ZIP, 7z, TAR, DEB, RPM), executables (ELF, Mach-O, PE), keys & certs.
- In-browser AI — The same standard_v3_3 ONNX model that Magika's Python tool uses, running on ONNX Runtime compiled to WebAssembly with SIMD and threads.
- Privacy-first — Files never upload. The model only needs the first 4 KB and last 4 KB; the middle of your file isn't even read.
- Confidence scores — Every detection reports a probability. Low-confidence results fall back to a strict UTF-8 heuristic ("txt" vs "unknown").
Frequently Asked Questions
What is Magika?
Magika is an open-source file-type detector by Google (Apache-2.0). It uses a small neural network trained on hundreds of millions of samples to reach over 99% accuracy across 200+ formats — including cases where file(1) and libmagic fail. Pixlane embeds the standard_v3_3 ONNX model (3.1 MB) directly in the browser.
Do my files get uploaded?
No. The entire detection pipeline — reading the first & last 4 KB, preprocessing, inference, postprocessing — runs inside your browser's WebAssembly sandbox. Files of any size can be dropped; only the first and last 4 KB are touched by the detector.
How does it handle tiny or empty files?
Exactly like the Python reference implementation: size 0 returns empty; under 8 meaningful (non-whitespace) bytes triggers a strict UTF-8 validity check and returns txt or unknown without running the model.
Which score should I trust?
The score column is the raw softmax probability. For standard labels, Magika applies a per-label confidence threshold; below it, results fall back to txt (text-looking) or unknown (binary), matching Magika's HIGH_CONFIDENCE mode.
Can it replace libmagic?
For most modern workflows, yes. Magika is stronger on polyglot formats, truncated files, unknown extensions, and ambiguous text types. Some legacy niche formats (obscure mainframe binaries, very old OS artifacts) still need libmagic's rule database.