- Home
- Exclusive Tools
- Data Wrangler
Last updated:
Data Wrangler
Load a CSV or JSON file and clean, reshape, and mine it with real pandas — group-by aggregation, de-duplication, summary statistics, filtering, and regex extraction to pull every email, IP, or handle out of a messy column. The engine is Python (pandas) compiled to WebAssembly and runs entirely in your browser, so even sensitive datasets never leave your device.
② Data in
or paste belowWhy analyse data in the browser
Investigative datasets — exports, leaks you are authorised to review, scraped tables — are often sensitive, and uploading them to an online CSV tool means handing them to a third party. Here the work runs on real pandas compiled to WebAssembly, so the file stays in your browser. Group-by and value counts surface patterns, de-duplication cleans exports, and regex extraction pulls every email, IP address, URL, or @handle out of a free-text column into a frequency table — the kind of pivot that normally needs a Python notebook.
Frequently asked questions
Is my data uploaded anywhere?
No. The data wrangler runs pandas in your browser through WebAssembly. Your CSV or JSON is parsed and analysed locally and never sent to a server.
Why is the first load larger than the other labs?
pandas pulls in more of the scientific Python stack, so the one-time engine download is around 30 MB. It is cached afterwards and then works offline.
What can the regex extractor do?
Pick a column and a pattern (presets for emails, IPv4, URLs, @handles, and hashtags, or your own regex) and it returns every match across the column as a ranked frequency table — useful for pulling indicators out of notes or log fields.
How big a file can it handle?
It runs in your browser's memory, so tens of thousands of rows are comfortable; very large files (hundreds of MB) may be slow or hit memory limits. For those, a desktop notebook is better.