Last updated:

Data Wrangler

Load a CSV or JSON file and clean, reshape, and mine it with real pandas — group-by aggregation, de-duplication, summary statistics, filtering, and regex extraction to pull every email, IP, or handle out of a messy column. The engine is Python (pandas) compiled to WebAssembly and runs entirely in your browser, so even sensitive datasets never leave your device.

Runs real pandas locally via WebAssembly. Engine downloads once (~30 MB), then works offline. Your data is never uploaded.

② Data in

or paste below

Why analyse data in the browser

Investigative datasets — exports, leaks you are authorised to review, scraped tables — are often sensitive, and uploading them to an online CSV tool means handing them to a third party. Here the work runs on real pandas compiled to WebAssembly, so the file stays in your browser. Group-by and value counts surface patterns, de-duplication cleans exports, and regex extraction pulls every email, IP address, URL, or @handle out of a free-text column into a frequency table — the kind of pivot that normally needs a Python notebook.

Frequently asked questions

Is my data uploaded anywhere?

No. The data wrangler runs pandas in your browser through WebAssembly. Your CSV or JSON is parsed and analysed locally and never sent to a server.

Why is the first load larger than the other labs?

pandas pulls in more of the scientific Python stack, so the one-time engine download is around 30 MB. It is cached afterwards and then works offline.

What can the regex extractor do?

Pick a column and a pattern (presets for emails, IPv4, URLs, @handles, and hashtags, or your own regex) and it returns every match across the column as a ranked frequency table — useful for pulling indicators out of notes or log fields.

How big a file can it handle?

It runs in your browser's memory, so tens of thousands of rows are comfortable; very large files (hundreds of MB) may be slow or hit memory limits. For those, a desktop notebook is better.