- Home
- Exclusive Tools
- Page OSINT Extractor
Page OSINT Extractor
Last updated:
One-click bookmarklet that scrapes intelligence from any webpage — emails, phone numbers, social media links, technology fingerprints, hidden metadata, HTML comments — and sends it here for organized analysis. Optionally archives the page on archive.today for evidence preservation. Nothing is stored on our servers.
Drag this button to your bookmarks bar:
🧲 Extract OSINTCan't drag? Click for manual install instructions
1. Right-click the button → "Copy link address"
2. Create a new bookmark manually
3. Name it "Extract OSINT" and paste the copied link as the URL
Or copy the code below and paste it as a bookmark URL:
No data loaded
Use the bookmarklet on any webpage to extract and analyze its contents here
Page OSINT Extractor — Bookmarklet-Powered Intelligence
Traditional OSINT workflows involve manually scanning webpages for contact information, social links, and technical indicators. The Page OSINT Extractor automates this into a single click. The bookmarklet injects a small script into the currently viewed page that scans the DOM for intelligence-relevant patterns, packages everything into a compressed payload, and opens this analysis page where findings are categorized, deduplicated, and presented in a structured format.
What Gets Extracted
The bookmarklet searches for email addresses (including obfuscated mailto: patterns), phone numbers (with international format normalization), social media profile links across 20+ platforms, technology indicators from script tags and meta elements, HTML comments that developers often leave in production code, Open Graph and Twitter Card metadata, RSS and Atom feed URLs, and any other structured data embedded in the page. The extraction uses regex patterns against both visible text and raw HTML source.
Evidence Preservation with archive.today
For investigative purposes, it is critical to preserve the state of a webpage at the moment of analysis. The bookmarklet optionally triggers an archive.today snapshot, creating a permanent, timestamped copy that cannot be altered by the page owner. This archived copy can serve as evidence of what was published at a specific time, which is essential for legal proceedings, journalism, and formal investigations.
Security Model
The extracted data is encoded into the URL hash fragment (the part after #), which by RFC specification is never sent to any server — it exists only in your browser. The MaxIntel analysis page reads the hash, decodes the data, and renders results entirely client-side. No extraction data touches any MaxIntel server. The bookmarklet itself makes no network requests; it only reads the DOM of the page you're on.
- Bookmarklet
- A small JavaScript program stored as a browser bookmark. When clicked, it executes on the currently viewed page in the same security context as the page itself.
- URL Hash Fragment
- The portion of a URL after the # symbol. Per HTTP specification, hash fragments are never transmitted to servers — they exist only in the browser, making them ideal for passing sensitive data between pages.
- archive.today
- A web archiving service that creates permanent, timestamped snapshots of webpages. Unlike the Wayback Machine, archive.today preserves JavaScript-rendered content.
- DOM Scraping
- Programmatically reading the Document Object Model (HTML structure) of a webpage to extract specific elements and text patterns.
🧲 Page OSINT Extractor — FAQ
Is the bookmarklet safe to use?
The bookmarklet only reads the DOM of the page you're currently viewing. It makes no network requests and sends no data to any server. The extracted data is passed to the analysis page via the URL hash fragment, which by HTTP specification never leaves your browser. The full source code is visible — you can inspect it before installing.
Does this work on JavaScript-rendered pages?
Yes. The bookmarklet runs after the page has fully rendered, so it captures dynamically loaded content, AJAX-fetched data, and JavaScript-generated elements. It reads the live DOM, not the initial HTML source.
Will the target website know I used this?
No. The bookmarklet reads the DOM silently — it does not modify the page, trigger events, or make network requests that the website could detect. The only detectable action is the archive.today snapshot, which makes a request from archive.today's servers (not your browser).
Can I use this on login-protected pages?
Yes — since the bookmarklet runs in your browser session, it can extract data from pages you're logged into. The extracted data stays entirely in your browser via the URL hash. However, be mindful of terms of service and applicable laws when scraping authenticated content.
What's the maximum amount of data it can handle?
URL length limits vary by browser — Chrome supports ~2MB in the hash fragment. The bookmarklet compresses data and caps extraction to prevent exceeding this limit. For very large pages, some content may be truncated. The extraction prioritizes high-value OSINT data (contacts, links, metadata) over raw text.