Schema Sniffer – Structured Data Finder & JSON Extractor

Every webpage contains hidden metadata—structured data tucked inside JSON-LD scripts, microdata attributes, RDFa markup, and social meta tags. This data is invisible in the browser but controls how search engines rank your content, how social platforms display your links, and how web applications understand your pages. Schema Sniffer is a one-click bookmarklet that exposes all of it. No more digging through HTML source code or using clunky online validators. Just click the bookmarklet and see exactly what structured data is on the page: all JSON-LD schemas, microdata items, RDFa triples, Open Graph tags, and Twitter Card metadata in one organized panel. Perfect for SEO audits, competitive analysis, schema debugging, or verifying your markup is correct. Extract the data, copy it, download it as JSON—all from your browser in seconds.

Schema Sniffer in action on Arcuras.com – The bookmarklet panel displays all structured data extracted from a music track page, including 2 JSON-LD schemas, 1 Microdata item, 1 RDFa triple, and 7 Open Graph tags, with options to copy, download, collapse, or filter by schema type.

javascript:(()=>{const id="__schemaSniffer__";try{document.getElementById(id)?.remove();const by=Array.from,txt=(n)=>n.textContent||"",trim=(s)=>s.replace(/\s+/g," ").trim();const safeParse=(s)=>{try{return JSON.parse(s)}catch(e){try{return JSON.parse(s.replace(/[\u0000-\u001F]/g,""))}catch{return{__parseError:String(e),__raw:s}}}};const valFromEl=(el)=>{const tn=(el.tagName||"").toLowerCase();if(el.hasAttribute("content"))return el.getAttribute("content");if(tn==="meta")return el.getAttribute("content")||"";if(/^(a|area|link)$/i.test(tn))return el.getAttribute("href")||"";if(/^(img|audio|video|source|track)$/i.test(tn))return el.getAttribute("src")||"";if(tn==="time")return el.getAttribute("datetime")||trim(txt(el));if(tn==="data"||tn==="meter")return el.getAttribute("value")||trim(txt(el));return trim(txt(el))};const isDescOf=(n,root)=>{let p=n.parentElement;while(p&&p!==root){if(p.hasAttribute("itemscope"))return false;p=p.parentElement}return true};const serMicro=(root)=>{const item={__type:"Microdata",itemtype:root.getAttribute("itemtype")||null,itemid:root.getAttribute("itemid")||null,props:{}};by(root.querySelectorAll("[itemprop]")).forEach(p=>{if(!isDescOf(p,root))return;const name=p.getAttribute("itemprop");if(p.hasAttribute("itemscope")){item.props[name]=serMicro(p)}else{item.props[name]=valFromEl(p)}});return item};const collectJSONLD=()=>by(document.querySelectorAll(%27script[type="application/ld+json" i]%27)).map(s=>({__type:"JSON-LD",source:"<script>",data:safeParse(s.textContent||"")}));const collectMicro=()=>by(document.querySelectorAll("[itemscope]")).filter(el=>!el.hasAttribute("itemprop")||!el.parentElement?.closest("[itemscope]")).map(serMicro);const collectRDFa=()=>{const map=new Map();const subjOf=(el)=>{const host=el.closest("[typeof],[about],[resource]");if(!host)return"__page__";return host.getAttribute("about")||host.getAttribute("resource")||("__anon_"+(map.size+1))};const typeOf=(el)=>{const h=el.closest("[typeof]");return h?h.getAttribute("typeof"):null};by(document.querySelectorAll("[property]")).forEach(el=>{const subj=subjOf(el);if(!map.has(subj))map.set(subj,{__type:"RDFa",subject:subj,typeof:typeOf(el),props:{}});const it=map.get(subj);const prop=el.getAttribute("property");const val=valFromEl(el);if(it.props[prop]===undefined)it.props[prop]=val;else if(Array.isArray(it.props[prop]))it.props[prop].push(val);else it.props[prop]=[it.props[prop],val]});return [...map.values()]};const collectOGT=()=>{const og={},tw={};by(document.querySelectorAll(%27meta[property^="og:" i],meta[name^="twitter:" i]%27)).forEach(m=>{const p=m.getAttribute("property");const n=m.getAttribute("name");const k=(p||n||"").toLowerCase();const v=m.getAttribute("content")||"";(k.startsWith("og:")?og:tw)[k]=v});return{opengraph:og,twitter:tw}};const data={jsonld:collectJSONLD(),microdata:collectMicro(),rdfa:collectRDFa(),...collectOGT()};const pretty=()=>JSON.stringify(data,null,2);const panel=document.createElement("div");panel.id=id;panel.style.cssText="position:fixed;inset:auto 12px 12px auto;top:12px;z-index:2147483647;max-width:720px;width:720px;max-height:80vh;background:#0f1115;color:#fff;border-radius:12px;box-shadow:0 12px 28px rgba(0,0,0,.45);font:13px/1.45 system-ui,Segoe UI,Roboto,Arial,sans-serif;overflow:hidden;border:1px solid #2a2f3a";const counts=%60JSON-LD: ${data.jsonld.length} · Microdata: ${data.microdata.length} · RDFa: ${data.rdfa.length} · OG: ${Object.keys(data.opengraph).length} · Twitter: ${Object.keys(data.twitter).length}%60;panel.innerHTML=%60<div style="display:flex;gap:8px;align-items:center;padding:10px 12px;background:#11151c;border-bottom:1px solid #232936"><strong style="font-size:13px">Structured Data Finder</strong><span style="opacity:.75">(${counts})</span><span style="flex:1"></span><button id="__copy" style="padding:6px 10px;border-radius:8px;background:#2a6;color:#fff;border:0;cursor:pointer">Copy JSON</button><button id="__save" style="padding:6px 10px;border-radius:8px;background:#2763c4;color:#fff;border:0;cursor:pointer">Download</button><button id="__min" style="padding:6px 10px;border-radius:8px;background:#444;color:#fff;border:0;cursor:pointer">Collapse</button><button id="__close" style="padding:6px 10px;border-radius:8px;background:#444;color:#fff;border:0;cursor:pointer">Close</button></div><div id="__body" style="padding:10px 12px"><div style="display:flex;gap:8px;margin:6px 0 10px 0;flex-wrap:wrap"><label style="opacity:.85"><input type="checkbox" id="__compact"> compact</label><label style="opacity:.85"><input type="checkbox" id="__onlyLD"> JSON-LD only</label></div><textarea id="__out" spellcheck="false" style="width:100%;height:56vh;resize:vertical;background:#0b0d11;color:#d7dde9;border:1px solid #273041;border-radius:8px;padding:10px;font-family:ui-monospace,SFMono-Regular,Consolas,monospace"></textarea></div>%60;document.body.appendChild(panel);const out=panel.querySelector("#__out"),btnC=panel.querySelector("#__copy"),btnS=panel.querySelector("#__save"),btnM=panel.querySelector("#__min"),body=panel.querySelector("#__body");const render=()=>{const onlyLD=panel.querySelector("#__onlyLD").checked;const compact=panel.querySelector("#__compact").checked;const d=onlyLD?{jsonld:data.jsonld}:data;out.value=compact?JSON.stringify(d):JSON.stringify(d,null,2)};panel.querySelector("#__onlyLD").onchange=render;panel.querySelector("#__compact").onchange=render;render();panel.querySelector("#__close").onclick=()=>panel.remove();btnM.onclick=()=>{if(body.style.display!=="none"){body.style.display="none";btnM.textContent="Expand"}else{body.style.display="";btnM.textContent="Collapse"}};btnC.onclick=()=>{navigator.clipboard?.writeText(out.value).then(()=>{btnC.textContent="Copied ✓";setTimeout(()=>btnC.textContent="Copy JSON",1200)}).catch(()=>{prompt("Copy this:",out.value)})};btnS.onclick=()=>{const blob=new Blob([out.value],{type:"application/json"});const a=document.createElement("a");a.href=URL.createObjectURL(blob);a.download=%60schema-data-${(location.hostname||"site")}.json%60;a.click();setTimeout(()=>URL.revokeObjectURL(a.href),1500)};if(!data.jsonld.length&&!data.microdata.length&&!data.rdfa.length&&Object.keys(data.opengraph).length===0&&Object.keys(data.twitter).length===0){out.value="// No structured data found on this page."}}catch(e){console.error(e);alert("Something went wrong: "+e)}})();

Description

A powerful bookmarklet tool that instantly extracts, analyzes, and displays all structured data (JSON-LD, Microdata, RDFa, Open Graph, and Twitter Card metadata) from any webpage. Perfect for SEO audits, schema validation, and structured data analysis.

What It Does

Schema Sniffer is a bookmarklet that runs in your browser and automatically detects and extracts all structured data markup from the current webpage. It supports multiple schema formats and presents them in an organized, easy-to-read interface.

Key Features:

Multiple Schema Format Support
- JSON-LD: JavaScript Object Notation for Linked Data (most common)
- Microdata: HTML5 itemscope/itemprop attributes
- RDFa: Resource Description Framework in Attributes
- Open Graph (OG): Facebook and social media metadata
- Twitter Cards: Twitter-specific metadata
Real-time Data Extraction
- Automatically scans the entire page for structured data
- Counts and displays statistics for each schema type
- Shows total number of each data type found
Interactive UI Panel
- Fixed position overlay that doesn’t interfere with page content
- Clean, dark-themed interface for easy reading
- Displays raw JSON output in textarea
User-Friendly Options
- Copy JSON Button: Instantly copy all extracted data to clipboard
- Download Button: Save extracted data as a .json file
- Collapse/Expand Button: Minimize the panel to reduce screen clutter
- Close Button: Remove the panel from the page
- Compact Checkbox: Toggle between formatted (pretty) and minified JSON
- JSON-LD Only Checkbox: Filter to show only JSON-LD schemas
Smart Data Extraction
- Extracts values from HTML attributes (href, src, content, datetime, value)
- Handles nested structures properly
- Parses malformed JSON with character cleanup
- Prevents parsing errors from breaking the tool

How It Works Technically

Step-by-Step Process:

1. Initialization

Removes any existing panel if tool was run multiple times
Creates internal functions for data extraction and parsing

2. Value Extraction (`valFromEl`)

Reads content from HTML elements smartly:
- Meta tags: extracts content attribute
- Links: extracts href attribute
- Media (img/video/audio): extracts src attribute
- Time elements: extracts datetime attribute
- Default: extracts text content
Trims whitespace and normalizes output

3. JSON-LD Collection (`collectJSONLD`)

Finds all <script type="application/ld+json"> tags
Safely parses JSON content
Handles malformed JSON gracefully
Returns array of all JSON-LD scripts found

4. Microdata Collection (`collectMicro`)

Scans for HTML5 microdata markup ([itemscope])
Respects hierarchy – doesn’t extract nested scopes
Recursively processes nested microdata items
Captures itemtype and itemid attributes
Extracts all itemprop values

5. RDFa Collection (`collectRDFa`)

Identifies RDFa triples using [property] attributes
Groups properties by subject (about/resource/anonymous)
Handles multiple values for same property
Preserves RDFa types and relationships

6. Open Graph & Twitter Cards (`collectOGT`)

Extracts meta tags with og: prefix (Open Graph)
Extracts meta tags with twitter: prefix (Twitter Cards)
Separates them into distinct objects

7. UI Panel Creation

Builds an overlay panel with:
- Header with title and statistics counter
- Control buttons (Copy, Download, Collapse, Close)
- Checkboxes for filtering options (compact, JSON-LD only)
- Large textarea displaying JSON output

8. Rendering & Interaction

Updates textarea based on selected options:
- Pretty-prints JSON for readability
- Minifies when “compact” is checked
- Filters to JSON-LD only when requested
Event listeners handle button clicks and checkbox changes

Data Extraction Examples

JSON-LD Example

{
  "__type": "JSON-LD",
  "source": "<script>",
  "data": {
    "@context": "https://schema.org",
    "@type": "Article",
    "headline": "Article Title",
    "author": "Author Name"
  }
}

Microdata Example

{
  "__type": "Microdata",
  "itemtype": "https://schema.org/Product",
  "itemid": "product-123",
  "props": {
    "name": "Product Name",
    "price": "99.99",
    "description": "Product description here"
  }
}

RDFa Example

{
  "__type": "RDFa",
  "subject": "__page__",
  "typeof": "schema:BlogPosting",
  "props": {
    "schema:headline": "Blog Post Title",
    "schema:datePublished": "2025-01-15"
  }
}

Open Graph Example

{
  "og:title": "Page Title",
  "og:description": "Page description",
  "og:image": "https://example.com/image.jpg",
  "og:url": "https://example.com/page"
}

UI Components Explained

Header

Title: “Structured Data Finder”
Statistics Counter: Shows counts of each schema type found
- Example: “JSON-LD: 3 · Microdata: 1 · RDFa: 0 · OG: 5 · Twitter: 2”

Buttons

Copy JSON (Green)
- Copies textarea content to clipboard
- Shows “Copied ✓” confirmation for 1.2 seconds
- Falls back to prompt if clipboard unavailable
Download (Blue)
- Creates and downloads a .json file
- Filename format: schema-data-{hostname}.json
- Example: schema-data-example.com.json
Collapse/Expand (Gray)
- Hides/shows the main panel body to save screen space
- Button text changes between “Collapse” and “Expand”
- Header remains visible for quick re-expansion
Close (Gray)
- Removes the entire panel from the page
- Run the bookmarklet again to re-open it

Checkboxes

Compact
- When checked: outputs minified JSON (single line)
- When unchecked: pretty-prints JSON with indentation
- Useful for copying to APIs or reducing file size
JSON-LD Only
- When checked: shows only JSON-LD schemas
- When unchecked: shows all schema types
- Helpful when you only need JSON-LD data

Textarea

Large scrollable area displaying the JSON data
Monospace font for readability
Dark theme to reduce eye strain
Full-width with 56vh height (adjustable with resize handle)

Use Cases

1. SEO Audits

Verify that structured data is correctly implemented
Check if schema markup matches content
Identify missing schema types

2. Schema Validation

Export structured data for validation tools
Check for JSON parsing errors
Verify schema completeness

3. Competitive Analysis

Analyze competitor website schema implementations
Understand their structured data strategy
Identify best practices in your industry

4. Development & Testing

Verify markup during development
Test schema changes without page reload
Debug schema implementation issues

5. Data Integration

Extract structured data for integration into other systems
Validate data format before processing
Generate JSON for APIs

6. Content Management

Track structured data across website pages
Ensure consistency across content
Monitor Open Graph metadata

Technical Details

Error Handling

Malformed JSON: Attempts to clean control characters and re-parse
Missing Attributes: Returns null or empty strings gracefully
Circular References: Prevented by scope checking
No Data Found: Shows helpful message “No structured data found on this page.”

Performance

Runs entirely client-side (no server requests)
Efficient DOM traversal and filtering
Minimal memory footprint
Instant results even on large pages

Browser Compatibility

Works on all modern browsers (Chrome, Firefox, Safari, Edge)
Uses ES6+ syntax
Supports Clipboard API for copying
Falls back to prompt if clipboard unavailable

Security

No external requests or tracking
Runs in isolated execution context
No data transmission
Safe to use on any website

Installation

Create a new bookmark in your browser
In the URL field, paste the entire bookmarklet code
Name it “Schema Sniffer” or preferred name
Save the bookmark
Visit any webpage and click the bookmark to extract structured data

Tips & Tricks

Quick Copy: Use “Compact” + “Copy JSON” for quick API integration
Sharing Results: Use “Download” to share extracted data with team members
Filtering: Check “JSON-LD Only” if page has too many schema types
Multiple Runs: Each run removes previous panel, so you can start fresh
Large Pages: Collapse the panel if it obscures important page content

Output File Format

When downloading, the file includes:

{
  "jsonld": [...],
  "microdata": [...],
  "rdfa": [...],
  "opengraph": {...},
  "twitter": {...}
}

Each section contains the extracted data organized by type.

Troubleshooting

Q: Nothing shows up when I click the bookmarklet A: The page might not have any structured data. The message “No structured data found on this page” will be displayed.

Q: Copy button shows “Copied” but nothing is pasted A: Your browser’s clipboard API might be restricted. The fallback prompt will appear – select all and copy from there.

Q: The panel is covering important content A: Click the “Collapse” button to hide the panel body, or drag the panel using its header.

Q: I got a parse error in JSON A: The bookmarklet tries to clean the JSON automatically. If still failing, check if the page has malformed script tags.

Summary

Schema Sniffer is an essential tool for anyone working with web content, SEO, or structured data. It provides instant visibility into all the hidden metadata on any webpage, making it invaluable for analysis, validation, and development workflows.