Schema Sniffer – Structured Data Finder & JSON Extractor

Mudos Digital Mudos Digital
12 min read

Every webpage contains hidden metadata—structured data tucked inside JSON-LD scripts, microdata attributes, RDFa markup, and social meta tags. This data is invisible in the browser but controls how search engines rank your content, how social platforms display your links, and how web applications understand your pages. Schema Sniffer is a one-click bookmarklet that exposes all of it. No more digging through HTML source code or using clunky online validators. Just click the bookmarklet and see exactly what structured data is on the page: all JSON-LD schemas, microdata items, RDFa triples, Open Graph tags, and Twitter Card metadata in one organized panel. Perfect for SEO audits, competitive analysis, schema debugging, or verifying your markup is correct. Extract the data, copy it, download it as JSON—all from your browser in seconds.

A screenshot of the Schema Sniffer bookmarklet running on a music website showing a dark-themed overlay panel on the right side of the screen. The panel displays extracted structured data in JSON format with statistics showing "JSON-LD: 2 · Microdata: 0 · RDFa: 1 · OG: 7 · Twitter: 3" in the header. The main page content shows a music track titled "Borderline" by Tame Impala with album artwork, singer information, release date, genre tags, language options, and playback controls visible on the left side of the screen. Control buttons in the bookmarklet include green "Copy JSON" and blue "Download" buttons, along with gray "Collapse" and "Close" buttons. A textarea displays the raw JSON output containing MusicRecording schema data with properties like name, artist, album, and image URLs.
Schema Sniffer in action on Arcuras.com – The bookmarklet panel displays all structured data extracted from a music track page, including 2 JSON-LD schemas, 1 Microdata item, 1 RDFa triple, and 7 Open Graph tags, with options to copy, download, collapse, or filter by schema type.
javascript:(()=>{const id="__schemaSniffer__";try{document.getElementById(id)?.remove();const by=Array.from,txt=(n)=>n.textContent||"",trim=(s)=>s.replace(/\s+/g," ").trim();const safeParse=(s)=>{try{return JSON.parse(s)}catch(e){try{return JSON.parse(s.replace(/[\u0000-\u001F]/g,""))}catch{return{__parseError:String(e),__raw:s}}}};const valFromEl=(el)=>{const tn=(el.tagName||"").toLowerCase();if(el.hasAttribute("content"))return el.getAttribute("content");if(tn==="meta")return el.getAttribute("content")||"";if(/^(a|area|link)$/i.test(tn))return el.getAttribute("href")||"";if(/^(img|audio|video|source|track)$/i.test(tn))return el.getAttribute("src")||"";if(tn==="time")return el.getAttribute("datetime")||trim(txt(el));if(tn==="data"||tn==="meter")return el.getAttribute("value")||trim(txt(el));return trim(txt(el))};const isDescOf=(n,root)=>{let p=n.parentElement;while(p&&p!==root){if(p.hasAttribute("itemscope"))return false;p=p.parentElement}return true};const serMicro=(root)=>{const item={__type:"Microdata",itemtype:root.getAttribute("itemtype")||null,itemid:root.getAttribute("itemid")||null,props:{}};by(root.querySelectorAll("[itemprop]")).forEach(p=>{if(!isDescOf(p,root))return;const name=p.getAttribute("itemprop");if(p.hasAttribute("itemscope")){item.props[name]=serMicro(p)}else{item.props[name]=valFromEl(p)}});return item};const collectJSONLD=()=>by(document.querySelectorAll(%27script[type="application/ld+json" i]%27)).map(s=>({__type:"JSON-LD",source:"<script>",data:safeParse(s.textContent||"")}));const collectMicro=()=>by(document.querySelectorAll("[itemscope]")).filter(el=>!el.hasAttribute("itemprop")||!el.parentElement?.closest("[itemscope]")).map(serMicro);const collectRDFa=()=>{const map=new Map();const subjOf=(el)=>{const host=el.closest("[typeof],[about],[resource]");if(!host)return"__page__";return host.getAttribute("about")||host.getAttribute("resource")||("__anon_"+(map.size+1))};const typeOf=(el)=>{const h=el.closest("[typeof]");return h?h.getAttribute("typeof"):null};by(document.querySelectorAll("[property]")).forEach(el=>{const subj=subjOf(el);if(!map.has(subj))map.set(subj,{__type:"RDFa",subject:subj,typeof:typeOf(el),props:{}});const it=map.get(subj);const prop=el.getAttribute("property");const val=valFromEl(el);if(it.props[prop]===undefined)it.props[prop]=val;else if(Array.isArray(it.props[prop]))it.props[prop].push(val);else it.props[prop]=[it.props[prop],val]});return [...map.values()]};const collectOGT=()=>{const og={},tw={};by(document.querySelectorAll(%27meta[property^="og:" i],meta[name^="twitter:" i]%27)).forEach(m=>{const p=m.getAttribute("property");const n=m.getAttribute("name");const k=(p||n||"").toLowerCase();const v=m.getAttribute("content")||"";(k.startsWith("og:")?og:tw)[k]=v});return{opengraph:og,twitter:tw}};const data={jsonld:collectJSONLD(),microdata:collectMicro(),rdfa:collectRDFa(),...collectOGT()};const pretty=()=>JSON.stringify(data,null,2);const panel=document.createElement("div");panel.id=id;panel.style.cssText="position:fixed;inset:auto 12px 12px auto;top:12px;z-index:2147483647;max-width:720px;width:720px;max-height:80vh;background:#0f1115;color:#fff;border-radius:12px;box-shadow:0 12px 28px rgba(0,0,0,.45);font:13px/1.45 system-ui,Segoe UI,Roboto,Arial,sans-serif;overflow:hidden;border:1px solid #2a2f3a";const counts=%60JSON-LD: ${data.jsonld.length} · Microdata: ${data.microdata.length} · RDFa: ${data.rdfa.length} · OG: ${Object.keys(data.opengraph).length} · Twitter: ${Object.keys(data.twitter).length}%60;panel.innerHTML=%60<div style="display:flex;gap:8px;align-items:center;padding:10px 12px;background:#11151c;border-bottom:1px solid #232936"><strong style="font-size:13px">Structured Data Finder</strong><span style="opacity:.75">(${counts})</span><span style="flex:1"></span><button id="__copy" style="padding:6px 10px;border-radius:8px;background:#2a6;color:#fff;border:0;cursor:pointer">Copy JSON</button><button id="__save" style="padding:6px 10px;border-radius:8px;background:#2763c4;color:#fff;border:0;cursor:pointer">Download</button><button id="__min" style="padding:6px 10px;border-radius:8px;background:#444;color:#fff;border:0;cursor:pointer">Collapse</button><button id="__close" style="padding:6px 10px;border-radius:8px;background:#444;color:#fff;border:0;cursor:pointer">Close</button></div><div id="__body" style="padding:10px 12px"><div style="display:flex;gap:8px;margin:6px 0 10px 0;flex-wrap:wrap"><label style="opacity:.85"><input type="checkbox" id="__compact"> compact</label><label style="opacity:.85"><input type="checkbox" id="__onlyLD"> JSON-LD only</label></div><textarea id="__out" spellcheck="false" style="width:100%;height:56vh;resize:vertical;background:#0b0d11;color:#d7dde9;border:1px solid #273041;border-radius:8px;padding:10px;font-family:ui-monospace,SFMono-Regular,Consolas,monospace"></textarea></div>%60;document.body.appendChild(panel);const out=panel.querySelector("#__out"),btnC=panel.querySelector("#__copy"),btnS=panel.querySelector("#__save"),btnM=panel.querySelector("#__min"),body=panel.querySelector("#__body");const render=()=>{const onlyLD=panel.querySelector("#__onlyLD").checked;const compact=panel.querySelector("#__compact").checked;const d=onlyLD?{jsonld:data.jsonld}:data;out.value=compact?JSON.stringify(d):JSON.stringify(d,null,2)};panel.querySelector("#__onlyLD").onchange=render;panel.querySelector("#__compact").onchange=render;render();panel.querySelector("#__close").onclick=()=>panel.remove();btnM.onclick=()=>{if(body.style.display!=="none"){body.style.display="none";btnM.textContent="Expand"}else{body.style.display="";btnM.textContent="Collapse"}};btnC.onclick=()=>{navigator.clipboard?.writeText(out.value).then(()=>{btnC.textContent="Copied ✓";setTimeout(()=>btnC.textContent="Copy JSON",1200)}).catch(()=>{prompt("Copy this:",out.value)})};btnS.onclick=()=>{const blob=new Blob([out.value],{type:"application/json"});const a=document.createElement("a");a.href=URL.createObjectURL(blob);a.download=%60schema-data-${(location.hostname||"site")}.json%60;a.click();setTimeout(()=>URL.revokeObjectURL(a.href),1500)};if(!data.jsonld.length&&!data.microdata.length&&!data.rdfa.length&&Object.keys(data.opengraph).length===0&&Object.keys(data.twitter).length===0){out.value="// No structured data found on this page."}}catch(e){console.error(e);alert("Something went wrong: "+e)}})();

Description

A powerful bookmarklet tool that instantly extracts, analyzes, and displays all structured data (JSON-LD, Microdata, RDFa, Open Graph, and Twitter Card metadata) from any webpage. Perfect for SEO audits, schema validation, and structured data analysis.


What It Does

Schema Sniffer is a bookmarklet that runs in your browser and automatically detects and extracts all structured data markup from the current webpage. It supports multiple schema formats and presents them in an organized, easy-to-read interface.

Key Features:

  1. Multiple Schema Format Support
    • JSON-LD: JavaScript Object Notation for Linked Data (most common)
    • Microdata: HTML5 itemscope/itemprop attributes
    • RDFa: Resource Description Framework in Attributes
    • Open Graph (OG): Facebook and social media metadata
    • Twitter Cards: Twitter-specific metadata
  2. Real-time Data Extraction
    • Automatically scans the entire page for structured data
    • Counts and displays statistics for each schema type
    • Shows total number of each data type found
  3. Interactive UI Panel
    • Fixed position overlay that doesn’t interfere with page content
    • Clean, dark-themed interface for easy reading
    • Displays raw JSON output in textarea
  4. User-Friendly Options
    • Copy JSON Button: Instantly copy all extracted data to clipboard
    • Download Button: Save extracted data as a .json file
    • Collapse/Expand Button: Minimize the panel to reduce screen clutter
    • Close Button: Remove the panel from the page
    • Compact Checkbox: Toggle between formatted (pretty) and minified JSON
    • JSON-LD Only Checkbox: Filter to show only JSON-LD schemas
  5. Smart Data Extraction
    • Extracts values from HTML attributes (href, src, content, datetime, value)
    • Handles nested structures properly
    • Parses malformed JSON with character cleanup
    • Prevents parsing errors from breaking the tool

How It Works Technically

Step-by-Step Process:

1. Initialization

Removes any existing panel if tool was run multiple times
Creates internal functions for data extraction and parsing

2. Value Extraction (valFromEl)

  • Reads content from HTML elements smartly:
    • Meta tags: extracts content attribute
    • Links: extracts href attribute
    • Media (img/video/audio): extracts src attribute
    • Time elements: extracts datetime attribute
    • Default: extracts text content
  • Trims whitespace and normalizes output

3. JSON-LD Collection (collectJSONLD)

  • Finds all <script type="application/ld+json"> tags
  • Safely parses JSON content
  • Handles malformed JSON gracefully
  • Returns array of all JSON-LD scripts found

4. Microdata Collection (collectMicro)

  • Scans for HTML5 microdata markup ([itemscope])
  • Respects hierarchy – doesn’t extract nested scopes
  • Recursively processes nested microdata items
  • Captures itemtype and itemid attributes
  • Extracts all itemprop values

5. RDFa Collection (collectRDFa)

  • Identifies RDFa triples using [property] attributes
  • Groups properties by subject (about/resource/anonymous)
  • Handles multiple values for same property
  • Preserves RDFa types and relationships

6. Open Graph & Twitter Cards (collectOGT)

  • Extracts meta tags with og: prefix (Open Graph)
  • Extracts meta tags with twitter: prefix (Twitter Cards)
  • Separates them into distinct objects

7. UI Panel Creation

  • Builds an overlay panel with:
    • Header with title and statistics counter
    • Control buttons (Copy, Download, Collapse, Close)
    • Checkboxes for filtering options (compact, JSON-LD only)
    • Large textarea displaying JSON output

8. Rendering & Interaction

  • Updates textarea based on selected options:
    • Pretty-prints JSON for readability
    • Minifies when “compact” is checked
    • Filters to JSON-LD only when requested
  • Event listeners handle button clicks and checkbox changes

Data Extraction Examples

JSON-LD Example

{
  "__type": "JSON-LD",
  "source": "<script>",
  "data": {
    "@context": "https://schema.org",
    "@type": "Article",
    "headline": "Article Title",
    "author": "Author Name"
  }
}

Microdata Example

{
  "__type": "Microdata",
  "itemtype": "https://schema.org/Product",
  "itemid": "product-123",
  "props": {
    "name": "Product Name",
    "price": "99.99",
    "description": "Product description here"
  }
}

RDFa Example

{
  "__type": "RDFa",
  "subject": "__page__",
  "typeof": "schema:BlogPosting",
  "props": {
    "schema:headline": "Blog Post Title",
    "schema:datePublished": "2025-01-15"
  }
}

Open Graph Example

{
  "og:title": "Page Title",
  "og:description": "Page description",
  "og:image": "https://example.com/image.jpg",
  "og:url": "https://example.com/page"
}

UI Components Explained

Header

  • Title: “Structured Data Finder”
  • Statistics Counter: Shows counts of each schema type found
    • Example: “JSON-LD: 3 · Microdata: 1 · RDFa: 0 · OG: 5 · Twitter: 2”

Buttons

  1. Copy JSON (Green)
    • Copies textarea content to clipboard
    • Shows “Copied ✓” confirmation for 1.2 seconds
    • Falls back to prompt if clipboard unavailable
  2. Download (Blue)
    • Creates and downloads a .json file
    • Filename format: schema-data-{hostname}.json
    • Example: schema-data-example.com.json
  3. Collapse/Expand (Gray)
    • Hides/shows the main panel body to save screen space
    • Button text changes between “Collapse” and “Expand”
    • Header remains visible for quick re-expansion
  4. Close (Gray)
    • Removes the entire panel from the page
    • Run the bookmarklet again to re-open it

Checkboxes

  1. Compact
    • When checked: outputs minified JSON (single line)
    • When unchecked: pretty-prints JSON with indentation
    • Useful for copying to APIs or reducing file size
  2. JSON-LD Only
    • When checked: shows only JSON-LD schemas
    • When unchecked: shows all schema types
    • Helpful when you only need JSON-LD data

Textarea

  • Large scrollable area displaying the JSON data
  • Monospace font for readability
  • Dark theme to reduce eye strain
  • Full-width with 56vh height (adjustable with resize handle)

Use Cases

1. SEO Audits

  • Verify that structured data is correctly implemented
  • Check if schema markup matches content
  • Identify missing schema types

2. Schema Validation

  • Export structured data for validation tools
  • Check for JSON parsing errors
  • Verify schema completeness

3. Competitive Analysis

  • Analyze competitor website schema implementations
  • Understand their structured data strategy
  • Identify best practices in your industry

4. Development & Testing

  • Verify markup during development
  • Test schema changes without page reload
  • Debug schema implementation issues

5. Data Integration

  • Extract structured data for integration into other systems
  • Validate data format before processing
  • Generate JSON for APIs

6. Content Management

  • Track structured data across website pages
  • Ensure consistency across content
  • Monitor Open Graph metadata

Technical Details

Error Handling

  • Malformed JSON: Attempts to clean control characters and re-parse
  • Missing Attributes: Returns null or empty strings gracefully
  • Circular References: Prevented by scope checking
  • No Data Found: Shows helpful message “No structured data found on this page.”

Performance

  • Runs entirely client-side (no server requests)
  • Efficient DOM traversal and filtering
  • Minimal memory footprint
  • Instant results even on large pages

Browser Compatibility

  • Works on all modern browsers (Chrome, Firefox, Safari, Edge)
  • Uses ES6+ syntax
  • Supports Clipboard API for copying
  • Falls back to prompt if clipboard unavailable

Security

  • No external requests or tracking
  • Runs in isolated execution context
  • No data transmission
  • Safe to use on any website

Installation

  1. Create a new bookmark in your browser
  2. In the URL field, paste the entire bookmarklet code
  3. Name it “Schema Sniffer” or preferred name
  4. Save the bookmark
  5. Visit any webpage and click the bookmark to extract structured data

Tips & Tricks

  • Quick Copy: Use “Compact” + “Copy JSON” for quick API integration
  • Sharing Results: Use “Download” to share extracted data with team members
  • Filtering: Check “JSON-LD Only” if page has too many schema types
  • Multiple Runs: Each run removes previous panel, so you can start fresh
  • Large Pages: Collapse the panel if it obscures important page content

Output File Format

When downloading, the file includes:

{
  "jsonld": [...],
  "microdata": [...],
  "rdfa": [...],
  "opengraph": {...},
  "twitter": {...}
}

Each section contains the extracted data organized by type.


Troubleshooting

Q: Nothing shows up when I click the bookmarklet A: The page might not have any structured data. The message “No structured data found on this page” will be displayed.

Q: Copy button shows “Copied” but nothing is pasted A: Your browser’s clipboard API might be restricted. The fallback prompt will appear – select all and copy from there.

Q: The panel is covering important content A: Click the “Collapse” button to hide the panel body, or drag the panel using its header.

Q: I got a parse error in JSON A: The bookmarklet tries to clean the JSON automatically. If still failing, check if the page has malformed script tags.


Summary

Schema Sniffer is an essential tool for anyone working with web content, SEO, or structured data. It provides instant visibility into all the hidden metadata on any webpage, making it invaluable for analysis, validation, and development workflows.

Share this Post

Let's Build Something Great Together

Have a project in mind or just want to say hello? We'd love to hear from you. Fill out the form and we'll get back to you as soon as possible.

Fast response times
Free project consultation

Or Contact Us Directly