Enrichment Skill — Understanding Workbench

What It Does

The Enrichment skill assigns structured metadata to each document in a corpus. For every document, it determines who the content is for, what problems it addresses, what key ideas it covers, and what kind of content it is. It also computes related articles using TF-IDF cosine similarity.

This is the Classification step of the Librarian pipeline — the work that transforms a pile of documents into a browsable, filterable, targetable collection.

Classification Facets

Each document is classified along five independent dimensions. Following Ranganathan's tradition, these facets are independent — a document can be accessed through any of them.

Two Approaches

LLM-Assisted Classification

Use Claude's judgment to read each document and assign metadata based on content analysis. Most accurate for small corpora.

Best for: New corpora, < 100 docs, exploring what categories make sense

Script-Based Classification

When personas and vocabularies are already established, use the enrichment script to apply them systematically with computed similarity.

Best for: Established schemas, repeatable pipeline, > 50 docs

How to Use

Via Claude:

"I have a collection of [N] documents. Please classify each one by
audience persona, key concepts, and pain points addressed."

Claude will read each document and propose classifications. Review and refine, then save as an enrichments JSON file.

Via command line (once you have enrichments):

python skills/librarian-enrichment/scripts/enrich_corpus.py \
  --input ./raw \
  --enrichments enrichments.json \
  --output data.json

Gap Analysis

After enrichment, the script automatically identifies gaps in your content strategy:

Underserved personas — Which audiences have few articles targeted at them?

Thin pain points — Which problems have only 1-2 articles addressing them?

Missing concepts — Are there important topics with no coverage?

Format imbalance — Too much thought-leadership, not enough practical how-to?

Persona imbalance — Detects when one persona has 3x more content than another

What Comes Next

Briefing

Present analysis options to the team

Organizer Console

View enriched data interactively

Content Enrichment