Article

The smallest upgrade path from blog to technology platform (2): The design art of labels and topics

What is the difference between topics and tags? Why is it harder to find content when there are too many tags? This article dismantles the three most common misunderstandings in content taxonomy and shares a practical 'three-tier tag system' design method.

Topic · Content Platform Engineering Series Minimal upgrade path from blog to technology platform 2/4

Blog Upgrade Taxonomy Content Strategy Information Architecture Tagging

Introduction: When the number of tags changes from 5 to 50, you find that you can’t find anything.

After completing the thematic upgrade, I confidently tagged the article.

It was very restrained at first: rag, agent, deployment, testing - each is a precise description. But three months later, my tag list looked like this:

python, javascript, typescript
docker, kubernetes, github-actions
openai, anthropic, langchain, llamaindex
beginner, intermediate, advanced
tutorial, guide, reference, opinion
2024, 2025, 2026…

**More than 50 tags, which in theory should make content more discoverable, actually make it more confusing for readers. **

A reader asked me: “I’m looking for RAG deployment articles suitable for beginners. Which tag should I click?”

I was stunned. Such an article does exist, but it is labeled rag, deployment, begin, docker, and python. The reader needs to guess which of the five, or jump back and forth between five tabbed pages.

**The problem with tag proliferation is not that there are too many tags, but that the relationship between tags is not designed. **

Misunderstanding 1: Treat tags as keyword trash cans

Why does everyone think so?

The default design of many blogging platforms (WordPress, Hexo) implies a simple logic: tags are the keywords of the article, and the more you type, the better.

SEO tutorials also say this: “Putting more tags will help search engines include it.” So the tag list becomes:

tags:
  - python
  - flask
  - web
  - api
  - restful
  - backend
  - tutorial
  - 2025

An article has 8 tags, which looks very rich, but each tag page only displays this article - because other articles use different combinations.

Why is this understanding wrong?

The value of tags is not to “describe all aspects of this article”, but to establish connections between articles.

If each tag is associated with only 1-2 articles, it is an orphan and has no navigation value.

What is a more accurate understanding?

**Tags should correspond to the reader’s query intent, not the content list of the article. **

Ask yourself: Under what circumstances would a reader click on this tag? What do they expect to see?

❌ “This article uses Python” → Not a good tag intention
✅ “I want to see all Python related practices” → Good tag intent

Only when the same tag can aggregate more than 3 related content, this tag has meaning.

Misunderstanding 2: Treating topics and tags as the same thing

Why does everyone think so?

Many CMS UI designs display “categories” and “tags” side by side, making people feel that they are different names for the same dimension.

So there is this confusion:

Some articles are classified under the topic “AI Engineering” and are tagged “ai-engineering”
Readers see duplicate content on topic pages and tag pages and don’t know which way to go.

Why is this understanding wrong?

Topics and tags solve completely different problems:

Dimensions	Topic	Tag
Hierarchy	High-dimensional aggregation (domain)	Low-dimensional features (attributes)
quantity	Few and stable (5-10)	More or less (20-50 pieces)
relation	Mutually exclusive (one article belongs to one topic)	Many-to-many (one article can have multiple tags)
use	Navigation entrance	Content correlation
change	relatively stable	Grow with content

Key Design Principles:

Special topic answer “Which field is this?”
Tag answer “What are the characteristics of this article”

What is a more accurate understanding?

In my blog, an article is organized as follows: Must belong to a topic, multiple tags are optional.

---
title: "RAG cache"
topic: ai-engineering-delivery  # must, unique
tags:
  - rag          # technical focus
  - deployment   # scenario
  - caching      # specific technique
  - production   # environment
---

Reader path design:

Enter the “AI Engineering Practice” topic from the homepage
See all articles under this topic
Find the “deployment” tag on the article, click on it
See all deployment related articles (across topics)

**The topic is vertical digging, and the label is horizontal connection. **

Misunderstanding 3: Pursuing a perfect classification system

Why does everyone think so?

A programmer’s instinct is to pursue completeness. When designing categories, think about:

“There is no place to put front-end, back-end, AI, operation and maintenance… and testing”
“Does testing belong to the backend or is it an independent topic?”
“Is CI/CD an operations or engineering practice?”

So I started to draw complex classification trees, trying to cover all possibilities.

Why is this understanding wrong?

**Classification is not for completeness, but for usefulness. **

Your readers won’t be confused when you miss a “test” topic, but they will be confused when you spread out 3 test articles into different topics.

A perfect classification system often means:

The level is too deep (readers need to click 4 times to see the content)
Blurred boundaries (an article looks like both A and B)
Difficulty in maintenance (when adding new articles, ownership conflicts must be considered)

What is a more accurate understanding?

Adopt the principle of “practical first, moderate redundancy”:

Current content orientation: not “what I might write about in the future”, but “what do I have now”
Reader Perspective Test: If a friend comes looking for X content, which topic will he go to first?
Blurred boundaries are allowed: For a small amount of cross-domain content, just choose the most relevant one.

In my blog, the boundary between “Agent system construction” and “AI engineering practice” is not absolutely clear. But I don’t get hung up on that—if an article looks like both sides, I choose based on the problem it mainly solves.

Three-tier labeling system: a practical design approach

After trial and error, I designed a “three-layer label system”, each layer solving problems in different dimensions:

The first layer: technical point label (What)

Describe the specific technology, tool, or concept covered in the article.

Example:

rag, agent, llm, vector-db
docker, k8s, github-actions
react, vue, astro

Design Principles:

Use common names in the technical community (do not make up your own abbreviations)
Singular form (use rag instead of rags)
Control it within 20 and merge similar tags regularly

Second layer: scene tag (When/Where)

Describe the scenario, stage, or environment to which the article applies.

Example:

prototyping (prototyping stage)
production (production environment)
migration (migration scenario)
troubleshooting

Design Principles:

The scenario is more stable than the technology (the production environment is always the production environment)
Help readers determine “Is this article useful to me now?”
Quantity is controlled within 10 pieces

The third layer: type tag (How)

Describe the content form and reading expectations of the article.

Example:

tutorial (step-by-step tutorial)
guide
interpretation
reference
opinion

Design Principles:

Help readers set reading expectations (tutorial vs point of view, reading methods are completely different)
Control the quantity and don’t over-subdivide it
Each article has only one type tag

Practical configuration: tag implementation in Astro

Tag data aggregation

// src/lib/taxonomy.ts
export function collectTagCounts(posts: Post[]) {
  const counts = new Map<string, number>();

  posts.forEach(post => {
    post.data.tags?.forEach(tag => {
      counts.set(tag, (counts.get(tag) || 0) + 1);
    });
  });

  // keep 2 Posttags
  return Array.from(counts.entries())
    .filter(([_, count]) => count >= 2)
    .sort((a, b) => b[1] - a[1]);  // sort by count descending
}

Tag page implementation

---
// src/pages/tags/[tag].astro
export async function getStaticPaths() {
  const posts = await getCollection('blog');
  const tagCounts = collectTagCounts(posts);

  return tagCounts.map(([tag]) => ({
    params: { tag },
    props: {
      tag,
      posts: posts.filter(p => p.data.tags?.includes(tag))
    },
  }));
}

const { tag, posts } = Astro.props;
---

<h1>tags: {formatTagLabel(tag)}</h1>
<p> {posts.length} posts</p>

{posts.map(post => (
  <article>
    <h2><a href={`/blog/${post.slug}`}>{post.data.title}</a></h2>
    <p>{post.data.description}</p>
  </article>
))}

---
// src/components/TagCloud.astro
const posts = await getCollection('blog');
const tagCounts = collectTagCounts(posts);

// 15 tags
const topTags = tagCounts.slice(0, 15);
---

<div class="tag-cloud">
  {topTags.map(([tag, count]) => (
    <a href={`/tags/${tag}`} class="tag">
      {formatTagLabel(tag)}
      <span class="count">{count}</span>
    </a>
  ))}
</div>

Tag governance: regular cleaning and merging

Labels need to be maintained. My approach is to do “tag management” once a month:

1. Merge similar tags

# tags
- "docker"and"containers" →  merge into  docker
- "testing"and"test" →  merge into  testing
- "ai"and"artificial-intelligence" →  merge into  ai

2. Delete orphan tags

To associate tags with only 1 article, consider:

Are there other articles that could be tagged with this?
If not, remove the tag and describe it in text in the article instead

3. Check for label level confusion

When it is found that the label and the topic overlap (for example, the topic is “AI Engineering” and the label also has “ai-engineering”), delete the label decisively to maintain the unique navigation status of the topic.

Conclusion: A good classification is one that readers don’t have to think about.

After completing the tag system reconstruction, I did a simple test: let a friend find an article “Production Practices about RAG Deployment” without being familiar with my blog.

His path is:

See the “AI Engineering Practice” topic on the homepage → click to enter
Browsing the article list, I found an article with the title “RAG Deployment”
The article has the “production” tag → Click to view all production environment related articles
Filter out content for RAG + production

**During the whole process, he did not wonder “where to click”, nor did he get lost in repeated content. **

This is the standard for a good classification system: readers can find what they are looking for intuitively, without having to understand your complicated classification philosophy.

In the next article, I will share how to design a platform-based homepage - integrating topics, tags, and latest content into an entrance for readers to “discover” rather than “browse”.

Series of articles

From “file pile” to “topicization” - the first principle of blog content organization
The Design Art of Labels and Topics ← This article
Building a platform-based homepage - allowing readers to go from “seeing” to “discovering”
Astro + Content Collections Practical Guide

Reference resources

Source code of this blog: GitHub repository
Information Architecture Classic: “How to Make Sense of Any Mess” by Abby Covert

Series context

You are reading: Minimal upgrade path from blog to technology platform

This is article 2 of 4. Reading progress is stored only in this browser so the full series page can resume from the right entry.

View full series →

Reading path

Continue along this topic path

Follow the recommended order for Content Platform Engineering instead of jumping through random articles in the same topic.

View full topic path →

Next step

Go deeper into this topic

If this article is useful, continue from the topic page or subscribe to follow later updates.

The smallest upgrade path from blog to technology platform (2): The design art of labels and topics

Introduction: When the number of tags changes from 5 to 50, you find that you can’t find anything.

Misunderstanding 1: Treat tags as keyword trash cans

Why does everyone think so?

Why is this understanding wrong?

What is a more accurate understanding?

Misunderstanding 2: Treating topics and tags as the same thing

Why does everyone think so?

Why is this understanding wrong?

What is a more accurate understanding?

Misunderstanding 3: Pursuing a perfect classification system

Why does everyone think so?

Why is this understanding wrong?

What is a more accurate understanding?

Three-tier labeling system: a practical design approach

The first layer: technical point label (What)

Second layer: scene tag (When/Where)

The third layer: type tag (How)

Practical configuration: tag implementation in Astro

Tag data aggregation

Tag page implementation

Tag governance: regular cleaning and merging

1. Merge similar tags

2. Delete orphan tags

3. Check for label level confusion

Conclusion: A good classification is one that readers don’t have to think about.

Series of articles

Reference resources

You are reading: Minimal upgrade path from blog to technology platform

Current series chapters

Continue along this topic path

Go deeper into this topic

Subscribe to updates

Comments and discussion

Introduction: When the number of tags changes from 5 to 50, you find that you can’t find anything.

Misunderstanding 1: Treat tags as keyword trash cans

Why does everyone think so?

Why is this understanding wrong?

What is a more accurate understanding?

Misunderstanding 2: Treating topics and tags as the same thing

Why does everyone think so?

Why is this understanding wrong?

What is a more accurate understanding?

Misunderstanding 3: Pursuing a perfect classification system

Why does everyone think so?

Why is this understanding wrong?

What is a more accurate understanding?

Three-tier labeling system: a practical design approach

The first layer: technical point label (What)

Second layer: scene tag (When/Where)

The third layer: type tag (How)

Practical configuration: tag implementation in Astro

Tag data aggregation

Tag page implementation

Tag cloud component (only displays valuable tags)

Tag governance: regular cleaning and merging

1. Merge similar tags

2. Delete orphan tags

3. Check for label level confusion

Conclusion: A good classification is one that readers don’t have to think about.

Series of articles

Reference resources

You are reading: Minimal upgrade path from blog to technology platform

Current series chapters

Continue along this topic path

Minimal upgrade path from blog to technology platform

The minimum upgrade path from blog to technology platform (1): from 'file pile' to 'thematic'

The smallest upgrade path from blog to technology platform (3): Build a platform-based homepage - let readers go from 'seeing' to 'discovering'

Continue with this topic

The smallest upgrade path from blog to technology platform (4): Astro + Content Collections practical guide

Go deeper into this topic

Subscribe to updates

Comments and discussion