Hualin Luan Cloud Native · Quant Trading · AI Engineering
Back to articles

Article

The smallest upgrade path from blog to technology platform (2): The design art of labels and topics

What is the difference between topics and tags? Why is it harder to find content when there are too many tags? This article dismantles the three most common misunderstandings in content taxonomy and shares a practical 'three-tier tag system' design method.

Meta

Published

3/21/2026

Category

guide

Reading Time

9 min read

Writing Statement This article is based on the author’s experience of trial and error in organizing blog content. The specific classification dilemmas and solutions involved come from the real pitfall process.


Introduction: When the number of tags changes from 5 to 50, you find that you can’t find anything.

After completing the thematic upgrade, I confidently tagged the article.

It was very restrained at first: rag, agent, deployment, testing - each is a precise description. But three months later, my tag list looked like this:

  • python, javascript, typescript
  • docker, kubernetes, github-actions
  • openai, anthropic, langchain, llamaindex
  • beginner, intermediate, advanced
  • tutorial, guide, reference, opinion
  • 2024, 2025, 2026…

**More than 50 tags, which in theory should make content more discoverable, actually make it more confusing for readers. **

A reader asked me: “I’m looking for RAG deployment articles suitable for beginners. Which tag should I click?”

I was stunned. Such an article does exist, but it is labeled rag, deployment, begin, docker, and python. The reader needs to guess which of the five, or jump back and forth between five tabbed pages.

**The problem with tag proliferation is not that there are too many tags, but that the relationship between tags is not designed. **


Misunderstanding 1: Treat tags as keyword trash cans

Why does everyone think so?

The default design of many blogging platforms (WordPress, Hexo) implies a simple logic: tags are the keywords of the article, and the more you type, the better.

SEO tutorials also say this: “Putting more tags will help search engines include it.” So the tag list becomes:

tags:
  - python
  - flask
  - web
  - api
  - restful
  - backend
  - tutorial
  - 2025

An article has 8 tags, which looks very rich, but each tag page only displays this article - because other articles use different combinations.

Why is this understanding wrong?

The value of tags is not to “describe all aspects of this article”, but to establish connections between articles.

If each tag is associated with only 1-2 articles, it is an orphan and has no navigation value.

What is a more accurate understanding?

**Tags should correspond to the reader’s query intent, not the content list of the article. **

Ask yourself: Under what circumstances would a reader click on this tag? What do they expect to see?

  • ❌ “This article uses Python” → Not a good tag intention
  • ✅ “I want to see all Python related practices” → Good tag intent

Only when the same tag can aggregate more than 3 related content, this tag has meaning.


Misunderstanding 2: Treating topics and tags as the same thing

Why does everyone think so?

Many CMS UI designs display “categories” and “tags” side by side, making people feel that they are different names for the same dimension.

So there is this confusion:

  • Some articles are classified under the topic “AI Engineering” and are tagged “ai-engineering”
  • Readers see duplicate content on topic pages and tag pages and don’t know which way to go.

Why is this understanding wrong?

Topics and tags solve completely different problems:

DimensionsTopicTag
HierarchyHigh-dimensional aggregation (domain)Low-dimensional features (attributes)
quantityFew and stable (5-10)More or less (20-50 pieces)
relationMutually exclusive (one article belongs to one topic)Many-to-many (one article can have multiple tags)
useNavigation entranceContent correlation
changerelatively stableGrow with content

Key Design Principles:

  • Special topic answer “Which field is this?”
  • Tag answer “What are the characteristics of this article”

What is a more accurate understanding?

In my blog, an article is organized as follows: Must belong to a topic, multiple tags are optional.

---
title: "RAG cache"
topic: ai-engineering-delivery  # must, unique
tags:
  - rag          # technical focus
  - deployment   # scenario
  - caching      # specific technique
  - production   # environment
---

Reader path design:

  1. Enter the “AI Engineering Practice” topic from the homepage
  2. See all articles under this topic
  3. Find the “deployment” tag on the article, click on it
  4. See all deployment related articles (across topics)

**The topic is vertical digging, and the label is horizontal connection. **


Misunderstanding 3: Pursuing a perfect classification system

Why does everyone think so?

A programmer’s instinct is to pursue completeness. When designing categories, think about:

  • “There is no place to put front-end, back-end, AI, operation and maintenance… and testing”
  • “Does testing belong to the backend or is it an independent topic?”
  • “Is CI/CD an operations or engineering practice?”

So I started to draw complex classification trees, trying to cover all possibilities.

Why is this understanding wrong?

**Classification is not for completeness, but for usefulness. **

Your readers won’t be confused when you miss a “test” topic, but they will be confused when you spread out 3 test articles into different topics.

A perfect classification system often means:

  • The level is too deep (readers need to click 4 times to see the content)
  • Blurred boundaries (an article looks like both A and B)
  • Difficulty in maintenance (when adding new articles, ownership conflicts must be considered)

What is a more accurate understanding?

Adopt the principle of “practical first, moderate redundancy”:

  1. Current content orientation: not “what I might write about in the future”, but “what do I have now”
  2. Reader Perspective Test: If a friend comes looking for X content, which topic will he go to first?
  3. Blurred boundaries are allowed: For a small amount of cross-domain content, just choose the most relevant one.

In my blog, the boundary between “Agent system construction” and “AI engineering practice” is not absolutely clear. But I don’t get hung up on that—if an article looks like both sides, I choose based on the problem it mainly solves.


Three-tier labeling system: a practical design approach

After trial and error, I designed a “three-layer label system”, each layer solving problems in different dimensions:

The first layer: technical point label (What)

Describe the specific technology, tool, or concept covered in the article.

Example:

  • rag, agent, llm, vector-db
  • docker, k8s, github-actions
  • react, vue, astro

Design Principles:

  • Use common names in the technical community (do not make up your own abbreviations)
  • Singular form (use rag instead of rags)
  • Control it within 20 and merge similar tags regularly

Second layer: scene tag (When/Where)

Describe the scenario, stage, or environment to which the article applies.

Example:

  • prototyping (prototyping stage)
  • production (production environment)
  • migration (migration scenario)
  • troubleshooting

Design Principles:

  • The scenario is more stable than the technology (the production environment is always the production environment)
  • Help readers determine “Is this article useful to me now?”
  • Quantity is controlled within 10 pieces

The third layer: type tag (How)

Describe the content form and reading expectations of the article.

Example:

  • tutorial (step-by-step tutorial)
  • guide
  • interpretation
  • reference
  • opinion

Design Principles:

  • Help readers set reading expectations (tutorial vs point of view, reading methods are completely different)
  • Control the quantity and don’t over-subdivide it
  • Each article has only one type tag

Practical configuration: tag implementation in Astro

Tag data aggregation

// src/lib/taxonomy.ts
export function collectTagCounts(posts: Post[]) {
  const counts = new Map<string, number>();

  posts.forEach(post => {
    post.data.tags?.forEach(tag => {
      counts.set(tag, (counts.get(tag) || 0) + 1);
    });
  });

  // keep 2 Posttags
  return Array.from(counts.entries())
    .filter(([_, count]) => count >= 2)
    .sort((a, b) => b[1] - a[1]);  // sort by count descending
}

Tag page implementation

---
// src/pages/tags/[tag].astro
export async function getStaticPaths() {
  const posts = await getCollection('blog');
  const tagCounts = collectTagCounts(posts);

  return tagCounts.map(([tag]) => ({
    params: { tag },
    props: {
      tag,
      posts: posts.filter(p => p.data.tags?.includes(tag))
    },
  }));
}

const { tag, posts } = Astro.props;
---

<h1>tags: {formatTagLabel(tag)}</h1>
<p> {posts.length} posts</p>

{posts.map(post => (
  <article>
    <h2><a href={`/blog/${post.slug}`}>{post.data.title}</a></h2>
    <p>{post.data.description}</p>
  </article>
))}

Tag cloud component (only displays valuable tags)

---
// src/components/TagCloud.astro
const posts = await getCollection('blog');
const tagCounts = collectTagCounts(posts);

// 15 tags
const topTags = tagCounts.slice(0, 15);
---

<div class="tag-cloud">
  {topTags.map(([tag, count]) => (
    <a href={`/tags/${tag}`} class="tag">
      {formatTagLabel(tag)}
      <span class="count">{count}</span>
    </a>
  ))}
</div>

Tag governance: regular cleaning and merging

Labels need to be maintained. My approach is to do “tag management” once a month:

1. Merge similar tags

# tags
- "docker"and"containers"  merge into  docker
- "testing"and"test"  merge into  testing
- "ai"and"artificial-intelligence"  merge into  ai

2. Delete orphan tags

To associate tags with only 1 article, consider:

  • Are there other articles that could be tagged with this?
  • If not, remove the tag and describe it in text in the article instead

3. Check for label level confusion

When it is found that the label and the topic overlap (for example, the topic is “AI Engineering” and the label also has “ai-engineering”), delete the label decisively to maintain the unique navigation status of the topic.


Conclusion: A good classification is one that readers don’t have to think about.

After completing the tag system reconstruction, I did a simple test: let a friend find an article “Production Practices about RAG Deployment” without being familiar with my blog.

His path is:

  1. See the “AI Engineering Practice” topic on the homepage → click to enter
  2. Browsing the article list, I found an article with the title “RAG Deployment”
  3. The article has the “production” tag → Click to view all production environment related articles
  4. Filter out content for RAG + production

**During the whole process, he did not wonder “where to click”, nor did he get lost in repeated content. **

This is the standard for a good classification system: readers can find what they are looking for intuitively, without having to understand your complicated classification philosophy.

In the next article, I will share how to design a platform-based homepage - integrating topics, tags, and latest content into an entrance for readers to “discover” rather than “browse”.


Series of articles

  1. From “file pile” to “topicization” - the first principle of blog content organization
  2. The Design Art of Labels and Topics ← This article
  3. Building a platform-based homepage - allowing readers to go from “seeing” to “discovering”
  4. Astro + Content Collections Practical Guide

Reference resources

  • Source code of this blog: GitHub repository
  • Information Architecture Classic: “How to Make Sense of Any Mess” by Abby Covert

Series context

You are reading: Minimal upgrade path from blog to technology platform

This is article 2 of 4. Reading progress is stored only in this browser so the full series page can resume from the right entry.

View full series →

Series Path

Current series chapters

Chapter clicks store reading progress only in this browser so the series page can resume from the right entry.

4 chapters
  1. Part 1 Previous in path The minimum upgrade path from blog to technology platform (1): from 'file pile' to 'thematic' When you have more than 20 blog posts, readers start to get lost in time. This article shares a practical experience: why thematicization is the first step in blog upgrade, and how to judge whether you have reached the moment where you need to upgrade.
  2. Part 2 Current The smallest upgrade path from blog to technology platform (2): The design art of labels and topics What is the difference between topics and tags? Why is it harder to find content when there are too many tags? This article dismantles the three most common misunderstandings in content taxonomy and shares a practical 'three-tier tag system' design method.
  3. Part 3 The smallest upgrade path from blog to technology platform (3): Build a platform-based homepage - let readers go from 'seeing' to 'discovering' Thematicization solves the problem of content attribution, but what should readers see when they open the homepage? This article shares how to design a 'content discovery' homepage, rather than a simple time flow list.
  4. Part 4 The smallest upgrade path from blog to technology platform (4): Astro + Content Collections practical guide Convert the design concepts from the first three articles into code. This article is a complete technical implementation guide, including all codes such as project structure, Schema design, dynamic routing, search integration, etc.

Reading path

Continue along this topic path

Follow the recommended order for Content Platform Engineering instead of jumping through random articles in the same topic.

View full topic path →

Next step

Go deeper into this topic

If this article is useful, continue from the topic page or subscribe to follow later updates.

Return to topic Subscribe via RSS

RSS Subscribe

Subscribe to updates

Follow new articles in an RSS reader without checking the site manually.

Recommended readers include Follow , Feedly or Inoreader and other RSS readers.

Comments and discussion

Sign in with GitHub to join the discussion. Comments are synced to GitHub Discussions

Loading comments...