Building an Automated Notes Publishing Pipeline at Zero Cost

Disclaimer: it is an English translation of my Chinese post using Gemini, I really don't have time to write two versions of the same topic! At first, note two things: "Zero-cost" means "hard costs" excluding your time and effort. Cheers! "Note" is equivalent to "Summary". Have you ever found yourself overwhelmed by the sheer number of links while browsing online? I certainly have. These links can be anything that piques your interest, whether from someone's tweet, search results, or links within an article or video you're reading. I suspect this need is not uncommon, as "read it later" plugins or applications are ubiquitous. I tried a couple of them myself, but after the initial novelty wore off, I never used them again. This is because they (note: several years ago) were essentially just bookmark management tools, some with superfluous features like article recommendations. Given my already broad range of interests, leading to an overwhelming backlog of links, recommending even more articles seemed nonsense! An Impromptu Demand Years later, in this era of democratized AI (while I was engrossed in developing a personal TSW plugin), a thought suddenly struck me: why not use AI to process these links? After some initial brainstorming, I outlined the following requirements: Automatically generate a summary based on the currently open article. The summary format should include: keywords, overview, section summaries, in-text tool links, references, and the original article link. Export the summary. These requirements were purely based on my personal needs, as I wanted to: Quickly grasp the article's content to decide whether to continue reading. Have tool or reference links for easy access to related resources. Have the original article link for reference. Export the summary for convenient storage. However, being averse to repetitive tasks, I soon added new requirements after manually saving summaries for a while: Directly export the summary to my GitHub repository, rather than downloading it locally, manually committing it to the repository, and then syncing it to the remote repository. Create a summary site to share these summaries, facilitating my own reading and that of others (primarily my team members). Minimize costs, ideally incurring no expenses on these infrastructures. Zero-Cost Technical Solutions The above requirements can be broadly divided into three parts: Summary Generation Summary Export Summary Site Let's explore how to implement these three parts at zero cost. Summary Generation LLMs have significantly lowered the barrier to solving NLP problems. Today, you can easily generate summaries from text using any mature LLM provider's API. However, the devil is in the details, and to maximize the LLM's capabilities, you need to consider several factors. Link vs. Content My initial approach was to use links directly, prioritizing convenience. While it seemed acceptable at first, a closer look at the generated summaries revealed unsatisfactory results, often with fabricated information, ie hallucination. Thus, I reverted to the traditional method of parsing the link, extracting the content, and then feeding it into the LLM. LLM Selection To minimize or eliminate costs, the selection here involves two aspects: free tier and the context window size. Considering the "big three" at the time—OpenAI, Claude, and Gemini—let's temporarily ignore other vendors like Mistral. In terms of free tier: Gemini is the most generous. In terms of scenario: For processing general text, there's no significant difference among the three. In terms of the context window size: Gemini offers the largest capacity, simplifying development as the input is a single page's text, unlikely to exceed Gemini's limit even for lengthy articles, thus eliminating the need for chunking. However, this doesn't mean you can directly feed the webpage's HTML to Gemini. Considerations include: Webpages often contain noise like ads, navigation, and comments. For the main content, code snippets and images are generally irrelevant for summarization. Markdown (MD) format is the optimal input for LLMs. Therefore, the HTML underwent simple cleaning: Filtering irrelevant tags Using and tags Filtering irrelevant tags like Converting HTML to MD Using the turndown library This also offers the added benefit of reducing input length, allowing more summaries within the same quota. Note: Code, https://github.com/foxgem/tsw/blob/main/src/ai/utils.ts Summary Format This involves prompt engineering, which is straightforward. See the code directly: https://github.com/foxgem/tsw/blob/main/src/ai/ai.ts#L261 Overall infrastructure cost for these three steps: 0. Summary Export The requirement here is clear: use GitHub's free API with the octokit library. A GitHub Personal Access Token (PAT) is required, created via: settings -> devel

Mar 30, 2025 - 01:08
 0
Building an Automated Notes Publishing Pipeline at Zero Cost

Disclaimer: it is an English translation of my Chinese post using Gemini, I really don't have time to write two versions of the same topic!

At first, note two things:

  • "Zero-cost" means "hard costs" excluding your time and effort. Cheers!
  • "Note" is equivalent to "Summary".

Have you ever found yourself overwhelmed by the sheer number of links while browsing online? I certainly have. These links can be anything that piques your interest, whether from someone's tweet, search results, or links within an article or video you're reading.

I suspect this need is not uncommon, as "read it later" plugins or applications are ubiquitous.

I tried a couple of them myself, but after the initial novelty wore off, I never used them again. This is because they (note: several years ago) were essentially just bookmark management tools, some with superfluous features like article recommendations. Given my already broad range of interests, leading to an overwhelming backlog of links, recommending even more articles seemed nonsense!

An Impromptu Demand

Years later, in this era of democratized AI (while I was engrossed in developing a personal TSW plugin), a thought suddenly struck me: why not use AI to process these links? After some initial brainstorming, I outlined the following requirements:

  1. Automatically generate a summary based on the currently open article.
  2. The summary format should include: keywords, overview, section summaries, in-text tool links, references, and the original article link.
  3. Export the summary.

These requirements were purely based on my personal needs, as I wanted to:

  1. Quickly grasp the article's content to decide whether to continue reading.
  2. Have tool or reference links for easy access to related resources.
  3. Have the original article link for reference.
  4. Export the summary for convenient storage.

However, being averse to repetitive tasks, I soon added new requirements after manually saving summaries for a while:

  1. Directly export the summary to my GitHub repository, rather than downloading it locally, manually committing it to the repository, and then syncing it to the remote repository.
  2. Create a summary site to share these summaries, facilitating my own reading and that of others (primarily my team members).
  3. Minimize costs, ideally incurring no expenses on these infrastructures.

Zero-Cost Technical Solutions

The above requirements can be broadly divided into three parts:

  1. Summary Generation
  2. Summary Export
  3. Summary Site

Let's explore how to implement these three parts at zero cost.

Summary Generation

LLMs have significantly lowered the barrier to solving NLP problems. Today, you can easily generate summaries from text using any mature LLM provider's API.

However, the devil is in the details, and to maximize the LLM's capabilities, you need to consider several factors.

Link vs. Content

My initial approach was to use links directly, prioritizing convenience.

While it seemed acceptable at first, a closer look at the generated summaries revealed unsatisfactory results, often with fabricated information, ie hallucination.

Thus, I reverted to the traditional method of parsing the link, extracting the content, and then feeding it into the LLM.

LLM Selection

To minimize or eliminate costs, the selection here involves two aspects: free tier and the context window size.

Considering the "big three" at the time—OpenAI, Claude, and Gemini—let's temporarily ignore other vendors like Mistral.

  • In terms of free tier: Gemini is the most generous.
  • In terms of scenario: For processing general text, there's no significant difference among the three.
  • In terms of the context window size: Gemini offers the largest capacity, simplifying development as the input is a single page's text, unlikely to exceed Gemini's limit even for lengthy articles, thus eliminating the need for chunking.

However, this doesn't mean you can directly feed the webpage's HTML to Gemini. Considerations include:

  1. Webpages often contain noise like ads, navigation, and comments.
  2. For the main content, code snippets and images are generally irrelevant for summarization.
  3. Markdown (MD) format is the optimal input for LLMs.

Therefore, the HTML underwent simple cleaning:

  1. Filtering irrelevant tags
    • Using
      and
      tags
    • Filtering irrelevant tags like