Rebuilding My TG Forward Bot
The Beginning
Section titled “The Beginning”The forwarding bot I used before was Node Forward Bot. I had used it for quite a while, but it kept receiving all kinds of ads, which got extremely annoying. When I checked again, I found that the project had already been inactive for a year. Since NFD2 is not open source (I think? I could not find a repository), I did not try it, and I also do not know whether it has ad-blocking features. So I decided to reinvent the wheel for my own use.
Keyword Filtering?
Section titled “Keyword Filtering?”At first, I tried stuffing it with a huge list of sensitive words so it would automatically block messages when they appeared. Then I got mysterious results like blocking x86_64 (64), Steam platform exclusive (Taiwan independence in Chinese substring matching), Python scripts (cheat tools), and listening port (monitoring).
Then What About Regex?
Section titled “Then What About Regex?”Ugh… clearly this path was not going to work. So I listened to some group friends and tried writing a bunch of regular expressions. But as everyone knows, Chinese spam is very mysterious: 丅子, 微 P 嗯, weird emoji splicing… Can regex really defend against all of that? Maybe it can, but my brain obviously is not that powerful (sweat), so in the end I decided to use an LLM for moderation.
Gemini!
Section titled “Gemini!”I needed a model that was as fast as possible, did not need to be too smart, but could still understand the text. As everyone knows, Google has a model called gemini-3-flash, and it felt suitable for content moderation, so I used it. I wrote a simple prompt and asked it to judge the user input, then output either SAFE or UNSAFE.
const MODERATION_PROMPT = `# RoleContent Moderator API. Output one word only.
# RulesUNSAFE if:- Real human nudity/sex- QR codes/spam/ads/gambling promotion- Real gore/shock content- Illegal content promotion- Scam/phishing attempts
SAFE if:- 2D/Anime/Cartoon (even suggestive)- Normal photos/text/screenshots- Regular conversation
# OutputOne word: "SAFE" or "UNSAFE"
Analyze the content:`;There is a little bit of latency, but it is almost negligible. Compared with traditional rule matching, an LLM can understand context and semantics, so the false-positive rate is much lower. The downside is probably cost, but Gemini has a free quota, though not a large one. With three APIs connected, I only get about 60 calls per day, which is barely enough for ad blocking.
TS Is the Way!
Section titled “TS Is the Way!”The whole project uses a modular structure:
.├── src│ ├── ai.ts│ ├── config.ts│ ├── handlers│ │ ├── admin│ │ │ ├── callbacks.ts│ │ │ ├── commands.ts│ │ │ ├── index.ts│ │ │ ├── replies.ts│ │ │ └── shared.ts│ │ └── guest.ts│ ├── i18n│ │ ├── en.ts│ │ └── zh.ts│ ├── i18n.ts│ ├── index.ts│ ├── storage.ts│ ├── telegram.ts│ └── types.ts├── tsconfig.json└── wrangler.tomlImplemented features:
| Feature | Description |
|---|---|
| LLM Content Moderation | LLM-based harmful content detection |
| Ban List | View all banned users |
| Content Hash Cache | Avoid wasting tokens on repeated spam content |
| Blacklist System | Users are blacklisted after repeated blocks |
| Whitelist System | Stop moderation after consecutive clean messages |
| Stats System | Message count, user count, and AI block count |
| Multi API Key Rotation | Google’s API quota is too small |
Who is 20 API calls per day enough for? It used to be 100 calls per day. Gemini CLI and Antigravity were generous, but the API is just stingy.
Second edit: now Gemini CLI and Antigravity are not enough either, FUCKING GOOGLE.
The whole project runs on Cloudflare Workers (same as NFD, convenient, useful, and free), making it a completely zero-cost solution. The LLM is also free.
Finally, I pushed the code to GitHub and open sourced it under the BSD2 license.