HashnodeRSS

Filter a Hashnode blog RSS feed by tags (and other properties)

The problem

If you have a blog hosted on hashnode you'll automatically get a lot of great features. One of these essential blog features is the RSS feed, allowing readers to subscribe with whatever RSS reader they are most comfortable with.

The problem is this feed is for every post, so if you have a blog with many different topics (different programming languages, frameworks, personal posts, movie reviews) your readers can't subscribe to a single topic but will receive all your posts. Or maybe you want to add your feed to a planet or another aggregator with a specific topic (my own usecase).

The solution

An externally hosted script, that parses the RSS and returns only the relevant posts. A reader (or feed aggregator) could then subscribe to this feed instead, and only get the posts they're interested in.

The stack

Code is available at the following GitHub repo (the readme holds a lot of the information in this blog post): github.com/BirkAndMe/hashnoderss

The stack for this project is kept as simple as possible, it only consists of PHP. One just needs a host with PHP support, this can be any shared host, dedicated server or cloud solution.

It's a standalone script, with no ties to any external libraries. And since the script is as small as it is, everyone with PHP knowledge should be able to grasp what's going on and understand the ins and outs, no matter what framework they usually work with. Making it more available to everyone.

The installation

Testing the code locally.

_There's also a quick intro to testing locally in the repository.

This is a quick and dirty way of testing out the functionality, and should not be used for any actual production.

git clone git@github.com:BirkAndMe/hashnoderss.git
cd hashnoderss
php -S localhost:8000 router-hack.php

Quick intro to a Linode LEMP installation.

Use whatever hosting option you're comfortable with, this is just a quick getting started.

Since the script doesn't use MySQL you could opt for manually setting up an instance with just a web server and PHP, but that is out of scope for this post

1) Find and install LEMP in the Linode marketplace.

Fill in all the options, and start by choosing the smallest (and cheapest) Shared CPU (Nanode 1 GB at the time of writing). The instance should start automatically and setup everything for a standard LEMP setup.

1.1) Open an SSH connection to the Linode instance.

Or use the Linode LISH console if you prefer.

2) Getting the repository into the web root.

Find the webroot (on the Linode instance) in /var/www/html/[SITE_NAME]/public_html and install the script here (notice git is installed, and the directory is cleaned).

sudo apt install git
rm *
git clone https://github.com/BirkAndMe/hashnoderss.git .

The [SITE_NAME] is the Reverse DNS name of the Linode instance (should be something like [IP-ADDRESS].ip.linodeusercontent.com). You can find this name in the Network tab when inspecting your Linode instance in the Linode backend https://cloud.linode.com/linodes/[INSTANCE_ID]/networking.

3) Setting up PHP and NGINX.

Just a few changes are needed to host the script.

3.1) Install PHP XML.

The XML module is not default for the PHP Debian installation, so install it.

sudo apt install php-xml
3.2) Change the NGINX site config.

Edit the site configuration found in /etc/nginx/sites-enabled/[SITE_NAME], on a clean installation there should only be 1 site so that's it (the server has nano ready for editing, but use whatever you like).

Setup a redirect of all the requests to the index.php script. Do so by replacing this:

location / {
  try_files $uri $uri/ =404;
}

With this:

location / {
  try_files $uri $uri/ /index.php;
}

Reload NGINX by running this command

systemctl reload nginx`

Now visit the site to make sure everything works (this should get the complete RSS feed for this blog): https://[SERVER-IP].ip.linodeusercontent.com/blog.birk-jensen.dk

The API

The primary (and only) endpoint is:

https://host/[BLOG_DOMAIN]

It's possible to set a blog domain in the settings.php (as $hostname). Doing this will ignore the BLOG_DOMAIN, and using https://host will work without the extra path.

Filtering

This is filtered using Query parameters.

These parameters are mapped to PostDetailed query on api.hashnode.com, so use the documentation available there to figure out what properties to filter by.

Simple filter.

host/blog.birk-jensen.dk?_id=6298c0135787a911a45d5fcf

This will get a single blog post with a specific ID.

Nested filtering.

It's possible to check nested properties, using double dashes -- as a seperator, so the following is probably the most useful filter:

host/blog.birk-jensen.dk?tags--name=Drupal

Only return blog posts tagged with Drupal.

Combining filters.

host/blog.birk-jensen.dk?author--username=BirkAndMe&tags--name[]=Drupal

Get blog posts written by BirkAndMe and tagged with Drupal.

The filter only support AND conditions.

Operators.

The value checks support 4 operators, which is written as a prefix (remember to URL escape) to the value

  • = Equal, is the default operator and makes sure the post value is the same as the given value.
  • ! Not, make sure the value is not the same.
  • < Lesser than, only if the value is lesser than.
  • > Greater than, only if the value is greater than.

Get all posts except a specific one.

host/blog.birk-jensen.dk?_id=!6298c0135787a911a45d5fcf

Only posts with more than 5 reactions (the %3E is > escaped).

host/blog.birk-jensen.dk?totalReactions=%3E5

Debugging.

Add a debug parameter to get a better view of what's going on. It will show the normalized values and the Graph QL used to query hashnode.

host/blog.birk-jensen.dk?author--username=BirkAndMe&tags--name[]=Drupal&debug

The code

Repo link again: github.com/BirkAndMe/hashnoderss

There's not much to be said about the code, it should be pretty self-explanatory.

It's kept simple rather than smart. Whenever I could do something smart, I chose to do something explicit instead, to make the code more approachable.

Because of the simplicity, reading the index.php from the beginning should give a pretty good picture of what's going on.

Here's the gist of it (gives you an overview, before diving into the code):

  1. Read the original RSS feed from the blog.
  2. Get all URLs in the original RSS (these are used as slugs when querying the Hashnode api.
  3. Normalize the query parameters into a filter array, and use this array to build the Graph QL property list.
  4. Query the Hashnode API, and run through all the posts checking the filter matches (and removing the posts that don't match from the RSS).
  5. Return the filtered RSS.

Remarks

I hope anyone other than me can use this for their Hashnode blog (I might set up an actual service in the future, so you don't have to host the script).

I also someone will find it refreshing to read a simple not framework dependant open source project. I know I found it refreshing to write something using only the simplest of tools.

Real-life use case.

I haven't tested this in production yet, but I would likely set up an HTTP cache (varnish) in front of this to minimize all the external requests (RSS and API calls), this should be almost plug and play, since it only uses GET requests.

Also using the $hostname setting and then having the RSS feed on a rss.domain.com will give prettier URLs.

Future features.

My guess is that tag filtering is the only needed filter 99.9% of the time, so any future feature is primary just for the fun of it, but here goes:

  • Get the greater and lesser than working on dates.
  • Implement a regular expression operator.
  • Improve code documentation on helper functions.
  • Error handling (there's none at the moment).
  • Get posts from a specific user (using the user query in the Hashnode API) across multiple blogs.

Limitations.

The Hashnode API is slow, especially when querying many posts (to the point where it sometimes fails). This is why it limits the posts to a max of 20, in most cases this is more than enough because RSS clients will only look for new responses anyway.

It also caches the requests for 10 minutes, so you won't get anything instantly.

Did you find this article valuable?

Support Philip Birk-Jensen by becoming a sponsor. Any amount is appreciated!