The problem " /> The problem " />

Now I'm using notion for this blog + notion2jekyll tool for sponsors

02 Sep 2022

The problem

Lately I wanted to start writing more blog posts, but writing them in markdown is not really comfortable nor fast to do:

  • I have to be on the computer
  • Open an IDE
  • Manually create a file
  • Copy the template
  • Set the properties
  • Then I have to manually create and copy the images to the folder and reference them
  • And finally commit and push
  • 🤯

I still don’t want to deal with databases, backups, nor use wordpress or anything that will definitely kill my server, and ghost was somehow fine with a sqlite, but still required backups, and doesn’t support plugins, so it cannot support the kind of sponsored content I want to publish, the way I want, and the flexibility I want in terms of page generation. So I want to still keep using my landinger (compatible kotlin-based jekyll) approach.


The solution

I wanted a WYSIWYG approach, but also my jekyll. Wanted to support post planification, simple, boilerplate-free and fine-grained page generation, and everything in a git repository with full history, so github and my server makes the backups implicitly and no databases involved on my end.

I was exploring a few options here, like using GitHub issues, GitHub discussions, and GitHub projects. But in the end I decided to use Notion and to make a synchroniser. Eventually I might be able to also synchronise from GitHub projects, but for I decided to stick to Notion.

Doing this required some time and a few steps, but it was worth the time.

Also now I’m offering this approach to some of my sponsors based on their sponsoring tier .

Writing a cacheable Notion API

So the first thing I needed was an API for accessing notion databases. Since it supports a standard Rest API I decided to implement it myself using okhttp, and implemented a generic pagination for the API with kotlinx-coroutines Flow.

If we query a database, it returns the resulting pages with lots of information including the id and the last modification time. With that information we can then query for the page and the blocks individually only if the pages were not modified (by checking the timestamp stored for the page).

Converting a page into markdown with FrontMatter

The Notion API is really complex, having dozens of possible types. So I used Jackson and SubTypes to actually mimic their objects. And with those objects I created a toMarkdown method to serialize each one to the equivalent markdown/html.

Synchronising existing/unexistant pages

After being able to retrieve all the pages in a database in a reasonably efficient way thanks to the cache, now it would make sense to put those pages in markdown+Front Matter in the posts folder in the jekyll. But I wanted to track down removed pages, updated pages and new pages, to remove old files and update existing ones.

To track down what was done, I added to the Front Matter, a field called notion_page_id automatically set for pages generated from the notion2jekyll synchroniser so I could iterate all the posts initially and figure out which ones were posts created in notion, to remove them if they doesn’t exist anymore, without affecting existing posts in the blog.

Actually synchronising into GitHub with commits

By either using GitHub pages where each commit triggers a page upload, or in my case where each commit triggers a refresh in my server since I’m using landinger and not Jekyll directly the changes are deployed.

But how to synchronize it without requiring a server? Github Workflows. By creating a /.github/workflow/sync.yml file I was able to execute a cronjob twice per day with the possibility of manually trigger it from github if required that actually executes the notion2jekyll code, and then commits the changes to the main branch.

Using it yourself

Create a Notion Integration so we can access the Notion API

First go to https://notion.so/my-integrations:

Then create a new integration:

And grab the secret token:

Create a new database in Notion

Then you can create a new Notion database, and start adding posts. The database can have: a Permalink string, Tags multiselect, a Sponsor numeric/enum, Draft boolean, Published date properties. The converter will use those to determine if it should put the article in the posts folder or the draft folder since it is not published, either if the Draft is set, or the publication date is not yet there. The permalink, the tags and the sponsor will be put in the FrontMatter of the article.

Then you have to give permissions to the integration in the panel where you share the page/database with other people.

Files:

notion2jekyll.jar

The JAR file contains the Kotlin code accepting a NOTION_SECRET and NOTION_DATABASE_ID environment variables that will generate the code in the current directory in the .notion_cache folder for caches, posts for posts and static folder for images.

notion2jekill.jar.zip : rename it from .zip to .jar to use it

sync.sh

Dependencies are stored at github, while the jar is provided here.

if [ ! -f notion2jekyll-deps.jar ]
then
  wget -q -O notion2jekyll-deps.jar https://github.com/soywiz/soywiz/releases/download/artifacts/notion2jekyll-deps.jar
fi
NOTION_SECRET=... NOTION_DATABASE_ID=... java -classpath ./notion2jekyll-deps.jar:notion2jekyll.jar MainKt $

.github/workflows/sync.yml

on:
  # https://crontab.guru/#0_10,22_*_*_*
  # twice per day at 10:00 and 22:00
  schedule:
    - cron: '0 10,22 * * *'
  workflow_dispatch:
    inputs:
      anything:
        required: false
        type: boolean

jobs:
  sync:
    runs-on: ubuntu-latest
    timeout-minutes: 2
    #if: ${{ inputs.print_tags }}
    steps:
      - uses: actions/checkout@v3
        timeout-minutes: 1
        with:
          persist-credentials: false # otherwise, the token used is the GITHUB_TOKEN, instead of your personal token
          fetch-depth: 0 # otherwise, you will failed to push refs to dest repo
      - name: Synchronizes notion
        timeout-minutes: 1
        run: ./sync.sh
      - name: Print the input tag to STDOUT
        timeout-minutes: 1
        run: echo The tags are ${{ inputs.tags }}
      - name: Commit & Push changes
        timeout-minutes: 1
        uses: actions-js/push@5f565701a8b9f9aa6b7efc25f28994eabfcf5312
        with:
          branch: master
          github_token: ${{ secrets.GITHUB_TOKEN }}

This takes usually less than 30 seconds, so even if the page is private it would take like 30 minutes per month. Way less than the 2000 minutes for free that Github Actions provides for private repos.

It is recommended to keep the .notion_cache folder commited in the repo so this action can reuse and doesn’t have to download all the pages and all the images and files everytime.