Deploying data on Github using Portal.js and Github pages
Use Case:
You have some data in a Github repo and you'd like to deploy it online using "portal" so that it is easy for others to view, explore and use.
Here we show how you can use portal.js plus github actions to deploy your dataset in minutes and keep it updated as you make changes.
The example focuses on the case of a Frictionless dataset but it works for any dataset type supported by portal.js.
We provide three options on how to do this and recommend using the first one unless you really want to get hands on:
- Deploying datasets automatically by setting up a github actions script.
- Deploying datasets from a local bash script with portal code commits
- Deploying datasets from a local bash script without portal code commits
Deploy datasets automatically by setting up a github actions script
The github actions below will automatically build and deploy a single page, Frictionless dataset to gh-pages
branch. Follow the steps below to achieve this:
- Create a secret so we can automatically commit to gh-pages branch (see below)
- Set up the github action to build portal to your dataset and deploy it (see below)
- Wait for your page to build and then setup github pages (see below)
- View the results: visit
https://<your github username>/github.io/<dataset repo name>/
Step 1
In the dataset repository you want to deploy, create a github secret with the name PORTAL_REPO_NAME
and the value should be the name of the repository.
See steps on creating a secret here
Step 2
In the dataset repository you want deploy create a .github/workflow
directory and add a main.yml
file with the following content (you can also view/download this action file here:
name: github pages
on:
push:
branches:
- master
- main
jobs:
run:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v2
- name: Setup Node
uses: actions/setup-node@v2.1.2
with:
node-version: '12.x'
- name: Build datasets
env:
PORTAL_REPO_NAME: ${{ secrets.PORTAL_REPO_NAME }}
run: |
curl https://raw.githubusercontent.com/datopian/portal.js/main/site/public/scripts/single-dataset-no-commit.sh > portal.sh
git config --local user.email "$(git log --format='%ae' HEAD^!)"
git config --local user.name "$(git log --format='%an' HEAD^!)"
source ./portal.sh
Then, commit and push your code.
git add .
git commit -m "Build dataset page"
git push
Step 3
Wait for a while as your page builds, and once you see the green check mark, navigate to your repository's github pages
in settings, set the source
to gh-pages
and folder to /root
:
Deploy single dataset without commiting portal.js code
Users who want to deploy datasets from a local bash script without saving/commiting the portal.js code, can use the script shown below.
Using this script means you do not have access to the portal.js code used to generate the dataset page, and as such cannot modify/extend it.
This script creates and commit only the build/output files to the gh-pages branch. Follow the steps below to achieve this.
Step 1
Clone/Pull the dataset repository you want deploy. For example:
git clone https://github.com/datasets/finance-vix
cd finance-vix
Step 2
In a terminal, export an env variable with the name of your dataset github repo. For example if deploying https://github.com/datasets/finance-vix, then export the name as:
export PORTAL_REPO_NAME=finance-vix
Step 3
In the dataset repository's root folder, create a file called portal.sh
and paste the following content:
#!/bin/bash
git checkout -b gh-pages
git rm -r --cached .
rm -rf portal
mkdir -p portal
npx create-next-app portal -e https://github.com/datopian/portal.js/tree/main/examples/dataset-frictionless
mkdir portal/public/dataset
cp -a ./data portal/public/dataset
cp -a ./datapackage.json portal/public/dataset
cp -a ./README.md portal/public/dataset
PORTAL_DATASET_PATH=$PWD"/portal/public/dataset"
export PORTAL_DATASET_PATH
cd portal
assetPrefix='"/'$PORTAL_REPO_NAME'/"'
basePath='"/'$PORTAL_REPO_NAME'"'
echo 'module.exports = {assetPrefix:' ${assetPrefix}', basePath: '${basePath}' }' > next.config.js ## This ensures css and public folder works
yarn export
cd ..
cp -R -a portal/out/* ./
touch .nojekyll
git add $PWD'/_next' $PWD'/index.html' $PWD'/dataset' $PWD'/404.html' $PWD'/.nojekyll' $PWD'/favicon.ico'
git commit -m "Build new dataset page"
git push origin gh-pages
Step 4
Run the bash script in a terminal with:
source portal.sh
Note: Use
source
instead ofbash
so that the script can work well with environment variables.
Step 5
Go to your repository's github pages
in setting and set the Branch to gh-pages and folder to root:
Step 6
Open your deployed site at https://<your github username>/github.io/<dataset repo name>
Deploy single dataset with portal commit
Users who want access to the portal.js code used for generating the dataset page can use the script shown in the following section.
This script creates and commits the portal.js code to the root branch and also adds an automated script to deploy to gh-page. Follow the steps below to use this script.
Step 1
Create a Github Personal Access Token (PAT). See steps here
Step 2
In the dataset repository you want to deploy, create a github secret with the name PORTAL_NEXT_TOKEN
. The value should be the PAT created in step 1. See steps on creating a secret here
Note: Without the PAT and the secret configured, the automatic build will fail.
Step 3
Clone/Pull the dataset repository you want deploy. For example:
git clone https://github.com/datasets/finance-vix
cd finance-vix
Step 4
In your computer's terminal/command prompt, export an environment variable with the name of your dataset's github repo.
For example if you want to deploy the dataset at https://github.com/datasets/finance-vix, then export the name using the command:
export PORTAL_REPO_NAME=finance-vix
Step 5
Create a file called portal.sh
and paste the following content:
#!/bin/bash
rm -rf portal
mkdir -p portal
npx create-next-app portal -e https://github.com/datopian/portal.js/tree/main/examples/dataset-frictionless
mkdir portal/public/dataset
cp -a ./data portal/public/dataset
cp -a ./datapackage.json portal/public/dataset
cp -a ./README.md portal/public/dataset
PORTAL_DATASET_PATH=$PWD"/portal/public/dataset"
export PORTAL_DATASET_PATH
mkdir -p .github && mkdir -p .github/workflows && touch .github/workflows/main.yml
curl https://raw.githubusercontent.com/datopian/portal.js/main/site/public/scripts/gh-page-builder-action.yml > .github/workflows/main.yml
cd portal
assetPrefix='"/'$PORTAL_REPO_NAME'/"'
basePath='"/'$PORTAL_REPO_NAME'"'
echo 'module.exports = {assetPrefix:' ${assetPrefix}', basePath: '${basePath}' }' > next.config.js ## This ensures css and public folder works
cd ..
git add .
git commit -m "Add dataset build feature"
git push
echo "Portal generated, please push your code to github"
Step 6
Run the bash script with:
source portal.sh
Note: Use
source
instead ofbash
so that the script can work well with environment variables.
Step 7
Go to your repository's github pages
in setting and set the Branch to gh-pages and folder to root:
Step 8
Open your deployed site at https://<your github username>/github.io/<dataset repo name>