First steps

Once you have followed the steps in Installation for the Operator and its dependencies, you will now go through the steps to set up and connect to a Superset instance.

Database for the Superset metadata

Superset metadata (slices, connections, tables, dashboards etc.) is stored in an SQL database.

For testing purposes, you can spin up a PostgreSQL database with the following commands:

helm repo add bitnami https://charts.bitnami.com/bitnami

helm install --wait superset bitnami/postgresql \
    --set auth.username=superset \
    --set auth.password=superset \
    --set auth.database=superset

This setup is unsuitable for production use! Follow the specific production setup instructions for one of the supported databases to get a production-ready database.

Secret with Superset credentials

A secret with the necessary credentials must be created: this contains database connection credentials as well as an admin account for Superset itself. Create a file called superset-credentials.yaml:

---
apiVersion: v1
kind: Secret
metadata:
  name: simple-superset-credentials
type: Opaque
stringData:
  adminUser.username: admin
  adminUser.firstname: Superset
  adminUser.lastname: Admin
  adminUser.email: admin@superset.com
  adminUser.password: admin
  connections.secretKey: thisISaSECRET_1234
  connections.sqlalchemyDatabaseUri: postgresql://superset:superset@superset-postgresql.default.svc.cluster.local/superset

And apply it:

kubectl apply -f superset-credentials.yaml

The connections.secretKey will be used for securely signing the session cookies and can be used for any other security related needs by extensions. It should be a long random string of bytes.

connections.sqlalchemyDatabaseUri must contain the connection string to the SQL database storing the Superset metadata.

The adminUser fields are used to create an admin user. Please note that the admin user will be disabled if you use a non-default authentication mechanism like LDAP.

Creation of a Superset node

A Superset node must be created as a custom resource, create a file called superset.yaml:

---
apiVersion: superset.stackable.tech/v1alpha1
kind: SupersetCluster
metadata:
  name: simple-superset
spec:
  image:
    productVersion: 3.1.0
  clusterConfig:
    credentialsSecret: simple-superset-credentials
    listenerClass: external-unstable
  nodes:
    roleGroups:
      default:
        config:
          rowLimit: 10000
          webserverTimeout: 300

And apply it:

kubectl apply -f superset.yaml

metadata.name contains the name of the Superset cluster.

The previously created secret must be referenced in spec.clusterConfig.credentialsSecret.

The rowLimit configuration option defines the row limit when requesting chart data.

The webserverTimeout configuration option defines the maximum number of seconds a Superset request can take before timing out. These settings affect the maximum duration a query to an underlying datasource can take. If you get timeout errors before your query returns the result you may need to increase this timeout.

You need to wait for the Superset node to finish deploying. You can do so with this command:

kubectl rollout status --watch statefulset/simple-superset-node-default --timeout 300s

Connecting to the web interface

When the Superset node is created and the database is initialized, Superset can be opened in the browser.

The Superset port which defaults to 8088 can be forwarded to the local host:

kubectl port-forward service/simple-superset-external 8088 > /dev/null 2>&1 &

Then it can be opened in the browser with http://localhost:8088.

Enter the admin credentials from the Kubernetes secret:

Great! Now the Superset is already ready to use, but if you also want some sample data and dashboards to explore the functionalities Superset has to offer, continue with the next step.

Loading examples and accessing example dashboards

To have some data to play with and some dashboards to explore, Superset comes with some example data that you can load. To do so, create a file superset-load-examples-job.yaml with this content:

---
apiVersion: batch/v1
kind: Job
metadata:
  name: superset-load-examples
spec:
  template:
    spec:
      volumes:
      - configMap:
          defaultMode: 420
          name: simple-superset-node-default
        name: config
      containers:
      - name: superset
        image: docker.stackable.tech/stackable/superset:3.1.0-stackable24.3.0
        command: [
          "/bin/sh",
          "-c",
          "mkdir --parents /stackable/app/pythonpath && \
          cp /stackable/config/* /stackable/app/pythonpath && \
          echo 'SQLALCHEMY_EXAMPLES_URI = os.environ.get(\"SQLALCHEMY_DATABASE_URI\")' >> /stackable/app/pythonpath/superset_config.py && \
          superset load_examples"
        ]
        env:
        - name: SECRET_KEY
          valueFrom:
            secretKeyRef:
              key: connections.secretKey
              name: simple-superset-credentials
        - name: SQLALCHEMY_DATABASE_URI
          valueFrom:
            secretKeyRef:
              key: connections.sqlalchemyDatabaseUri
              name: simple-superset-credentials
        volumeMounts:
        - mountPath: /stackable/config
          name: config
        resources:
          limits:
            cpu: 1200m
            memory: 1000Mi
          requests:
            cpu: 300m
            memory: 1000Mi
      restartPolicy: Never
  backoffLimit: 4

This is a Kubernetes Job. The same connection information and credentials are loaded that are also used by the Superset instance. The Job will load the example data. Execute it and await its termination like so:

kubectl apply -f superset-load-examples-job.yaml
sleep 5
kubectl wait --for=condition=complete --timeout=300s job/superset-load-examples

The Job will take a few minutes to terminate. Afterwards, check back again on the web interface. New dashboards should be available:

Great! Now you can explore this sample data, run queries on it or create your own dashboards.

What’s next

Look at the Usage guide to find out more about configuring your Superset instance or have a look at the Superset documentation to create your first dashboard.