Spark Operator

The Spark Operator aims to make specifying and running Spark applications as easy and idiomatic as running other workloads on Kubernetes. We have deployed cluster-wide Spark Operator that defines kinds ScheduledSparkApplication and SparkApplication, full documentation on kinds’ structure is available here. The official and detailed user guide is available here.

Important Configuration

You have to set spec.driver.serviceAccount to default, otherwise your spark application will fail on permission issues. The Spark example feature hostPath as volume source but this won’t work in the cluster since host mounts are forbidden. Please use persistentVolumeClaim instead (examples) and don’t forget to create PVC.

Other Configuration

Spark UI

Spark UI is automatically created for your applications on HTTPS address

https://spark-[app_namespace].dyn.cloud.e-infra.cz/[app_namespace]/[app_name]

and is also visible in Rancher UI, follow the steps to see its final form. When you copy the address into the browser, omit the last character group (/|$)(.*) (a Rancher related issue).

sparkaddress

When you access the path, you should be presented with dashboard similar to

sparkdashboard

Custom SparkUI Ingress

If you want to provide custom annotations to Ingress (e.g. use different certificate issuer, we use letsencrypt) or custom TLS secret you have to include following code snippet in spark application YAML.

  sparkUIOptions:
    ingressAnnotations:
      [annotation_key]: [annotation_value]
      ...
    ingressTLS: (optional section)
      - secretName: [secretname_in_app_namespace]
        hosts:
          - [host]

We merge the items provided by you with our configuration. If you set the same annotation as we do, your value is propagated.