Guidelines for using PySpark 3.X on EVOLVE dashboard

This document describes the guidelines for using PySpark 3.X through zep-pelin notebook on the EVOLVE dashboard. We provide a simple ETL example that loads a 2.5 GB dataset and performs an SQL query. Finally, we provide the configuration for enabling CPU only as well as GPU accelerated execution in PySpark 3.X.

For any issues or questions please contact aferikoglou@microlab.ntua.gr

Download

View
Online

Share on

Cookies Definitions

EVOLVE Project may use cookies to memorise the data you use when logging to EVOLVE website, gather statistics to optimise the functionality of the website and to carry out marketing campaigns based on your interests.

Required Cookies

Functional Cookies

Advertising Cookies

menu

Cookbooks

Guidelines for using PySpark 3.X on EVOLVE dashboard