MEVN stack experiment with Jupyter

In this notebook I want to play with Jupyter and show the steps of how to create a MEVN application from a notebook. Normally I would do this in the normal Linux terminal and a text editor, but since we can combine code, explanation and shell commands, I want to create a story in this notebook which hopefully will be of any help of people experimenting with full-stack development. I will create the application, use some simple Linux tricks and use Selenium to test the application.

Create Spark dataframe column with lag

Create a lagged column in a PySpark dataframe:

from pyspark.sql.functions import monotonically_increasing_id, lag
from pyspark.sql.window import Window

# Add ID to be used by the window function
df = df.withColumn('id', monotonically_increasing_id())
# Set the window
w = Window.orderBy("id")
# Create the lagged value
value_lag = lag('value').over(w)
# Add the lagged values to a new column
df = df.withColumn('prev_value', value_lag)