Timelapse data exploration of NYC taxi rides
Back in 2014 the city of New York put online a dataset with yellow cab rides comprising a full year of data. Back then I remember struggling quite a bit with managing the sheer volume of the dataset involved, trying out various alternatives for reading in the full dataset. After a few years SAP introduced an “Express edition” of their HANA in-memory database which allowed you to run a 32 GB database just from your own hardware. That was enough to load a full years’ worth of data and be able to analyze it using a standard SQL approach.
Nov 24, 2021