O'Reilly - SQL for CSV datasets
MP4 | Video: AVC 1920 x 1080 | Audio: AAC 44 Khz 2ch | Duration: 00:07:37 | 136.51 MB
MP4 | Video: AVC 1920 x 1080 | Audio: AAC 44 Khz 2ch | Duration: 00:07:37 | 136.51 MB
Execute SQL with CSV datasets
SQL is usually reserved for interacting with databases but in this video I show how you can use Databricks to run SQL queries against a CSV dataset. There are a few defaults that can make working with a CSV dataset problematic, like disabled schema infering and no headers. These are crucial when running SQL against the CSV since the defaults will treat every single value as a string. Although this video uses Azure Databricks, the same concepts should apply to any Databricks cluster.
In this video you will learn:
Uploading a CSV dataset to Databricks
Create a Notebook to work with the CSV dataset
Find potential pitfalls in default options for CSV and SQL