Tags
Language
Tags
May 2024
Su Mo Tu We Th Fr Sa
28 29 30 1 2 3 4
5 6 7 8 9 10 11
12 13 14 15 16 17 18
19 20 21 22 23 24 25
26 27 28 29 30 31 1

O'Reilly - SQL for CSV datasets

Posted By: sammoh
O'Reilly - SQL for CSV datasets

O'Reilly - SQL for CSV datasets
MP4 | Video: AVC 1920 x 1080 | Audio: AAC 44 Khz 2ch | Duration: 00:07:37 | 136.51 MB

Execute SQL with CSV datasets

SQL is usually reserved for interacting with databases but in this video I show how you can use Databricks to run SQL queries against a CSV dataset. There are a few defaults that can make working with a CSV dataset problematic, like disabled schema infering and no headers. These are crucial when running SQL against the CSV since the defaults will treat every single value as a string. Although this video uses Azure Databricks, the same concepts should apply to any Databricks cluster.

In this video you will learn:

Uploading a CSV dataset to Databricks
Create a Notebook to work with the CSV dataset
Find potential pitfalls in default options for CSV and SQL