FWSSUG - Querying Semistructured Data in Azure Data Lake with USQL

Fort Worth SQL Server Users Group
Fort Worth SQL Server Users Group
Public group

Location visible to members


August 2018: Russ Loski presents:
Querying Semistructured Data in Azure Data Lake with USQL

Data is exploding across enterprises. Much of it is semi-structured junk. Or is it junk? What are you going to do with it until you can find out? Microsoft’s Azure Data Lake is a cloud storage and analytical service for parking a variety of data. When you are ready, you can query that semi-structured data using an ANSI SQL language called U-SQL. In this session I will demonstrate the similarities and differences between U-SQL and T-SQL. I will demonstrate how easy it is to build a query against 21 GB of CSV files. Such queries can help you determine whether you have a gold mine in your data or a bunch of garbage before investing in a full data warehouse build.

We will go beyond simply extracting data from CSV files. We will look at how easy it is to gain greater insight from your data using U-SQL and Microsoft’s Cognitive Services. We will also look at how to extract data from JSON files using U-SQL.

Russ Loski is an SQL Server ETL developer based in Dallas, TX. Twenty years ago, he began working with SQL Server 6.5. He has continued to develop applications connected to all of the versions of SQL Server since. His clients have ranged from insurance to healthcare, from movie theaters to American football. Russ is a regular speaker at SQL Saturday events, as well as the SQL Server Users Groups in the North Texas region. Russ likes working with data in various shapes. He has used DTS, SSIS, U-SQL, R, and .Net to process XML, CSV and other file formats.