What we're about
Upcoming events (3)
Information Asset (www.information-asset.com) has partnered with Collibra to provide you visibility into Catalog and Lineage: https://www.collibra.com/data-catalog-lineage-demo Organizations across the world are using Collibra Data Catalog to empower business users to quickly discover and understand data that matters, so they can generate impactful insights that drive business value. And with the addition of Collibra Data Lineage, they can automatically map data flows and transformations as data moves from source to destination to get an end-to-end view of their data. In this product demonstration webinar, you’ll learn how Collibra can help you: Gain a unified view of data assets across your enterprise data landscape Get access to compliant data via an eCommerce-like shopping experience Integrate with commonly used BI tools to catalog reports and share insights Discover and extract technical lineage automatically from source systems Trace data flows from source to report with summary business lineage Please register at https://www.collibra.com/data-catalog-lineage-demo or contact Kevin Ladwig at [masked] if you have any questions.
[Session] SQL Query Anything, Anywhere with Starburst Presto, an Open Source Data Access Layer [Abstract] Pulling all of the data sources we need to access within reach of our SQL-Based BI tools, Analytics tools, and Data Science tools can be challenging. Especially when some of the data sources we need are data lakes of schema-on-read data optimized for Hadoop clusters. The primary way to solve this problem is to create a "single source of truth" by moving the data via an ETL process into a data warehouse. This often takes significant time in the planning, infrastructure, and ETL project phases. Meanwhile, we either can’t answer those multi-data-source questions, or we end up running our analysis based on partial data while those new data sources remain out-of-reach. Presto was developed at Facebook to address this very problem and then released as an open-source project. It's based on a simple premise: rather than ETL the data into a "single source of truth", instead, provide a "single source of access." Worker nodes can pull data from the mixture of databases, data warehouses, and data lakes as users run queries to provide a single SQL prompt that pulls data from all of them, merging the data on the fly into a single result-set. And the cluster can scale elastically to handle more concurrency or provide faster response time by adding or removing workers. Data analysts can then leverage a SQL single connection string to access all of their data sources from their tools. Using this model, as new data sources are needed, users can gain access to additional ones by simply updating the cluster with the data-source connection string and login. There's no need to go through an ETL process. When it comes to data lakes that implement schema-on-read, Presto connects to a metastore(s) to provide end-user access as if it were a traditional SQL database, directly accessing the data. Presto skips the engines for Hive, Hadoop, and others. This can provide faster response times to the data lake than the engines can deliver because Presto skips many disk-based operations in map-reduce systems. This session will explore this "single point of access" approach to data sources and discuss how you can use open source Presto in your own environment to bring all of your data within reach of your analysis tools. Plus, you will hear how a Cost-Based Optimizer can handle getting the fastest results across multiple data sources ("SQL on anything"). Topics include: ● The architecture of open source Presto. ● How to bring multiple data sources within reach of your data analysis tools using a single point of access. ● How schema-on-read data lakes can be merged with schema-on-write SQL data sources. ● How a Cost-Based Optimizer can handle getting the fastest results across multiple data sources. [About the Speaker] Randy Chertkow is a Solutions Engineer at Starburst Data and has over 25 years of IT experience as an infrastructure architect for Fortune 500 companies. He has used his skills as a database developer, architect, and tools expert to implement enterprise-scale IT at companies like Abbott Laboratories, VMware, Mesosphere, Cockroach Labs, and more. He has a Master’s in Computer Science: Data Communications with a concentration in artificial intelligence. Besides his IT background, Randy is a musician on the side and has also written four books with major publishers about the music business. He has also professionally spoken all over the country at organizations like the Recording Academy (grammy.com) and the City of New York at Carnegie Hall. Starburst Data: https://www.starburstdata.com/ Information Asset: https://www.information-asset.com/
Information Asset (www.information-asset.com) has partnered with Collibra to provide you visibility into Data Governance and Privacy: https://www.collibra.com/data-governance-privacy-demo Organizations across the world are using Collibra Data Governance and Data Privacy to understand their ever-growing amounts of data and operationalize privacy policies. With Collibra, organizations can ensure their teams can trust and use data to improve business outcomes in a scalable, compliant manner. In this product demonstration webinar, you’ll learn how Collibra can help you: Establish a common understanding of data and create a shared language so that everyone works from an agreed-upon source of truth Easily automate governance and stewardship tasks so that your data management practice can remain in place as your business evolves Operationalize privacy policies and create a sustainable approach that keeps up with the pace of regulatory change Sign up for a demo to see how Collibra Data Governance and Data Privacy help create a common data understanding, scale compliance and unlock the true value of your data. Please register at https://www.collibra.com/data-governance-privacy-demo or contact Kevin Ladwig at [masked] if you have any questions.