Categorical Databases


Details
This month Ryan Wisnesky will be presenting on Categorical
Databases.
We also invite everyone to give a lightning talk (5-15 minutes)
on any topic you like, as long as it's of interest to the
Clojure community.
Pizza will be provided. Please RSVP so we have an idea of how
much to order.
------
The open-source Categorical Query Language (CQL) and integrated
development environment (IDE) performs data-related tasks —
such as querying, combining, migrating, and evolving databases
— using category theory, a branch of mathematics that has
revolutionized several areas of computer science. Open-source
CQL is production-ready for single-node in-memory data
processing workloads, such as integrating data for data
science. It is available for free download at
categoricaldata.net.
Its value proposition is:
- Reduce risk of failure through artificial intelligence. CQL
contains an embedded automated theorem prover that
guarantees the correctness of CQL programs. For example, a
CQL program cannot materialize an instance that violates a
data integrity constraint. Such errors are detected at
compile time, when they are easiest to fix. - Preserve data quality. High-quality data is expensive to
obtain, so it is important to preserve that quality
throughout the data life-cycle. CQL programs evolve and
migrate data in a mathematically universal way, with zero
degradation. As such, data integrated by CQL has many
advantages, including perfect provenance: every row in the
output of an CQL program contains a lineage that describes
exactly how that row was obtained from input data. - Increased developer productivity through higher-level
abstractions. CQL generalizes concepts from SQL using
powerful principles from category theory. For example, CQL
generalizes SQL's select-from-where queries from returning
single tables to returning many tables related by foreign
keys. Such higher-level abstractions enable developers to be
more productive.
============
About the speaker:
Ryan Wisnesky obtained B.S. and M.S. degrees in mathematics and
computer science from Stanford University and a Ph.D. in
computer science from Harvard University, where he studied the
design and implementation of provably correct software systems.
Previously, he was a postdoctoral associate in the MIT
department of mathematics, where he developed the categorical
query language, CQL. He currently leads open-source and
commercial development of CQL as CTO of Conexus AI. He
maintains an active collaboration with the
information-integration department of IBM Research, where he
contributed to the Clio, Orchid, and HIL projects.

Categorical Databases