This month we'll have talks showing how to use Neo4j to explore the Bitcoin blockchain and various open government datasets.
Using Neo4j for the blockchain.
In this talk Greg will explain how the Bitcoin blockchain can be stored in Neo4j, and the advantages this has over using a relational database.
Exploring Open Government Data using Neo4j
Fake News is a hot topic at the moment and Emil Eifrem has just written about how Graph Databases can help unlock the truth (https://neo4j.com/emil/help-world-make-sense-news-data/) as evidenced by the Panama Papers investigation.
Open Government Data is a fantastic resource that allows a unique insight into the records of recent events within the country.
Neo4j is a mature, native graph database and the features added over the last few major releases have made it a compelling platform for rapidly exploring datasets.
During this Lightning Talk we'll:
* See how to derive an initial graph model
* Perform ETL Cypher-style
* Solve referential data quality issues
* Explore why graphs are revealing!
The session will be a hands on journey from identifying a dataset to being able to explore it in Neo4j.
Diving into the UK’s corporation “beneficial ownership” with Neo4j
In June 2016, Companies House started publishing the world’s first open data register of “beneficial owners” or “people with significant control” of companies registered in the UK.
In November, DataKind UK ran a weekend DataDive in cooperation with with Global Witness and OpenCorporates to explore this new data for the first time and see whether the data points to any promising leads for further investigation into cases of tax evasion and corruption.
One of the three teams at the event was tasked with building up a graph of the data and performing network analysis to produce easily reproducible analysis and queries.
Company neighbourhood Instantiator.
The purpose of this package is to collect data from https://www.gov.uk/government/organisations/companies-house specific to a company as well as to the direct (n-hop) network of that company and import that data into Neo4j for inspection.
The relationships identified between two companies are based on intermediate links between officers participating in both companies. Officers may participate in companies with different roles. These roles are captured to characterise the different kinds of relationships.
The code is executed in an iterative fashion relying on a Breadth First Search (BFS) approach and stops either if no more neighbours can be found or if the maximum (user-defined) n-hops distance from the root company has been covered (e.g. 10 hops).
By leveraging this code we were able to construct the a company/officers neighbourhood with affiliated companies and offices.
The code is available on DataReply’s public repo: https://github.com/DataReplyUK/datareplyuk/tree/master/Company_Neighbourhood