What we're about
Upcoming events (1)
Mate Toth (BlackRock) - Extracting topics from the FED: The monetary policy implemented by the Federal Reserve has a significant effect on the economy. Though these policies are usually intended long term, information released by the FED can also have an immediate impact on asset prices, as the market reacts to their statements instantaneously. In this presentation, we are going to take a look at how unsupervised Natural Language Processing techniques can help us quickly gain insights into the contents of the speeches and statements released by the FED. In particular, we will look at Topic Modelling to extract the relative importance of topics discussed, study their trends over time, as well as look at how separating the corpus per topic can provide us with enhanced sentiment signals. Peter Koczka (BlackRock) - Document Type Recommendation with Restricted Resources: In the document management space it is crucial to have a reliable document classification system that can distinguish different types of forms and contracts. Permissions and a number of applications rely heavily on what type a document belongs to. Therefore classification is an important but non-trivial task, as there are a number of obstacles, such as low OCR quality, a large number of named entities and similarly worded but different types of contracts to name a few. Applying machine learning and basic techniques using Python primitives can both lead to acceptable results with slightly varying pros and cons — which play an important role in making the decision in a corporate environment. Géza Kulcsár: The Life of a Sign is its Use: On Context-Awareness in Semantic Graph Grammars The first part of the title is paraphrased from The Blue Book of Wittgenstein; already at this stage, Wittgenstein started to contemplate on a notion of meaning which coined his late thought, led by a desire to eliminate the traditional concept of meaning as something exterior being pointed at by a semantic mapping. Graph grammars represent a powerful, high-level and generic formalism to describe the structure and generation of any suitable conceptual domain. In particular, graphic models are flexible enough to capture the static structure of domain elements (such as language ingredients of varying granularity) as well as the generation of such structures by replacement (rewriting) rules of the grammar. In particular, in the context of NLP, graph grammars are extensively used to represent linguistic structures, i.e., language objects and their relation, in the spirit of the aforementioned Wittgensteinian "structuralist" program. Particularly, hyperedge replacement grammars (HRG) have been frequently studied for representing the structure of natural-language semantic units, and even their generation process from elementary building blocks of the language. However, although prolific, this framework is also not free from implicit assumptions such as the context-free and, thus, constrained generation principle of HRG grammars and the necessarily fixed notion of when a semantic unit (e.g., a sentence) is supposed to be complete. Therefore, to address those issues, we propose the systematic study of alternative graph-rewriting formalisms, and, in particular, algebraic graph rewriting, for a potential broadening of the horizon of NLP analysis techniques. Here, the generation principle has a rather context-sensitive flavor, which, we argue, harmonizes to a larger extent with structuralist semantic notions. The underlying graphic models are still mathematically well-founded, but expressive enough to host any graph structures (including hypergraphs) without the burden of constrained derivation steps and enforced termination of semantic generation chains. In addition, algebraic graph rewriting has a well-studied extension for externally controlling generation steps, which appears to be a meaningful vehicle for achieving higher semantic awareness in the context of automatically understanding natural languages.