Zum Inhalt springen

Details

In this meetup, you will learn all the fundamental steps of preparing and working with text data, such as full copies of famous literary works, song lyrics, or written/transcribed language. Using the examples of fairy tales, like the Grimm brothers collection and Alice in Wonderland, we will prepare and tokenize text data, extract the most frequent words, and display these results graphically. Then, we will explore methods for comparing more than one text source, finding words that are especially characteristic of a particular author/speaker/genre/chapter and comparing these to similar texts. After this meetup, you'll know how to compute word frequencies, identify unique word choices, calculate text-level measures such as lexical density, and create exciting visualizations of text statistics in R.

R-users of all levels can get something out of this Meetup, as we will walk you through every step from reading in data to tuning the graphical parameters. The only requirement is to have R and R Studio up and running on your computer. Main packages include: tidyverse, tidytext, gutenbergr, scales. Primary source: Text Mining with R by Julia Silge.

The meetup will be hosted online via Zoom -- the link will be sent out via Meetup shortly before the start time.

Mitglieder interessieren sich auch für