Room 215
Exploratory Analysis of Textual Data: Collect a corpus of textual data and begin to analyze it.
Topics covered: pandas, TextaCy, Beautifulsoup
Working with unstructured, textual data in Python presents new challenges. We can use some of our familiar pandas idioms to organize our corpus of text documents, but even a surface knowledge of the corpus demands new tools for analyzing data. Here, we’ll build a corpus of text and begin looking for macro trends in it.
Data Club meetups typically occur twice-monthly, on Thursdays, throughout the semester. Open to everyone in the Columbia community, these informal events will start with a presentation on a specific use case for Python, R, Julia, or JavaScript, then open up to questions, collaborative work, and discussion. Computation typically occurs within a Jupyter/Colab workflow, and participants of all skill levels are welcome.