Wharton Text Analytics course

Wharton course details

I teach a course to Wharton MBA and upper-level undergraduate students on the theory and practical applications of text analytics, natural language processing (NLP), and large language models (LLMs) in various industries (STAT 4240/7240). The course provides a comprehensive overview of text analytics and NLP, starting from the basics and ending with LLMs and multi-modal models. The course covers enough theory that students gain an intuition for how the models work and what they are learning. However, unlike a CS / NLP class, there is an emphasis on how these techniques and models are applied in different industries.

The course’s primary objective is to enable non-specialists from other fields to understand how these models work and apply these techniques to their own data to obtain interesting insights. The lectures are complemented with Python code in Jupyter Notebooks, which help students climb the steep learning curve, especially for non-specialists. The syllabus and links to the coursework are listed below.

Lec.	Topic	Notebook
1	Introduction to Text Analytics	Colab link
2	Converting text from files into statistical data	Colab link
3	N-grams, tagging and parsing text, regular expressions	Colab link
4	Language models Pt. 1 (EDA, simple statistics, text classifiers)	Colab link
5	Language models Pt. 2 (best practices, topic modeling, sequence models)	Colab link
6	Text analytics & NLP in the e-Commerce Industry	Colab link
7	Language models Pt. 3 (word embeddings & sequence models)	Colab link
8	Text analytics & NLP in Finance	Colab link
9	Language models Pt. 4 (deep learning models & large language models)	Colab link
10	Text Analytics & NLP in Healthcare	Colab link
11	NLP Applications	Colab link
12	LLMs & Prompt Engineering	Colab link
13	NLP in Tech & State of the Art	Colab link