Wharton course details
I teach a course to Wharton MBA and upper-level undergraduate students on the theory and practical applications of text analytics, natural language processing (NLP), and large language models (LLMs) in various industries (STAT 4240/7240). The course provides a comprehensive overview of text analytics and NLP, starting from the basics and ending with LLMs and multi-modal models. The course covers enough theory that students gain an intuition for how the models work and what they are learning. However, unlike a CS / NLP class, there is an emphasis on how these techniques and models are applied in different industries.
The course’s primary objective is to enable non-specialists from other fields to understand how these models work and apply these techniques to their own data to obtain interesting insights. The lectures are complemented with Python
code in Jupyter Notebooks
, which help students climb the steep learning curve, especially for non-specialists. The syllabus and links to the coursework are listed below.
Lec. | Topic | Notebook |
---|---|---|
1 | Introduction to Text Analytics | Colab link |
2 | Converting text from files into statistical data | Colab link |
3 | N-grams, tagging and parsing text, regular expressions | Colab link |
4 | Language models Pt. 1 (EDA, simple statistics, text classifiers) | Colab link |
5 | Language models Pt. 2 (best practices, topic modeling, sequence models) | Colab link |
6 | Text analytics & NLP in the e-Commerce Industry | Colab link |
7 | Language models Pt. 3 (word embeddings & sequence models) | Colab link |
8 | Text analytics & NLP in Finance | Colab link |
9 | Language models Pt. 4 (deep learning models & large language models) | Colab link |
10 | Text Analytics & NLP in Healthcare | Colab link |
11 | NLP Applications | Colab link |
12 | LLMs & Prompt Engineering | Colab link |
13 | NLP in Tech & State of the Art | Colab link |