Archived course content from the inaugural offering of Generative AI for Software Development. The Spring 2025 curriculum covered five core modules focused on the foundations of AI for code.
Overview of AI for software engineering, course structure, and the landscape of generative models applied to code — from early statistical approaches to modern LLMs.
Techniques for extracting, cleaning, and analyzing data from version control systems, issue trackers, and code hosting platforms to build training datasets for AI models.
Foundations of probabilistic language models for code: n-grams, smoothing, perplexity, and entropy — understanding how machines can learn the statistical patterns of programming languages.
Measuring the quality of AI-generated code: BLEU, CodeBLEU, exact match, functional correctness, and the challenges of evaluating generated outputs rigorously.
Neural approaches to code: word embeddings, RNNs, LSTMs, and the Transformer architecture — the building blocks behind modern code generation tools.