Learning Material — Spring 2025 Archive

Spring 2025 — Core Modules

Course Introduction

Overview of AI for software engineering, course structure, and the landscape of generative models applied to code — from early statistical approaches to modern LLMs.

Spring 2025 Overview

Mining Software Repositories

Techniques for extracting, cleaning, and analyzing data from version control systems, issue trackers, and code hosting platforms to build training datasets for AI models.

Spring 2025 Data Collection Git

Probabilistic Source Code Modeling

Foundations of probabilistic language models for code: n-grams, smoothing, perplexity, and entropy — understanding how machines can learn the statistical patterns of programming languages.

Spring 2025 N-grams Probability

Evaluation Metrics

Measuring the quality of AI-generated code: BLEU, CodeBLEU, exact match, functional correctness, and the challenges of evaluating generated outputs rigorously.

Spring 2025 BLEU Benchmarks

Deep Learning for Software Engineering

Neural approaches to code: word embeddings, RNNs, LSTMs, and the Transformer architecture — the building blocks behind modern code generation tools.

Spring 2025 Transformers Neural Networks

Note: The Spring 2025 curriculum covered five foundational modules. Spring 2026 expanded the course to eight modules, adding coverage of prompting LLMs, code hallucinations, and genetic algorithms. View the current curriculum →