| Course Introduction Module 0 · Overview 1 / 30 Home
Module 0 · Slide 01
William & Mary · Department of Computer Science
CSCI 455 CSCI 555

0. Course Introduction

Generative AI for Software Development — equipping the next generation of developers with AI-powered tools and techniques.

Spring 2026 3 Credits Prof. Antonio Mastropaolo
Module 0 · Slide 02

Welcome & Course Vision

Software development is being fundamentally reshaped by generative AI. This course is designed to put you at the forefront of that transformation — not just as users of AI tools, but as engineers who understand the science behind them.

What This Course Is About
We explore how deep learning and large language models can automate core software engineering tasks: code generation, documentation, testing, code review, and more. You will learn both the theory and the hands-on practice.
Why It Matters
AI coding assistants are already changing how professional software is built. Understanding their foundations — and limitations — will make you a more effective, more employable engineer.
Our Approach
This is a hands-on, project-driven course. You will design, prompt, evaluate, and build with state-of-the-art AI systems. Every module includes interactive exercises and real-world applications.
Who This Is For
Whether you are an undergraduate (CSCI 455) or a graduate student (CSCI 555), if you are curious about the intersection of AI and software engineering, you are in the right place.
Module 0 · Slide 03

What Is Generative AI?

Not all AI is created equal. Understanding the distinction between generative and discriminative models is the foundation for everything we cover in this course.

Generative AI (Creates New Content)
Generative models learn the underlying probability distribution of data and can produce entirely new, plausible outputs that did not exist in the training set.
Text Generation
Writing essays, emails, documentation, and code comments from a prompt.
Code Completion
Autocompleting functions, classes, and entire modules based on context.
Image Synthesis
Creating visuals from textual descriptions (DALL-E, Midjourney, Stable Diffusion).
Discriminative AI (Classifies Existing Data)
Discriminative models learn the decision boundary between categories. They classify or label data but do not create new content.
Spam Detection
Classifying emails as spam or legitimate based on learned patterns.
Sentiment Analysis
Determining whether a review is positive, negative, or neutral.
Bug Classification
Categorizing reported issues by severity, component, or priority.
Key Insight
GenAI models learn probability distributions over data and sample from them to generate new, plausible outputs. This is why the same prompt can produce different results each time — the model is sampling from a distribution, not looking up an answer.
Module 0 · Slide 04

The Evolution of AI for Code

AI-assisted programming did not appear overnight. Decades of research built the foundation for today's large language models.

1

Rule-Based Systems (1970s–1990s)

Expert systems and static analysis tools relied on hand-crafted rules written by domain experts. Effective for narrow tasks like linting and style checking, but brittle and unable to generalize.

2

Statistical Methods (2000s)

n-gram models and probabilistic approaches treated source code as a natural language. Researchers discovered that code is even more predictable than English text — a key insight you will explore in Module 2.

3

Deep Learning (2010s)

Recurrent neural networks (RNNs) and sequence-to-sequence models enabled code completion, bug detection, and code summarization. The introduction of the Transformer architecture in 2017 was a turning point.

4

Large Language Models (2020s)

GPT, Codex, CodeLlama, StarCoder — models trained on billions of lines of code can now generate entire functions, explain complex code, and assist with debugging. This is the era we focus on.

Course Focus
This course focuses on the LLM era while teaching you the foundations — from statistical models to deep learning — that make it all work. Each era built on the previous one, and understanding this progression makes you a stronger practitioner.
Module 0 · Slide 05

GenAI in Action: Before & After

What does AI-assisted development actually look like in practice? Here is a side-by-side comparison of traditional vs. AI-augmented workflows.

Before GenAI
The traditional workflow:
Search & Copy
Browse Stack Overflow, copy snippets, adapt them manually to your codebase. Time: 15-30 min per problem.
Write Boilerplate
Manually write repetitive code: CRUD endpoints, data models, config files. Tedious but necessary.
Debug Alone
Read error traces, add print statements, search forums. Debugging could consume entire afternoons.
Write Tests Manually
Think through edge cases yourself. Write each test by hand. Coverage was often an afterthought.
With GenAI
The AI-augmented workflow:
AI Autocomplete
Copilot suggests the next line or entire function as you type. Accept, reject, or modify in real time.
Scaffold Features
Describe what you need in plain English. AI generates a working starting point for entire features in seconds.
AI-Assisted Debugging
Paste the error into an LLM. Get an explanation, root cause analysis, and suggested fix instantly.
Generated Test Suites
Ask AI to generate unit tests for your function. It identifies edge cases you might have missed.
Important Caveat
AI does not replace your need to understand the code. It accelerates your workflow, but you are still the engineer. Blindly accepting AI suggestions leads to bugs, security vulnerabilities, and technical debt. This course teaches you to be a critical collaborator, not a passive consumer.
Module 0 · Slide 06

Your Instructor: Prof. Antonio Mastropaolo

A software engineering researcher passionate about the intersection of artificial intelligence and code.

Research Focus
My work centers on AI for Software Engineering (AI4SE): using deep learning and large language models to automate software development tasks including code generation, code summarization, bug fixing, and code review.
Background
Ph.D. in Software Engineering with a focus on applying neural machine translation techniques to source code. Published research on code generation, automated documentation, and empirical software engineering.
Teaching Philosophy
I believe in learning by doing. In this course, you will not just read about AI tools — you will build with them, evaluate them, and push their boundaries. My goal is to help you develop critical thinking about what AI can and cannot do for software development.
Office & Contact
Department of Computer Science, William & Mary
Office hours: Check the syllabus for schedule
I encourage you to reach out early and often — collaboration is key to learning.
Module 0 · Slide 07

Research Spotlight

The topics in this course are rooted in active research. Here are some contributions that inform our curriculum.

NMT for Code

Applying neural machine translation techniques to translate code between programming languages and generate code from natural language descriptions.

  • Treating code as a translatable language
  • Encoder-decoder architectures for source code
  • Cross-language code migration

AI for Code Review

Training models to automatically generate code review comments, catching bugs and suggesting improvements the way a human reviewer would.

  • Automated review comment generation
  • Defect prediction from code changes
  • Learning from millions of real reviews

Evaluating LLMs for SE

Developing rigorous methodologies for measuring how well AI models perform on software engineering tasks — beyond simple accuracy metrics.

  • Benchmark design for code generation
  • Human evaluation protocols
  • Measuring functional correctness

Code Summarization

Generating natural language descriptions of source code automatically — helping developers understand unfamiliar codebases faster.

  • Method-level documentation generation
  • Commit message synthesis
  • API documentation automation
Why This Matters For You
This is not a course taught from a textbook — it is taught by someone who actively publishes research in this field. You will engage with cutting-edge papers, and your projects may contribute to real research questions.
Module 0 · Slide 08

Course Structure

The course is organized into focused modules, each building on the last. Here is the roadmap of what you will learn.

Module 1

Mining Software Repositories

Extracting insights from version control, issue trackers, and code databases to fuel AI models.

Module 2

Source Code Modeling

Understanding n-grams, language models, and statistical approaches to modeling source code.

Module 3

Evaluating AI Techniques

Metrics and methodologies for rigorously assessing AI-generated code quality and correctness.

Module 4

Deep Learning for SE

Neural networks, transformers, and architectures that power modern code intelligence systems.

Module 5

Prompting LLMs

Prompt engineering strategies for maximizing the effectiveness of large language models in SE.

Module 6

Hallucinations & Reliability

Understanding and mitigating AI errors, biases, and the challenge of trustworthy code generation.

Module 0 · Slide 09

Learning Objectives

By the end of this course, you will be able to:

  • 1Design and implement deep learning pipelines tailored to software engineering tasks such as code generation and summarization.
  • 2Apply prompt engineering techniques to effectively leverage large language models for real-world development workflows.
  • 3Critically evaluate AI-generated code using established metrics (BLEU, CodeBLEU, pass@k) and identify failure modes.
  • 4Mine software repositories to extract datasets suitable for training and evaluating AI models.
  • 5Analyze state-of-the-art research in AI4SE and propose innovative refinements to existing techniques.
  • 6Identify and mitigate AI hallucinations and reliability issues in code generation systems.
  • 7Build complete, functional applications using AI-assisted "vibe coding" workflows from concept to deployment.
  • 8Communicate findings through technical presentations and written reports at a professional level.
Module 0 · Slide 10

How This Course Connects

This course sits at a critical intersection in your CS education. Here is how it builds on what you know and opens doors to what comes next.

Prerequisites You Bring

CS fundamentals (data structures, algorithms), programming proficiency (Python preferred), and version control (Git). These are the building blocks we will stand on.

NOW

This Course: GenAI for Software Development

You will gain hands-on experience with AI-assisted coding, deep learning for SE, prompt engineering, and evaluation methodologies. These are skills that bridge traditional CS and the AI-powered future.

What It Unlocks

ML/AI research (graduate studies, publications), industry AI roles (ML engineer, AI-augmented developer), and entrepreneurship (building AI-powered products). This course makes you fluent in the language of AI4SE.

For Undergrads (CSCI 455)
This course gives you a competitive edge in the job market. Companies actively seek developers who can work effectively with AI tools and understand their strengths and limitations.
For Graduate Students (CSCI 555)
This course provides the research foundation for a thesis or dissertation in AI4SE. You will learn to read, critique, and build upon state-of-the-art research papers.
Module 0 · Slide 11

Course Logistics

Essential details about enrollment, prerequisites, and how the course is structured.

Course Numbers
CSCI 455 — Undergraduate section
CSCI 555 — Graduate section
Both sections meet together. Graduate students have additional depth requirements on projects and presentations.
Schedule & Credits
Semester: Spring 2026 (also offered Spring 2025)
Credits: 3 credit hours
Class size: ~36 students
Format: In-person lectures with hands-on lab components
Prerequisites
Solid foundation in programming (Python preferred)
Basic understanding of data structures & algorithms
Familiarity with version control (Git)
No prior ML/AI experience required — we build from the fundamentals
What You Will Need
A laptop capable of running Python and Jupyter notebooks
Access to GitHub (free account)
Willingness to experiment with AI tools — some may require API keys (provided or free-tier)
Module 0 · Slide 12

Prerequisites in Detail

An honest look at what you should know coming in — and what we will teach you along the way.

Required
Python Proficiency
You will write Python daily. Comfort with functions, classes, file I/O, and pip/conda is essential.
Basic Data Structures
Lists, dictionaries, trees, graphs. You need to read and reason about algorithmic code.
Git Basics
Clone, commit, push, pull, branch. We use GitHub extensively for projects and assignments.
Helpful but Not Required
Probability & Statistics
Useful for understanding language models and evaluation metrics. We review the basics in Module 2.
Linear Algebra
Helps with understanding neural network internals. Not strictly required — we provide intuitions.
Prior ML Exposure
Familiarity with concepts like training/testing, overfitting, and loss functions gives you a head start.
Need to Ramp Up?
If you feel rusty on any required topic, reach out during the first week. We can recommend specific resources to get you up to speed quickly. The course is designed to be accessible — we will build ML/AI concepts from scratch.
Module 0 · Slide 13

Getting Set Up

Make sure you have everything ready before the first lab session. Click each item to mark it as complete.

✅ Environment Checklist interactive
  • Python 3.8+ installed — verify with python --version
  • Jupyter Notebook / JupyterLab ready — install via pip install jupyterlab
  • Git installed & configured — set your name and email with git config
  • GitHub account created — sign up at github.com if you do not have one yet
  • VS Code + Copilot extension — free for students via the GitHub Student Developer Pack
  • API keys obtained (OpenAI, HuggingFace) — instructions will be provided in the first lab
0 / 6 ready
Do Not Worry
If you cannot complete all items before class, that is okay. We will walk through the setup during the first lab session. The most important thing is having Python and a GitHub account.
Module 0 · Slide 14

Why GenAI for Software Development?

Generative AI is not a distant future — it is transforming software development right now. Here is the landscape.

92%
of developers use AI coding tools
55%
faster task completion with AI assistance
$1.5B+
invested in AI coding startups
150M+
GitHub Copilot users worldwide
The Opportunity
Developers who understand AI-assisted coding are in high demand. Companies are integrating these tools into every stage of the software lifecycle. Knowing how they work — and where they fail — gives you a significant competitive advantage.
The Challenge
AI tools are powerful but not infallible. They can hallucinate code, introduce subtle bugs, and perpetuate bad practices. This course teaches you to leverage AI effectively while maintaining the critical thinking that defines a great engineer.
Module 0 · Slide 15

The State of AI in Software Engineering (2025)

The AI-for-code landscape is evolving rapidly. Here is where things stand as you start this course.

97%
of developers have used AI tools (GitHub Survey)
Top 3
Copilot, ChatGPT, Claude
$30B+
AI coding tools market size
30–50%
productivity gains reported
What Is Working
Code autocomplete — accepted suggestions save significant keystrokes
Boilerplate generation — CRUD, configs, and scaffolds in seconds
Documentation — generating docstrings and README files
Debugging assistance — explaining errors and suggesting fixes
The Reality Check
Complex logic — AI still struggles with multi-step reasoning
Security — generated code often contains vulnerabilities
Testing — AI tests frequently miss critical edge cases
Architecture — system-level design remains a human skill
Hype vs. Reality
Headlines claim AI will replace developers. The research tells a different story: AI is a powerful amplifier for skilled engineers, but it cannot replace understanding, judgment, and creativity. This course teaches you to separate the signal from the noise.
Module 0 · Slide 16

What Can Go Wrong?

AI-assisted development brings real risks. Understanding these upfront makes you a more responsible and effective engineer.

Code Hallucinations

AI can confidently generate code that calls APIs that do not exist, uses deprecated methods, or invents function signatures.

  • Fabricated library functions
  • Incorrect API parameters
  • Plausible but wrong logic

Security Vulnerabilities

AI-generated code frequently contains security flaws: SQL injection, XSS, hardcoded secrets, and improper input validation.

  • Insecure default configurations
  • Missing input sanitization
  • Exposed credentials in examples

Over-Reliance

Developers who lean too heavily on AI risk losing fundamental skills. If the AI is down or wrong, can you still solve the problem?

  • Atrophy of debugging skills
  • Reduced deep understanding
  • Dependency on AI availability

Licensing & IP Concerns

AI models trained on open-source code may reproduce copyrighted snippets. The legal landscape is still evolving.

  • Copilot lawsuit precedent
  • License compliance questions
  • Attribution requirements
Bias in AI Models
AI models reflect the biases in their training data. This can mean generating code that follows outdated patterns, reinforces non-inclusive variable naming, or performs poorly on underrepresented programming languages and paradigms. We will explore this in depth in Module 6 (Hallucinations & Reliability).
Module 0 · Slide 17

The SE + AI Landscape

AI is transforming every phase of the software development lifecycle. Here are the key areas we will explore.

Code Generation

From natural language descriptions to working code. AI models can now write functions, classes, and entire modules.

  • Function-level synthesis from docstrings
  • Autocomplete and inline suggestions
  • Full application scaffolding

Code Review & Quality

Automated detection of bugs, code smells, and security vulnerabilities before they reach production.

  • Static analysis augmented by AI
  • Automated PR review comments
  • Vulnerability detection

Testing & Verification

AI-generated test cases, property-based testing, and automated regression detection.

  • Unit test generation from source code
  • Fuzz testing with AI guidance
  • Test oracle synthesis

Documentation & Maintenance

Automatic code summarization, API documentation generation, and commit message synthesis.

  • Javadoc / docstring generation
  • README and changelog automation
  • Code-to-explanation pipelines
Module 0 · Slide 18

AI Across the Software Lifecycle

AI is not just for writing code. It is being applied to every phase of the SDLC. Here is a stage-by-stage walkthrough.

1

Requirements

Natural language specifications transformed into structured user stories. AI extracts acceptance criteria and identifies ambiguities.

2

Design

Architecture suggestions based on project description. AI recommends patterns, databases, and tech stacks suited to your requirements.

3

Coding

Real-time autocomplete, function generation, and code translation. The most mature area of AI-assisted development today.

4

Testing

Automated test generation, mutation testing, and fuzz testing. AI identifies edge cases and generates assertions from code behavior.

5

Code Review

Automated review comments, style enforcement, and bug detection. AI reviewers complement human judgment for faster, more thorough reviews.

6

Documentation

Javadoc, changelogs, and API docs generated from code. AI maintains documentation in sync with code changes automatically.

7

Deployment & Ops

CI/CD pipeline optimization, infrastructure-as-code generation, and intelligent monitoring. AI reduces deployment friction and catches issues early.

Module 0 · Slide 19

Tools of the Trade

Throughout this course, you will work with a variety of AI models and tools — both commercial and open-source.

🧠

GPT-4 / GPT-4o

OpenAI's flagship multimodal models for code generation, analysis, and reasoning.

commercial
💬

Claude

Anthropic's assistant with strong coding, analysis, and long-context capabilities.

commercial
⚙️

GitHub Copilot

AI pair programmer integrated directly into your IDE for real-time code suggestions.

ide-integrated
🌐

CodeLlama / StarCoder

Open-source code LLMs you can run locally, fine-tune, and study architecturally.

open-source
📓

Jupyter + Python

Your primary workspace for experiments, model evaluation, and data analysis.

open-source
🛠️

HuggingFace

The platform for accessing pre-trained models, datasets, and the Transformers library.

open-source
🎯 Match the Tool interactive

Click a tool on the left, then click the matching capability on the right. Match all pairs correctly.

GitHub Copilot
HuggingFace
GPT-4
CodeLlama
Jupyter Notebooks
Interactive environment for running Python experiments and visualizing results
Real-time AI code suggestions directly inside your IDE as you type
Platform for downloading pre-trained models and NLP datasets
Multimodal reasoning model for complex code generation and analysis
Open-source code LLM you can run locally and fine-tune
Module 0 · Slide 20

Open Source vs. Commercial Models

Choosing the right model for a task is an important skill. Here is how the two major categories compare.

Commercial Models
GPT-4, Claude, Gemini
Strengths
Highest performance on complex reasoning, large context windows, multimodal capabilities, and regular updates.
Tradeoffs
API costs per token, data sent to third-party servers, rate limits, and vendor lock-in risk.
Best For
Complex code generation, debugging, architecture discussions, and rapid prototyping when quality matters most.
Open-Source Models
CodeLlama, StarCoder, DeepSeek Coder
Strengths
Free to use, self-hosted (full data privacy), customizable through fine-tuning, and community-driven development.
Tradeoffs
Require GPU hardware to run, generally lower performance on complex tasks, and less frequent updates.
Best For
Research experiments, domain-specific fine-tuning, privacy-sensitive projects, and understanding model internals.
Course Approach
We use both categories in this course. Commercial models for rapid prototyping and project development; open-source models for understanding architectures, fine-tuning experiments, and research. The best engineers know when to use each.
Module 0 · Slide 21

Interactive: Model Size Explorer

Click on a model below to explore its specifications. Understanding model scale helps you choose the right tool for the job.

GPT-4

OpenAI · Commercial

Claude 3.5 Sonnet

Anthropic · Commercial

CodeLlama-34B

Meta · Open Source

StarCoder2-15B

BigCode · Open Source

Why Size Matters
Larger models generally perform better on complex tasks, but they cost more to run and require more hardware. The trend in 2025 is toward smaller, more efficient models that match larger models on specific domains — a key research direction you will explore.
Module 0 · Slide 22

Course Projects

The heart of this course: you will build real, functional applications using AI-assisted development — what we call "vibe coding."

1

Concept & Design

Start with a problem statement. Use AI to brainstorm architectures, generate wireframes, and plan your tech stack.

2

AI-Assisted Implementation

Build your application using LLMs as coding partners. Learn when to trust, when to verify, and when to override AI suggestions.

3

Testing & Refinement

Use AI to generate tests, identify edge cases, and refactor your code. Evaluate the quality of AI-generated components.

4

Presentation & Reflection

Present your project to the class. Reflect on where AI helped, where it hindered, and what you learned about the process.

Example Projects
Stock Trading Dashboard — Real-time financial data visualization with AI-generated components

AI Interview Prep Platform — Practice coding interviews with AI-powered feedback

Automated Code Reviewer — Build a tool that reviews PRs using LLM analysis

Prompt Engineering Studio — A platform to test, compare, and optimize prompts for code generation
Vibe Coding Philosophy
The goal is not to have AI write everything for you. It is to learn the art of human-AI collaboration — knowing how to guide AI, validate its output, and integrate it into a professional development workflow.
Module 0 · Slide 23

What Is Vibe Coding?

A development approach where you describe what you want in natural language and iterate with AI until the code works. Here is the process.

1

Describe Your Intent

Write a clear, detailed prompt describing what you want to build. The better your description, the better the AI output. Include constraints, technologies, and expected behavior.

2

AI Generates Code

The LLM produces a first draft — often a working scaffold with routing, data models, and basic UI. This is your starting point, not your final product.

3

Test and Evaluate

Run the generated code. Does it work? Does it handle edge cases? Does it follow best practices? Identify what is correct, what is broken, and what is missing.

4

Refine the Prompt

Based on your evaluation, refine your instructions. Be more specific about what went wrong. Iterate until the code meets your requirements. This is where the real skill lies.

5

Integrate and Deploy

Once the components work, integrate them into your application. Add manual refinements, write tests, and prepare for deployment. Document which parts were AI-generated.

The Key Skill
Vibe coding is not about being lazy — it is about being strategically efficient. The best vibe coders understand the code AI generates, can debug it when it breaks, and know when to write code manually instead. Your projects will demonstrate this balance.
Module 0 · Slide 24

Demo Project Walkthrough

Let us trace through a real vibe coding project to see what the process looks like end to end.

Project: Stock Trading Dashboard
Goal: Build a web dashboard that displays real-time stock prices, charts, and portfolio tracking using AI-generated components.
Stage 1: Initial Prompt
"Build a React dashboard with a stock ticker search, real-time price charts using Chart.js, and a portfolio tracker. Use the Alpha Vantage API for data." — AI generates a full scaffold in ~30 seconds.
Stage 2: What Worked
Component structure — clean separation of concerns
API integration — correct endpoint usage
Basic UI — functional layout with Chart.js rendering
Stage 3: What Needed Fixing
Error handling — no fallback for API failures
Rate limiting — hit API limits immediately
Styling — generic look, needed brand customization
State management — prop drilling instead of context
Stage 4: Final Result
After 3 rounds of prompt refinement and ~2 hours of manual polish, the dashboard was fully functional, styled, and deployed. Estimated time saved vs. building from scratch: 60–70%.
What Students Actually Submit
Your project submission includes: the final application, a prompt log (all AI interactions), a reflection report (what worked, what did not, what you learned), and a live demo to the class. We grade the process as much as the product.
Module 0 · Slide 25

Assessment & Expectations

This is a hands-on course. Your grade reflects both your technical skills and your engagement with the material.

ComponentWeight
Vibe Coding Projects 40%
Paper Presentations 20%
Assignments & Labs 25%
Participation 15%
What We Expect From You
Be curious. Experiment with tools beyond what is assigned.

Be critical. Do not blindly trust AI output — verify, test, and question.

Be collaborative. Share discoveries and help your peers.

Be honest. Academic integrity applies to AI-assisted work — always document how AI contributed to your submissions.
CSCI 555 (Graduate) Additional Requirements
Graduate students are expected to engage with primary research literature at greater depth, produce more rigorous project reports, and lead at least one paper presentation during the semester.
📋 What Do You Already Know? self-assessment

Rate your familiarity with each topic. This helps you understand where you are starting from — no wrong answers here.

Q1 Python programming & scripting

New to it Some experience Comfortable Expert

Q2 Machine learning / deep learning concepts

New to it Some experience Comfortable Expert

Q3 Using AI coding assistants (Copilot, ChatGPT, etc.)

Never used Tried a few times Regular user Power user

Q4 Natural language processing (NLP)

New to it Some exposure Comfortable Expert

Q5 Git & version control workflows

New to it Basic commands Comfortable Expert
Module 0 · Slide 26

How to Present a Research Paper

Paper presentations are worth 20% of your grade. Here is a proven approach to delivering a clear, engaging presentation.

1

Read the Abstract and Conclusion First

Get the big picture before diving into details. Understand what the paper claims to contribute and what the authors conclude. This frames your entire reading.

2

Identify the Problem, Approach, and Key Results

Every research paper follows this structure: What problem does it solve? What method did they use? What did they find? Organize your presentation around these three pillars.

3

Create a 15-Minute Presentation

Aim for 10–12 slides. Do not try to cover everything — focus on the core contributions. Use diagrams and examples rather than walls of text. Practice your timing.

4

Include a Live Demo or Worked Example

Show the paper's technique in action. Run code, walk through an example input/output, or demonstrate a tool. This is what makes a presentation memorable.

5

Prepare 3 Discussion Questions

End with questions that spark class discussion: What are the limitations? How could this be improved? How does it connect to other topics in the course?

Pro Tip
The best presentations do not just summarize a paper — they critique it. Tell us what you think the authors got right, what they missed, and what you would do differently. Your informed opinion is what earns the highest marks.
Module 0 · Slide 27

AI Usage Policy

This course embraces AI tools — but with clear boundaries. Transparency is non-negotiable.

Encouraged
Using AI for project development — this is literally the point of vibe coding projects
Exploring new AI tools — try Copilot, ChatGPT, Claude, and open-source alternatives
Documenting AI contributions — keep a prompt log and note which code AI generated
Using AI to learn — ask an LLM to explain concepts, then verify its answers
Allowed with Citation
Using AI to help understand concepts — cite it like you would any other source
Generating boilerplate code — acknowledge which parts were AI-generated
Getting debugging help — note that you used AI assistance in your submission
Drafting documentation — always review and refine AI-generated text
Not Allowed
Submitting AI-generated work as your own analysis — your critical thinking must be yours
Using AI for exam answers — exams test your understanding, not AI's
Not disclosing AI usage — hiding AI contributions is an academic integrity violation
Copying another student's prompts — your AI interactions should be your own
The Golden Rule
Always document your AI interactions. We value transparency. If you used AI, tell us how. This is not about restricting you — it is about building the professional habit of accountability in AI-assisted work.
Module 0 · Slide 28

Ethics & Responsible AI in SE

As future engineers, you will shape how AI is used in software development. These are the ethical considerations you need to understand.

Bias in Training Data

AI models learn patterns from their training data — including biased naming conventions, underrepresented languages, and skewed coding practices.

  • Non-inclusive variable naming patterns
  • Underperformance on non-English codebases
  • Reinforcing outdated practices

Environmental Cost

Training large language models requires massive computational resources with significant carbon footprints.

  • GPT-4 training cost estimated at ~$100M in compute
  • Inference costs scale with usage
  • Push toward efficient, smaller models

Job Displacement Concerns

Will AI replace developers? The evidence points to augmentation, not replacement — but the nature of development work is changing.

  • Augmentation vs. full replacement
  • Shifting skill requirements
  • New roles: prompt engineer, AI auditor

Intellectual Property

AI models trained on open-source code raise complex legal questions about copyright, licensing, and attribution.

  • The GitHub Copilot class-action lawsuit
  • Fair use in model training
  • License compliance in generated code
Your Responsibility
As future engineers, you will make decisions about how AI is integrated into real products used by real people. This course does not just teach you how to use AI for code — it teaches you to think critically about when and whether you should.
Module 0 · Slide 29

Your Semester Roadmap

Here is a week-by-week overview of what to expect. Each module builds on the previous one, culminating in your final project presentations.

Weeks 1–2

Intro + MSR

Course overview, setup, and mining software repositories for AI training data.

Weeks 3–4

Source Code Modeling

N-grams, language models, and statistical approaches to code prediction.

Weeks 5–6

Deep Learning

Neural networks, transformers, and architectures for code intelligence.

Weeks 7–8

Evaluation Metrics

BLEU, CodeBLEU, pass@k, and methodologies for assessing AI code quality.

Weeks 9–10

Prompting LLMs

Prompt engineering, chain-of-thought, few-shot learning, and advanced techniques.

Weeks 11–12

Hallucinations

Understanding AI errors, reliability testing, and mitigation strategies.

Weeks 13–14

Project Work

Dedicated time for vibe coding projects, office hours, and peer reviews.

Week 15

Presentations

Final project demos, reflections, and course wrap-up.

Key Dates
Paper presentation sign-ups: Week 2 — choose your paper early for the best selection
Project proposals due: Week 4 — team formation and idea pitch
Midpoint check-in: Week 8 — project progress review
Final presentations: Week 15 — demo day with invited guests
Module 0 · Slide 30
William & Mary · CSCI 455 / 555

Let's Get Started

You are about to embark on a journey at the intersection of artificial intelligence and software engineering. The future of development is collaborative — human and AI, working together.

7
Core Modules
4+
Vibe Coding Projects
1
Semester of Discovery
Begin Module 1: Mining Software Repositories →

// welcome to the future of software development

🎉

Module Complete!

You've finished Course Introduction. You're all set to start the course!