Module 0 · Slide 01

William & Mary · Department of Computer Science

CSCI 455 CSCI 555

0. Course Introduction

Generative AI for Software Development — equipping the next generation of developers with AI-powered tools and techniques.

Spring 2026 3 Credits Prof. Antonio Mastropaolo

Module 0 · Slide 02

Welcome & Course Vision

Software development is being fundamentally reshaped by generative AI. This course is designed to put you at the forefront of that transformation — not just as users of AI tools, but as engineers who understand the science behind them.

What This Course Is About

We explore how deep learning and large language models can automate core software engineering tasks: code generation, documentation, testing, code review, and more. You will learn both the theory and the hands-on practice.

Why It Matters

AI coding assistants are already changing how professional software is built. Understanding their foundations — and limitations — will make you a more effective, more employable engineer.

Our Approach

This is a hands-on, project-driven course. You will design, prompt, evaluate, and build with state-of-the-art AI systems. Every module includes interactive exercises and real-world applications.

Who This Is For

Whether you are an undergraduate (CSCI 455) or a graduate student (CSCI 555), if you are curious about the intersection of AI and software engineering, you are in the right place.

Module 0 · Slide 03

What Is Generative AI?

Not all AI is created equal. Understanding the distinction between generative and discriminative models is the foundation for everything we cover in this course.

Generative AI (Creates New Content)

Generative models learn the underlying probability distribution of data and can produce entirely new, plausible outputs that did not exist in the training set.

Text Generation

Writing essays, emails, documentation, and code comments from a prompt.

Code Completion

Autocompleting functions, classes, and entire modules based on context.

Image Synthesis

Creating visuals from textual descriptions (DALL-E, Midjourney, Stable Diffusion).

Discriminative AI (Classifies Existing Data)

Discriminative models learn the decision boundary between categories. They classify or label data but do not create new content.

Spam Detection

Classifying emails as spam or legitimate based on learned patterns.

Sentiment Analysis

Determining whether a review is positive, negative, or neutral.

Bug Classification

Categorizing reported issues by severity, component, or priority.

Key Insight

GenAI models learn probability distributions over data and sample from them to generate new, plausible outputs. This is why the same prompt can produce different results each time — the model is sampling from a distribution, not looking up an answer.

Module 0 · Slide 04

The Evolution of AI for Code

AI-assisted programming did not appear overnight. Decades of research built the foundation for today's large language models.

1

Rule-Based Systems (1970s–1990s)

Expert systems and static analysis tools relied on hand-crafted rules written by domain experts. Effective for narrow tasks like linting and style checking, but brittle and unable to generalize.

2

Statistical Methods (2000s)

n-gram models and probabilistic approaches treated source code as a natural language. Researchers discovered that code is even more predictable than English text — a key insight you will explore in Module 2.

3

Deep Learning (2010s)

Recurrent neural networks (RNNs) and sequence-to-sequence models enabled code completion, bug detection, and code summarization. The introduction of the Transformer architecture in 2017 was a turning point.

4

Large Language Models (2020s)

GPT, Codex, CodeLlama, StarCoder — models trained on billions of lines of code can now generate entire functions, explain complex code, and assist with debugging. This is the era we focus on.

Course Focus

This course focuses on the LLM era while teaching you the foundations — from statistical models to deep learning — that make it all work. Each era built on the previous one, and understanding this progression makes you a stronger practitioner.

Module 0 · Slide 05

GenAI in Action: Before & After

What does AI-assisted development actually look like in practice? Here is a side-by-side comparison of traditional vs. AI-augmented workflows.

Before GenAI

The traditional workflow:

Search & Copy

Browse Stack Overflow, copy snippets, adapt them manually to your codebase. Time: 15-30 min per problem.

Write Boilerplate

Manually write repetitive code: CRUD endpoints, data models, config files. Tedious but necessary.

Debug Alone

Read error traces, add print statements, search forums. Debugging could consume entire afternoons.

Write Tests Manually

Think through edge cases yourself. Write each test by hand. Coverage was often an afterthought.

With GenAI

The AI-augmented workflow:

AI Autocomplete

Copilot suggests the next line or entire function as you type. Accept, reject, or modify in real time.

Scaffold Features

Describe what you need in plain English. AI generates a working starting point for entire features in seconds.

AI-Assisted Debugging

Paste the error into an LLM. Get an explanation, root cause analysis, and suggested fix instantly.

Generated Test Suites

Ask AI to generate unit tests for your function. It identifies edge cases you might have missed.

Important Caveat

AI does not replace your need to understand the code. It accelerates your workflow, but you are still the engineer. Blindly accepting AI suggestions leads to bugs, security vulnerabilities, and technical debt. This course teaches you to be a critical collaborator, not a passive consumer.

Module 0 · Slide 06

Your Instructor: Prof. Antonio Mastropaolo

A software engineering researcher passionate about the intersection of artificial intelligence and code.

Research Focus

My work centers on AI for Software Engineering (AI4SE): using deep learning and large language models to automate software development tasks including code generation, code summarization, bug fixing, and code review.

Background

Ph.D. in Software Engineering with a focus on applying neural machine translation techniques to source code. Published research on code generation, automated documentation, and empirical software engineering.

Teaching Philosophy

I believe in learning by doing. In this course, you will not just read about AI tools — you will build with them, evaluate them, and push their boundaries. My goal is to help you develop critical thinking about what AI can and cannot do for software development.

Office & Contact

Department of Computer Science, William & Mary
Office hours: Check the syllabus for schedule
I encourage you to reach out early and often — collaboration is key to learning.

Module 0 · Slide 07

Research Spotlight

The topics in this course are rooted in active research. Here are some contributions that inform our curriculum.

NMT for Code

Applying neural machine translation techniques to translate code between programming languages and generate code from natural language descriptions.

Treating code as a translatable language
Encoder-decoder architectures for source code
Cross-language code migration

AI for Code Review

Training models to automatically generate code review comments, catching bugs and suggesting improvements the way a human reviewer would.

Automated review comment generation
Defect prediction from code changes
Learning from millions of real reviews

Evaluating LLMs for SE

Developing rigorous methodologies for measuring how well AI models perform on software engineering tasks — beyond simple accuracy metrics.

Benchmark design for code generation
Human evaluation protocols
Measuring functional correctness

Code Summarization

Generating natural language descriptions of source code automatically — helping developers understand unfamiliar codebases faster.

Method-level documentation generation
Commit message synthesis
API documentation automation

Why This Matters For You

This is not a course taught from a textbook — it is taught by someone who actively publishes research in this field. You will engage with cutting-edge papers, and your projects may contribute to real research questions.

Module 0 · Slide 08

Course Structure

The course is organized into focused modules, each building on the last. Here is the roadmap of what you will learn.

Module 1

Mining Software Repositories

Extracting insights from version control, issue trackers, and code databases to fuel AI models.

Module 2

Source Code Modeling

Understanding n-grams, language models, and statistical approaches to modeling source code.

Module 3

Evaluating AI Techniques

Metrics and methodologies for rigorously assessing AI-generated code quality and correctness.

Module 4

Deep Learning for SE

Neural networks, transformers, and architectures that power modern code intelligence systems.

Module 5

Prompting LLMs

Prompt engineering strategies for maximizing the effectiveness of large language models in SE.

Module 6

Hallucinations & Reliability

Understanding and mitigating AI errors, biases, and the challenge of trustworthy code generation.

Module 0 · Slide 09

Learning Objectives

By the end of this course, you will be able to:

1Design and implement deep learning pipelines tailored to software engineering tasks such as code generation and summarization.
2Apply prompt engineering techniques to effectively leverage large language models for real-world development workflows.
3Critically evaluate AI-generated code using established metrics (BLEU, CodeBLEU, pass@k) and identify failure modes.
4Mine software repositories to extract datasets suitable for training and evaluating AI models.

5Analyze state-of-the-art research in AI4SE and propose innovative refinements to existing techniques.
6Identify and mitigate AI hallucinations and reliability issues in code generation systems.
7Build complete, functional applications using AI-assisted "vibe coding" workflows from concept to deployment.
8Communicate findings through technical presentations and written reports at a professional level.

Module 0 · Slide 10

How This Course Connects

This course sits at a critical intersection in your CS education. Here is how it builds on what you know and opens doors to what comes next.

←

Prerequisites You Bring

CS fundamentals (data structures, algorithms), programming proficiency (Python preferred), and version control (Git). These are the building blocks we will stand on.

NOW

This Course: GenAI for Software Development

You will gain hands-on experience with AI-assisted coding, deep learning for SE, prompt engineering, and evaluation methodologies. These are skills that bridge traditional CS and the AI-powered future.

→

What It Unlocks

ML/AI research (graduate studies, publications), industry AI roles (ML engineer, AI-augmented developer), and entrepreneurship (building AI-powered products). This course makes you fluent in the language of AI4SE.

For Undergrads (CSCI 455)

This course gives you a competitive edge in the job market. Companies actively seek developers who can work effectively with AI tools and understand their strengths and limitations.

For Graduate Students (CSCI 555)

This course provides the research foundation for a thesis or dissertation in AI4SE. You will learn to read, critique, and build upon state-of-the-art research papers.

Module 0 · Slide 11

Course Logistics

Essential details about enrollment, prerequisites, and how the course is structured.

Course Numbers

CSCI 455 — Undergraduate section
CSCI 555 — Graduate section
Both sections meet together. Graduate students have additional depth requirements on projects and presentations.

Schedule & Credits

Semester: Spring 2026 (also offered Spring 2025)
Credits: 3 credit hours
Class size: ~36 students
Format: In-person lectures with hands-on lab components

Prerequisites

Solid foundation in programming (Python preferred)
Basic understanding of data structures & algorithms
Familiarity with version control (Git)
No prior ML/AI experience required — we build from the fundamentals

What You Will Need

A laptop capable of running Python and Jupyter notebooks
Access to GitHub (free account)
Willingness to experiment with AI tools — some may require API keys (provided or free-tier)

Module 0 · Slide 12

Prerequisites in Detail

An honest look at what you should know coming in — and what we will teach you along the way.

Required

Python Proficiency
You will write Python daily. Comfort with functions, classes, file I/O, and pip/conda is essential.

Basic Data Structures
Lists, dictionaries, trees, graphs. You need to read and reason about algorithmic code.

Git Basics
Clone, commit, push, pull, branch. We use GitHub extensively for projects and assignments.

Helpful but Not Required

Probability & Statistics
Useful for understanding language models and evaluation metrics. We review the basics in Module 2.

Linear Algebra
Helps with understanding neural network internals. Not strictly required — we provide intuitions.

Prior ML Exposure
Familiarity with concepts like training/testing, overfitting, and loss functions gives you a head start.

Need to Ramp Up?

If you feel rusty on any required topic, reach out during the first week. We can recommend specific resources to get you up to speed quickly. The course is designed to be accessible — we will build ML/AI concepts from scratch.

Module 0 · Slide 13

Getting Set Up

Make sure you have everything ready before the first lab session. Click each item to mark it as complete.

✅ Environment Checklist interactive

✓ Python 3.8+ installed — verify with python --version
✓ Jupyter Notebook / JupyterLab ready — install via pip install jupyterlab
✓ Git installed & configured — set your name and email with git config
✓ GitHub account created — sign up at github.com if you do not have one yet
✓ VS Code + Copilot extension — free for students via the GitHub Student Developer Pack
✓ API keys obtained (OpenAI, HuggingFace) — instructions will be provided in the first lab

0 / 6 ready

Do Not Worry

If you cannot complete all items before class, that is okay. We will walk through the setup during the first lab session. The most important thing is having Python and a GitHub account.

Module 0 · Slide 14

Why GenAI for Software Development?

Generative AI is not a distant future — it is transforming software development right now. Here is the landscape.

92%

of developers use AI coding tools

55%

faster task completion with AI assistance

$1.5B+

invested in AI coding startups

150M+

GitHub Copilot users worldwide

The Opportunity

Developers who understand AI-assisted coding are in high demand. Companies are integrating these tools into every stage of the software lifecycle. Knowing how they work — and where they fail — gives you a significant competitive advantage.

The Challenge

AI tools are powerful but not infallible. They can hallucinate code, introduce subtle bugs, and perpetuate bad practices. This course teaches you to leverage AI effectively while maintaining the critical thinking that defines a great engineer.

Module 0 · Slide 15

The State of AI in Software Engineering (2025)

The AI-for-code landscape is evolving rapidly. Here is where things stand as you start this course.

97%

of developers have used AI tools (GitHub Survey)

Top 3

Copilot, ChatGPT, Claude

$30B+

AI coding tools market size

30–50%

productivity gains reported

What Is Working

Code autocomplete — accepted suggestions save significant keystrokes
Boilerplate generation — CRUD, configs, and scaffolds in seconds
Documentation — generating docstrings and README files
Debugging assistance — explaining errors and suggesting fixes

The Reality Check

Complex logic — AI still struggles with multi-step reasoning
Security — generated code often contains vulnerabilities
Testing — AI tests frequently miss critical edge cases
Architecture — system-level design remains a human skill

Hype vs. Reality

Headlines claim AI will replace developers. The research tells a different story: AI is a powerful amplifier for skilled engineers, but it cannot replace understanding, judgment, and creativity. This course teaches you to separate the signal from the noise.

Module 0 · Slide 16

What Can Go Wrong?

AI-assisted development brings real risks. Understanding these upfront makes you a more responsible and effective engineer.

Code Hallucinations

AI can confidently generate code that calls APIs that do not exist, uses deprecated methods, or invents function signatures.

Fabricated library functions
Incorrect API parameters
Plausible but wrong logic

Security Vulnerabilities

AI-generated code frequently contains security flaws: SQL injection, XSS, hardcoded secrets, and improper input validation.

Insecure default configurations
Missing input sanitization
Exposed credentials in examples

Over-Reliance

Developers who lean too heavily on AI risk losing fundamental skills. If the AI is down or wrong, can you still solve the problem?

Atrophy of debugging skills
Reduced deep understanding
Dependency on AI availability

Licensing & IP Concerns

AI models trained on open-source code may reproduce copyrighted snippets. The legal landscape is still evolving.

Copilot lawsuit precedent
License compliance questions
Attribution requirements

Bias in AI Models

AI models reflect the biases in their training data. This can mean generating code that follows outdated patterns, reinforces non-inclusive variable naming, or performs poorly on underrepresented programming languages and paradigms. We will explore this in depth in Module 6 (Hallucinations & Reliability).

Module 0 · Slide 17

The SE + AI Landscape

AI is transforming every phase of the software development lifecycle. Here are the key areas we will explore.

Code Generation

From natural language descriptions to working code. AI models can now write functions, classes, and entire modules.

Function-level synthesis from docstrings
Autocomplete and inline suggestions
Full application scaffolding

Code Review & Quality

Automated detection of bugs, code smells, and security vulnerabilities before they reach production.

Static analysis augmented by AI
Automated PR review comments
Vulnerability detection

Testing & Verification

AI-generated test cases, property-based testing, and automated regression detection.

Unit test generation from source code
Fuzz testing with AI guidance
Test oracle synthesis

Documentation & Maintenance

Automatic code summarization, API documentation generation, and commit message synthesis.

Javadoc / docstring generation
README and changelog automation
Code-to-explanation pipelines

Module 0 · Slide 18

AI Across the Software Lifecycle

AI is not just for writing code. It is being applied to every phase of the SDLC. Here is a stage-by-stage walkthrough.

1

Requirements

Natural language specifications transformed into structured user stories. AI extracts acceptance criteria and identifies ambiguities.

2

Design

Architecture suggestions based on project description. AI recommends patterns, databases, and tech stacks suited to your requirements.

3

Coding

Real-time autocomplete, function generation, and code translation. The most mature area of AI-assisted development today.

4

Testing

Automated test generation, mutation testing, and fuzz testing. AI identifies edge cases and generates assertions from code behavior.

5

Code Review

Automated review comments, style enforcement, and bug detection. AI reviewers complement human judgment for faster, more thorough reviews.

6

Documentation

Javadoc, changelogs, and API docs generated from code. AI maintains documentation in sync with code changes automatically.

7

Deployment & Ops

CI/CD pipeline optimization, infrastructure-as-code generation, and intelligent monitoring. AI reduces deployment friction and catches issues early.

Module 0 · Slide 19

Tools of the Trade

Throughout this course, you will work with a variety of AI models and tools — both commercial and open-source.

🧠

GPT-4 / GPT-4o

OpenAI's flagship multimodal models for code generation, analysis, and reasoning.

commercial

💬

Claude

Anthropic's assistant with strong coding, analysis, and long-context capabilities.

commercial

⚙️

GitHub Copilot

AI pair programmer integrated directly into your IDE for real-time code suggestions.

ide-integrated

🌐

CodeLlama / StarCoder

Open-source code LLMs you can run locally, fine-tune, and study architecturally.

open-source

📓

Jupyter + Python

Your primary workspace for experiments, model evaluation, and data analysis.

open-source

🛠️

HuggingFace

The platform for accessing pre-trained models, datasets, and the Transformers library.

open-source

🎯 Match the Tool interactive

Click a tool on the left, then click the matching capability on the right. Match all pairs correctly.

GitHub Copilot

HuggingFace

GPT-4

CodeLlama

Jupyter Notebooks

Interactive environment for running Python experiments and visualizing results

Real-time AI code suggestions directly inside your IDE as you type

Platform for downloading pre-trained models and NLP datasets

Multimodal reasoning model for complex code generation and analysis

Open-source code LLM you can run locally and fine-tune

Module 0 · Slide 20

Open Source vs. Commercial Models

Choosing the right model for a task is an important skill. Here is how the two major categories compare.

Commercial Models

GPT-4, Claude, Gemini

Strengths

Highest performance on complex reasoning, large context windows, multimodal capabilities, and regular updates.

Tradeoffs

API costs per token, data sent to third-party servers, rate limits, and vendor lock-in risk.

Best For

Complex code generation, debugging, architecture discussions, and rapid prototyping when quality matters most.

Open-Source Models

CodeLlama, StarCoder, DeepSeek Coder

Strengths

Free to use, self-hosted (full data privacy), customizable through fine-tuning, and community-driven development.

Tradeoffs

Require GPU hardware to run, generally lower performance on complex tasks, and less frequent updates.

Best For

Research experiments, domain-specific fine-tuning, privacy-sensitive projects, and understanding model internals.

Course Approach

We use both categories in this course. Commercial models for rapid prototyping and project development; open-source models for understanding architectures, fine-tuning experiments, and research. The best engineers know when to use each.

Module 0 · Slide 21

Interactive: Model Size Explorer

Click on a model below to explore its specifications. Understanding model scale helps you choose the right tool for the job.

GPT-4

OpenAI · Commercial

Claude 3.5 Sonnet

Anthropic · Commercial

CodeLlama-34B

Meta · Open Source

StarCoder2-15B

BigCode · Open Source

Why Size Matters

Larger models generally perform better on complex tasks, but they cost more to run and require more hardware. The trend in 2025 is toward smaller, more efficient models that match larger models on specific domains — a key research direction you will explore.

Module 0 · Slide 22

Course Projects

The heart of this course: you will build real, functional applications using AI-assisted development — what we call "vibe coding."

1

Concept & Design

Start with a problem statement. Use AI to brainstorm architectures, generate wireframes, and plan your tech stack.

2

AI-Assisted Implementation

Build your application using LLMs as coding partners. Learn when to trust, when to verify, and when to override AI suggestions.

3

Testing & Refinement

Use AI to generate tests, identify edge cases, and refactor your code. Evaluate the quality of AI-generated components.

4

Presentation & Reflection

Present your project to the class. Reflect on where AI helped, where it hindered, and what you learned about the process.

Example Projects

Stock Trading Dashboard — Real-time financial data visualization with AI-generated components

AI Interview Prep Platform — Practice coding interviews with AI-powered feedback

Automated Code Reviewer — Build a tool that reviews PRs using LLM analysis

Prompt Engineering Studio — A platform to test, compare, and optimize prompts for code generation

Vibe Coding Philosophy

The goal is not to have AI write everything for you. It is to learn the art of human-AI collaboration — knowing how to guide AI, validate its output, and integrate it into a professional development workflow.

Module 0 · Slide 23

What Is Vibe Coding?

A development approach where you describe what you want in natural language and iterate with AI until the code works. Here is the process.

1

Describe Your Intent

Write a clear, detailed prompt describing what you want to build. The better your description, the better the AI output. Include constraints, technologies, and expected behavior.

2

AI Generates Code

The LLM produces a first draft — often a working scaffold with routing, data models, and basic UI. This is your starting point, not your final product.

3

Test and Evaluate

Run the generated code. Does it work? Does it handle edge cases? Does it follow best practices? Identify what is correct, what is broken, and what is missing.

4

Refine the Prompt

Based on your evaluation, refine your instructions. Be more specific about what went wrong. Iterate until the code meets your requirements. This is where the real skill lies.

5

Integrate and Deploy

Once the components work, integrate them into your application. Add manual refinements, write tests, and prepare for deployment. Document which parts were AI-generated.

The Key Skill

Vibe coding is not about being lazy — it is about being strategically efficient. The best vibe coders understand the code AI generates, can debug it when it breaks, and know when to write code manually instead. Your projects will demonstrate this balance.

Module 0 · Slide 24

Demo Project Walkthrough

Let us trace through a real vibe coding project to see what the process looks like end to end.

Project: Stock Trading Dashboard

Goal: Build a web dashboard that displays real-time stock prices, charts, and portfolio tracking using AI-generated components.

Stage 1: Initial Prompt

"Build a React dashboard with a stock ticker search, real-time price charts using Chart.js, and a portfolio tracker. Use the Alpha Vantage API for data." — AI generates a full scaffold in ~30 seconds.

Stage 2: What Worked

Component structure — clean separation of concerns
API integration — correct endpoint usage
Basic UI — functional layout with Chart.js rendering

Stage 3: What Needed Fixing

Error handling — no fallback for API failures
Rate limiting — hit API limits immediately
Styling — generic look, needed brand customization
State management — prop drilling instead of context

Stage 4: Final Result

After 3 rounds of prompt refinement and ~2 hours of manual polish, the dashboard was fully functional, styled, and deployed. Estimated time saved vs. building from scratch: 60–70%.

What Students Actually Submit

Your project submission includes: the final application, a prompt log (all AI interactions), a reflection report (what worked, what did not, what you learned), and a live demo to the class. We grade the process as much as the product.

Module 0 · Slide 25

Assessment & Expectations

This is a hands-on course. Your grade reflects both your technical skills and your engagement with the material.

Component	Weight
Vibe Coding Projects	40%
Paper Presentations	20%
Assignments & Labs	25%
Participation	15%

What We Expect From You

Be curious. Experiment with tools beyond what is assigned.

Be critical. Do not blindly trust AI output — verify, test, and question.

Be collaborative. Share discoveries and help your peers.

Be honest. Academic integrity applies to AI-assisted work — always document how AI contributed to your submissions.

CSCI 555 (Graduate) Additional Requirements

Graduate students are expected to engage with primary research literature at greater depth, produce more rigorous project reports, and lead at least one paper presentation during the semester.

📋 What Do You Already Know? self-assessment

Rate your familiarity with each topic. This helps you understand where you are starting from — no wrong answers here.

Q1 Python programming & scripting

New to it Some experience Comfortable Expert

Q2 Machine learning / deep learning concepts

New to it Some experience Comfortable Expert

Q3 Using AI coding assistants (Copilot, ChatGPT, etc.)

Never used Tried a few times Regular user Power user

Q4 Natural language processing (NLP)

New to it Some exposure Comfortable Expert

Q5 Git & version control workflows

New to it Basic commands Comfortable Expert

Module 0 · Slide 26

How to Present a Research Paper

Paper presentations are worth 20% of your grade. Here is a proven approach to delivering a clear, engaging presentation.

1

Read the Abstract and Conclusion First

Get the big picture before diving into details. Understand what the paper claims to contribute and what the authors conclude. This frames your entire reading.

2

Identify the Problem, Approach, and Key Results

Every research paper follows this structure: What problem does it solve? What method did they use? What did they find? Organize your presentation around these three pillars.

3

Create a 15-Minute Presentation

Aim for 10–12 slides. Do not try to cover everything — focus on the core contributions. Use diagrams and examples rather than walls of text. Practice your timing.

4

Include a Live Demo or Worked Example

Show the paper's technique in action. Run code, walk through an example input/output, or demonstrate a tool. This is what makes a presentation memorable.

5

Prepare 3 Discussion Questions

End with questions that spark class discussion: What are the limitations? How could this be improved? How does it connect to other topics in the course?

Pro Tip

The best presentations do not just summarize a paper — they critique it. Tell us what you think the authors got right, what they missed, and what you would do differently. Your informed opinion is what earns the highest marks.

Module 0 · Slide 27

AI Usage Policy

This course embraces AI tools — but with clear boundaries. Transparency is non-negotiable.

Encouraged

Using AI for project development — this is literally the point of vibe coding projects
Exploring new AI tools — try Copilot, ChatGPT, Claude, and open-source alternatives
Documenting AI contributions — keep a prompt log and note which code AI generated
Using AI to learn — ask an LLM to explain concepts, then verify its answers

Allowed with Citation

Using AI to help understand concepts — cite it like you would any other source
Generating boilerplate code — acknowledge which parts were AI-generated
Getting debugging help — note that you used AI assistance in your submission
Drafting documentation — always review and refine AI-generated text

Not Allowed

Submitting AI-generated work as your own analysis — your critical thinking must be yours
Using AI for exam answers — exams test your understanding, not AI's
Not disclosing AI usage — hiding AI contributions is an academic integrity violation
Copying another student's prompts — your AI interactions should be your own

The Golden Rule

Always document your AI interactions. We value transparency. If you used AI, tell us how. This is not about restricting you — it is about building the professional habit of accountability in AI-assisted work.

Module 0 · Slide 28

Ethics & Responsible AI in SE

As future engineers, you will shape how AI is used in software development. These are the ethical considerations you need to understand.

Bias in Training Data

AI models learn patterns from their training data — including biased naming conventions, underrepresented languages, and skewed coding practices.

Non-inclusive variable naming patterns
Underperformance on non-English codebases
Reinforcing outdated practices

Environmental Cost

Training large language models requires massive computational resources with significant carbon footprints.

GPT-4 training cost estimated at ~$100M in compute
Inference costs scale with usage
Push toward efficient, smaller models

Job Displacement Concerns

Will AI replace developers? The evidence points to augmentation, not replacement — but the nature of development work is changing.

Augmentation vs. full replacement
Shifting skill requirements
New roles: prompt engineer, AI auditor

Intellectual Property

AI models trained on open-source code raise complex legal questions about copyright, licensing, and attribution.

The GitHub Copilot class-action lawsuit
Fair use in model training
License compliance in generated code

Your Responsibility

As future engineers, you will make decisions about how AI is integrated into real products used by real people. This course does not just teach you how to use AI for code — it teaches you to think critically about when and whether you should.

Module 0 · Slide 29

Your Semester Roadmap

Here is a week-by-week overview of what to expect. Each module builds on the previous one, culminating in your final project presentations.

Weeks 1–2

Intro + MSR

Course overview, setup, and mining software repositories for AI training data.

Weeks 3–4

Source Code Modeling

N-grams, language models, and statistical approaches to code prediction.

Weeks 5–6

Deep Learning

Neural networks, transformers, and architectures for code intelligence.

Weeks 7–8

Evaluation Metrics

BLEU, CodeBLEU, pass@k, and methodologies for assessing AI code quality.

Weeks 9–10

Prompting LLMs

Prompt engineering, chain-of-thought, few-shot learning, and advanced techniques.

Weeks 11–12

Hallucinations

Understanding AI errors, reliability testing, and mitigation strategies.

Weeks 13–14

Project Work

Dedicated time for vibe coding projects, office hours, and peer reviews.

Week 15

Presentations

Final project demos, reflections, and course wrap-up.

Key Dates

Paper presentation sign-ups: Week 2 — choose your paper early for the best selection
Project proposals due: Week 4 — team formation and idea pitch
Midpoint check-in: Week 8 — project progress review
Final presentations: Week 15 — demo day with invited guests

Module 0 · Slide 30

William & Mary · CSCI 455 / 555

Let's Get Started

You are about to embark on a journey at the intersection of artificial intelligence and software engineering. The future of development is collaborative — human and AI, working together.

7

Core Modules

4+

Vibe Coding Projects

1

Semester of Discovery

Begin Module 1: Mining Software Repositories →

// welcome to the future of software development