SEMERU Lab
Software Engineering Maintenance and Evolution Research Unit — advancing the science of intelligent software development through deep learning and program analysis.
Mission
With over two decades of sustained research output, SEMERU has shaped the field of AI for software engineering. Our work on code search, software traceability, and neural code models has been widely adopted by both the academic community and industry practitioners.
By the Numbers
Research Areas
SEMERU's research program spans five major areas at the forefront of intelligent software engineering:
Deep Learning for SE
Designing and training neural architectures for code understanding, generation, and transformation. From sequence-to-sequence models to large language models applied to software tasks.
Semantic Code Search
Building intelligent code retrieval systems that understand developer intent. Using natural language queries to find relevant code snippets across massive codebases.
Program Repair
Automated techniques for detecting, localizing, and fixing software bugs. Combining static analysis with neural models to generate correct patches.
Software Traceability
Recovering and maintaining trace links between requirements, design artifacts, and code. Using information retrieval and deep learning to ensure software consistency.
Neural Code Models
Pre-training and fine-tuning code-aware language models for downstream SE tasks including clone detection, code summarization, and vulnerability detection.
Key Publications
A selection of influential papers from the SEMERU Lab:
Tools & Artifacts
Open-source tools and datasets developed by the SEMERU Lab:
CoDiSum
A neural model for automatic commit message generation. Uses code diffs to produce concise, human-readable commit messages that describe code changes.
CSDA
Code Search with Deep Attention — a semantic code search engine that uses deep learning attention mechanisms to match natural language queries to code.
TraceLink
An information retrieval framework for recovering traceability links between software artifacts, supporting requirements-to-code and design-to-test mappings.
Code Clone Benchmark
A large-scale benchmark for evaluating code clone detection tools across multiple programming languages, including Type-1 through Type-4 clones.
Team
Meet the researchers behind the SEMERU Lab: