Data-Driven Software Engineering Department

The Data-Driven Software Engineering Department (dataSED) aims to advance the frontiers of software engineering by building on the wealth of data produced during software development and operation to support software engineers with the analysis, evolution, and operation of large and complex software-intensive systems. Our focus is on the development and empirical evaluation of custom machine learning and data mining techniques that solve software engineering problems using evidence-based, actionable insights. These techniques operate on various types of data, including source code, change histories from versioning systems, data from issue tracking databases, logs from building, deploying & testing the system, and run-time information collected through logging and instrumentation. 

Research activities in dataSED address four areas of software engineering:

  1. Cybersecurity, in particular, automated identification and repair of software security vulnerabilities;
  2. Software Resilience, through adaptive bio-inspired approaches, to create autonomously self-healing systems;
  3. Intelligent Analytics, to deal with the vast amounts of data produced in iterative development processes, such as continuous engineering;
  4. Recommendation Systems, aimed at smarter evolution and testing of software-intensive systems.

We aim to work in close collaboration with industry, to ensure that our research addresses questions of practical value, and to evaluate candidate solutions in real-life circumstances. Our research is firmly rooted in well-established disciplines of software engineering, such as software repository mining, program analysis, software reverse engineering, generic language technology, and empirical software engineering.


Software Engineering

Contact person(s)