Big Code

Creating software tools and services from massive open-source code repositories

We apply big data and machine learning techniques to massive open source code databases to create tools and services that improve the software development process.

We are currently working on the following projects:

  • Improving decompilation of binary files.

  • Classifying source code for different purposes such as predicting the expertise level of programmers by analyzing their code.

  • Source code representation and analysis using overlaid graph representations.

  • Predicting the impact of type annotation on the runtime performance of gradually typed programs.

  • New algorithms of structured data mining for big amounts of data.