Back

 Industry News Details

 
Machine learning tool trains on old code to spot bugs in new code Posted on : May 18 - 2020

Microsoft and Altran release Code Defect AI to identify potential problems in software development and suggest fixes.

Altran has released a new tool that uses artificial intelligence (AI) to help software engineers spot bugs during the coding process instead of at the end.

Available on GitHub, Code Defect AI uses machine learning (ML) to analyze existing code, spot potential problems in new code, and suggest tests to diagnose and fix the errors.

Walid Negm, group chief innovation officer at Altran, said that this new tool will help developers release quality code quickly.

"The software release cycle needs algorithms that can help make strategic judgments, especially as code gets more complex," he said in a press release.

Code Defect AI uses several ML techniques including random decision forests, support vector machines, multilayer perceptron (MLP) and logistic regression. The platform extracts, processes and labels historical data to train the algorithm and build a reliable decision model. Developers can use a confidence score from Code Defect AI that predicts whether the code is compliant or buggy.

Here is how Code Defect AI works:

  1. For an open source GitHub project, historical data is collected using RESTFul interfaces and Git CLI. This data includes complete commit history and complete bugs history.
  2. Preprocessing techniques like feature identification, label encoding, one hot encoding, data scaling and normalization are applied to the collected historical commit data.
  3. Labelling is performed on the preprocessed data. The labelling process involves understanding of the pattern in which the fix commits (where a bug has been closed) are tagged for each of the closed issues. After the fix commits are collected, the commits which introduced the bugs are identified by backtracking on historical changes for each file in a fix commit.
  4. If a data set contains a very small amount of bug data as compared with clean records, synthetic data is also generated to avoid bias toward the majority class.
  5. Multiple modelling algorithms are trained on the data prepared.
  6. Once there is a model that has acceptable value of precision and recall, the selected model is deployed for prediction on new commits.

Code Defect AI supports integration with third-party analysis tools and can help identify bugs in a given program code. Also, the Code Defect AI tool allows developers to assess which features in the code should take higher priority in terms of bug fixes.

"Microsoft and Altran have been working together to improve the software development cycle, and Code Defect AI, powered by Microsoft Azure, is an innovative tool that can help software developers through the use of machine learning," said David Carmona, general manager of AI marketing at Microsoft, in a press release. View More