External libraries are widely used to expedite software development, but like any software component, they are updated over time, introducing new features and deprecating or removing old ones. When a library introduces breaking changes, all its clients must be updated to avoid disruptions. This update, when it introduces a breaking change, is defined as a Breaking Dependency Update. Repairing such breakages is challenging and time-consuming because the error originates in the dependency, while the fix must be applied to the client codebase. Automatic Program Repair (APR) is a research area focused on developing techniques to repair code failures without human intervention. With the advent of Large Language Models (LLMs), learning-based APR techniques have significantly improved in software repair tasks. However, their effectiveness on Breaking Dependency Updates remains unexplored. This thesis aims to investigate the efficacy of an LLM-based APR approach to Breaking Dependency Updates and to examine the impact of different components on the model’s performance and efficiency. The focus is on the API differences between the old and new versions of the dependency and a set of error-type specific repair strategies. Experiments conducted on a subset of BUMP, a new benchmark for Breaking Dependency Updates, with a strong focus on build failures, demonstrate that a naive approach to these client breakages is insufficient. Additional context from the dependency changes is necessary. Furthermore, error-type specific repair strategies are essential to repair some blocking failures that prevent the tool from completely repairing the projects. Finally, our research found that GPT-4, Gemini, and Llama exhibit similar efficacy levels but differ significantly in cost-efficiency, with GPT-4 having the highest cost per repaired failure among the tested models, almost 30 times higher than Gemini.
arXiv:2407.18760 Java-Class-Hijack: Software Supply Chain Attack for Java based on Maven Dependency Resolution and Java ClassloadingWe introduce Java-Class-Hijack, a novel software supply chain attack that enables an attacker to inject malicious code by crafting a class that shadows a legitimate class that is in the dependency tree. We describe the attack, provide a proof-of-concept demonstrating its feasibility, and replicate it in the German Corona-Warn-App server application. The proof-of-concept illustrates how a transitive dependency deep within the dependency tree can hijack a class from a direct dependency and entirely alter its behavior, posing a significant security risk to Java applications. The replication on the Corona-Warn-App demonstrates how compromising a small JSON validation library could result in a complete database takeover.
Toast EndetDeveloped a cloud-based full-stack application with 15+ modules using Laravel, React, Tailwind, MySQL and Docker to manage the delivery process of a SME. Designed a SQL database schema to manage all the information and reduced data inconsistencies by more than 85% from the previous system Implemented a REST API to allow communication with third-party apps reducing by 35% the time needed to switch to the new platform Built a mobile app using React Native to allow warehouse workers to insert data into the system, eliminating 65% of the interactions between the warehouse and the back-office Increased the number of orders processed by the company by 10% YoY.
Double Master degree in computer science and engineering, with a focus on Data Science and Machine Learning. Minor in ICT Innovation with focus in startup development and digital innovation.