Ransomware is a detrimental threat to cybersecurity. In recent years it has gained the ability to evade traditional detection systems by manipulating user-level signals. This thesis looks at how effective machine learning models are for early and robust ransomware detection, by analyzing low-level memory and storage I/O patterns captured at the hypervisor, as this data source is more resistant to tampering within the guest system.
The research systematically compared three different Machine Learning architectures: non-sequential, recurrent, and attention-based. This comparison used a public dataset and a cross-validation framework to evaluate performance, efficiency, and generalization. The main finding is that the non-sequential model performed better across all evaluation metrics, suggesting that the overall statistical properties of a trace are more important than their temporal order.
The research measured the direct trade-off between detection speed and accuracy. It showed inconsistent generalization against ransomware variants with unfamiliar behaviors. The experiments conducted in this study confirm that hypervisor-level monitoring is a viable approach and that a non-sequential model provides the most practical and effective solution for this specific detection context. The findings highlight the important role of feature representation in choosing models and support a tiered strategy for effective ransomware defense.