AI-BASED ANOMALY DETECTION IN SOFTWARE LOGS FOR PROACTIVE FAULT DIAGNOSIS AND SELF-HEALING

Authors

  • Rizwan Iqbal Author

Keywords:

AI-powered anomaly detection, software logs, machine learning, deep learning, self-healing systems, log analysis, LSTM, Transformer models, contrastive learning, proactive fault diagnosis, reinforcement learning, explainable AI, system resilience, automated fault recovery, real-time anomaly detection

Abstract

Modern software systems generate vast volumes of logs, making manual analysis impractical for effective monitoring and fault diagnosis. This paper proposes an architecture leveraging advanced machine learning and deep learning techniques—such as LSTMs, Transformers, contrastive learning, and self-supervised learning—to detect anomalous patterns in logs for proactive fault diagnosis and self-healing. Traditional rule-based methods fall short in handling the complexity and scale of contemporary systems. Reinforcement learning and rule-based automation further enable fault correction, reducing system downtime. Evaluation across various log datasets using metrics like precision, recall, F1-score, and AUC-ROC shows Transformer-based models outperform traditional methods, albeit with higher computational demands. The proposed self-healing systems reduce downtime by up to 68.2%, highlighting AI’s potential to enhance system resilience. However, challenges remain in model interpretability, computational cost, and real-time deployment. Addressing these through lightweight models, explainable AI, and scalable deployment is key to advancing AI-driven anomaly detection in safety-critical systems. This work also offers a state-of-the-art review and outlines future research directions to improve practicality and scalability.

Downloads

Published

2025-06-30