Machine Learning Engineer
Atos Madrid
Experteer Overview
As a Machine Learning Engineer at Bull, you will develop AI-driven solutions for HPC infrastructure monitoring, reliability, and cybersecurity. You will leverage large-scale telemetry and logs to build predictive models and enable proactive operations within a Kubernetes-based platform.You will validate, deploy, and operationalize models in production while collaborating with cross-functional teams to improve system availability. This role offers a chance to work at the intersection of AI, HPC, and cyber‑security in a cutting-edge R&D environment.
Compensaciones / Beneficios- Design and develop ML/DL models to predict hardware failures and detect anomalies in HPC systems
- Apply time-series forecasting, anomaly detection, classification, and predictive maintenance on large-scale monitoring data
- Build and maintain data pipelines from infrastructure telemetry and logs
- Perform rigorous model validation for robustness and production readiness
- Deploy and operate models within a Kubernetes environment with scalable inference and lifecycle management
- Contribute to AI-driven cybersecurity use cases detecting abnormal behaviors or intrusions
- Work within an Agile/Scrum framework, participating in sprint activities
- Collaborate with system administrators, support teams, and data engineers to translate operational challenges into data-driven solutions
- Strong experience with ML and DL frameworks (TensorFlow, PyTorch, Scikit-learn)
- Experience with time-series data and anomaly detection
- Proficiency in Python and data science ecosystem
- Experience with Prometheus or similar monitoring/telemetry systems
- Familiarity with containerization/orchestration, especially Kubernetes
- Experience building production-grade ML pipelines
- Experience handling large-scale monitoring/operational datasets
- Understanding of distributed systems and infrastructure monitoring
- Knowledge of HPC environments, GPUs, and high-speed interconnects (e.g., Infiniband) is highly desirable
- Proficiency with Git-based version control (GitHub, GitLab)
- Linux proficiency
- Understanding of Scrum; experience with Jira and Confluence
- Flexible Work Schedule: Half day Fridays
- Learning and Growth: opportunities to work with advanced AI technologies
AmazonMadrid
production issues
Responsabilidades
• Experience with video and image processing, compression algorithms, computer vision or machine learning
• Experience with cloud computing and cloud technologies
• Programming experience in Python, Java, or C++
• Bachelor...
Grupo NSMadrid
Descripción:
Desde Grupo NS precisamos incorporar un ingeniero/a de Machine Learning orientado/a a producción, con fuerte base en Python y Kubernetes, capaz de desplegar y operar modelos de IA en un entorno corporativo IBM, integrándose con equipos...
AmazonMadrid
research and engineering projects within a growing, collaborative team that is part of Amazon's expanding Madrid Tech Hub.- Experience with video and image processing and compression algorithms and standards, computer vision and/or machine learning...