Architecting Resilient Cloud Infrastructure & Automating DevOps Pipelines
Hi, I'm Bharath A C. A Cloud, DevOps, and SRE Engineer and Master of Engineering student. Passionate about automating software delivery, building AIOps & MLOps pipelines, and managing high-availability infrastructure deployments.
Cloud Computing
Master of Engineering • GPA: 9.56/10
SAP Labs : BTP
BTP Core • Cloud Foundry • Kyma • DMS
AWS Cloud, AI & DevOps
Certifications
AI-Driven Platform Engineering
Agentic AI • SRE • GitOps
Professional Journey
Software Engineer Intern @ SAP Labs
BTP Core — SAP Document Management Service (SDM)
Operating on the SAP Business Technology Platform (BTP) Core division. SDM enables centralized, secure handling of unstructured business content for Cloud ERP environments, ensuring seamless storage, retrieval, governance, and streamlined document lifecycles through scalable cloud architecture.
- Developed a multi-threaded Python toolkit to enable forward and reverse metadata migration between MongoDB and PostgreSQL, improving data portability and migration efficiency.
- Worked with Cloud Foundry on SAP BTP to deploy, manage, and monitor applications, gaining hands-on experience in cloud-native application lifecycle management.
- Successfully set up and self-managed PostgreSQL clusters with High Availability (HA).
Software Engineer Intern @ Robosoft Technologies
Web Engineering Division
Engineered responsive web applications and backend API interfaces, adhering to modern software design patterns and SDLC guidelines.
- Developed responsive user interface components using React and managed application state efficiently with Redux.
- Built and integrated secure backend RESTful APIs using Node.js and Express.js with database layers.
- Collaborated using Git, participated in Scrum sprints, code reviews, and followed standard SDLC workflows.
Platform Engineering — Live View
$ watch -n 0 'commit → build → test → iac → config → deploy → observe'
- COMMIT
- BUILD
- TEST
- SCAN
- IAC PLAN
- IAC APPLY
- CONFIG
- DEPLOY
- OBSERVE
Operations Registry
AIOps Agentic Self-Healing Kubernetes Platform
An intelligent AIOps agentic platform that automates incident detection, Root Cause Analysis (RCA), and self-healing within Kubernetes clusters.
- Built a Multi-Agent System using LangGraph for coordinated monitoring, analysis, remediation, and reporting.
- Integrated Prometheus, Alertmanager, and Loki to enable real-time observability and log aggregation.
- Implemented RAG-based Root Cause Analysis using ChromaDB vector database and LLMs via Ollama.
- Developed policy-driven Auto-Remediation loops (restarts, scale-ups, diagnostic runbooks) using the Kubernetes Python Client.
- Automated build and deployment workflows using Jenkins CI/CD pipelines, sending alert responses directly to Discord.
AI Infrastructure Governance Platform
An AI-powered multi-agent auditing engine that scans Kubernetes, Terraform, and Helm configurations for security vulnerabilities, reliability risks, and cost optimization opportunities.
- Built a sequential LangGraph pipeline orchestrating specialized agents (Security, Reliability, Cost, Architecture Reviewer).
- Combines deterministic checks (tfsec, checkov) with local LLMs (Gemma via Ollama) for contextual reasoning.
- Utilizes ChromaDB to store report histories, track posture scores (0-100), and find similar past risk profiles.
- Supports multi-cloud (AWS, GCP, Azure) and parses Helm charts server-side via Helm CLI.
Progressive Delivery & Canary Deployments
Implemented production-grade progressive delivery strategies on Kubernetes using Service Meshes and GitOps pipelines to launch dynamic website releases safely.
- Designed Canary strategies using Argo Rollouts, Istio Service Mesh, and Flagger.
- Automated CI using Jenkins and GitOps deployment with ArgoCD mapping Helm chart configurations.
- Configured advanced observability dashboards in Grafana with Prometheus, Jaeger (tracing), and EFK stack logs.
Kubernetes Vind Deployment Architecture
A complete CI/CD automation pipeline deploying the CineVerse static movie website to an isolated virtual Kubernetes cluster inside GitHub Actions using VIND.
- Leverages Loft's VIND (Virtual-cluster IN Docker) to spin up CNCF-conformant control planes inside Docker containers.
- Enables ultra-fast boot times (~30s) and minimal overhead compared to heavy VM-based solutions like Minikube.
- Configured container workloads with persistent volume mounts, environment context mapping, and multi-node clusters.
- Automated ephemeral environment workflows, provisioning test clusters and clean teardowns within pipeline runs.
CrewAI Document Scanning Agent
An AI-powered multi-agent system executing local-first Q&A and classification against internal organization documents.
- Configured collaborative agent roles, tasks, goals, and backstories via YAML configurations.
- Implemented 100% local document scanning using Gemma 4 via Ollama with no data sent to external cloud APIs.
- Integrated nomic-embed-text for local RAG indexing, parsing, and semantic PDF document search.
- Scaffolded workflow with CrewAI and optimized dependencies using uv package manager.
Secure & Scalable 3-tier AWS Architecture
Designed and implemented a highly-available, secure, fault-tolerant 3-tier architecture hosting dynamic web services on AWS.
- Established private/public subnets (VPC) securing the database layer. Hosted frontend on S3/CloudFront and application layer on EC2.
- Configured RDS MySQL with Auto Scaling Groups, Elastic Load Balancer (ELB), and Route 53 DNS routing.
- Automated serverless reporting pipelines with AWS Lambda and provisioned infrastructure via CloudFormation stacks.
AI News Curator
An AI-powered local-first news extraction and newsletter generation orchestrator utilizing model-guided browser automation.
- Uses Amazon Nova Act browser-automation agent SDK to execute Google News searches and UI workflows.
- Runs parallel browser sessions for multiple topics with trace replays for visual debugging of steps.
- Enforces structured schema extraction via Pydantic, generates summaries, and deduplicates headlines.
- Features dual-persistence (CSV + MongoDB) with a Flask dashboard for visual summary reading.
OpenTelemetry Instrumentation Demo
Implemented end-to-end distributed tracing, metric collection, and log correlation across microservices using OpenTelemetry APIs and collectors.
Technical Skills & Toolkit
DevOps, Security & CI/CD
Cloud Infrastructure
Languages & DBs
Observability & Monitoring
AI & Intelligent Agents
Tech Ecosystem
Certifications
Committed to continuous learning and cloud excellence. Below are my major certificates, which you can verify in my shared Google Drive repository.
Academic Foundation
Master of Engineering (M.Eng)
Specialization in Cloud Computing
Manipal School of Information Sciences (MSIS)
Manipal Academy of Higher Education (MAHE), Manipal
07/2024 – PresentBachelor of Engineering (B.Eng)
Computer and Communication Engineering
NMAM Institute of Technology (NITTE)
Visvesvaraya Technological University, Belagavi
12/2020 – 03/2024Initialize Connection
Let's build something secure.
Whether you want to discuss AIOps self-healing systems, progressive delivery pipelines on AWS/SAP BTP, or have a Software Engineering opportunity, feel free to drop a message.
