I build Kubernetes-first platforms that ship fast, stay observable, and survive incidents.
AI-Powered / Open Infrastructure / FinTech & Startup Opportunities
Open to DevOps / SRE / Platform Engineering roles in FinTech, startups, Europe, LATAM, remote-first and international engineering teams.
I focus on production-grade platforms where releases are repeatable, systems are observable, failures are recoverable, and operational context is ready for humans and AI-assisted workflows.
- Platform Engineering: Kubernetes, OpenShift, Helm, Argo CD, GitOps, Docker, Linux, KubeVirt pet-project exposure, multi-environment and hybrid infrastructure patterns.
- SRE / Production Reliability: incident response, RCA, high availability, automated failover, zero-downtime migrations, SLI/SLO thinking, error budgets, release reliability, BC/DR readiness.
- CI/CD & Release Engineering: Jenkins, GitLab CI/CD, GitHub Actions, canary deployments, automated rollbacks, quality gates, security-scan stages, deployment scripts and controlled production changes.
- IaC & Automation: Terraform, OpenTofu, Pulumi, Ansible, Python, Bash, repeatable VM/platform provisioning, multi-data-centre and on-prem/cloud-connected automation.
- Stateful Platforms: PostgreSQL, Patroni, etcd, PgBouncer, Envoy, Redis, MongoDB, MinIO, Kafka, Debezium, backups, restore readiness, replication, failover and query/index troubleshooting.
- Observability: Prometheus, VictoriaMetrics, Grafana, ELK/Kibana, OpenTelemetry, Jaeger-to-log workflows, dashboards, alerting, RED/USE, Four Golden Signals, capacity and degradation visibility.
- AI-Powered Ops / IDP: n8n, AI agents, Streamlit, Backstage-style service catalog patterns, alert enrichment, structured service metadata, runbook-friendly context and support automation.
- Security & Compliance: RBAC, SSO, LDAP/Active Directory, Keycloak, OAuth/OIDC concepts, JWT sessions, auditability, access traceability, SOC 2-aware operational controls and sensitive-data discipline.
- Startup/Product Delivery: MVP-to-production delivery with CI/CD, IaC, monitoring, guardrails, simple ML/prototyping with Python and scikit-learn when it helps the product.
| Area | Signal |
|---|---|
| Production scope | 300+ microservices across Kubernetes/OpenShift |
| Platform size | 30+ Kubernetes/OpenShift nodes |
| Critical services | 99.99% measured uptime |
| Detection speed | ~30 min → under 2 min |
| DB incident MTTR | hours → ~10–20 min |
| Release delivery | every 3 days → daily production releases |
| Deployment speed | 3x faster delivery workflows |
| Toil reduction | 30+ hours/week automated |
Merged upstream contributions focused on Kubernetes, Helm charts, observability, GitOps documentation, dashboard correctness and runtime regression coverage.
| Project | PR | What changed | Status |
|---|---|---|---|
| Kubespray | #13249 | Removed duplicated inline fallback defaults from selected download and Kubernetes preinstall role paths, relying on configured role defaults while keeping behavior unchanged. | Merged · May 27, 2026 |
| Prometheus Community Helm Charts | #6902 | Added kube-prometheus-stack support for ThanosRuler extraEnv through a strategic merge patch path for cleaner Ruler deployment extension. |
Merged · May 22, 2026 |
| Open Policy Agent / Gatekeeper | #4557 | Fixed Helm webhook namespaceSelector rendering so generated exempt label values are quoted and stay stable as strings. |
Merged · May 19, 2026 |
| Prometheus Community Helm Charts | #6905 | Fixed invalid kube-prometheus-stack PrometheusRule rendering when single-alert default rule groups are disabled; updated generated rules and tests. |
Merged · May 17, 2026 |
| Prometheus Community Helm Charts | #6906 | Fixed KubeletDown alert generation with additional aggregation labels to avoid false positives for healthy kubelets. |
Merged · May 16, 2026 |
| bpftrace | #5161 | Added runtime regression coverage around repeated for loops and mixed map value types. |
Merged · May 13, 2026 |
| Kubernetes Mixin | #1219 | Fixed API server Grafana dashboard error-budget wording and percentage formatting at the upstream mixin source. | Merged · May 13, 2026 |
| Flux website | #2553 | Added AWS CodeCommit SSH authentication documentation for Flux source git and bootstrap git workflows. |
Merged · May 11, 2026 |
| Prometheus Community Helm Charts | #6901 | Fixed duplicate thanos.image rendering in kube-prometheus-stack generated Prometheus custom resources. |
Merged · May 11, 2026 |
| Repository | What it demonstrates |
|---|---|
| heritage-infra | End-to-end infrastructure rollout: VM preparation, Kubernetes automation, GitOps delivery, Helm charts and Vault integration. |
| heritage-cicd | Reference CI/CD pipeline with quality gates, security scanning, container builds, semantic release and GitOps deployment. |
| heritage-vm-create | Terraform + Pulumi automation for repeatable VM provisioning and infrastructure rollout patterns. |
| social-project | Startup MVP baseline for a professional social-networking product. |
| DB-Interface-childs-policl | C# + MySQL educational DB interface with procedures, triggers, functions and application logic. |
DevOps / SRE Engineer — Alfa-Bank environment / vendor-side delivery
DFA / RWA / blockchain-backed banking platform · Apr 2023 — Present
- Operated reliability and deployment workflows for 300+ microservices across Kubernetes/OpenShift clusters.
- Improved release flow from every three days to daily production releases with safer rollout controls.
- Operated PostgreSQL HA with Patroni, etcd, PgBouncer and Envoy: failover, pooling, read/write separation and zero-downtime maintenance.
- Built AI-powered operational workflows with n8n, AI agents and structured service metadata for ticket/status aggregation, alert enrichment and faster triage.
- Led an internal Python + Streamlit operations platform deployed on Kubernetes with LDAP auth, JWT sessions, read-only DB replicas, Swagger discovery, SQL utilities, Kubernetes config diffing, certificate checks and Jira/n8n integrations.
- Delivered infrastructure automation across on-prem and cloud-connected environments using Terraform, OpenTofu/Pulumi, Ansible, Jenkins, Python and Bash.
- Supported secure banking-grade operations: RBAC/SSO, LDAP/AD, Keycloak, controlled production changes, auditability and sensitive-data discipline.
Lead System Administrator / DevOps — Block4Block
Blockchain infrastructure · May 2022 — Dec 2022
- Automated provisioning, deployment and operations for blockchain nodes across Ethereum and Polygon using Docker and Ansible.
- Built infrastructure for a crypto ETF portfolio product and supported production transactional workloads from day one.
- Built Python Telegram bots and SQL dashboards for real-time blockchain analytics and operational visibility.
- Higher School of Economics (HSE) — Product Management / MBA track
- Innopolis University — Software Engineering & Machine Learning Technologies, graduated with honors
- MIREA — Russian Technological University — Informatics and Programming
- Rostelecom + MIREA — DevOps Engineering & Site Reliability Engineering, graduated with honors
- Russian — Native
- English — B2
- Email: zakhardenn@gmail.com
- Telegram: @Zakhardenn
- LinkedIn: linkedin.com/in/zakharden
- GitHub: github.com/Zakharden
I build AI-powered, highly reliable platforms that are boring in production, fast in delivery, and clear during incidents.


