Tech Lead Manager

Super Dispatch

Казахстан
Постоянная работа
Полная занятость

10 д. назад

About the RoleThe Tech Lead Manager (TLM) for the Platform squad at Super Dispatch is a hybrid leadership role that combines deep technical ownership with people management. Unlike a traditional Engineering Manager role, this position requires you to be the primary technical authority for our platform infrastructure - the person who makes architecture decisions, leads incident response, and directly contributes to critical platform systems.The Platform squad owns the foundation that all product squads build on: infrastructure (GCP, Kubernetes, CloudSQL, RabbitMQ, Elasticsearch), CI/CD pipelines, observability (Datadog), identity and authentication services, security controls, and developer experience tooling. With a team of ~4-5 engineers, this role demands someone who can personally drive technical direction while growing and managing the team.You will report to the CTO and collaborate closely with Engineering Managers across all product squads, as well as Security, Analytics, and Customer Support teams.What You'll DoYour work will split roughly 60% technical leadership / 40% people management, though this will shift based on team and organizational needs.You will own platform architecture and technical direction.

Own the technical roadmap for the Platform squad, including infrastructure modernization, reliability improvements, and cost optimization.
Make and document architecture decisions (ADRs) that affect the entire engineering organization - service decomposition, API contracts, database strategies, and infrastructure patterns.
Design and implement cross-cutting platform capabilities: identity/auth services, observability pipelines, deployment infrastructure, and security controls.
Drive API-first design practices across services, including OpenAPI specifications and generated client libraries for Go, Python, and Java consumers.
Evaluate and adopt new technologies and tools (e.g., transitioning observability to Datadog, implementing row-level security in databases, adopting infrastructure-as-code with Terraform).
Lead cost analysis and optimization of cloud infrastructure.
Be the go-to technical escalation point for platform-related questions from all product squads.

You will lead incidents and ensure platform reliability.

Own incident response for platform-level outages (SEV-0/SEV-1), coordinating across squads to restore service.
Define and maintain runbooks, monitoring alerts, and escalation procedures.
Conduct post-incident reviews and drive follow-up action items to prevent recurrence.
Set and track platform reliability metrics (uptime, latency percentiles, deployment frequency, MTTR).
Design and implement resilience patterns: circuit breakers, graceful degradation, database failover strategies.

You will be hands-on in code and infrastructure.

Contribute directly to platform services - writing production code in Python, Go, or other languages as needed.
Review code and architecture proposals from platform engineers and cross-squad contributions to shared infrastructure.
Manage deployment configurations (Kubernetes manifests, Helm charts, ArgoCD), secrets management (Vault, 1Password), and CI/CD pipelines (GitHub Actions).
Set high standards for coding, testing, deployment, and monitoring practices within the squad and across the organization.

You will manage and grow the platform team.

Manage a team of ~4-5 platform engineers with regular 1:1s, career development conversations, and performance reviews.
Coach engineers on both technical depth and breadth - helping backend engineers grow into infrastructure and reliability expertise.
Identify hiring needs and technical skill gaps; lead recruiting efforts for the platform squad.
Onboard new team members effectively, building their context on a complex, cross-cutting codebase.
Foster a collaborative culture where product squads feel supported (not blocked) by the platform team.
Delegate effectively - empower team members to own subsystems while maintaining architectural coherence.

You will communicate and coordinate across the organization.

Proactively communicate platform changes, maintenance windows, and new capabilities to engineering and non-engineering stakeholders.
Partner with product squad EMs to understand their infrastructure needs and pain points.
Coordinate with Security and Compliance on audit logging, access controls, and data protection requirements.
Collaborate with Data/Analytics teams on database access policies, ETL pipelines, and data governance.

CompetenciesIf you consider yourself an eager learner, a conscientious worker, and a thoughtful, kind, supportive human, you might just thrive at Super Dispatch.To be successful, you will need a combination of deep technical skills and leadership abilities. We expect you are:

Technically deep - you can debug a production database replication issue at 2 AM, design a new service architecture on a whiteboard, and review a Kubernetes deployment manifest with equal confidence.
Proactive - you act without being told what to do. You identify reliability risks before they become incidents and technical debt before it slows the team.
Pragmatic - you make sound trade-offs between engineering perfection and business velocity. You know when 4ms response time is good enough and when to stop optimizing.
Move fast - you execute quickly and get things done, while maintaining the quality bar expected of platform infrastructure.
Growth driven - you seek growth in learning, efficiency, and celebrate wins.
Customer focus - you treat product squads as your customers and empathize with their needs and constraints.
Strong communicator - you can explain a database failover strategy to engineers and a platform investment to leadership with equal clarity. You communicate comfortably in English (speaking and writing).

As a technical leader:

You have strong opinions, loosely held - you drive decisions forward while remaining open to better ideas.
You can evaluate complex system designs and identify where they will break at scale.
You balance "build vs. buy" decisions thoughtfully, considering long-term maintenance burden.
You write clear ADRs and technical documentation that help future engineers understand the "why" behind decisions.
You stay current with infrastructure and platform engineering trends without chasing every new tool.

As a people manager:

You can manage engineers with different skill sets from your own.
You communicate expectations clearly, solicit and deliver feedback frequently.
You run effective 1:1s, planning sessions, and retrospectives.
You can develop processes and remove hurdles to facilitate great execution.
You have a high tolerance for ambiguity, especially around organizational boundaries.
You value empathetic and direct communication, particularly when giving and receiving feedback.

Minimum Requirements

Advanced-level English skills, especially speaking and writing.
At least 7+ years of experience as a software engineer, with at least 3 years in infrastructure, platform, or SRE roles.
At least 2 years of experience managing a team of 3-10 engineers.
Deep hands-on experience with cloud infrastructure (GCP or AWS), Kubernetes, and container orchestration.
Strong background in at least two of: Java, Python, Go, - with willingness to work across all three.
Production experience with relational databases at scale (PostgreSQL), including replication, failover, and performance tuning.
Experience with message brokers (RabbitMQ, Kafka, or similar) and event-driven architectures.
Experience with observability and monitoring tools (Datadog, Grafana, Prometheus, or similar).
Track record of leading incident response for production systems and driving reliability improvements.
Experience with CI/CD pipeline design and deployment automation.
Demonstrated ability to make architectural decisions and communicate them clearly through ADRs or similar documentation.

Plus Points

Experience with Elasticsearch at scale (cluster management, index optimization, migration strategies).
Experience building identity/authentication services.
Experience with infrastructure-as-code (Terraform, Pulumi).
Experience with API-first design and OpenAPI code generation workflows.
Experience managing platform/infrastructure teams at startups during periods of rapid growth.
Experience rolling out engineering practices and processes where they didn't exist before.
Experience managing partially or entirely remote teams across multiple time zones.
Familiarity with cost optimization strategies for cloud infrastructure.
Experience with database security controls.

Super Dispatch

Откликнуться