System Design Architect
Xgrid.co
Job Description
We are seeking a System Design Architect to lead the architecture, design, and evolution of modern, scalable, and resilient workflow platforms for our clients.
As a trusted technology partner for several enterprise clients, we help organizations design distributed systems that are fault-tolerant, event-driven, and cloud-native. Youโll play a key role in driving these architectural engagements โ defining patterns, mentoring teams, and ensuring solutions are robust and maintainable at scale.
While direct experience with specific workflow orchestration technologies is a strong advantage, we value system design expertise, distributed systems fundamentals, and rapid learning ability above all.
Key Responsibilities
- Lead end-to-end system design and architecture for workflow and automation platforms across client engagements.
- Design and implement distributed, event-driven systems with high scalability, availability, and fault tolerance.
- Define architecture blueprints, reference implementations, and best practices for workflow orchestration and stateful service design.
- Collaborate with client engineering teams to evaluate, onboard, and scale workflow solutions.
- Guide decisions around data consistency, reliability patterns (sagas, retries, compensation), and observability.
- Conduct architecture reviews and provide technical governance across multiple concurrent projects.
- Partner with internal solution teams to establish accelerators, templates, and frameworks for rapid client adoption.
- Mentor engineers and provide architectural leadership within the organization.
Qualifications
- 8+ years of experience in software architecture, backend design, or distributed systems.
- Proven experience designing microservice-based or event-driven architectures.
- Deep understanding of scalability, reliability, consistency models, and system resiliency.
- Proficiency in one or more languages such as Go, Java, Python, or TypeScript.
- Strong understanding of messaging, streaming, and asynchronous communication (e.g., Kafka, RabbitMQ, Pub/Sub).
- Experience with cloud-native infrastructure (Kubernetes, Docker, CI/CD, Observability).
- Solid background in API design, workflow modeling, or automation systems.
Nice to Have (Highly Advantageous)
- Experience with workflow orchestration platforms or stateful orchestration frameworks (e.g., Temporal, Cadence, Airflow, Step Functions).
- Understanding of orchestration vs. choreography, activity/task design, and failure handling patterns.
- Experience running or optimizing workflow or event-processing clusters in production environments.
- Familiarity with Kubernetes operators, service meshes, and observability stacks (Grafana, Prometheus, OpenTelemetry).