hero

Venture into tech

Want to get in early on the next big thing or join a tech rising star? Search our curated, vetted list of job opportunities at high-growth Ottawa-led and Ottawa-founded technology companies. Get notified of new opportunities - sign up for alerts belowCareers at Invest Ottawa

Backend Engineer

PolarGrid

PolarGrid

Software Engineering
Ottawa, ON, Canada · Remote
Posted on Oct 18, 2025

Role Overview

We're seeking a Backend Engineer to build and scale our edge inference infrastructure. You'll architect distributed compute systems handling GPU-accelerated AI workloads across edge nodes with sub-10ms latency requirements.

Core Responsibilities

  • Infrastructure Engineering
  • Design and implement Kubernetes-native distributed compute platforms
  • Build GPU resource management and allocation systems
  • Develop edge deployment pipelines with automated testing
  • Create high-performance inference serving infrastructure

Backend Systems

  • Architect microservices for distributed model serving
  • Implement API gateways with OpenAI and Hugging Face-compatible endpoints
  • Build dynamic resource allocation and load balancing
  • Design multi-backend systems with mutual exclusivity enforcement

Performance & Optimization

  • Optimize GPU memory utilization and inference latency
  • Implement streaming inference with TensorRT acceleration
  • Build comprehensive monitoring and observability systems
  • Design automatic scaling based on workload patterns

Required Technical Skills

Core Infrastructure

Kubernetes: Production experience with cluster management, resource allocation, networking
Containerization: Docker, container security, multi-stage builds, optimization
Distributed Systems: Service mesh, load balancing, distributed consensus, fault tolerance
Cloud: GitOps, infrastructure as code, AWS, CDK

Backend Development

Languages: TypeScript, Go, Python, or Rust
APIs: RESTful services, gRPC, WebSocket streaming, rate limiting
Databases: Distributed databases, caching systems, data consistency
Message Queues: Kafka, Redis, SQS, distributed event systems

AI Inference Infrastructure

GPU Computing: NVIDIA CUDA, TensorRT, GPU memory management
AI/ML Serving: Triton Inference Server, model optimization, batch processing
Performance: Latency optimization, throughput tuning, resource profiling

Preferred Experience

Infrastructure Platforms

  • Edge computing deployments
  • Multi-region distributed systems
  • Hardware acceleration (GPUs)
  • Container security (Kata, gVisor)

Monitoring & Operations

  • Prometheus, Grafana, distributed tracing
  • SRE practices, incident response
  • Capacity planning, cost optimization
  • Automated testing and deployment

What You'll Build

Edge Inference Platform

  • Multi-tenant GPU inference clusters serving 10,000+ concurrent requests
  • Sub-10ms latency requirements with geographic distribution
  • Automatic model loading and resource optimization
  • Comprehensive health monitoring and alerting

Backend Architecture

  • Microservices handling model lifecycle management
  • API gateway with authentication and rate limiting
  • Dynamic backend switching (Python/TensorRT-LLM)
  • Streaming inference with WebSocket support

DevOps Infrastructure

  • Kubernetes operators for inference workload management
  • Automated testing covering performance and reliability
  • GitOps deployment with rollback capabilities
  • Cloud and edge resource monitoring and cost optimization