
Site Reliability Engineer
- On-site
MONTREAL, QC CANADA
FULL-TIME
CURRENT VACANCY
Job description
WE'RE HIRING!
At HTG, you’ll push boundaries with the latest tech and collaborate with a team that loves what they do. Be part of a design services company that is amongst the companies that lead the world in technology and innovation.
Your next chapter starts here.
In this role, you will:
Perform advanced troubleshooting and service recovery for residential and small-business networks, energy systems, and IoT technologies.
Support and troubleshoot Azure workloads, cloud integrations, and general IT issues across Windows, Linux, and hybrid environments.
Perform detailed incident investigations, document findings, and implement corrective actions. Strengthen the team’s ability to resolve complex technical issues.
Communicate clearly with customers and internal teams, providing timely updates, guidance, and expectations throughout the incident lifecycle.
Assist L1 analysts through training, coaching, and escalation support to increase first-contact resolution rates.
Stay current with developments in EV charging, solar energy, and related emerging technologies to enhance overall technical support.
Apply security-first practices in daily operations, including secrets management, patching cycles, baseline image maintenance, identity hygiene, and RBAC reviews.
Work with security and compliance teams on vulnerability management, audit documentation, and evaluation of control effectiveness across applicable frameworks.
Develop and maintain operational documentation such as SOPs, runbooks, knowledge base articles, escalation procedures, and service catalogs.
Ensure documentation accuracy, version control, and comprehensive coverage; treat documentation as a critical operational output.
Utilize Jira for incident, problem, and change workflows, including SLAs, dashboards, and reporting.
Collaborate with Engineering and DevOps teams to define operational requirements, enhance service design, and prioritize reliability improvements.
Provide ongoing mentorship and training opportunities to L1 analysts to support skill growth and improve initial resolution outcomes.
Communicate effectively with both internal and external stakeholders regarding incidents, maintenance activities, service enhancements, and post-incident analyses.
Job requirements
At least 3 years of experience in network operations, site reliability, or cloud platform support roles managing production systems
Strong understanding of networking, VPNs, firewalls, load balancers, DNS, and certificate management
Hands-on experience with cloud services including compute, storage, networking, and identity management
Practical experience with both Linux and Windows systems administration
Proficiency in one or more scripting languages such as Python, PowerShell, or Bash, and ability to create dependable automation workflows
Familiarity with monitoring, alerting, and telemetry systems, including the design of meaningful service-level indicators.
Working knowledge of service management platforms and workflow automation tools.
Proven ability to write accurate operational documentation, including procedures and troubleshooting guides
Strong communication skills for both technical and customer-facing interactions
Preferred Qualifications:
Experience with Infrastructure-as-Code tools (e.g., Terraform, Bicep) and CI/CD systems
Knowledge of IoT or distributed device management at scale
Understanding of system reliability concepts such as graceful degradation and autoscaling
Exposure to industrial or energy systems involving telemetry, control, or gateway operations
Relevant certifications such as Azure Administrator, Azure Network Engineer, ITIL, or CCNA (or equivalents)
High Tech Genesis Inc. is an Equal Opportunity Employer. Diversity and inclusion are at the core of our values.
Please advise High Tech Genesis of any accommodation measures you may require.
Please be advised:
Applicants must have the legal right to work in Canada.
Kindly submit your resume in MS Word format upon application for this position.
or
All done!
Your application has been successfully submitted!
