Site reliability engineering developer SRE
City : Montreal, Quebec
Category : Permanent Full-Time
Industry : Information technology
Employer : National Bank
A career in technology at National Bank means being part of the transformation to have a direct impact on the client. As a Systems Reliability Specialist, you will be expected to support all IT teams in putting in place the necessary mechanisms to improve and maintain the highest standards of resilience and availability of IT services.
Your job
- Promote resilience and stability best practices to application and infrastructure teams.
- Understand the main flows of our critical environments to identify single points of failure.
- Support IT teams to improve their document support and architecture diagram to include resilience and stability information.
- Be able to perform application development and automate as needed.
- Promote and increase automation of IT tasks to reduce human error.
- End-to-end stability analysis and recommendations to improve system performance and resilience.
- Promote good monitoring practices and support IT teams in the implementation of key resilience and stability indicators.
- Support IT teams following major events impacting the resilience of their systems.
Our training programs use on-the-job learning to help you master your role. You can access personalized training content on such topics as banking solutions and the advisory approach to support your ongoing learning. You’ll also have access to colleagues with a wide range of expertise, experience and backgrounds to enrich all aspects of your development.
Basics requirements
- At least 10 years' expertise in the development of online services in a complex environment composed of new and old technologies (Legacy).
- Expertise in software design of complex systems supporting thousands of competing customers.
- Excellent understanding of DevSecOPS principles, monitoring, and observability.
- Knowledge of different MongoDB, Redis, and SQL database technologies.
- Experience in AWS cloud technology (service development, deployment, automation, and operations).
- Experience in infra capacity analysis (CPU, Memory, Latency, IO, Bandwidth etc).
- 24/7 operational experience.
- Experience in load testing and analysis.
- Great ability to solve complex multi-system problems.
Your benefits
- Health and wellness program, including many options
- Flexible group insurance
- Generous pension plan
- Employee Share Ownership Plan
- Employee and Family Assistance Program
- Preferential banking services
- Opportunities to get involved in community initiatives
- Telemedicine service
- Virtual sleep clinic