Intermediate System Reliability Engineer
City : Toronto, ON, CA, M5H1H1
Category : Computer Specialists
Industry : Finance
Employer : Scotiabank
Requisition ID: 149096
Join a purpose driven winning team, committed to results, in an inclusive and high-performing culture.
The Global Systems Reliability team interfaces with Senior management, infrastructure, IT Operation, and business partners to continuously improve the stability, reliability and efficiency of our Global systems through Site Reliability Engineering (SRE) based principles and practices that will include continuous people, process and technology (automating all the things) enhancements in support of our rapidly changing technology product portfolio.
You will work cross-functionally amongst a variety of teams and be a contributor in all significant engineering service or solution delivered to the Global Systems Reliability Office and stakeholders. You will also have an understanding ‘what could go wrong’, help to solve complex problems and have a flare for communicating and participating in discussions with technical and business partners. You will work directly with our Software Engineering teams to both maintain and operate our existing technology and build our next generation of technologies.
Is this role right for you?
- Work in collaboration with the Global System Reliability Engineering team as well as with software development, Quality, Product and Data Engineering teams to Champion SRE/ DevOps culture and practices
- Lead and collaborate with a team of Reliability Engineers (directly and through local and global communities of practice)
- Working closely with software development, Quality, Product and Data Engineering teams as a Champion of SRE/ DevOps culture and practices
- Working closely with global and regional architecture boards, champion the definition and implementation of resilience policies for new and existing solutions
- Lead management of Service Level Objectives with senior development and business leads
- Lead initiatives to continuously refine our build, plan and deploy practices for improved stability, reliability, efficiency, repeatability and security. You’ll create plans, collaborate with other SROs and DevOps team members - coordinating activity with development and business leads to increase service levels, lower costs, and support delivery velocity objectives
- Working closely with Development and operations teams to lead troubleshooting of our most severe incidents – leading senior stakeholder communication, driving problem-solving (e.g., log analysis, non-invasive tests) and debugging with best practice techniques
- Leading continuous improvement and execution of quality and timely major incident root cause analysis and blameless post-mortem activities to ensure we take action to avoid similar problems in the future
- Lead prioritization of reliability features and contribute to the design, development and delivery of effective tooling, alerts, and automated responses to identify and address reliability risks
Do you have the skills that will enable you to succeed in this role?
- 5+ years' experience in IT
- Strong understanding of SRE
- Degree in Computer Science, Engineering, or equivalent
- Excellent communication (both verbal and written). The ability to communicate confidently and clearly on conference calls, in meetings, via email, etc. at all levels of the organization is essential
- Performance and results oriented leadership skills - with a developmental bias (coaching)
- Experience working with large-scale distributed systems
- Experience using Jenkins, Bamboo or other CI tools
- Experience with GCP/Azure/AWS
- Experience working in an Agile environment
- Deep understanding of containerization and orchestration
- Experience with monitoring/observability tooling such as Dynatrace, DataDog, Splunk, Elastic Stack, Promeatheus, Jaeger, OpenTelemetry, etc.
- Experience in at least one high level programing language such as Python or Go
- Knowledge of R is a plus
- Configuration management using Ansible, Terraform, Puppet, Chef, or similar
What's in it for you?
- We have an inclusive and collaborative working environment that encourages creativity, curiosity, and celebrates success!
- We provide you with the tools and technology needed to create beautiful customer experiences
- You'll get to work with and learn from diverse industry leaders, who have hailed from top technology companies around the world
- Dress codes don't apply here, being comfortable does
- We offer a competitive total rewards package that includes a base salary, a performance bonus, company matching programs (on pension & profit sharing), generous vacation, personal & sick days, personal development funding, maternity leave top-up, parental leave and much more.
Location(s): Canada : Ontario : Toronto
Scotiabank is a leading bank in the Americas. Guided by our purpose: "for every future", we help our customers, their families and their communities achieve success through a broad range of advice, products and services, including personal and commercial banking, wealth management and private banking, corporate and investment banking, and capital markets.
At Scotiabank, we value the unique skills and experiences each individual brings to the Bank, and are committed to creating and maintaining an inclusive and accessible environment for everyone. If you require accommodation (including, but not limited to, an accessible interview site, alternate format documents, ASL Interpreter, or Assistive Technology) during the recruitment and selection process, please let our Recruitment team know. If you require technical assistance, please click here. Candidates must apply directly online to be considered for this role. We thank all applicants for their interest in a career at Scotiabank; however, only those candidates who are selected for an interview will be contacted.