• Partner Resources
  • Members E-Learning
  • Log Out
En
  • French
Skip to content
Pride At Work Canada
  • About
    • About
      • Staff
      • Board
      • Careers
      • Annual Report
    • Network
      • Proud Partners
      • Community Partners
    • Get Involved
      • Become a Proud Partner
      • Become a Community Partner
      • Volunteer
  • Programs
    • Learning
      • E-Learning
      • Resources
      • Webinars
    • Benchmarking
      • Workplace Audit
    • Leadership
      • FLOURISH
      • THRIVE
    • Networking
      • Ambassador Program
      • Communauté BRAINDATE Community
      • Matrices
      • Rendez-Vous
  • Events
  • Job Board
  • Blog
  • Podcast
  • Contact
  • Partner Resources
  • Members E-Learning
  • Log Out
Pride At Work Canada
En
  • French

Senior Site Reliability Engineer

Home / Senior Site Reliability Engineer

Return to List


City : TORONTO, Ontario, Canada

Category : Technology | Analytics | Research

Industry : Financial/Banking

Employer : RBC

Come Work with Us!

At RBC, our culture is deeply supportive and rich in opportunity and reward. You will help our clients thrive and our communities prosper, empowered by a spirit of shared purpose.

Whether you’re helping clients find new opportunities, developing new technology, or providing expert advice to internal partners, you will be doing work that matters in the world, in an environment built on teamwork, service, responsibility, diversity, and integrity.

Job Title

Senior Site Reliability Engineer

Job Description

What is the opportunity?

The Enterprise DevOps SRE team is undertaking multiple complex enterprise-wide initiatives as part of RBC’s ongoing plan to improve and standardize application releases for Cloud, Distributed, Mainframe etc. This role will be responsible for the development, implementation, administration, and support of Site Reliability Engineering (SRE) solutions for the Enterprise DevOps tools and CI/CD pipeline supported by the Enterprise DevOps SRE Team.

What will you do?

- Champion Stability and Reliability across DevOps applications and services

- Develop SRE solutions (monitoring, alerting, self-healing and reliability testing)

- Building automated solutions to remove toil.

- Explore & evaluate new technologies and drive innovation by designing/implementing new practices/processes.

- Implement and drive proactive monitoring solutions for internally hosted applications

- Own and develop reports for SRE Metrics (including incident metrics) - gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding.

- Identify and establish SLOs and error budgets for DevOps applications

- Assist in incident management and problem management for applications in scope

- Evaluate & iterate continuously – what went well, what went wrong, what can be done to improve and prevent in future

- Spear head blameless post-mortems for the high impact incidents

- Collaborate and contribute on cross-functional enterprise initiatives and manage the effective implementation of assigned deliverables

- Work with necessary stakeholders to mature processes and ensure SRE and ITSM processes are effective and understood

- Identify potential issues, conflicts, and risks. Analyze, mitigate, and escalate to management where appropriate

- Provide guidance to other team members on managing end-to-end availability and performance of mission critical services, on building automation to prevent problem recurrence, and on building automated responses for non-exceptional service conditions

What do you need to succeed?

Must-have

- Strong development background in at least couple of programming/scripting languages (Preferably Python, JavaScript, Java, Shell scripting, PowerShell)

- Strong problem solving and analytical skills to triage issues

- Experience in Configuration Management (config as code) using Ansible or Terraform

- Have a well-rounded understanding of Linux operating system including command line, firewalls, certificates, PGP encryption and various file transfer protocols e.g., SFTP, AS2, Connect: Direct etc

- Thorough understanding of SRE principles

- Extensive experiencing working with APIs (REST and/ or SOAP endpoints).

- Ability to quickly pick up new tools, programming languages, libraries, frameworks, and other technical concepts as needed.

- Hands-on experience in a variety of Industry standard SRE tools (Ansible, Dynatrace, Moogsoft, PagerDuty, ServiceNow, Slack, Elastic Stack, CatchPoint)

- Deadline-driven and results-oriented; able to meet consistently high-quality standards while handling a variety of tasks and deadlines simultaneously.

- Excellent written and verbal communication skills: ability to deal with key partners across the organization: Business, Operations, Application Development, Maintenance, and Infrastructure Teams

Nice-to-have

- Computer Engineering, Computer Science, related (technical) degree/diploma, or related breadth of experience

- Exposure to Docker, Kubernetes, Openshift, GitHub, JFrog Artifactory, JFrog Xray, NexusRepo & IQ,  DevSecOps, IBM Urbancode Deploy, Jenkins, MongoDB, Jira, Confluence, Jira Service Desk, Databases, PagerDuty.

- Experience in Vendor Management, application development, database, system engineering and/or systems analysis

- Understanding of banking/financial services industry.

- Experience working as a member of an Agile development team.

What’s in it for you?

We thrive on the challenge to be our best, progressive thinking to keep growing, and working together to deliver trusted advice to help our clients thrive and communities prosper. We care about each other, reaching our potential, making a difference to our communities, and achieving success that is mutual.

  • A comprehensive Total Rewards Program including bonuses and flexible benefits, competitive compensation, commissions, and stock where applicable
  • Leaders who support your development through coaching and managing opportunities
  • Ability to make a difference and lasting impact
  • Work in a dynamic, collaborative, progressive, and high-performing team
  • A world-class training program in financial services
  • Flexible work/life balance options
  • Opportunities to do challenging work

Job Summary

The Enterprise DevOps SRE team is undertaking multiple complex enterprise-wide initiatives as part of RBC’s ongoing plan to improve and standardize application releases for Cloud, Distributed, Mainframe etc. This role will be responsible for the development, implementation, administration, and support of Site Reliability Engineering (SRE) solutions for the Enterprise DevOps tools and CI/CD pipeline supported by the Enterprise DevOps SRE Team

Address:

TORONTO, Ontario, Canada

City:

CAN-ON-TORONTO

Country:

Canada

Work hours/week:

37.5

Employment Type:

Full time

Platform:

Technology and Operations

Job Type:

Regular

Pay Type:

Salaried

Posted Date:

2023-02-16

Application Deadline:

2023-03-31

Inclusion and Equal Opportunity Employment

At RBC, we embrace diversity and inclusion for innovation and growth. We are committed to building inclusive teams and an equitable workplace for our employees to bring their true selves to work. We are taking actions to tackle issues of inequity and systemic bias to support our diverse talent, clients and communities.
​​​​​​​
We also strive to provide an accessible candidate experience for our prospective employees with different abilities. Please let us know if you need any accommodations during the recruitment process.

Join our Talent Community

Stay in-the-know about great career opportunities at RBC. Sign up and get customized info on our latest jobs, career tips and Recruitment events that matter to you.

Expand your limits and create a new future together at RBC. Find out how we use our passion and drive to enhance the well-being of our clients and communities at rbc.com/careers.

Return to List

Subscribe to our newsletter:

Privacy Policy

PRIDE AT WORK CANADA/FIERTÉ AU TRAVAIL CANADA

© Pride at Work Canada 2022