Bank of America is looking for a Site Reliability Engineer in Richmond – Apply Here!
The individual in this role is accountable for establishing and maintaining partnerships with Application Development and Production Support teams to implement the measures prescribed through the collaboration of the Senior Site Reliability Engineer (SRE) and the SRE team(s) they are leading. This individual will include ensuring the appropriate instrumentation, tooling, ticketing, alerting and on-call routines are in place for key services. This role demonstrates a high level of technical expertise within one or more technical domains. This role demonstrates the ability to decompose issues or objectives into units of work that can be assigned to other team members. This individual will advocate and advance more efficient solution delivery practices and evangelize great design, engineering and organizational practices.
• Collaborate with Development and Infrastructure teams to understand technical solutions and to implement the monitoring capabilities outlined in the application and system monitoring designs put forward by the Senior SRE
• Mentor SRE resources on reliability practices and established tools/capabilities.
• Develop and maintain a catalog of extensible reliability scripts, tools, and libraries that can be leveraged for common instrumentation, automation and operational needs.
• Partner to implement code changes to make use of common reliability libraries and tools and help the Application Production Services (APS) and Application Development teammates understand how to use them.
• Partner with infrastructure engineers and application teams to implement the necessary code changes to make use of common reliability libraries and tools and help the APS and Application Development teammates understand how to use them.
• Engage as a subject matter expert (SME) in major incident triage efforts, failure scenario modelling and work with the Problem Manager to diagnose root causes for major incident / problem management investigations.
• Identify vulnerabilities and opportunities for reliability improvement, such as investigating low level error rates and ‘noise’ in monitoring, and to help define solutions to reduce manual support effort and/or improve system reliability.
• Participate regularly in architecture community of practice meetings and communication via other channels.
• Performance Testing
• Tertiary Skill
• EXPERT – Learns & Adapts Demonstrates the ability to identify more complex problems that cross several infrastructure domains and gather insights from others and data. Demonstrates the ability to assess issues identified by others and update the mental model of software and infrastructure services.
• EXPERT – Analytical Thinking Develops frameworks for solving a range of complex problems. Can foresee future problems and define resolutions/new opportunities.
• ADVANCED – Production Support – Has the knowledge and expertise of supporting production environments, and associated maintenance, change control, incident and problem management.
• ADVANCED – Solution Design – Has demonstrated ability to design and develop significant components within an application; has performed code review
• ADVANCED – Application, Data, & Infrastructure Architecture Designs architecture and drives optimal solutions. Ensures design reviews and compliance with bank standards for information security and infrastructure standards
• ADVANCED – Innovation Shares new ideas and consistently demonstrates openness to the opinions and views of others. Identifies new and different patterns, trends, and opportunities. Generates novel solutions. Seeks to involve other stakeholders in developing solutions to problems
• ADVANCED – Promotes Collaboration Facilitates collaboration within and across teams. Go-to person for consultation from others
• ADVANCED – Influences Decisions Uses a combination of factual and emotional points to exert influence. Receives agreement to ideas by identifying and working collaboratively with specific stakeholders and senior managers.
• ADVANCED – Achieves Sustainable Results Sets KPI’s for teams to follow. Makes decisions about technology that will increase or deliver business value
• PROFICIENT – Business Products & Strategy Working knowledge of products, services, business flows and understanding of financial context in which technology and operating principles are used within their business area.
• PROFICIENT – DevOps Practices & Automation Ability to perform or support continuous integration and deployment activities within their role
• Understanding an appreciation of DevOps culture
• PROFICIENT – Portfolio, Program, & Project Management Manages at least one project. Has knowledge and hands-on experience with project scope management, project time management, project cost management and proficient experience with project management tools / technology. Is adept at addressing project change (e.g., scope, schedule, resources)
• PROFICIENT – Solution Delivery Process Understands all roles and responsibilities within their team, and uses the tool sets within the software / infrastructure lifecycle
• PROFICIENT – Financials & Resource Management Ability to provide input for financial planning. Understands the variances
• AWARENESS – Governance & Stakeholder Management Understands the importance of governance processes, standards and tools used within the organization to build and support solutions
• AWARENESS – Risk Management Understands the basic elements of risk and control within the organization
• AWARENESS – User Experience Design Has a basic understanding of experience design principles and familiar with the bank brand and accessibility guidelines