Fulltime Site Reliability Engineer openings in New York, United States on September 04, 2022

Site Reliability Engineering (SRE) at People Can Fly

Location: New York

Company Description

People Can Fly is one of the leading independent AAA games development studios with an international team of hundreds of talented individuals working from offices located in Poland, UK, US, and Canada, and from all over the world thanks to our remote work programs.

Founded in 2002, we made our mark on the shooter genre with titles such as Painkiller, Bulletstorm, Gears of War: Judgment, and Outriders. We are one of the most experienced Unreal Engine studios in the industry and we are expanding it with in-house solutions called PCF Framework.

Our creative teams are currently working on several exciting AAA projects with the top publishers in the industry: Project Gemini with Square Enix and Project Dagger with Take-Two (2K), in addition to a new IP to be self-published and two other games in a concept phase. One of our IPs is also being adapted for VR technology.

With over 20 years of experience, PCF sets out to explore new horizons. We aim to combine our expertise with creativity of the best and most forward-thinking talents in the industry to work together on the new generation of action games for the global gaming community.

If you decide to accompany us on this journey, you’ll have a chance to perfect your craft and expand your knowledge, working alongside leaders in the industry on bringing a brand-new unique experience to the players worldwide.

If you feel yourself able to deliver as nobody else, take ownership of your projects, and are ready to leave a mark on a game you work on, apply now!

Job Description
• Build and deploy the cloud-native infrastructure of the online services platform.
• Build the tools, and foster the culture, for reliability across all our services.
• Plan for, and exercise recovery from, disasters.
• Build and deploy the platform to cloud service providers in an automated, reproductible way. Provision additional instances for development, testing, load testing, certification and (if needed) external publishers.
• Harden the platform; advise the programmers on maximizing the reliability, scalability and uptime of their services.
• Deploy the required tools to ensure maintenance, updates and recoveries of the services are quick, seamless, traceable, reproductible, and simple to revert if needed.
• Establish disaster recovery protocols. Put them to the test.
• Write and deploy monitoring dashboards and alerting systems to ascertain the state of online services and their dependencies in real-time. Assist programmers in instrumenting their services so that they’re monitored effectively.
• Build dashboards to monitor the cost of our online systems in real-time. Advise programmers on minimizing operational costs.
• Communicate with 3rd party providers and/or publishers in case of outages on their end.
• Establish protocols for 24/7 on-call support of our live games.

• Typically: 2+ years of experience in a Site Reliability Engineering (SRE) or DevOps position.
• Videogame-specific experience is useful but not mandatory.
• Other relevant domains to look into: content distribution, ad-tech, news, mobile gaming, finance.
• FAANG (or adjacent) experience highly sought after.
• Strong knowledge of one or two of: Amazon Web Services, Microsoft Azure, Google Cloud Platform.
• Experience building, deploying and operating Kubernetes clusters in cloud-native environments (EKS on Amazon, AKS on Azure, GKS on Google).
• Knowledge of infrastructure-as-code tooling (e.g. Hashicorp Terraform) and integration into CI/CD pipelines (e.g. Atlantis).
• Experience deploying software on Kubernetes clusters using Docker, Helm and ArgoCD (GitOps-style operations).
• Experience with monitoring and tracing stacks: Prometheus, InfluxDB, Loki, Grafana, OpenTelemetry.
• Deep understanding of scalability, security and maintainability considerations.
• Being able to work efficiently under tight deadlines.
• Knowledge of any project management and bug tracking software.
• Strong verbal and written communication skills in English.
• Open-minded team player attitude.
• Strong work ethic and self-motivated.
• Passionate about playing and making video games.

Additional Information

What we offer:

• 100% group health insurance benefit premiums paid by PCF (Medical, Dental, Vision, Group Life, and Supplemental Live) and start on day 1 of employment.
• 401K with 100% match, up to 3% of employee salary, and vested immediately.
• Paid week off during Winter Holidays.
• 20 paid vacation days and 5 paid sick days.
• Free virtual health and mental wellbeing sessions included in the plan for members and their dependents.
• A competitive salary and performance-based annual bonuses.
• Personal development opportunities and ability to work in a global environment.
• Work in a creative team with people full of passion for what they do.
• Long term disability, short term disability, travel insurance, as well as other benefits provided.

• Benefit package 100% paid by PCF. Insurance company reimburses 100% of claims (Up to $500 per service a year, as well as individual family coverage).
• Full Dental coverage, including major dental and orthodontics.
• 4% RRSP matching before tax deductions, 100% vested on day 1.
• Paid week off during Winter Holidays.
• 20 paid vacation days and 5 paid sick days.
• Free virtual health and mental wellbeing sessions included in the plan for members and their dependents.
• A competitive salary and performance-based annual bonuses.
• Personal development opportunities and ability to work in a global environment.
• Work in a creative team with people full of passion for what they do.
Apply Here
For Remote Site Reliability Engineering (SRE) roles, visit Remote Site Reliability Engineering (SRE) Roles


Site Reliability Engineer at Adobe Systems Incorporated

Location: New York

• Our Company Changing the world through digital experiences is what Adobe’s all about.
• We give everyone-from emerging artists to global brands-everything they need to design and deliver exceptional digital experiences!
• We’re passionate about empowering people to create beautiful and powerful images, videos, and apps, and transform how companies interact with customers across every screen.
• We’re on a mission to hire the very best and are committed to creating exceptional employee experiences where everyone is respected and has access to equal opportunity.
• We realize that new ideas can come from everywhere in the organization, and we know the next big idea could be yours!
• We’re passionate about empowering people to create moving and powerful images, videos, and apps, and transform how companies interact with customers across every screen.
• We realize that new insights can come from everywhere in the organization, and we know the next big idea could be yours!
• About Adobe At Adobe, we’re changing the world.
• We give people the tools to bring their ideas to life and build content that makes life more fun and work more significant.
• We give businesses and organizations the power to truly engage their customers.
• We’re the ones behind the elegantly designed content that streams across your laptop, TV, phone, and tablet every day-and the ones who harness the data to help companies move from data to insight and insight to action by delivering content that people care about most.
• We’re a company that understands that product innovation comes from people innovation, and that’s why we invest in cultivating leaders throughout the organization.
• If you’re passionate about leading from where you sit, join us.
• The ChallengeAre you comfortable running production environmentsANDwriting code?
• Do you have an intimate understanding of the operational challenges of running services at scale?
• Tackle challenges through software engineering instead of sustained human toil?
• Are you keen to explore new technologies and looking to help build a path not-yet-traveled?
• Do you combine the above experiences and talent with strong communications skills, a customer-centric demeanor, and a will to “get stuff done” – all while learning and having fun?
• About the Role We need an engineer that loves learning technology and uses it to deliver sustainable business solutions.
• Our focus is helping engineering teams know how their systems are performing, so they can find problems before customers do.
• We believe in automating whenever possible, and so we firmly believe in pull requests, infrastructure as code, and failing fast.
• We are also advocates for standard methodologies, and regularly train internal teams on how they can use our technologies.
• This team member builds and maintain solutions for getting insights on infrastructure and services supporting applications with a focus on logs, metrics and application traces that improve Observability.
• Observability engineer will think about the problem end-to-end: automation of data collection from common data sources, store data efficiently in performance managing and monitoring tool.
• The successful candidate will have proven experience or capability in the following areas:Collaborate with operations & engineering teams, application developers, management and infrastructure teams to assess near- and long-term monitoring needs.
• Keep an eye on emerging observability tools, trends and methodologies, and continuously enhance our existing systems and processes.
• Assist with driving observability standards to improve the consumer experience of meaningful applications, services, and business processes with a strong focus on the end-to-end journey.
• Assist in scheduling and hosting regular tool training sessions to better enable tool adoption and standard methodologies.
• Work with multiple teams to design, deploy, and support large scale clustered software platforms in multiple datacenters and public clouds around the world.
• Deliver highly secure solutions within comprehensive compliance regulations.
• Experience with monitoring and observability solutions and methodologies including server and network performance, hardware, web synthetics, and application performance monitoring a plus.
• Eager to learn, able to take guidance, and able to understand that there is always more to learn
Apply Here
For Remote Site Reliability Engineer roles, visit Remote Site Reliability Engineer Roles


Senior Site Reliability Engineer, Tock at SquareSpace

Location: New York

• The GigThe Tock engineering team is looking for a site reliability engineer to help us support the next generation of restaurant bookings based on the proven system we’ve deployed everywhere from local dive bars to Alinea, The French Laundry, and Eleven Madison Park. Our engineering team was founded in 2015 by ex-Google engineers and combines FAANG team quality with the speed and personal impact of a startup.
• As a Site Reliability Engineer, you will work closely with the rest of our Engineering team to ensure our products and infrastructure are reliable, fast, efficient, and secure with an eye to reducing toil.
• This is an opportunity for you to work with world-class engineers and SREs on challenging problems while having a big impact on the hospitality industry.
• As a member of a small and growing team, you’ll help define the next stage of our company’s growth.
• Tock Engineering takes a fully hybrid approach to work.
• Qualified engineers can choose to work from our offices, from home, or a combination of the two.
• Work in complex production environments and seek out ways to simplify systems and reduce toil.
• Accelerate our adoption of Config as Code and Infrastructure as Code. Troubleshoot production incidents and debug across distributed systems and at multiple layers (including network, system, and application).
• Define and measure Service Level Indicators (SLIs) and Service Level Objectives (SLOs) to help teams make informed decisions about balancing reliability against engineering velocity.
• Participate in a culture of direct, compassionate feedback and engage in a results-oriented work style with transparent communication.
• Contribute to core team cultureShape the evolution of a large and complex systemWho We’re Looking ForAt least 5 years of experience in an SRE or DevOps roleExperience supporting HA products / infrastructure on public clouds using Kubernetes (AWS / GCP).
• We use GCP.Familiarity with using Config as Code and Infrastructure as Code to manage configuration and cloud infrastructure.
• We use Terraform, Ansible, and Helm. An understanding of all parts of a modern web-based application stack including frontend, backend, database, and networking.
• Fluency with one or more general purpose programming languages and/or scripting languages, including but not limited to: Java, Go, Bash, Python, Ruby, JavaScript, or Typescript.
• Familiarity with security best practices.
• We are changing the way restaurants, wineries, and culinary event organizers run their business and how guests explore, discover, and book at these places all around the globe.
• About SquarespaceSquarespace is a leading all-in-one website building and ecommerce platform that enables millions to build a brand and transact with their customers in an impactful and beautiful online presence.
• Our suite of products enables anyone at any stage of their journey to manage their projects and businesses through websites, domains, ecommerce, marketing tools, and scheduling, along with tools for managing a social media presence with Unfold and hospitality business management via Tock. Squarespace democratizes access to best-in-class design, helping our customers in approximately 200 countries and territories maintain consistent branding across all digital touchpoints to stand out online.
• Our team of more than 1,400 is headquartered in downtown New York City, with offices in Dublin, Ireland, Portland, Oregon, Los Angeles, California and Chicago, Illinois.
• For more information, visitÂ.
• Our CommitmentNot only do we embrace and celebrate the diversity of our customer base, but we also strive for the same in our employees.
• At Tock, we are committed to equal employment opportunity regardless of race, color, ethnicity, ancestry, religion, national origin, gender, sex, gender identity or expression, sexual orientation, age, citizenship, marital or parental status, disability, veteran status, or other class protected by applicable law.
• We are proud to be an equal opportunity workplace.
Apply Here
For Remote Senior Site Reliability Engineer, Tock roles, visit Remote Senior Site Reliability Engineer, Tock Roles


Site Reliability Engineer at Cerner Corporation

Location: New York

We are hiring a motivated Site Reliability Engineer to join our amazing team at Cerner Corporation in Missouri.
Growing your career as a Full Time Site Reliability Engineer is a great opportunity to develop relevant skills.
If you are strong in project management, teamwork and have the right mindset for the job, then apply for the position of Site Reliability Engineer at Cerner Corporation today

As a Site Reliability Engineer, you will be focused on and improve the reliability of mission critical solutions and platforms for large-scale or critical systems through software development. In this role you will spend time identifying gaps in current technology and processes to recommend system and software improvements, develop software to improve the availability, scalability, performance, stability, security and reliability of the solution or platform. In addition, you have duties to refine and improve monitoring through software development by injecting variable system failures and develop mitigations to prevent impact. The team participates in operational review, incident recovery and post-incident review with appropriate stakeholders, participates in on-call rotation for the operations of the solution or platform and everyone collaborates with development, operations and program management peers to knowledge share and promote best practices. Working Environment Hybrid or Remote

This position offers a Hybrid or Remote working environment. Meaning if you live within a metro area of a Cerner office, you will split working time between a Cerner office and remote. If you are not within a metro area of a Cerner office, you can live and work in your current geographical location and work primarily remote. #LI-Hybrid #Remote Back to Description
Cerner Jobs and Careers

Engineering & Technology

Innovation occurs everywhere but maybe you are also looking for a purpose. Nothing is more impactful than improving the health of others. Develop cutting edge technologies that have real meaning.
Additional Information Working Environment Hybrid or Remote
Relocation Assistance Available for this Job: Yes – Domestic/Regional

Basic Qualifications
At least 6 years total combined higher education and related work experience, including:
At least 1 years software engineering work experience
At least 5 years higher education and/or additional work experience directly related to the duties of the job; including:
Bachelor’s degree in; Computer Science, Computer Engineering, Software Engineering, Data Processing and/or in a related field
Due to the client contract, you will be assigned, this position requires you to be a U.S. citizen Preferred Qualifications
At least 1 year of Linux Operating System experience
At least 1 year of Advanced scripting or development work (Any of the following: Python, Ruby, Go)
At least 1 year of Infrastructure as Code work (Any of the following: Chef, Terraform, Ansible)
At least 1 year of Monitoring Software/Services (Any of the following: New Relic, Prometheus, Splunk, “Home Grown” services)
Nice to have – AWS (Amazon Web Services) Expectations
Willing to work additional or irregular hours as needed and allowed by local regulations
Work in accordance with corporate and organizational security policies and procedures, understand personal role in safeguarding corporate and client assets, and take appropriate action to prevent and report any compromises of security within scope of position
Perform other responsibilities as assigned

Applicants for U.S. based positions with Cerner Corporation must be legally authorized to work in the United States. Verification of employment eligibility will be required at the time of hire. Visa sponsorship is not available for this position.

Some Cerner positions may be obligated to comply with client-facing requirements and occupational health requests, including but not limited to, an immunization set, covid-19 vaccination, an annual flu shot, an annual TB screen, an updated background check, and/or an updated drug screen.

Cerner is a place where people are encouraged to innovate with confidence and focus on what is important – people’s health and the care they receive. We are transforming health care by developing tools and technologies that make it more efficient for care providers and patients to navigate the complexity of our health. From single offices to entire countries, Cerner solutions are licensed at more than 25,000 facilities in over 35 countries.

Cerner’s policy is to provide equal opportunity to all people without regard to race, color, religion, national origin, ancestry, marital status, veteran status, age, disability, pregnancy, genetic information, citizenship status, sex, sexual orientation, gender identity or any other legally protected category. Cerner is proud to be a drug-free workplace.

Company Benefits:
● Company offers great benefits
● Company offers career progression opportunities
● Competitive salary

● Remote Work opportunity
Apply Here
For Remote Site Reliability Engineer roles, visit Remote Site Reliability Engineer Roles


site reliability engineering at Randstad US

Location: New York

site reliability engineering.
• new york , new york
• posted today

job details

• $53 – $63 per hour
• contract
• bachelor degree
• category computer and mathematical occupations
• reference950730

job details

job summary:

• Site Reliability Engineering (SRE) is an exciting and emerging area that applies engineering discipline to proactively solve operational problems. SRE focuses on mitigating incidents while optimizing large-scale, massively distributed, fault-tolerant systems.
• The SRE team has responsibility for improving overall availability, reliability, resiliency, monitoring, emergency response, latency, change management and capacity planning for services in the platform.
• The team provides applications and tools that are used by trading desks and there supporting communities to understand the state of trading platforms and orders that flow through them.

Job Description
• As a Site Reliability Engineer, you will join an engineering team that focuses on engineering. Our culture is built on continuous learning and supported by transparency, trust, and cooperation. While there will be an operations aspect to the role, the bulk of your time will be spent in development.
• If you quickly become bored by performing manual tasks and you have a passion and the aptitude for writing software to replace this manual work, then this role will interest you.
• You will collaborate with Feature teams and Product Development partners to achieve a collective result.
• As part of the team, you will build applications, services, and tools that drive end-to-end automation, continuous monitoring, exception alerting, and metrics gathering for decision making in support of the Electronic Trading businesses

• As an Engineer, you will work with Architecture, Application Development, Product Delivery, Enterprise, and Technology Infrastructure partners to deliver business critical features and site reliability non-functional requirements into the platform.
• With automation as a key motivator, you will implement predictive, preventative, and self-healing full stack monitoring; centralized logging to enable incident triage; execute on a High Availability (HA) and DR strategy that exceed RTOs; deliver automated Load Testing, with known capacity thresholds, to provide business scalable systems; provide application performance planning, monitoring, and tuning that exceeds key business process SLOs.
• You will be deeply involved in delivering the solution for change management automation, extending the current CI and CD pipelines.
• You will provide applications that are necessary for trading desks and there supporting communities to understand the state of trading platforms and the flow of orders and executions through them.

Primary Skill: Core Java

Required Skills
• B.S. degree in Computer Science or equivalent practical experience
• 5+ years professional coding experience in Java
• Professional coding experience in Web UI and Javascript frameworks (i.e. Angular, React)
• Mastery of one or more scripting languages (e.g. Python, Bash)
• Deep understanding of Linux Operating System internals, Namespaces, and Containers
• Working knowledge of networking (e.g., firewall, routing, network topologies and hardware, SDN);
• Hands-on experience with Kubernetes orchestration platform
• SQL Database (e.g. Oracle) and No-SQL experience
• Experience with messaging systems and APIs (JMS, EMS, AMPS)
• Hands-on experience in a variety of SRE languages and tools (Ansible, Dynatrace, GitHub, Elastic Search, Logstash, Kibana, Grafana, Prometheus)
• Experience with SDLC process and Agile development practices (JIRA, Jenkins, Confluence, git)

Desired Skills
• Experience with public cloud technologies
• Financial industry experience with understanding of Equities

location: NEW YORK, New York

job type: Contract

salary: $53 – 63 per hour

work hours: 9am to 5pm

education: Bachelors

• As an Engineer, you will work with Architecture, Application Development, Product Delivery, Enterprise, and Technology Infrastructure partners to deliver business critical features and site reliability non-functional requirements into the platform.
• With automation as a key motivator, you will implement predictive, preventative, and self-healing full stack monitoring; centralized logging to enable incident triage; execute on a High Availability (HA) and DR strategy that exceed RTOs; deliver automated Load Testing, with known capacity thresholds, to provide business scalable systems; provide application performance planning, monitoring, and tuning that exceeds key business process SLOs.
• You will be deeply involved in delivering the solution for change management automation, extending the current CI and CD pipelines.
• You will provide applications that are necessary for trading desks and there supporting communities to understand the state of trading platforms and the flow of orders and executions through them.

• Experience level: Experienced
• Minimum 8 years of experience
• Education: Bachelors

• Java

Equal Opportunity Employer: Race, Color, Religion, Sex, Sexual Orientation, Gender Identity, National Origin, Age, Genetic Information, Disability, Protected Veteran Status, or any other legally protected group status.

At Randstad, we welcome people of all abilities and want to ensure that our hiring and interview process meets the needs of all applicants. If you require a reasonable accommodation to make your application or interview experience a great one, please contact [Email available when viewing the job].

Pay offered to a successful candidate will be based on several factors including the candidate’s education, work experience, work location, specific job duties, certifications, etc. In addition, Randstad offers a comprehensive benefits package, including health, an incentive and recognition program, and 401K contribution (all benefits are based on eligibility).

For certain assignments, Covid-19 vaccination and/or testing may be required by Randstad’s client or applicable federal mandate, subject to approved medical or religious accommodations. Carefully review the job posting for details on vaccine/testing requirements or ask your Randstad representative for more information.

related jobs.

see all jobs

agile project manager
• jersey city, new jersey
• contract
• $63.68 – $73.68 per hourposted august 22, 2022job summary: Description: Responsible for the implementation of Site Reliability Engineering best practice in a production support environment.Creation of documentation, workflow and processes to assist teams in moving to Site Reliability Engineering Methods.Facilitate adoption of Agile processes through mentoring production support teams.Manage software development framework in-take process.Ensure teams manage their backlog and prioritise work effect view job

senior lead of software engineering
• new york, new york
• permanent
• $185,000 – $195,000 per yearposted august 9, 2022job summary: Senior Lead of Software Engineering is a “difference maker” who will be a crucial hire to play a critical role in growing our company to a $1B professional services organization. Through a focused technical operations and solutions strategy, you will have the opportunity to contribute significantly to the operational growth. THE COMPANY We are a professional services firm dedicated to building human connections that catalyze busine view job

portfolio manager – software engineering
• new york, new york
• permanent
• $160,000 – $165,000 per yearposted june 27, 2022job summary: Implement organization strategies through the effective direction and management of resources, while being accountable for the business strategies, functional or operational areas, processes or programs. Responsible and accountable for managing and developing a team of people, setting direction and deploying resources on varying projects, participating in organization-wide projects, while providing guidance and expertise on an as needed ba view job

let similar jobs come to you

We will keep you updated when we have similar job postings.

your email address

select frequency

select frequencydailyweeklymonthly

I consent to the use of my information for the purpose of sending me job alerts.
Apply Here
For Remote site reliability engineering roles, visit Remote site reliability engineering Roles


Site Reliability Engineer at Interactive Brokers LLC

Location: New York

Join the Interactive Brokers Team Interactive Brokers Group has been consistently at the forefront of trading innovation, starting with the invention of the first floor-based handheld computer in 1983 and we pride ourselves on being primarily a technology company. We continue to challenge the status quo and push boundaries to offer the best trading platform with the most sophisticated features all for the lowest cost to our customers. Software development is the lifeblood of our firm, and it shows in our stellar brokerage platform. We offer award-winning desktop, mobile and web applications which provide our clients with the tools they need to be successful. Interactive Brokers Group, Inc. (IBKR); is rated 1 – Best Online Brokers 4 years in a row by Barron’s , Best Online Brokers – Barron’s Award (read more). About the role – As a global technology leader in Financial Services, IBKR maintains tens of thousands of individual IT components and millions of dollars of infrastructure supporting the business. Inside our global IT operations centers, these systems, networks, processes and infrastructure are monitored 24/7/365 ensuring platform stability and proper function. We are searching for an SRE / IT Operations Engineer to work within the technical operations group, to support our technical operation analysts through automation and tooling. Your Responsibilities Take ownership of software tooling and configuration management software and infrastructure. Maintain and improve reliability of production services by developing innovative monitoring and scaling solutions to measure system health and automate resilience. Continuously improve hybrid, on-premise and cloud infrastructure to support development teams throughout the full service lifecycle. Support teams across the organization with connectivity configuration and file delivery automation using software development, networking and cyber security knowledge. Second level support for incident management across the brokerage system and creating corrective action plans through collaboration with problem managers using post-mortem best practices. Key Requirements Bachelor of Engineering or equivalent relevant technical experience in Computer Science, Software Engineering, Mathematics, Physics or similar. Experience developing applications in any of the following languages: Python, C++, Java or Kotlin Preferred Practical background with Linux based systems and networking. Experience working with and managing configuration for monitoring / alerting systems (Prometheus, Grafana, Kibana, ElasticSearch, Logstash, AlertManager). Infrastructure configuration and management with AWS and supporting cloud technologies (CloudWatch, Terraform, Kubernetes, Lambdas/Functions). Experience with DevOps technologies such as Docker / Docker-Compose, software build/packaging systems (Gradle, SetupTools, CMake, Make), dependency management (pip, maven, NPM etc.) Knowledge of file transfer protocols (SFTP, FTP), certificate management and modern encryption standards. Experience working in a software development team using supporting development tools (Jira, Git, Gitlab, Jenkins) and best practices (test coverage, unit & integration testing, linting). Experience with ITIL best practices and collaboration tools such as Jira, Confluence and ServiceNow. Company Benefits & Perks Competitive salary, annual performance-based bonus and stock grant Retirement plan 401(k) with competitive company match Excellent health and welfare benefits including medical, dental, and vision benefits Wellness screenings and assessments, health coaches and counseling services through Employee Assistance Program (EAP) Paid time off and a generous parental leave policy Daily company paid lunch and a fully stocked kitchen with healthy options for breakfast and snack Corporate events including team outings, dinners, volunteer activities and company sports teams Education reimbursement and learning opportunities Modern offices with sit/stand desks and multi-monitor setups LI-SB1
Apply Here
For Remote Site Reliability Engineer roles, visit Remote Site Reliability Engineer Roles


Site Reliability Engineer at OLO

Location: New York

• Olo is experiencing tremendous growth and as we enhance our platform to support increased demand, it must be positioned for continued stability, reliability and resiliency.
• Reporting to the Engineering Manager of Site Reliability, the Site Reliability Engineer will partner with Engineering and Product Managers to learn, improve system availability and sharpen our execution skills to provide an amazing experience for our customers.
• Olo is a remote-first company, offering all full-time employees the option to work from anywhere in the U.S. What You’ll DoGuide observability and SLIs/SLOs to Incident Response to postmortems and follow-up actions.
• Implement and tailor our incident response tools to minimize outage durations.
• Build collaborative monitoring solutions with members across multiple product teams.
• Contribute insights across teams to help us improve or re-architect existing systems to support scale, performance and extensibility.
• Rethink our observability tooling to improve architecture, knowledge models, user experience, performance and stability.
• Analyze and mature our processes around Incident Response, Observability, Postmortems and Predictive Monitoring.
• Influence an engineering culture of reliability, observability, and availability.
• Participate in an Incident Commander on-call rotation to help drive remediation efforts to improve our user experience through incidents across our Platform.
• Mentor engineering teams through game days, SRE boot camps and other training and feedback channels.
• What We’ll Expect From You3+ years of professional experience building scalable, efficient, and resilient systems.
• Experience with monitoring tools like Datadog, Sumo Logic, Raygun, New Relic, Grafana, CloudWatch, and Splunk SignalFx. Fluency in Incident Management using tools such as FireHydrant, OpsGenie, PagerDuty, VictorOps, or similar.
• Experience with build and deploy tools (ie.
• Jenkins, TeamCity, Octopus, or CircleCI).
• Prior hands-on software development experience.
• About Olo Olo is a leading on-demand commerce platform powering the restaurant industry’s digital transformation.
• Millions of orders per day run on Olo’s enterprise SaaS engine, enabling brands to maximize the convergence of digital and brick-and-mortar operations.
• The Olo platform provides the infrastructure to capture demand and manage consumer orders from every channel.
• With integrations to over 100 technology partners, Olo customers can build digital experiences with the largest and most flexible restaurant commerce ecosystem on the market.
• Over 500 restaurant brands use Olo to grow digital sales, maximize profitability, and preserve direct consumer relationships.
• Learn more at olo.com. Olo’s headquarters is located on the 82nd floor of One World Trade Center.
• In addition to our NYC cohort, over 75% of our team works remotely across the U.S. We offer great benefits, such as 20 days of Paid Time Off, fully paid health, dental and vision care premiums, a 401k match, company equity, a generous parental leave plan, and perks like team events.
• Check out our culture map: We encourage you to apply!
• We value diversity.
• At Olo, we know a diverse and inclusive team not only makes our products better, but our workplace better.
• Many groups are underrepresented across the tech sector and we are committed to doing our part to move the needle.
• Olo is an equal opportunity employer and diversity is valued at our company.
• All applicants receive consideration for employment.
• We do not discriminate on the basis of race, religion, color, national origin, gender identity, sexual orientation, pregnancy, age, marital status, veteran status, or disability status.
• If you like what you read, hear, and/or know about Olo, and want to be a part of our team, please do not hesitate to apply!
• We are excited to hear from you!
Apply Here
For Remote Site Reliability Engineer roles, visit Remote Site Reliability Engineer Roles


Staff Site Reliability Engineer at Maven Clinic

Location: New York

Maven is the largest virtual clinic for women’s and family health, offering continuous, holistic care for fertility, pregnancy and parenting. Maven s award-winning digital programs are trusted by leading employers and health plans to reduce costs and drive better health outcomes for both parents and children. Founded in 2014 by CEO Kate Ryder, Maven has supported more than 10 million women and families to date. Maven has raised more than $200 million in funding from leading investors including Sequoia, Oak HC/FT, Dragoneer Investment Group and Lux Capital.

An award-winning culture working towards an important mission – Maven Clinic is a recipient of over 20 workplace and innovation awards, including:
• Fast Company Best Workplaces for Innovators (2022)
• Fortune Best Workplaces NY (2020, 2021, 2022)
• Great Place to Work certified (2020, 2021, 2022)
• Inc. Best Workplaces (2022)
• CNBC Disruptor 50 List (2022)
• Fast Company Most Innovative Company in Health (2020)
• Built In NYC Best Paying Companies (2022)
• Built In LGBTQIA+ Advocacy Award (2022)

Maven is looking for an experienced Staff Site Reliability Engineer in our New York City office or completely remote to work on all things infrastructure, site and system reliability, tooling, deployment, and information security. You will work with a strong cross-functional team of engineers, supporting our infrastructure using tools like Google Cloud Platform, Kubernetes, and Docker. Prior experience in a healthcare setting is a nice-to-have, but not a requirement.

As a Staff Site Reliability Engineer at Maven, you will:
• Be a leader within Maven and help drive our technical direction and vision
• Maintain existing infrastructure and systems
• Design new deployment and security strategies
• Facilitate collaboration with other engineers and product teams on their infrastructure needs
• Maintain Maven s existing Kubernetes deployment via Google Cloud Platform (GCP) to ensure high levels of availability to our members who rely on access to our healthcare platform
• Empower our developer teams to ship frequently, safely, and quickly
• Work with our engineering teams on broad infrastructure initiatives, like multi-cloud deployments, disaster recovery processes and testing
• Be a key member of our third-party audits and assessments like SOC 2, HITECH, and more
• Debug and resolve issues in our testing and production environments
• Assist deploying new versions of our systems in production (note that we only deploy during regular work hours)

We re looking for you to bring:
• 8+ years of production-level experience using cloud-based hosting, notably with Google Cloud Platform, AWS, or Azure
• 2+ years of production-level experience with Terraform, Ansible, Chef, Puppet, SaltStack, or other infrastructure-as-code tools
• Experience with containerized systems like Docker and Kubernetes
• One or more scripting language (Python, Ruby, Perl, etc)
• Experience with continuous testing and continuous deployment (CI/CD) systems and strategies
• Shell scripting skills (Bash)
• Strong communication, interpersonal, written and verbal skills

Helpful experiences and skills (if you don t have them, you can learn them with us.):
• Experience supporting a healthcare SaaS-type platform
• Experience with various security and compliance frameworks like SOC, HITECH

At Maven we believe that a diverse set of backgrounds and experiences enrich our teams and allow us to achieve above and beyond our goals. If you do not have experience in all of the areas detailed above, we hope that you will share your unique background with us in your application and how it can be additive to our teams.

Benefits & Perks:

We are reimagining what a supportive workplace looks like, from the inside out. On top of standards such as employer-covered health, dental, and insurance plan options, and generous PTO, we offer an all-of-you, inclusive approach to benefits:
• Maven for Mavens: access to the full platform and specialists, including care for everything from mental health, reproductive health, family planning, pediatrics.
• Whole-self care through wellness partnerships
• Weekly breakfast, lunch, and get-togethers
• 16 weeks 100% paid parental leave, flexible time upon return, and $1.5K/mo for 2 months, new parent stipend (for Mavens who’ve been with us at least six months)
• Udemy, annual professional development stipend, and access to a personal career coach through Maven
• 401K matching for US-based employees (immediately vesting)

These benefits are applicable to Maven Clinic Co., US-based, full-time employees only. 1099/Contract Providers are ineligible for these benefits.

Maven is an equal opportunity employer. We celebrate diversity and are committed to creating an inclusive environment for all employees. Please note that Maven Clinic interview requests and job offers only originate from email address (e.g ). Maven Clinic will also never ask for bank account information (routing, account numbers), social security numbers, passwords or any other sensitive information to be delivered over email, or phone. If you receive a scam issue or a security issue involving Maven Clinic please notify us at: .
Apply Here
For Remote Staff Site Reliability Engineer roles, visit Remote Staff Site Reliability Engineer Roles


Site Reliability Engineer at Aspen Capital

Location: New York

Aspen Capital is currently seeking a Site Reliability Engineer to join our engineering team This position will work collaboratively with our technology teams to deploy and operate custom built applications and commercially available systems and applications. You will help automate and streamline operations and processes, build and maintain tools for deployment, monitoring and operations.

We are currently in a phase of rapid growth driven by increasing demand for our services. To meet this demand, we are looking to hire exceptional developers to work alongside our existing team. Together driving our platform and services forward.

Day to day you will

Create and maintain infrastructure as code in using Terraform and Ansible

Create CI/CD pipelines to automate code deploy/release and infrastructure creation/destruction

Assist engineering team in refactoring legacy application to modern cloud native architecture

Assist in troubleshooting and resolving production issues

Collaborate across technical and business teams to speed delivery of our applications and foster open communication and learning


4+ years of demonstrated work experience supporting cloud scalable high-performance data solutions

Strong background in Linux/Unix Administration and operations

Experience with Kubernetes and containers

Experience with automation/configuration management using Ansible and Terraform

Experience with modern build strategies. Continuous delivery experience preferred

GitOps – leveraging Git to automate and orchestrate product delivery and monitoring

Strong working understanding of code (Java, JavaScript ())

Ability to effectively work in a team environment and independently as needed

Ability to be on call after hours as needed


Work with cutting-edge technology. Data Science is in the DNA of Aspen Capital and we hope it is in yours as well. Join us as we rewrite the rules of residential and commercial mortgages and real estate.

We are a private equity firm based in Portland, OR and New York, NY. We utilize data and technology to enhance business insight, propel growth, transform our investment strategies and business operations, and execute industry-leading deals. The unique Aspen Capital worldview is reflected in a nimble, efficient organizational structure that allows the company to capitalize on market demands, seize business opportunities and excel in a wide range of roles including investment, lending and servicing, acquisitions, management, joint ventures, asset management, recapitalization and advisory services.

We offer competitive salary and a great benefits package including medical, dental, and vision insurance, covering 100% of the employee premiums and 50% for dependents. We provide life insurance, short & long term disability insurance, 15 days of PTO, 8 paid holidays and a 401(k) plan with company match up to 4%. We have a dog friendly work environment, and casual dress.
• We are an equal employment opportunity employer. All qualified applicants will receive consideration for employment without regard to race, color, religion, gender, national origin, disability status, protected veteran status or any other characteristic protected by law.
• We maintain a drug-free workplace and perform pre-employment substance abuse testing.
Apply Here
For Remote Site Reliability Engineer roles, visit Remote Site Reliability Engineer Roles


Site Reliability Engineer – Trading – New York at GSR

Location: New York

About us:

Founded in 2013, GSR is a crypto market maker with more than 300 employees in 5 countries. We provide billions of dollars of liquidity to cryptocurrency protocols and exchanges on a daily basis. We build long-term relationships with cryptocurrency communities and traditional investors by offering exceptional service, expertise and trading capabilities tailored to their specific needs.

GSR works with token issuers, traders, investors, miners, and more than 60 cryptocurrency exchanges around the world. In volatile markets we are a trusted partner to crypto native builders and to those exploring the industry for the first time.

Our team of veteran finance and technology executives from Goldman Sachs, Two Sigma, Citadel, and Tower Research among others, has developed one of the world’s fastest and most robust trading platforms designed to navigate issues unique to the digital asset markets. We have continuously improved our technology throughout our history, allowing for our clients to scale and execute their strategies with the highest level of efficiency.

Working at GSR is an opportunity to be deeply embedded in every major sector of the cryptocurrency ecosystem.

Our proprietary trading platform was designed to navigate issues unique to the digital asset markets. We have continuously improved our technology throughout our nine year history, allowing for our clients to scale and execute their strategies with the highest level of efficiency.

Our edge is fuelled by a sophisticated data driven process of hypothesise, research, and validation.

Gaining an edge in cryptocurrency markets requires investment in the right tools and a systematic approach. Our research pipeline is built on data that has been harvested on a 24-hour basis from 50+ exchanges.

Your role will be to help deliver a secure, reliable, and best-in-class production environment for our global trading franchise.

• Provide first-line support and troubleshooting for our exchange connectivity applications.
• Track, escalate, and prioritise incidents, ensuring timely resolution.
• Implementing configuration changes and software upgrades.
• Actively participate in the continuous monitoring of exchange connectivity, order flows, latency etc.
• Actively participate in the design and improvement of processes around monitoring and alerting.
• Planning long-term roadmap for the trading system, from capacity to tools to software features.
• Working with developers to design and implement features in the trading system.
• Perform ad-hoc operational tasks as needed to support the global trading function.

• At least 2 to 7 years’ experience in a similar role at a bank or proprietary trading firm.
• Bachelor’s degree in a relevant subject (e.g., Computer Science).
• Required: Experience working in front-line support.
• Required: Experience in MySQL/MSSQL and Linux.
• Required: Ability to understand and read API documentation. Nice to have WS, REST and FIX.
• Required: Ability to analyse logs, and metrics and system performance of applications in general.
• Required: Ability to successfully manage multiple tasks in a fast-paced environment.
• Nice to have: Exposure to container technology (e.g., Docker, Rancher).
• Nice to have: Software development experience a plus but not required.
• Disciplined self-starter with high attention to detail and ability to work autonomously and/or remotely.
• A high degree of motivation, adaptability and proactiveness are key success factors for the role.
• Optimistic and willing to learn and working with other SRE’s even outside the domain.

What we offer:

A collaborative and transparent company culture founded on Integrity, Innovation and Performance. Competitive Salary with two discretionary bonus’ payments a year. Benefits such as Healthcare, Dental, Vision, Retirement Planning, 30 days holiday and free lunches when in the office.

Hybrid working pattern in all of our offices from London, New York, Singapore, Zug and Malaga.

Regular Town Halls and off-sites , team lunches and drinks.

A Corporate and Social Responsibility program as well as charity fundraising matching and volunteer days.

Immigration and relocation support where required.

GSR is proudly an Equal Employment Opportunity employer. We do not discriminate based upon any applicable legally protected characteristics such as race, religion, colour, country of origin, sexual orientation, gender, gender identity, gender expression or age. We operate a meritocracy, all aspects of people engagement from the decision to hire or promote as well as our performance management process will be based on the business needs and individual merit, competence in the role.

Learn more about us at www.gsr.io .
Apply Here
For Remote Site Reliability Engineer – Trading – New York roles, visit Remote Site Reliability Engineer – Trading – New York Roles


The Tech Career Guru
We will be happy to hear your thoughts

Leave a reply

Tech Jobs Here