FULL TIME JOB / SENIOR SITE RELIABILITY ENGINEER

FULL TIME JOB / SENIOR SITE RELIABILITY ENGINEER

Job Overview

Location
Irvine, California
Job Type
Full Time Job
Job ID
125721
Date Posted
5 months ago
Recruiter
Dennis Ruth
Job Views
152

Job Description

About the Position

As a member of the SRE team, you will bring a collaborative style in leading efforts that raise the maturity levels of the engineering practices across all agile teams delivering our products. The tools and use-cases are diverse, and our challenge is to increase the development velocity by optimizing various parts of the pipeline and increase application stability. Much of our software development focuses on optimizing existing systems by measuring elasticity and saturation, building infrastructure through IAC, and eliminating /reducing toil through automation. We also look to instill core SRE practices into the engineering teams including measuring SLIs/SLOs, increasing visibility/observability through monitoring tools, guide chaos engineering efforts to improve overall resiliency, and lead Gameday/Production Readiness reviews across all engineering disciplines. We’re experts in AWS and use cutting edge tools developed in-house and open-source software and enable teams to deploy faster with zero downtime.

We are looking for engineers who are passionate about automation, like planting the seeds of DevOps in an organization and watch the organization benefit and grow from your ideas, and own best practices facilitated by SRE principles to build scalable and highly reliable applications.

If you love to figure out how all the pieces are put together and if automation and building tools to monitor and manage your applications sounds interesting to you, we want to talk to you.

Cox Automotive is transforming the way the world buys, sells and owns cars. Come join the transformation! 

Primary Responsibilities and Essential Functions:

As a Senior Site Reliability Engineer at Cox Automotive you will:

  • Have a natural tendency to avoid toil and want to automate it away

  • Automate anything and everything! (testing, deploying, monitoring, etc)

  • Take complex and not maybe well-defined problem and come up with a technically reasonable solution

  • Take ownership of processes or solutions that can be shared across teams globally

  • Build and rollout solutions to be consumed by multiple teams

  • Have innate curiosity about how things work

  • Design and assist in the authoring of software tools that reliably manage application delivery & performance

  • Define requirements, functional specifications, and deliverables

  • Create automated delivery pipelines for deployment of internal and third-party services

  • Design and assist in the setup and maintenance of application monitoring and alerting

  • Engage with product/capability engineering teams to ensure best practices are implemented

  • Improve predictability and reliability of software releases, workflows, and operating software.

  • Reduce application deployment windows by leading engineering teams towards a Continuous Deployment environment

  • Reduce mean time to recovery (MTTR) by helping troubleshoot, monitor, alert, and automating recovery.

  • Facilitate Gamedays and Production Readiness reviews to continue increasing resiliency in our applications

  • Provide consulting expertise in AWS, cloud design, and operations

  • Identify new technologies that can improve our area of responsibility, design and conduct proofs-of-concept, and communicate results throughout the organization

Minimum Qualifications:

  • Bachelor’s degree in Computer Science or related field and 4+ years of relevant experience

  • Expertise in designing, analyzing, and troubleshooting large-scale distributed systems

  • Ability to debug, optimize code, and automate routine tasks

  • Systematic problem-solving approach, coupled with effective communication skills and a sense of drive

  • Understanding of Linux/Windows operating systems

  • Experience with Python or PowerShell or related scripting languages

  • Experience with configuration management systems (Spinnaker, Chef, Puppet, or Ansible)

  • Experience rolling out highly available, mission-critical applications

  • Experience with version control systems (Git or SVN) and branching strategies

  • Experience with Cloud Computing platforms (Amazon AWS, Kubernetes, Heroku, etc)

  • Experience in release engineering / automation with cloud environments

  • Experience with security and network / distributed computing concepts

  • Experience with continuous integration tools (Jenkins, GitHub Actions, CircleCI, TeamCity, etc), Artifactory (or Nexus)

  • Experience with Database Server infrastructure (RDS, Aurora, DynamoDB, MySQL, Postgres, etc)

  • Experience with agile development, continuous integration and automated testing

  • Experience with Infrastructure as Code (Terraform or CloudFormation)

  • Excellent written communication, problem solving, and process management skills

  • Desire to work in a fast paced, evolving, growing, dynamic environment

Job ID: 125721

Similar Jobs

Apple Inc.

Full Time Job

Full time job / senior site reliability engineer Full time job / senior site reliability engineer

Apple Retail is where the best of Apple comes together. We bring our expertise t...

Full Time Job

HellermannTyton

Full Time Job

Full time job / senior site reliability engineer Full time job / senior site reliability engineer

 QM Specialist   The following position is open in Jalisco, ...

Full Time Job

7-Eleven

Full Time Job

Full time job / senior site reliability engineer Full time job / senior site reliability engineer

ResponsibilitiesBeing a 7-Eleven Area Leader isn’t easy. In fact, itâ€...

Full Time Job

7-Eleven

Full Time Job

Full time job / senior site reliability engineer Full time job / senior site reliability engineer

ResponsibilitiesThe Area Leader is responsible for directly driving sales and pr...

Full Time Job

Cookies

This website uses cookies to ensure you get the best experience on our website.

Accept