National Society of Black Engineers

Site Reliability Engineer / DevOps

Spotlight Preferred
GRIID Infrastructure
Austin, Texas, United States
18 days ago

Description

We are looking for a Site Reliability Engineer (SRE) / DevOps who will be responsible for creating, maintaining, and scaling our cloud and edge data center infrastructure. Your primary duties will include working with software engineers to build a robust infrastructure with Continuous Integration / Continuous Deployment (CI/CD), monitoring, and alerting while meeting security and data protection requirements and on call rotations.

Key Responsibilities:

Essential duties and responsibilities include, but are not limited to, the following:

  • Championing automation to reduce toil and increase development velocity
  • Responsible for deploying / monitoring / maintaining the production environment/services, taking a holistic view of system health, and ensure that our platform is stable and balanced
  • You will work closely with developers to provide feedback and drive operational improvements within our production infrastructure.
  • Develop infrastructure and network best practices
  • Measure and optimize system performance with an eye toward pushing our capabilities forward, getting ahead of customer needs, and innovating to continually improve
  • Gather and analyze metrics from both operating systems and applications to assist in performance tuning and fault finding
  • Document processes and procedures with appropriate level of detail

Participate in our on-call operations and monitoring pool

Qualifications:

Education and/or Experience

  • Bachelor of Science degree in Computer Science or equivalent experience
  • 3 to 5 years of related experience
  • Knowledge of one or more scripting languages (ex: Python, Bash)
  • Experience with multiple areas of AWS (EC2, RDS, S3, IAM, Route53, etc.)
  • Experience with IaC (Infrastructure As Code), such as Terraform and CloudFormation, and Ansible
  • Experience with configuration management tools such as Ansible, Puppet, or Chef
  • Experience with CICD tooling (Jenkins preferred)
  • Experience with bare metal servers (Linux)
  • Experience with dashboarding, monitoring, and alerting (ex: CloudWatch, Prometheus / Grafana stack)
  • Maintained relational and time series database systems
  • Experience with Kubernetes a plus
  • Ability to create accurate system diagrams and documentation for design and planning cloud and edge computing systems
  • A proactive approach to spotting problems, areas for improvement, and performance bottlenecks
  • AWS, CNCF, and CompTIA Linux+ certifications a plus

Additional information:

Physical Demands

        Vision and hearing correctable to within normal ranges.

        Must be able to speak and write clearly and distinctly.

        Must be able to lift and move typical server equipment in the 10 lb range

Work Environment

        Remote work initially with eventual on site or split

        10 to 20% travel

Reasonable accommodations may be made to enable individuals with disabilities to perform essential functions.



Job Information

  • Job ID: 63914118
  • Location:
    Austin, Texas, United States
  • Company Name For Job: GRIID Infrastructure
  • Position Title: Site Reliability Engineer / DevOps
  • Industry: Engineering
  • Job Function: Engineering
  • Job Type: Full-Time
  • Job Duration: Indefinite
  • Min Education: Bachelor's Degree
  • Years of Experience: 2+ to 5 Years
  • Required Travel: 10-25%
  • Salary: $100,000.00 - $130,000.00 (Yearly Salary)
Technical Communications
Norwood , OH , US

GRIID is an American Infrastructure company that procures low-cost, carbon-free energy to build, manage, and operate a dynamically growing portfolio of vertically-integrated bitcoin mining facilities in the US.

View Full Profile

Jobs You May Like