SysOps / Site Reliability Engineer (SRE)
Chess.com is one of the largest gaming sites in the world and the clear #1 platform for playing, learning, and enjoying chess.
We are a team of 335+ fully remote people in 45+ countries working hard to serve the global chess community. We are also growing fast, with more than 80 million players having chosen Chess.com as their online chess home, and we support a large base of happy premium subscribers.
We are a tech company. A gaming company. A content company. And we do it all with passion and commitment to the game. Above all we prize our mission-driven, flat, life-celebrating, no-corporate culture, and we look forward to meeting you and learning more about what you can bring to the team.
You are passionate about building and managing infrastructure. It brings you joy to learn new technologies and use them to help reach challenging product goals. You have solid experience deep diving into Linux internals, as well as the future-oriented skills of managing Cloud/Kubernetes ecosystems. You are humble with a sense of humor and eager to be a part of a like minded team of people. You have been working in or dreamed of working in the gaming industry and are ready to turn your talents towards chess!
What You’ll Do
- You will help us to reinvent how people experience chess around the world
- You will take part in building a multi regional resilient system capable of handling millions of games each day along with tons of additional services.
- You are willing and able to participate in an on-call schedule
- You will have the opportunity to solve interesting challenges like scaling 1 million CPUs across a dozen regions.
- You will help us maintain stability and performance as we blend our existing bare-metal datacenter hosted with GCP for micro-services and scaling
- You’ll be proactive in improving our users' experience
- Sense of ownership and responsibility
- Strong understanding of cloud principles and datacenter design
- Strong collaboration and communication skills working in a fully distributed team
- Strong knowledge of UNIX based OS fundamentals
- Detailed understanding of http and related technologies
- Knowledge of configuration management systems
- Understanding of networks and low level network protocols
- Experience with Content Delivery Networks
- Experience with server application monitoring & visualization (statsd, datadog, ELK, prometheus, etc)
- Experience with server-side automation scripting
- Experience with data layer technologies (RDBMS/SQL, NOSQL/key-value, etc)
- Security knowledge and risk assessment ability
- Experience managing and running chess engines is a major plus!
- Lifelong learner
About the Opportunity
- This is a contract position
- We are 100% remote (work from anywhere!)
- This is open to applicants in time zones within GMT 0 and GMT +3
You can learn more about us here:
We look forward to meeting you!