Site Reliability Engineer – UMP

Miejsce pracy: Kraków | Remote


A Site Reliability Engineer is responsible for maintaining, monitoring, automating processes, handling emergencies and incidents, troubleshooting, managing risk and building scalable systems across all our software products for Device Management. Our main customers are Tier 1 Telecom players and big enterprises all around the world. In the role of a Site Reliability Engineer you will have a chance to create and implement a leading technology, work in various environments for clients from all parts of the globe with various types of implementation and maintenance scenarios.

What we are looking for:

  • min. 2 years of experience in systems administration,
  • strong Unix (Ubuntu, RedHat) and network knowledge,
  • good scripting skills: one of Python/Perl/Ruby/Go,
  • expertise with container orchestration and/or virtualization,
  • English (B2+).

Nice to have:

  • expertise with monitoring systems like Zabbix, Prometheus, Elasticsearch,
  • experience with Continuous Integration and Delivery Environments
  • experience with maintaining Java application,
  • experience with AWS, Azure Cloud or Google Cloud,
  • experience with Ansible, Salt, Helm.

What do we offer:

  • flexible working hours,
  • participating in the creation of a modern product, which will keep you up to date with emerging technologies,
  • work in a team of professionals,
  • expanding your skills in working with developers, testers and people responsible for business,
  • Multisport card,
  • company’s own parking and bike room,
  • a kitchen full of snacks,
  • a relaxed work atmosphere – no dress code, no open space.

Typical working day:

  • install/upgrade/reconfigure network, OSes, packages or our services,
  • create documentation or guides,
  • find workarounds and solutions to problems in monitored installations,
  • discuss and analyze solutions for the client requirements to ensure the reliability of the introduced solution,
  • join postmortem or similar meetings to minimize the reappearance of the same errors,
  • develop and integrate tools that can improve our internal procedures and make our work more efficient,
  • plan and apply introduced changes into our internal system as well as in the client’s systems.

The technologies we use:

We have heterogeneous deployments on bare metals, virtualization and increasingly clouds and Kubernetes ones. Other than that — Ansible, Zabbix, Elastic stack, Atlassian stack. Various network stuff like pfsense and Shorewall for VPN automation. Lots of Python scripts and Django systems. And many many more!

Zobacz inne oferty

Site Reliability Engineer

Cracow | Warsaw | Wroclaw

Solution Architect

Kraków | Partly remote

zobacz wszystkie ogłoszenia