Senior Site Reliability Engineer

 

Description:

The Wikimedia Foundation is looking for a Senior Site Reliability Engineer to join our team, reporting to the Engineering Manager, Cloud Services. Cloud Services curates environments that host tools and services utilized across Wikimedia projects. A significant portion of edit traffic on Wikipedia for example, is done by community developed tools we host!

Our team maintains Infrastructure as a Service, Platform as a Service, and Data as a Service products. The team works in partnership (our puppet repo is public! And yes, you can contribute to it!) with the larger Wikimedia volunteer community to manage these environments. Candidates should be comfortable communicating in public and asynchronous ways with volunteers and developers from around the world.

 

You’ll work remotely with a full-time distributed team, with members spread between Europe and North America, and need to overlap (UTC-5 to UTC+1) working hours. Some examples of the type of work you’ll be doing include:

 

  • Expanding the capabilities of our toolforge platform
  • Expanding and refining our storage offerings, backed by Ceph and NFS
  • Scaling our team via automation
  • Providing a curated Jupyter notebook environment for data analysis and queries of Wikimedia data
  • Upgrading, customizing, and adding new services like terraform support, and Database as a service to Openstack
  • Developing new webservices for our technical community, like Quarry and PAWS

And the backlog has even more details!

 

You are responsible for:

 

  • Helping to create a repeatable Openstack cloud deployment
  • Implementing a network topology using Open vSwitch, providing per tenant networking, load balancing, and IPv6
  • Performing day-to-day operational tasks on Wikimedia’s Cloud Services infrastructure (deployment, maintenance, configuration, troubleshooting). Develop and support automation tools and processes in support of these tasks.
  • Participating in on-call rotation and support in a 24x7 environment

 

Skills and Experience:

 

  • Comfortable working and thriving within a Linux ecosystem
  • Understand networking in the physical domain of switches and servers
  • Software development skills in at least one of the following languages: Python, Go, Javascript, and/or Ruby
  • B.S. or M.S. in Computer Science or related field or equivalent in related work experience.

Organization Wikimedia Foundation
Industry Engineering
Occupational Category Senior Site Reliability Engineer
Job Location Dublin,Ireland
Shift Type Morning
Job Type Full Time
Gender No Preference
Career Level Intermediate
Experience 2 Years
Posted at 2023-11-18 5:29 am
Expires on 2024-12-06