
Site Reliability Expert
1 day ago
Job Description:
The role of Site Reliability / Gitops Engineer involves developing infrastructure as code, automating software operations, and improving the resilience and scalability of cloud and container portfolios.
This position requires a deep understanding of IT operations automation, infrastructure as code, and Linux networking, routing, and firewalls.
You will work collaboratively with development teams to design service architecture, documentation, playbooks, policies, and operational procedures.
Main Responsibilities:
- Apply experience of IaC to develop infrastructure as code practice within IS by constantly increasing automation and improving IaC processes
- Automate software operations for re-usability and consistency across private and public clouds, taking into consideration the complexities of distributed systems
- Develop new features and improve the resilience and scalability of the existing cloud and container portfolio
- Maintain operational responsibility for all of Canonical's core services, networks, and infrastructure
- Develop skills in troubleshooting, capacity planning, and performance investigation, Setting up, maintaining and using observability tools such as Prometheus, Grafana, and Elasticsearch; design, implement and maintain monitoring and alerting for various systems and services
- Collaborate with development teams to design service architecture, documentation, playbooks, policies and operational procedures
- Provide assistance and work with globally distributed engineering, operations, and support peers
Requirements:
- A deep experience of, and knowledge to define operations in code, using version control, peer review and CI/CD to roll out changes both to applications and infrastructure
- Strong modern engineering background (peer-review, unit testing, SCM, CI/CD, Agile)
- Python software development experience, with large projects
- Practical knowledge of Linux networking, routing, and firewalls
- Affinity with various forms of Linux storage, from Ceph to Databases
- Hands-on experience administering enterprise Linux servers
- Extensive knowledge of cloud computing concepts and technologies
- Bachelor's degree or greater, preferably in computer science or related engineering field
- Able to communicate clearly and effectively in English over email, chat, video or voice calls and in-person
- Motivated and able to troubleshoot from kernel to web, and willing to ask others when appropriate
- A willingness to be flexible and able to learn new things quickly
- Be inspired by the needs of fast-changing environments
- Happy to work within distributed teams
- Be passionate and familiarized about open-source, especially Ubuntu or Debian
-
Reliability Expert
1 day ago
Muscat, Muscat, Oman beBee Careers Full timeJob Overview:As an Equipment Reliability Specialist, you will play a crucial role in ensuring the continued operation of our rotating equipment. This involves analyzing and evaluating data to identify potential issues before they become major problems.A key aspect of this job is working closely with the condition monitoring team to provide expert technical...
-
Site Reliability
5 days ago
Muscat, Muscat, Oman Canonical Full timeCanonical is a leading provider of open source software and operating systems to the global enterprise and technology markets. Our platform, Ubuntu, is very widely used in breakthrough enterprise initiatives such as public cloud, data science, AI, engineering innovation, and IoT. Our customers include the world's leading public cloud and silicon providers,...
-
Safety and Reliability Expert
1 week ago
Muscat, Muscat, Oman beBee Careers Full timeKey Responsibilities:Conduct risk assessments to identify potential hazards and evaluate existing safety controls.Develop and implement risk management strategies to mitigate potential risks.Collaborate with clients and stakeholders to ensure effective risk management practices.Provide expert advice on process safety matters and conduct site audits as...
-
Reliability Engineering Expert
1 day ago
Muscat, Muscat, Oman beBee Careers Full timeJob Description: The successful candidate will be responsible for leading electrical asset reliability management and developing electrical reliability strategies. This includes providing technical expertise to operations and maintenance teams and maintaining records of equipment history, maintenance activities, and reliability metrics.Main...
-
Electrical Reliability Expert
1 day ago
Muscat, Muscat, Oman beBee Careers Full timeWe are seeking a highly skilled and experienced Senior Electrical Reliability Engineer to join our team.About the RoleThis is an exciting opportunity for a motivated professional to contribute to the development and implementation of electrical reliability strategies, ensuring the safe and efficient operation of electrical systems.Main...
-
System Reliability Engineer
1 day ago
Muscat, Muscat, Oman beBee Careers Full timeResponsibilities and DeliverablesDevelop and implement monitoring strategies to optimize system performance and reduce downtime.Provide expert-level technical assistance to IT teams and vendors during incident resolution, leveraging your knowledge of performance monitoring and observability tools.Maintain accurate records of incidents, near-misses, and...
-
Equipment Reliability Specialist
1 day ago
Muscat, Muscat, Oman beBee Careers Full timeJob Title: Equipment Reliability SpecialistThe ideal candidate for this role will be responsible for providing proactive technical support to the condition monitoring team. This includes analyzing and diagnosing data collected by condition monitoring specialists, as well as evaluating the mechanical integrity of rotating equipment.To succeed in this...
-
Safety and Reliability Engineer
2 weeks ago
Muscat, Muscat, Oman beBee Careers Full timeRisk management is an essential aspect of ensuring the safety and reliability of complex systems and processes. As a Technical Safety Specialist, you will be responsible for evaluating and mitigating risks associated with various scenarios and activities.We are seeking an experienced professional to join our team of safety experts. In this role, you will be...
-
Cloud Operations Expert
1 day ago
Muscat, Muscat, Oman beBee Careers Full timeCloud Operations ExpertWe are looking for a seasoned Cloud Operations Expert to lead our team in designing and implementing cutting-edge cloud infrastructure. The ideal candidate will have a strong background in Linux networking, routing, firewalls, internet transit, and large-scale/bandwidth networking.Main Responsibilities:Define and implement a...
-
Sales and Marketing Expert
1 week ago
Muscat, Muscat, Oman beBee Careers Full timeAbout the RoleThis is a full-time on-site position located in Muscat for a Sales and Marketing Expert. As a Sales and Marketing Expert, you will be responsible for selling unique designs and pieces to a diverse clientele.ResponsibilitiesYou will represent HILIA as the brand's first point of contact, providing exceptional customer service and helping...