Monitoring Specialist

7 days ago


Muscat, Muscat, Oman Infoline Full time

Job Title: Monitoring & Incidents Analyst

Type of contract: One year extendable

Main Role (Overall Accountability)

Our client is seeking a Command Centre Monitoring & Incidents Analyst to operate within the bank's 24/7 command center. The command center is responsible for end-to-end monitoring and observation of all IT services using relevant monitoring and observability tools implemented at the bank. The candidate will be part of the team managing the round-the-clock operations, monitoring the bank's IT assets, and working on shifts as per the roster prepared by supervisors.

Overall Responsibilities

  1. Work on shift to perform 24x7 command center duties, ensuring all monitoring tools operate uninterrupted.
  2. Monitor all critical systems, follow up on service requests from monitoring tools, and manage emergency escalations and reporting.
  3. Coordinate with on-site technical teams for system inspections, emergency repairs, and management to maintain 100% IT service uptime.
  4. Perform emergency escalations and report abnormalities.
  5. Ensure proper handovers between shifts, including preparing daily shift reports and activity summaries.
  6. Provide incident details to Command Centre Leadership for incident reports within SLA.
  7. Assist in preparing historical data logs from monitoring tools as required.
  8. Handle Incident and Problem Management, assess and validate major incidents.
  9. Manage notifications, escalations, and coordinate recovery actions for major incidents with application owners.
  10. Provide timely updates to management and stakeholders.
  11. Participate in post-incident root cause analysis and follow up on improvement plans.
  12. Track outstanding actions and improvement plans related to incidents.
  13. Report monthly incident trends to management.
  14. Collaborate with internal teams to ensure alignment across the incident lifecycle.
  15. Contact support and vendor teams promptly during incidents, after event scrutiny.
  16. Arrange triage during crises, monitor using existing and new tools, and follow predefined recovery processes.
  17. Create and maintain a knowledge library for the command center team.
  18. Manage and follow up on incidents, lead crisis calls, and manage war-room activities.
  19. Classify incidents by priority and coordinate resolution with IT teams and vendors.
  20. Escalate incidents requiring additional resources or senior management intervention.
  21. Maintain detailed incident records, generate performance reports, and conduct post-incident reviews.
  22. Update crisis management plans based on lessons learned.
  23. Act as the primary contact for major incidents and crises, ensuring proper logging and progress updates.
  24. Ensure compliance with risk management and audit standards.
  25. Contribute ideas for support team improvements and seek input from team members.

Specific Responsibilities:

  1. System Maintenance: Conduct impact analysis, recommend solutions, and ensure methodology adherence. Support decision-making and research emerging technologies.
  2. Projects: Advise digital transformation and change management teams on monitoring and observability projects.

Educational & Personnel Specifications

  • Bachelor's degree in IT or related discipline.
  • Minimum 3 years' experience as an Incident or Problem Manager in IT applications.
  • 3+ years' experience with performance monitoring and observability tools.
  • Proven techno-functional knowledge in performance and observability tools and IT fields.
  • SRE certification is an advantage.
  • Understanding of market trends and current technology.
  • Strong presentation skills and ability to communicate complex topics.
  • Experience in banking is a plus.
  • Programming knowledge in various languages.
  • Knowledge of databases and SQL.
  • System design expertise using structured and OOP methodologies, SDLC knowledge.
  • Excellent communication and interpersonal skills, proficiency in English.
Seniority level
  • Mid-Senior level
Employment type
  • Contract
Job function
  • Information Technology
Industries
  • Banking
#J-18808-Ljbffr

  • Muscat, Muscat, Oman beBee Careers Full time

    Role Summary:The Equipment Reliability Specialist will work closely with the condition monitoring team to provide expert technical advice and support. Key responsibilities include analyzing and evaluating data, working with the condition monitoring specialists, and utilizing Condition Monitoring tools to monitor machine health.Additional requirements include...

  • Monitoring Specialist

    2 weeks ago


    Muscat, Muscat, Oman Infoline Full time

    Job Title: Monitoring & Incidents AnalystType of contract: One year extendableMain Role (Overall Accountability)Our client is seeking a Command Centre Monitoring & Incidents Analyst to operate within the bank's 24/7 command center. The role involves end-to-end monitoring of all IT services using relevant tools, and working as part of a team responsible for...


  • Muscat, Muscat, Oman beBee Careers Full time

    Cost Planning Specialist PositionThis role involves ensuring accurate cost entry into systems, tracking project performance, and supporting financial planning and risk management efforts within construction or engineering projects. This full-time on-site position is based in Oman.Develop and edit work estimates using iTWO estimation software.Monitor and...


  • Muscat, Muscat, Oman beBee Careers Full time

    **Monitoring & Incidents Analyst Job Description**The ideal candidate will have experience in Incident or Problem Management in an IT operations environment and be proficient in performance monitoring and observability tools.**Key Responsibilities**Analyzing issues thoroughly and recommending solutions with application custodians.Implementing corrective and...


  • Muscat, Muscat, Oman beBee Careers Full time

    Job DescriptionAs a Monitoring and Incidents Analyst, you will play a crucial role in ensuring the smooth operation of our IT services. Your primary responsibility will be to operate and monitor the 24/7 Command Centre for IT services, guaranteeing uninterrupted functionality of monitoring and observability tools.Perform system inspections, emergency...


  • Muscat, Muscat, Oman beBee Careers Full time

    Incident Management SpecialistThis position requires strong problem-solving skills and attention to detail.The Incident Management Specialist will work closely with cross-functional teams to ensure seamless incident resolution.The ideal candidate will have experience working with IT infrastructure and troubleshooting complex technical issues.Key...


  • Muscat, Muscat, Oman beBee Careers Full time

    This role is responsible for designing, deploying, and maintaining Dynatrace monitoring solutions across the organization. The successful candidate will possess strong knowledge of Dynatrace architecture and deployment strategies, as well as experience in configuring tags, services, and SLIs/SLOs.QualificationsBachelor's degree in Computer Science or a...


  • Muscat, Muscat, Oman beBee Careers Full time

    Monitoring and Incident AnalystThis is a role that requires the ability to work independently in a fast-paced environment.The Monitoring and Incident Analyst will be responsible for ensuring the smooth operation of IT systems, identifying and resolving technical issues promptly, and maintaining detailed records of incidents.The successful candidate will have...


  • Muscat, Muscat, Oman beBee Careers Full time

    Job Title: IT Operations SpecialistThe role of the IT Operations Specialist involves ensuring the smooth operation of all IT systems and services. This includes monitoring performance, identifying and resolving issues, and implementing process improvements. The successful candidate will have a strong understanding of IT service management principles and...


  • Muscat, Muscat, Oman beBee Careers Full time

    Monitoring and Incident Management RoleAbout the JobWe are seeking a skilled Monitoring Specialist to join our team in a critical role that requires strong technical expertise, excellent communication skills, and the ability to work effectively under pressure.