Monitoring & Incidents Analyst

6 days ago


Muscat, Muscat, Oman TAT IT Technolgies Full time

We have an urgent requirement for Monitoring & Incidents Analyst for our banking client in Muscat, Oman

Candidate is required to work on shift to perform 24x7 command center duties ---Must

Experience as an Incident or Problem Manager in an IT Application Operations environment. ---Must

Experience in performance monitoring and observability tools like Dynatrace, Riverbed NPM, SolarWinds, Grafana ---Must

SRE certification is an added advantage.

Overall Accountability

The command centre is used for end-to-end monitoring and observing all IT services using the relevant monitoring and observability tools implemented or being implemented at bank side. The candidate will be part of the team responsible for operating the round clock command centre and monitoring the entire bank IT assets using the monitoring capabilities applied within the command centre. This will involve working on shifts as per the shift roster prepared by the supervisors.

  • This candidate is required to work on shift to perform 24x7 command centre duties and ensure all monitoring and observability tools are working in an uninterrupted manner.
  • This role shall entail contributing to our command centre work activities involving monitoring of all 24x7 critical systems operation, following up service requests for all requests initiated from monitoring tools, and perform emergency escalation and reporting management.
  • Liaise with on-site shift technical team to perform system inspection, emergency repair and emergency management with an aim to achieve a 100% facility uptime for our IT Services operations.
  • Perform emergency escalation and reporting when systems abnormality occurs.
  • Perform proper handover and takeover of daily duties to next shift co-worker by clearly indicating all task or work duties to follow-up by next shift co-worker, this includes preparing daily 24x7 shift handover report and work activities summary.
  • Provide incident report details information to Command Centre Leadership for preparation of Incident report and ready to issue within SLA.
  • Assist in preparation of historical data log from the monitoring tools whenever required.
  • Be part of a Command Centre team to handle Incident and Problem Management.
  • Assess and validate major incidents.
  • Manage notifications and escalations as defined in the major incident management process.
  • Help in coordinating recovery actions and plans for major incidents to resolution with the respective applications owners.
  • Provide timely and informative updates to management, stakeholders, and users until incident closure.
  • Participate in post-incident root cause analysis (RCA) as required and follow up on improvement plans.
  • Understand and track outstanding actions, improvement plans for incidents escalated to Command Centre until closure.
  • Provide monthly incidents trend updates to management.
  • Work in close collaboration with internal teams throughout the life cycle to ensure cross-team alignment.
  • Contact right support & vendors team(s) on time in case of incident (after scrutinizing the event/alert with the subject matter experts).
  • Arrange triage in case of crisis.
  • Monitoring using existing tools & new EPM Perform pre-defined recovery process (following the runbooks).
  • Create & Maintain knowledge library for Command center team to operate and detect.
  • Manage and follow-up on the incidents created by the monitoring tools/team.
  • Follow-up (RCA, Problem Calls, Implementation).
  • Lead crisis calls and manage the war-room.
  • Classify incidents based on priority (e.g., critical, high, medium, low).
  • Coordinate between various IT teams & vendors to ensure swift resolution of incidents.
  • Escalate incidents that require additional resources or senior management intervention.
  • Maintain detailed records of incidents, actions taken, and resolution processes.
  • Generate reports on incident management performance and incident trends.
  • Conduct post-incident reviews to assess the effectiveness of the crisis response.
  • Update crisis management plans based on lessons learned from the incident.
  • Act as the primary point of contact for all IT-related major incidents and crises.
  • Ensure all issues are logged in appropriate internal and vendors tracking tools.
  • Must make sure that regular updates on progress are conveyed to line managers.
  • Ensure that appropriate corrective and preventive actions are undertaken and resolve problems as soon as they arise.
  • Must ensure compliance to Risk Management and Audit standards.
  • Must contribute ideas to help the support team to become more effective and seek ideas from other team members.
Specific Responsibilities
  • System Maintenance
  • Ensure that a detailed impact analysis of any issue is carried out and viable solution is recommended with the help of the respective application custodians.
    • 24/7 shift availability
    • System Enhancements
    • Must ensure to use the Bank's methodology for any enhancements undertaken.
    • Come up with appropriate solutions to support line managers in making decisions and using right methodology during any development or issue resolution.
    • Research and evaluate emerging technologies and trends and suggest course of action to line managers.
    Key Skills
    • Bachelor degree in IT or any related discipline.
    • At least 3 years' experience as an Incident or Problem Manager in an IT Application Operations environment.
    • 3+ years' experience in performance monitoring and observability tools like Dynatrace, Riverbed NPM, SolarWinds, Grafana, etc.
    • Proven Techno-functional knowledge in performance and observability tools.
    • Proven Techno-functional knowledge in IT related fields.
    • SRE certification is an added advantage.
    • Good understanding of the market trends and current technology.
    • Good presentation skills, ability to express complex technical and business topics.
    • Work experience in the banking industry is considered as a competitive advantage.

The candidate should also have thorough knowledge of the following:

  • Ready to work on shifts.
  • Good level of programming knowledge in various languages.
  • Good level of knowledge in database management systems and SQL.
  • Solid knowledge in system design using Structured and Object-Oriented methodology and good knowledge of SDLC.
  • Good knowledge of current technology in the IT industry.
  • Documentation & Report/MIS Preparation.
  • Good communication, presentation skills with good command of written English.
  • Good Interpersonal relations with pleasing personality.

Skills: presentation skills, grafana, incident, communication, sre certification, 24x7, solarwinds, system design, problem management, documentation, monitoring, incident management, observability tools, 24x7 command center duties, riverbed npm, dynatrace, sdlc, sql, performance monitoring tools.

#J-18808-Ljbffr

  • Muscat, Muscat, Oman TAT IT Technolgies Full time

    This range is provided by TAT IT Technologies. Your actual pay will be based on your skills and experience — talk with your recruiter to learn more.Base pay rangeWe have an urgent requirement for Monitoring & Incidents Analyst for our banking client in Muscat, Oman. The candidate is required to work on shifts to perform 24x7 command center duties.Must-have...


  • Muscat, Muscat, Oman TAT IT Technolgies Full time

    Job Title: Monitoring & Incidents Analyst - Command CenterWe have an urgent requirement for a Monitoring & Incidents Analyst to operate within the bank's 24/7 command center in Oman. The role requires experience with performance and observability tools and involves monitoring all IT assets of the bank using various tools.Key Responsibilities:Operate and...


  • Muscat, Muscat, Oman beBeeMonitoring Full time 45,000 - 50,000

    Job DescriptionWe are seeking a highly skilled and experienced Monitoring and Incidents Analyst to join our team. The successful candidate will be responsible for monitoring and analyzing IT services, identifying and resolving incidents in a timely manner, and providing excellent customer service.The ideal candidate will have a strong understanding of...


  • Muscat, Muscat, Oman beBeeAnalyst Full time 5,000 - 7,000

    Job OverviewThis is a role focused on IT operations and incident management. We are seeking a skilled professional to work in our 24x7 command center.Key Responsibilities:Ensuring all monitoring tools are operational 24/7 and monitoring critical systems for optimal performance.Managing service requests, escalations, and notifications, as well as documenting...


  • Muscat, Muscat, Oman InterTech Oman Full time

    SIEM Log Collector and Analyst (Entry level)SIEM Log Collector and Analyst (Entry level)Collect and centralize logs from various IT systems, including network devices, applications, operating systems, and other IT infrastructure.Analyze log data for anomalies, potential security threats, and incidents.Share analyzed data with the Client's IS team for further...

  • IT System Responder

    2 weeks ago


    Muscat, Muscat, Oman beBeeObservability Full time 6,000 - 8,000

    Job Title: Monitoring & Incidents AnalystWe are seeking a skilled Monitoring and Incidents Analyst to join our team in Oman. The ideal candidate will possess experience with performance and observability tools, as well as the ability to monitor all IT assets of the organization using various tools.Key Responsibilities:Operate and monitor all critical systems...


  • Muscat, Muscat, Oman beBeeDynatrace Full time 9,000 - 12,000

    About the Role">We are seeking a skilled Dynatrace Administrator to join our team. The ideal candidate will have extensive experience in administering Dynatrace for enterprise-level applications and a deep understanding of deploying, configuring, and managing Dynatrace to offer top-notch monitoring and optimization to applications and infrastructure.">">The...

  • OPEX Analyst

    2 days ago


    Muscat, Muscat, Oman Omantel Full time

    Role Purpose:To prepare the annual budgeting and quarterly forecasting - OPEX for Omantel Telecommunications company, implement cost reduction programs for the company in line with the best practices, to monitor the spending of budget, and assist in cost optimization program.Position InformationTitle: Analyst OPEXUnit: FinanceDivision : OPEXLocation :...


  • Muscat, Muscat, Oman National Bank of Oman Full time

    Job PurposeMonitors and analyzes the security procedures of an organization and defends against security breaches and actively isolates and mitigates security risks. In addition, SOC Administrator is responsible for integrating log sources into SIEM solutions and administering the health of integrated log sources into SIEM such as security devices,...

  • Data Analyst

    2 days ago


    Muscat, Muscat, Oman Oman Air Full time

    Oman Air has built up a reputation as a strong, competitive leader in the airline industry. We are committed to recruiting and nurturing bright and dynamic individuals to meet our manpower needs. In the new millennium, our mission is to seek out new ways to develop and improve our position as a leader in aviation excellence.We believe our people are the...