Skip to main content

Checklist of the information a manager of a NOC needs to have close at hand


This is a checklist devised by DS of the information a manager of a Network Operations Centre (NOC) needs to have close at hand:

Command and control

  • Date and time of current shift including start, finish and handover.
  • NOC manager on duty
  • Shift leaders
  • Service Delivery Manager on duty/standby
  • Major Incident Manager on duty/standby

Tiger Team status (refer here for process)

  •  echo
  • whisky
  • delta
  • romeo
  • bravo
  • alpha

Red-Amber-Green (RAG) of the trenches

  • Security
  • Data centre
  • Apps
  • Support
  • Infrastructure

Notifications

  • Ongoing Service Level Agreement (SLA) or contract violations
  • All Major Incidents
  • All failures and outages
  • Last 10 maintenance tasks completed
  • Next 10 maintenance tasks scheduled
  • Planned continuity tests scheduled (inverter/generator tests, network path protection tests, business continuity or application high availability tests)
  • Resources available to the NOC
  • Resources unavailable to the NOC
  • Changes completed during the past week (includes the status on whether they were successful or failed)
  • Changes scheduled for the next week
  • Emergency changes completed or in progress
  • Top 10 most important projects that are ongoing
  • Top 10 congested links
  • Top 10 devices with temperature alerts
  • Top 10 devices with cooling alerts
  • Top 10 devices with storage or capacity alerts (including raid failures)
  • Systems/devices with known problems or symptoms of degradation
  • Top 10 network with path protection faults

Popular posts from this blog

Using OPENDNS on a Mikrotik

At the office we use a Mikrotik which is connected via fibre to Cool Ideas.  We use OpenDNS as a Information Security tool.  It prevents ransomware and bots from becoming major incidents within the office.

The router is scheduled to do a daily update via script of the OpenDNS settings.  Below is the example:

:local opendnsuser "user@domain.co.za";
:local opendnspass "itsprivate";
:local opendnshost "office";

:log info "OpenDNS Update";
:local url "https://updates.opendns.com/nic/update";
/tool fetch url=($url . "\3Fhostname=$opendnshost") user=("$opendnsuser") password=("$opendnspass") mode=https dst-path=opendnsupdate.txt
:local opendnsresult [/file get opendnsupdate.txt contents];
:log info "OpenDNS: Host $opendnshost - $opendnsresult";

The Hours of WannaCry from the Cisco Umbrella Blog

In the span of just 10 days, two large-scale, wormable attacks grabbed international headlines. First, a phishing campaign posing as a Google Docs sharing request gained access to Google accounts then spread across its victim’s contacts, and now, a ransomware campaign with a bite, named WannaCry, autonomously infected vulnerable systems leveraging an exploit leaked on the internet. In the early minutes of the attack, we worked with our Talos counterparts to analyse the behaviour of WannaCry and protect our customers. We were also particularly proud to see that our Investigate product helped MalwareTech reduce WannaCry’s impact. In this post, we hope to give you a retrospective analysis of what we’ve observed during the first critical hours of the event. 
Read more here.

PUE: to compare is human – to improve is divine (part of the DS PUE series of articles)

Yes, I know – it’s an inadequate distortion of an old, clichéd proverb.  Yet, I say this too often in client meetings and peer discussions, “Don’t compare the PUE (Power Usage Effectiveness) of your data centre to that of another because it’s a pointless exercise. - Lee Smith Read the article here about Power Usage Effectiveness (PUE) in data centres on our website.