A robust monitoring service, integral to the operations of the IT Security department, is contingent upon the seamless and punctual ingestion of event and flow data. The significance of this timely acquisition cannot be overstated, as it forms the bedrock for proactive threat detection, incident response, and overall cybersecurity resilience.

The term ‘timely’ introduces a critical dimension to the monitoring process, emphasizing the necessity of receiving data promptly to ensure that potential security incidents are identified and addressed in a timely manner. Delays in data ingestion may result in a widening gap between the occurrence of security events and their detection, leaving an organization vulnerable to prolonged exposure and increased risk.

Moreover, the concept of ‘ingested’ underscores the need for the efficient and accurate assimilation of diverse data types into the monitoring system. This process involves not only the collection of raw data but also its transformation into a format conducive to analysis and interpretation. Challenges associated with data ingestion encompass issues such as data quality, compatibility, and the ability to handle large volumes of information efficiently.

Addressing these challenges requires a proactive approach involving implementing robust data management strategies, streamlined data pipelines, and utilizing advanced technologies like machine learning for automated anomaly detection. Furthermore, establishing effective communication channels between various IT infrastructure components is paramount to facilitating the seamless flow of information.


The ingested-challenge

Various services within the IT Security department rely heavily on ingested data. Next to the monitoring service, Cyber Threat Intelligence and Cyber Threat Hunting also rely heavily on ingested data.

Therefore, a logical question is: ‘Where is the data?’. A general rule of thumb is ‘Those assets listed in the CMDB and matching certain criteria should be onboarded.’ This is a good starting point, but we all know the challenges everyone is facing with ensuring the CMDB is kept up to date. Therefore, you need a safety net to identify gaps in the CMDB data.

A different service of the IT Security department can help you. The Enterprise Vulnerability Management program does own the data you need to identify the gaps in the CMDB if the program is set up and functioning correctly. Next to running periodic vulnerability scans, it should also conduct periodic interval discovery scans. The discovery scans can help you identify CMDB gaps if you implement this KPI/KRI ‘VLAN Discovery scan coverage.’ In other words, how many active VLANs are checked frequently on which addresses are alive?

Discovery scanning focuses on the on-premise estate, not the cloud estate. To find and identify these cloud assets, you need a different approach. You can only detect so much with a Cloud Access Security Broker solution (CASB). And it really depends on how the CASB solution is working. Therefore, the IT security department must take a proactive stance and hunt for cloud-based assets. Analyzing the firewall and authentication data may reveal clues about used cloud assets, but that implies that traffic is initiated from the internal network towards the cloud asset and/or the cloud asset authentication using the company’s central authentication service.

But still, you will only discover some assets, not all the cloud-based assets. It would help if you also had a solution to discover those cloud assets that use their own authentication system and/or are only accessible from outside the company’s internal network. In other words, you need an attack surface management solution. One that is based on actively crawling the internet.

But this is only part of the ingested challenge. The next part of the ingested challenge is determining which events to ingest. If you apply the ‘I want everything to be ingested’ approach, you are wasting your budget and only making your technology provider happy as you need additional licenses. You need to categorize the events into these three buckets, ‘Required for detection,’ ‘Required for context analysis,’ and ‘Not required.

The last category is easy. The only thing you need to do is configure the log source to prevent forwarding that particular event. Okay, true; it might not be as simple because not all log sources support selective forwarding. Therefore, you should build a log forwarding architecture that supports selective forwarding. This extra layer may also be useful for the second category, ‘Required for context analysis.’ because you can route these events to a less expensive solution such as a data lake and, therefore, avoid the more expensive SIEM solution license.

The ingested challenge also has a technical element. One that can profoundly impact the workings of the detection engine. Although the UDP protocol is fast at delivering the data packet across the network, it is also the least reliable protocol. You must conduct a performance test on the log forwarding architecture to understand its limits, for example, by running this simple Bash script. It generated 100k messages that should be forwarded and ingested. Are all the 100k events ingested in your environment? From performance tests that I have conducted, severe data loss was detected once the network was congested.


The timely-challenge

So far, I talked only about which data should be ingested. However, the time element of the event data is as crucial for the IT Security department as the event itself. Every generated event may contain the data to identify the boom-moment (also known as moment zero). Therefore, you must understand the time difference between the generated and ingested event timestamps. Especially for those events that belong to the category ‘Required for detection.’

At the same time, it is crucial to understand how the detection works. Is it based on the near-real-time detection principle, or is it based on the scheduled search principle? Both approaches have pros and cons. Both approaches profoundly impact how a use case is implemented.

When you combine these two elements, you also understand there might already be a gap between when the detection engine triggered the alert and when moment zero occurred. This is something you need to take into consideration when analyzing the alert.