We’ve talked about Microsoft’s view of threat modeling, but this next one might appeal to folks with a background in software that doesn’t crash – sorry, I jest, i jest! (or do I?) Well in this useful method, we’re all about understanding the flow of data across our systems. We’re talking Data Flow Diagrams (DFDs) used in software engineering and systems analysis.

These are handy because they provide a visual representation to depict the flow of data within a system or a process. They help in understanding how data is input, processed, stored, and output within a system or between different components of a system. DFDs are a part of structured analysis and design methods and are commonly used for modeling and documenting information systems. If you are coming from a software-centric organization that has a lot of software engineers, developers, and architects, borrowing this approach for threat modeling might be a great idea.

Nerd surfing with his laptop open in a killer halfpipe of nerd symbols.
This is the best AI could do with “nerd surfing on a keyboard through a half-pipe of data flow diagram symbols.

When I worked for Boeing’s Integrated Defense Systems business, we worked on US DoD programs that were using a more ‘robust’ version of this process up front. This gets out of hand quick so we used IBM’s Rational Rose software to capture Unified Modeling Language (UML) models that basically did this and other things. Similar concept, but we even used it to develop requirements for each stage of those flows. I wish we had just used the pure DFDs 😉

Back to the cyber use case, it is helpful to understand how these simpler Data Flow Diagrams are structured:

  1. There are several key components of a DFD:
    • Processes: You may see these as functions or transformations, but either way they represent specific functions or activities that manipulate data.
    • Data Stores: Also known as Warehouses, Files, or Databases, these depict where data is stored within the system.
    • Data Flows: Show the movement of data between processes, data stores, external entities, and data sources/destinations.
    • External Entities: Sometimes called Terminators, external entities may be anything or anyone interacting with the system being modeled, such as users, groups, or other outside systems.
  2. Notation: DFDs use specific symbols to represent these components. Commonly used symbols include circles for processes, rectangles for data stores, arrows for data flows, and squares or rectangles for external entities. While the symbols are important, the naming is critical: they should be understood without requiring a decoder ring or lookup table, and each should also have a unique ID number for easy cross-referencing.
  3. Levels of DFDs: DFDs can be developed at different levels of detail, such that the complexity of any DFD can be limited to between three and nine processes.
    • Level 0 DFD provides a high-level overview of the system
    • Subsequent levels (Level 1, Level 2, etc.) break down processes into more detailed sub-processes, and typically inherit their ID’s prefix from the tier above
    • Need an example? Level 0 might use IDs 1, 2, and 3, while Level 1 would then see 1.1, 1.2, 1.3, and so on
  4. Development: DFDs are typically developed through a process of analysis, where analysts gather requirements and model how data flows through a system. Tools like data flow diagramming software can be used to create DFDs. Chances are, someone in your organization has already done something like this. Now whether it is right is another topic…
Special note on notation - use whatever floats your boat! So long as you and your other stakeholders all find it clear and useful, that is fine
Basic symbols for the DFDs.
Someone finally catered to my level of artistic talent

Regardless of my bias from jobs long ago, Data Flow Diagrams can be a valuable tool in the context of Threat Modeling. This about it! To better know ourselves, we need a systematic approach to identifying and mitigating security threats and vulnerabilities in software or system designs. Sweet, huh? Here’s how DFDs might be useful in Threat Modeling:

  1. Identifying Attack Surfaces: DFDs help in identifying potential entry points (external entities) into a system. These entry points are potential attack surfaces that need to be secured.
  2. Mapping Data Flow: By tracing data flows through the system, DFDs highlight how sensitive data moves within the system. This helps in identifying areas where data might be at risk and needs protection.
  3. Identifying Vulnerabilities: Security analysts use DFDs to identify potential vulnerabilities, such as unauthorized data flows, data leaks, or improper data handling.
  4. Risk Assessment: DFDs assist in assessing the risk associated with different components and data flows within the system. Analysts can prioritize security efforts based on this assessment.
  5. Security Control Placement: Based on the identified threats and vulnerabilities, security controls (e.g., access controls, encryption) can be strategically placed within the system to mitigate risks.

So I encourage you to do some web searches, where you will see the seemingly simple concept adapted a million ways to fit the user’s needs. That is sort of the point of these guides and frameworks – to make them work for you! If you find yourself in a process you inherited or borrowed and the juice isn’t worth the squeeze, maybe you are doing it wrong. Try to adapt it to your environment’s needs, and take the best parts of what you find.

OWASP Data FLow Diagram for an online banking application
OWASP made it their own in this notional PBA DFD

So now that we’ve seen how DFDs can frame the “know yourself” context, the next question is – does this fit your environment? Sometimes the process of defining your process is part of the point! To that effect, we’ll talk Attack Chains next – maybe there is a hybrid of these techniques or others that works well for you?