|
What are hidden relationships and how can I find them?
by Jim Nolan
March 2008
|
Join Us for a Complimentary Online Seminar on this Subject
Date: Wednesday, April 2, 2008
Time: 2pm Eastern; 11am Pacific
Duration: 1 hour
Host: Carahsoft

|
Last week I was updating my CEO, John Donnellon, on the capabilities of BOBCAT when he asked me a question: "What do you mean by a hidden relationship?". He then followed that by saying "show me one". I was thrown off by the question. I thought to myself, "what does he mean by that, isn't it obvious?". We have been grinding on the algorithms to find these hidden relationships for years now, it must be obvious by now, right? When he asked me to show him a hidden relationship and I didn't have an immediate answer I realized there was a major problem in my thinking. Upon further review I realized "hidden" and "obvious" don't really go together (see oil and water). I needed to do a little better job in explaining to people exactly what I mean by the "hidden relationships" that are being discovered by BOBCAT.
Definition: Hidden Relationships
So, let's start by first defining a hidden relationship. My Random House dictionary defines hidden as "concealed; obscure; covert" and relationship as "a connection, association, or involvement". OK, good, I've got the fundamentals down now - a hidden relationship is a concealed connection or association. But that word "concealed" jumps out at me. If the association is concealed it means that someone is actively cloaking his or her activity. This proactive concealment suggests a different type of problem than we might otherwise expect. Not only do I need to find hidden relationships, but the relationships are being purposefully hidden.
Finding the Hidden Relationships
Now that we've got a clear definition of a hidden relationship, let us move onto the really interesting part: finding them. In the past, document co-occurrence was the primary technique to identify relationships across entities. Co-occurrence refers to two entities being mentioned in the same document, email, report, or other media. In other words, if two individuals, John and Mary, were mentioned in the same report, we would make an association between them.
Co-occurrence has its advantages in that it is easy to understand and implement. However, it has a few disadvantages:
- Co-occurrence will miss relationships between entities that are not mentioned in the same report.
- Co-occurence may imply relationships between individuals who are mentioned in the same report but may not have any meaningful relationship at all.
The bottom line is that co-occurence is not really going to reduce the workload on the intelligence analyst - which is really our ultimate goal. The analyst will have to deconflict all of these errors, which can be frustrating and cause him/her to lose confidence in the results of this technique.
So how do we find the true hidden relationships? Well first we need to move away from techniques that center on how the data is reported and towards techniques that identify activities. Think about your own relationships for a minute. Whether it is your work colleagues, family, golf buddies, or bridge partners; your relationships center on common activities. The same can be said for terrorist operations - the cells and individuals are linked through common activities.
Discovering Hidden Relationships through Common Activities or Themes
Let's take a look at one approach to identify common activities that we have created here at DAC. We have previously illustrated this approach in the "Five Steps to Finding and Stopping the True Terrorist Threat" article that Jessica published earlier. Here I am going to illustrate how to solve the hidden relationship problem using news articles published yesterday, March 25, 2008.
What I have done is run a series of news articles through BOBCAT to identify activities (we call them "Themes") and relationships. News sources were obtained by using the Really Simple Syndication (RSS) protocol from public news providers such as Yahoo or CNN. As you can see below on the left, there are quite a few relationships in the news data and things really do not become clear until I start to filter based on the strength of relationships (on the right).
One entity that jumps out at me (and I have highlighted) is Al-Qaida. I am particularly interested in exploring the relationships that have been identified between Al Qaida and other entities.
[NOTE: Zoom in on these images in your browser for full resolution]
By clicking on Al-Qaida I can immediately see what other entities are related (below left). I see some some expected links: Osama, Iraq, Shiite, etc. However there is a relationship that stands out to me: Hezbollah, another well known terrorist group that has a history of feuding and working with Al-Qaida.
In looking in the themes that I have generated, I quickly find the Al-Qaida theme (above right) to see if Al-Qaida and Hezbollah are mentioned together in a news article. After reading through several of the news stories, I see no mention of Al Qaida and Hezbollah in the same article. They are mentioned frequently in their own separate articles but there is no apparent linkage in the current news stream.
So how are they linked? If I examine the articles within the theme that has created the association it becomes obvious: the linkage is through their common declaration against Israel. We can begin to draw many conclusions from this linkage, but that is outside the scope of this article.
So how does it reduce the workload of an analyst? By making these associations through activities or themes, the analyst can quickly focus on the entities that they are interested in, or be notified when new relationships are created. By organizing the data based on themes, and creating relationships based upon the themes, the analyst can focus on the data that is most important and ignore data that is not relevant.
Simply put, the analyst can ask the right questions and drive to more accurate conclusions faster than ever before.
As usual, I look forward to your feedback on this article.
Jim
jim.nolan@dac.us
|