Starting with the fundamental idea that information security is supposed to "secure information", we first need to determine what information must be protected. Here regulations may help specify, but there is much more information to protect in your environment than what is required- certainly confidential patient data and customer financial records must be protected, and not just because HIPAA or PCI DSS require it. Your organization may also have trade secrets, marketing campaigns, merger plans or other information which should be protected regardless of regulatory imperatives.
A basic rule of protection is that you must know what you have and where it is before you can protect it- even if the folks at MA OCABR can't figure this out. It doesn't matter if you need to defend jewelry from theft or credit card numbers from loss, you have to know where they are before you can protect them- so identifying the information you must protect is a logical first step towards both security and compliance.
The information to be secured will vary by organization and change over time, and therefore will require a flexible and versatile identification method. One effective approach is to start by asking three questions about the information to be protected:
- How does the information enter the environment?
- Identify every point of entry for the information.
- Include the origins of internally created information.
- Where is the information stored and accessed internally?
- Not simply where it is stored, but also where it is used.
- Not just where it is supposed to be, but where it really is stored and used.
- How does the information leave your organization?
- Map every egress point, including submissions to any outside organizations.
Note that you will have to account for remote workers, road warriors, and others "insiders" who store and access information while "outside".
Now for the truly informative step: connect the dots. All of the dots. Map all of those entry and creation points to the storage points to the use points, and then to the egress points. You will likely discover paths and storage locations previously overlooked, you may even need to go back and re-answer the three questions armed with your new insights.
With this exercise complete you can pick up the ClueBat and start cracking heads begin to build a plan for both securing the information, and meeting your compliance goals. Streamlining the information flow and reducing the number of storage points would be good starting points, these will reduce your exposure and simplify future security and compliance tasks.
Jack
4 comments:
Great points, as usual, Jack. When I did consulting gigs, my progress of initial questions was usually:
1) What information is important to your organization?
2) Where is that information stored?
3) What is the data flow of that information into and out of your environment?
Not many orgs could answer #1, even fewer #2, and almost none #3. Bog-mindling. :)
I'm mulling over the same problem, agree with your approach, but am left wondering about how to execute practically...
Context: After conducting an eye-opening risk assessment we issued a RFP for a Data Loss/Leakage Prevention (DLP) solution and are starting implementation. The initial focus is on monitoring and protecting data specific to the industry sector my organization operates in, and the project would be considered successful if it achieves the objectives specific to this data. However, the organization has tons of other sensitive information and we would not realize the full benefits of the DLP solution if we didn't also monitor and protect this information. So we want to follow a process similar to the one you've outlined.
Thoughts I'd like to add to the conversation:
1. The data mapping process should follow a structured approach to be repeatable, produce consistent results, and scale in large organizations where interview-based data discovery is not feasible and self-assessments are required. Even then, this process will require much in the way of information security practitioner time.
2. Metadata and taxonomy is important if you want consistency and need to be able to slice and dice the data, e.g. to be able to focus on the top data entry points, or the most important exit points. The metadata and taxonomy probably needs to include:
Entry points (physical and electronic) - email, call center (voice), outlets, branches, stores, warehouse
Internally created information - organizational structure or HR organogram could serve as the base starting point for the "creator" dimension although you'd also need to look at system-generated data
Where the information is stored - list of all applications/systems and an understanding of how the entry/exit points, users and systems interact [although I think there is a system/application aspect to each of create/store/user/exchange]
Exit points - similar to entry points, but we'd need to come up with a list (of the names) of all other organizations which which we have business relationships, e.g. financial institutions, HR agencies, suppliers, etc.
Data types
Data sensitivy or classifications
3. From the above two points it becomes obvious that we need a system in which to populate the metadata, run the surveys and questionnaires, report on the data gathered, etc. I'd be very interested to know what other organizations are using software-wise (commercial or open source) to facilitate all of this, how long it took them to build the software internally, and how much effort is involved in maintaining it. [Access and Excel - while useful - don't count as sustainable and scalable solutions in my view but I'm open to being corrected.]
4. Organizations who have already documented their business processes have a significant advantage over organizations who haven't. If the process documentation is available, it should help information security practitioners better understand the businesses they are supporting and the information flows.
5. Some industry sectors have commonly accepted process frameworks which may give them a head-start in identifying business processes and data types and hence information flows. eTom in telecommunications for example.
6. Information security practitioners are in this mess - not knowing in enough detail what information is most important to the business and who is authorized to access it - because we've historically focused on securing the data containers (systems/databases/applications) instead of the data itself. To some extent this is because the solutions haven't existing to secure data portably, i.e. the protection and data are inseparable as data is moved between systems.
7. I believe that Information Lifecycle Management and Information Rights Management, when widely adopted, are going to help us significantly to protect unstructured data.
Ben, again our different experiences and perspectives converge and have us in agreement- thank you for adding your insight. I am both gratified and mortified that your experience is in line with my thoughts on this.
And Stephen- excellent comments. I especially appreciate the insight in point number 6, "...because we've historically focused on securing the data containers...instead of the data itself". You bring up very good points on scalability issues, this is one of the places where smaller businesses truly have an advantage, and never having worked in a large enterprise I appreciate your insights.
I had to perform this same exercise a few months back, and took a different approach. Instead of going at the data, from the data, go at the data from the process.
That is, I met with each line of business and area of the company and began very broadly and worked our way in. Just define the pockets of data such as: Human Resources stores and gets payroll data, social security numbers, applications, retirement information and so on. Instead of defining that by the different applications and systems that hold that data first, just allow them to work with you to label the information they know about.
Next, you have rate that information in terms of sensitivity and begin locating the systems and databases they are stored in.
Then you determine the appropriate level of protection, storage, and destruction of that data.
Post a Comment