Complex Data: Key Best Practices On How Organizations Can Handle It More Efficiently

Written by Jeffrey Wolff, Certified E-Discovery Specialist 


Many of us have a love-hate relationship with data, specifically the complex data generated by modern cloud-based software. On one hand, it’s a necessary part of most organizations’ daily operations, especially for litigators and other legal professionals who deal with eDiscovery. On the other hand, complex data is exactly that – complex – and it can make the eDiscovery process slower and more challenging.

In this post, we’ll first give an overview of complex data and explain the difference between the two types of complex data, structured and unstructured. We’ll then share how today’s legal professionals deal with complex data and offer some tips about how your organization can handle complex data more efficiently, especially when it is scattered across multiple repositories.


What is complex data?
What are the challenges legal professionals face with complex data?
     Legal operations
     Information technology
     Healthcare organizations
     Outside legal counsel
How can organizations manage complex data across multiple repositories more efficiently?
     1. Practice proper governance
     2. Invest in technology
With diligent information governance and advanced technology, managing complex data is less complex

What is complex data?

We’re referring to complex data as the data that’s generated by modern cloud-based applications or software-as-a-service (SaaS) programs. Examples of this include word processing documents, maps, images, video, audio files, graphs, and databases.

There are two main types of complex data: structured and unstructured.
Structured data is typically quantitative data that has clearly defined internal parameters and relationships. Structured data generally appears in columns and rows of a relational database. It may include dates, names, addresses, credit card numbers, and other data types that fit within defined fields. This type of data is easy to use and access but is generally inflexible.

In contrast, unstructured data is usually qualitative data that users cannot readily process or analyze using conventional tools or approaches.

Unstructured data includes:

  • freeform text such as documents and text or chat messages;
  • audio and visual content including pictures, video, and audio recordings;
  • online content such as social media posts;
  • sensor data from internet of things (IoT) devices; and more.

This type of data requires more expertise and specialized tools to manage.

What are the challenges legal professionals face with complex data?

The legal discovery process can be tedious and complicated even without complex data. But when it comes to eDiscovery, new forms of complex data can be particularly difficult to search, review, and collect. That’s especially true because complex data tends to be scattered across multiple repositories.

Complex data presents many common challenges that affect different groups of legal professionals in varying, unique ways. Consider the following four groups.

Legal operations (legal ops) may lack the specialized tools and expertise they need to manage complex data. This becomes especially apparent when they are tasked with handling complex data spread across multiple repositories, and when much of that data is unstructured data of varying types (such as social media, audio, and video).

Legal ops may also struggle to keep their lawyers up to date on new eDiscovery technology, which is often necessary when it comes to processing unstructured data. This technology can include artificial intelligence (AI) tools and other searching and indexing software.

Additionally, legal ops teams may have difficulty maintaining a sufficient number of staff who can competently handle eDiscovery involving complex data due to budgetary constraints and the competitiveness of the job market. Staffing is particularly challenging when eDiscovery workloads are intermittent, as it’s difficult to scale upward quickly and financially infeasible to keep a large staff when the workload is low.

Information technology

The information technology (IT) teams that support legal professionals often face unique time constraints due to their other obligations within their organizations. IT cannot focus solely on assisting legal to the detriment of other business units.

If left to manage complex data for eDiscovery independently, IT may also fail to identify important documents or over-collect information. IT staff generally aren’t lawyers, so this is a considerable risk when there is a lack of specificity or direction concerning the kinds of information they need to identify and collect. Failure to identify pertinent information or over-collection can ultimately inflate the cost of eDiscovery and drag matters out.

Lawyers within healthcare organizations 

Healthcare organizations and their lawyers face highly specific challenges when it comes to handling complex data.

Healthcare organizations store a tremendous volume of clinical data – approximately 19 terabytes per year on average – which means they have an even greater need for assistive technology such as AI and search-in-place functionality.

Additionally, healthcare organizations face unique challenges due to the sensitive information contained in their data. This information often falls under the Health Insurance Portability and Accountability Act (HIPAA) and includes both personally identifiable information (PII) and personal health information (PHI).

Finally, many healthcare organizations rely on relational databases to manage patient information, but those structured repositories cannot efficiently handle unstructured data like clinical notes and transcripts.

Many organizations turn to outside legal counsel to manage their eDiscovery needs. Outside legal counsel can be very skilled and knowledgeable, but they are rarely the most cost-effective approach for eDiscovery. Outside legal counsel may be even more costly when they are called upon to locate and search complex data across multiple repositories.

How organizations can manage complex data across multiple repositories more efficiently

There are two practices that today’s legal professionals can adopt to efficiently deal with complex data across multiple repositories.

1. Practice proper data governance

Implementing and maintaining proper governance can minimize the challenges of data identification and collection by supporting your organization’s overall data management strategy.

Data governance is a system of “people, processes, and technologies” that manages and protects data assets and defines who has authority and control over those data assets. Data governance is a very important component of data management because it “supports an organization’s overarching data management strategy” and ensures accountability and ownership.

Here are six core principles for proper data governance that your organization can employ now to improve your data management strategy:

  • Recognize that data is an important asset to your organization and has real value.
  • Clearly define who owns data and who is accountable for its management.
  • Clarify standardized rules for data management and ensure compliance with those rules.
  • Consistently manage data quality and implement regular testing to monitor whether quality standards are being met.
  • Create a unified change management strategy to encourage adoption of the new approach.
  • Practice transparent data auditing.

By implementing policies and procedures that align with these core principles, your organization can ensure proper governance of complex data.

2. Invest in technology

Managing complex data is no easy task, especially under the time constraints imposed by discovery deadlines. Luckily, technology is advancing to meet the ever-growing needs of organizations that are swimming in seas of complex data.

The right technology can make data management processes faster, easier, and less costly for your organization. In case you store complex data across multiple repositories, consider turning your attention towards an in-place search solution, as it would allow your organization to search and locate all the necessary data even before collection.

Subsequently, this means you will be able to implement early case assessment (ECA). ECA is a process that focuses on locating crucial documents and communications at the outset of a matter. That way, you can use your findings to calculate risk, determine the proportionality of different approaches, and – where appropriate – formulate a litigation strategy during the early stages of discovery.

With diligent information governance and advanced technology, managing complex data is less complex

Managing complex data doesn’t have to be so complex! By implementing the data governance and technology solutions we’ve discussed, your organization can achieve greater efficiency, accuracy, and excellence in complex data management and eDiscovery. If you are ready to step up your data management and collection processes, get in touch with our experts to learn more about IPRO’s solutions.