Semi-Structured Data

Semi-structured data sits in between structured data and unstructured data. Semi-structured data has some level of metadata tagging to identify information that gives context to what data points are about. But, like unstructured data, it’s not collected in accordance to a particular data model, or schema.

Semi-Structured Data vs. Unstructured Data: What’s the Difference?

For example, an image file may be considered unstructured data. But, adding image ALT tags associated with the image that provides some information on what the image is about, transforms the file into semi-structured data.

Semi-structured data is the largest growing area of data. This is due to the increase of meta tagging across documents, images, and video to help classify and categorize the content for search engine optimization and organization.

What are the Different Types of Semi-Structured Data?

Different types of semi-structured data includes:

  • Compressed Files
  • Emails (unstructured body text, but with structured data like subject line and send date)
  • Images (that include metadata)
  • Webpages

How Does a Customer Data Platform Manage Semi-Structured Data?

Data collection needs to be standardized in order to integrate successfully. And more often than not, that data is fractured and is residing in disparate silos. The right technology solution can help gather that data and combine it together in a standardized fashion.

A customer data platform (CDP) is able to integrate, unify and deliver structured, unstructured and semi-structured to the right teams across the organization. Organizations are also using CDPs to ensure that data is secure and compliant with emerging global data privacy regulations.

With data that is standardized and integrated into unified profiles, enterprise businesses can de-silo different departments and work together using single source of truth for all customer data. Transforming semi-structured data into structured data with a CDP can be the differentiator brands need to stay ahead of the competition and stay relevant to their customers. 

Amy Onorato
Amy Onorato
Amy Onorato is the Managing Editor of and Senior Content Marketing Manager at Treasure Data. Prior editorial and creative roles include journalism, content marketing and content strategy for CBSNewYork, Newsday, DMN, and Publicis Sapient.

More To Explore

Is 2024 the Year of the CDP?

Discover why 2024 will be a critical year for the CDP market. Learn more about the latest trends, challenges and opportunities shaping customer data platforms.