Databricks Data Lakehouse vs. a Data Warehouse: What’s the Difference? Read Our Latest Blog...
Databricks Data Lakehouse vs. a Data Warehouse: What’s the Difference? Read Our Latest Blog...
Start Free Trial

ChaosSearch Blog

5 MIN READ

How to Index and Process JSON Data for Hassle-free Business Insights

How to Index and Process JSON Data for Hassle-free Business Insights
4:52

If your IT department is generating a tsunami of JSON-based log and event data, ChaosSearch® JSON FLEX® can fast-track automatic, flexible indexing for custom insights of your valuable business data.

JavaScript Object Notation (JSON) has become the de facto standard for log and event data created by business applications and services. The easy-to-read, semi-structured format can hold a wealth of information and statistics. As users perform their work, submit forms, buy products, and run their critical business operations, the JSON events are a continuous record of those essential operations, events, and interactions throughout the day, every day.

The challenge is keeping up with a steady volume of thousands, millions, and even billions of events generated. Analysts must have an easy way to condense that volume into compact, searchable data that they can quickly and easily query and visualize. For JSON data, in particular, nested arrays and nested properties support very rich information layers, but converting those layers into searchable two-dimensional representations is not straightforward. There's always the risk of the JSON permutation explosion—just one nested JSON record could quickly balloon into a million rows, or a million columns, of indexed data to represent that information.

 

JSON Indexing and Processing

 

JSON FLEX® leverages the advantages of the Chaos Index® technology to uniquely represent the dimensionality of JSON. With JSON FLEX, administrators have the tools to selectively control the indexing of JSON source, creating the highly optimized Chaos Index data for in-depth querying of data at scale, and offering flexible ways to unlock value for the end-user analysts.

JSON FLEX tackles the JSON indexing and searching challenges from two important vectors:

  • Flexible Chaos Index® choices that can process and efficiently store JSON structures in the patented ChaosSearch indexed data format
  • Flexible Chaos Refinery® views with schema-on-read transformation features that empower analysts and end-users to specify, quickly search, and visualize their important business data

WATCH: Unlock JSON Files for Analytics at Scale in ChaosSearch

The patented ChaosSearch Index and data analysis solution includes proven features for filtering, indexing, and creating that compact Chaos Index data. The ChaosSearch core design champions born-in-the-cloud behaviors like scaling, high availability, and centralized processing to keep our services close to your cloud storage and the driving principle to always keep your data inside your cloud storage.

When JSON events are in the mix, ChaosSearch adds additional, powerful JSON-processing features.

 

Watch this quick demo to learn more:

 

Options to selectively apply JSON array expansion rules

Flatten JSON Arrays

 

Avoid the JSON permutation explosion—not by leaving out important data from source files—but by using the Chaos Index to flatten some arrays horizontally for storage, some vertically for filtering and aggregation usage, and some to JSON string blobs when the data might be valuable for search and query results.

 

Rules to exclude (or include only) the target JSON file content

Keep source data intact to avoid costly re-pipelining to scale down content. Use Chaos Index rules to specify the arrays and properties you want to index within an object group, reduce index storage footprint, maximize scan performance, and filter out unneeded data. If other analysts wish to evaluate different or excluded arrays from the same source files, they could create their own object group and indexed data with varying inclusion rules.

 

Views with JSON transformations and schema-on-read features in the Refinery

Customize Analysis of JSON Data

 

Materialize columns from the content of JSON strings, to make a JSON string field searchable with an Elastic nested query path, or JSON Array Transformation—the ability to transform a horizontally indexed array to a virtual vertical array. JSON Array Transformation lets you take advantage of the storage benefits of horizontal expansion and the analysis features of vertical expansion, all as a schema-on-read view materialization. Different business analysts can create their own views with different transformation rules to mine the insights they want from the same indexed data.

 

Transformation Process To Get Value From JSON Data

 

It's a lot to take in, but we're excited about the JSON FLEX capabilities and how JSON FLEX can help organizations with JSON files stacking up in their cloud storage. Don't let your valuable JSON log and event data go unused. Don't spend critical time and storage costs with log shipper duplications trying to work around the JSON permutation explosion. Try JSON FLEX.

 

Want to learn more about how ChaosSearch works?

Get a Demo on Demand

 

Additional Resources

Read the Blog: Best Practices for Effective Log Management

Listen to the Podcast: The Data Management Triangle: Lake, Warehouse, Virtualization

Check out the Whitepaper: Top Strategic Technology Trends for 2023: Applied Observability

About the Author, Barbara O'Toole

Barb O'Toole is the Senior Technical Writer at ChaosSearch. Barb works with Engineering and Customer Success teams to create the information content to support the product. She loves learning about and testing the technology that she documents, and has a long history of working with industry-leading computing products. Barb unwinds with her family, friends, time on Cape Cod, and a healthy golf addiction. She holds a Bachelor of Science in Interdisciplinary Studies from WPI. More posts by Barbara O'Toole