About Gadi Yedwab

Gadi Yedwab is the founder and CEO of Explore Analytics. Prior to founding Explore Analytics, Gadi served as VP of Product Development at ServiceNow, a leading provider of cloud-based services that automate enterprise IT operations. Prior to ServiceNow, Gadi Yedwab held executive positions at Quest Software and Brio Technology (which was acquired by Hyperion and then by Oracle). You can reach Gadi on twitter at @GYedwab or using the Feedback Form.

Broadening the Reach of Self-Service BI

The necessity of self-service is obvious once you realize that traditional BI has limited reach within user communities. For example, BI dashboards are typically tailored to the needs of decision makers and leave out a broader group of analytically-minded users who could leverage data to innovate and make improvements. For small companies and for teams with limited budgets, self-service is often the only viable option because current BI approaches require people’s time and expertise to setup.

The proliferation of spreadsheets as tools for data analysis is a proof that that existing needs are unmet.

The current approaches to self-service often suffer from the same problems that limit the reach of BI to broader user communities. This article focuses on these problems and discusses a new approach that can significantly broaden the reach of BI.

The Problems with Current Approaches to Self-Service BI

  • IT organizations often concentrate their efforts on the most strategic data, while leaving a lot of useful data outside the scope.
  • For performance reasons, IT often opts for data warehousing. This approach is expensive and therefore has limited reach. Small companies lack resources and find this approach to be cost prohibitive.
  • Providing self-service by periodically delivering data sets for analysis in spreadsheets or desktop tools does not satisfy the need for real-time data. Latency of information is often cited by users as the major drawback of their BI solution.
  • Desktop BI tools and spreadsheet downloads can be a security risk when users keep data on laptops, or send it via email. This approach also makes it hard to share and collaborate in the analysis.

The Spreadsheet as a Self-Service BI Tool

Let’s admit it: the number one tool for self-service BI is the spreadsheet. It’s been that way since the invention of the spreadsheet, and it still is. The most typical scenario is exporting data from an application and then analyzing it in Excel. The main drawback of this approach is that it’s outside the skill-set of most users.

Sure, having the data in a spreadsheet is better than having nothing, but using Excel for BI has serious limitations. Most users do not have the necessary skills to analyze data in Excel, especially if the data resides in more than one table. Even for users who are skilled in Excel, the data quickly becomes stale and there is no good way of collaborating with other users in the analysis.

A New Approach to Self-Service

The new approach minimizes the need for data warehousing thereby reducing costs and providing real-time data. It uses cloud-based solutions to facilitating collaboration and sharing. Moreover, cloud-based tools can bring the required expertise and cost down to within the means of small companies and teams inside large companies.

The premise is simple: if a solution can be useful to small companies with limited resources, then it can be very useful for all the under-served constituencies inside large companies. The spreadsheet already proved that, but we can do much better than that.

Reducing the Need for Data Warehousing

For more than two decades the common wisdom has been to keep ad-hoc query away from production systems. This is generally still a good idea. However, there are good reasons to reconsider that widely accepted notion.

A good self-service BI tool can control and prevent runaway queries.

Explore Analytics, for example:

  • Only joins tables on the primary key
  • Puts a limit on every query to prevent it from returning too many rows
  • Pushes all the filtering and aggregation to the data source thus eliminating the need to pull large query results
  • Controls the number of queries that concurrently execute against a data source

Modern database servers eliminate three reasons why a bad query would previously bring a database down to its knees.

  • Having multiple CPU cores, the database performs well even if several cores are momentarily tied up.
  • Large portions of the database reside in memory and a full-table scan can be done without noticeable impact to other transaction.
  • Liberally creating indexes doesn’t come with the performance penalty that it had a decade ago.

While “Big Data” is an important category, a lot of useful data reside in tables with less than a few million rows. Running a query to summarize data across a million rows can complete in a few seconds. That wasn’t the case a decade ago.

Using the Cloud

Having a centralized web-based self-service BI solution allows users to share and publish their analysis. It allows teams to leverage the diverse strengths of individuals and review the analysis to increase its accuracy. Analytically-minded people can create data analysis and share it with the rest of the team.

By keeping data sets and reports securely in the cloud, companies can avoid distributing data to laptops, desktops, and passing it around in email attachments.

If you’re thinking that the same can be accomplished using an in-house web-based solution, you may be right, but you should consider the cost and expertise that’s required to build and support this solution. A cloud solution can greatly reduce the expertise that’s needed as well as the direct costs of the service. It then becomes feasible even for small companies or teams.

IT Call to Action

IT organizations should identify data sources for real-time access. For other data sources, consider publishing data sets to the Cloud. Then provide a cloud-base tool such as Explore Analytics to deliver self-service analysis to users and unleash their creativity.

Application Vendor Call to Action

Application vendors should enable real-time data access by providing web-services APIs that allow ad-hoc query including joining data, filtering and aggregation. Remember that if you allow tools to push the filtering and aggregation to your application, then they’d have no need to pull large results in real time.

About Explore Analytics

Explore Analytics is a SaaS BI tool for data analysis, visualization, and reporting. Its approach is to meet the needs of currently under-served users by designing a self-service solution that would be usable by small companies, individuals, and small-teams with limited expertise and resources.

2012 Presidential Elections Popular Vote

The following chart is a good example of displaying diverging data using color. We set “50%” as the neutral point and show higher values in one color and lower values in another color. We show “50%” as white and the color gets darker as values diverge from 50%. The value being measured in this chart is the percentage of votes for Barak Obama. You can use the mouse to hover over a state (or, on a touch device, use your finger to touch) to see additional details.

This chart is powered by Explore Analytics.