We go to great lengths to capture it, store it, organize it and, ultimately, use some of it! Data is all over the place within our organization, whether it exists in electronic or hardcopy format, and we constantly are looking for more of it. Data has become an essential component in our business toolbox. When faced with a business decision, we look for data to bolster our wisdom and experience. Within our information environment, however, the never-ending quest to use data efficiently and appropriately is constantly being challenged by the validity and source of the data.
Navigating the data
Most organizations today have many transaction systems to support their business processes. These include financial systems, production systems, maintenance management systems, customer service systems and the like. Transaction systems record events and subsequently store the information within their respective database environments. In some cases, many of these systems share the same database as a result of integration. In most organizations, though, islands of data and information often are the norm, as the transaction systems are truly stand-alone and never integrate with other systems. This is not necessarily a poor business process since information requirements vary within business functions.
Consider for a moment the information required to manage the maintenance organization on a day-to-day basis. This information includes work orders, spare parts and labor resources, to name a few. Do the folks in the accounts receivable department need to have access to this data? Probably not. However, this information is contained within the maintenance management system to enable the maintenance department to track and record their maintenance activities. Over time, this information continues to grow and grow within the database. At some point, some part of this data may be retrieved in the form of a report, chart or spreadsheet in order to examine trends or status. One of the primary roles of these transaction systems is to record and store data.
Now, imagine many transaction systems recording and storing data in different places. Just think for example, every time you dial the telephone a transaction is recorded—the date and time, the number called and the length of the call are just some of the data elements stored. Considering that there are many transactions being recorded from multiple transaction systems, how do businesses consolidate all this information and why?
The most common way to bring your data together is data warehousing. Simply put, a data warehouse is a collection of information within a single information environment, usually located on dedicated computer hardware. This type of strategy provides the user community with a common location to look for and retrieve information. Sounds easy and straightforward, but as we have learned, there are always challenges to overcome.
Dealing with the challenges
First and foremost, remember “garbage-in, garbage-out.” Data warehousing will not ensure the quality or validity of data. To a certain degree, this is one of the jobs of the transaction system, but the ultimate responsibility lies with the source and entry point of the data. As in any analysis and decision process, the supporting documentation and information must be of impeccable quality.
Secondly, be prepared for a significant effort at the beginning to initialize the data warehouse with all the appropriate data from the appropriate source. It is sometimes easy to visualize this effort as filling a wheelbarrow with shovels of data from different piles of information created by multiple transaction systems. At the very least, what you end up with is a wheelbarrow full of stuff that has little or no meaning. It becomes the duty of the responsible data warehouse to not only define the appropriate source, but also define the appropriate relationships between the data elements so that retrieval performance can be optimized. This becomes an ongoing task as new transaction systems are added, removed or upgraded.
Implementing the appropriate tools to retrieve, analyze and format the information is another important component of utilizing any data warehouse or data storage configuration. It is essential that the capabilities of end users be considered when providing a reporting tool. End users should not need a degree in computer programming or computer science to be able to retrieve and present desired information.
“Data mining” is a term that’s frequently used to define the process of retrieving data. While purists may correctly insist that data mining is more than just retrieving data from a data source, the casual user thinks of data mining as the activity of extracting data from a database—whether it is from a data warehouse or a transactional system. Fortunately, there are many tools available today to perform this function, from spreadsheet software to sophisticated business intelligence toolsets. Presentation of retrieval results can vary, including reports, information dashboards, multi-dimensional charts, etc. Remember, all this capability requires quality data, training of users on the right tools and enough computing power to process the requests.
Often, the question is asked why not provide the data retrieval tools to access the transaction data environment? Think about the transaction system in use in your job today and the last time someone (or you) attempted to retrieve a large amount of information (maybe accidentally) and the resulting decline in system performance as evidenced by the moans and groans coming from an adjacent cubicle. Where there are many users on the transaction system, the opportunity for this problem to occur increases. In an effort to keep it from happening, data warehouses typically are installed on their own computer—which can be sized appropriately for this activity without competing with transaction processing.
Security of information is another challenge facing the implementation of a data warehouse and data access tools. Where the transaction systems have their own security as to who can see certain data and perform certain tasks, this same capability has to be implemented within the data warehouse for obvious reasons.
Data warehouse questions
The previous paragraphs have identified some of the considerations for a data warehouse environment. The greatest advantage for a data warehouse is a common set of data elements for use by the organization. This eliminates “my data doesn’t agree with your data” situations since all of the information is coming from the same place. Of course, timing is constantly an issue as data is always retrieved at a point in time and, as a result, is continually changing. Typically, where there is a great quantity of information from numerous transaction systems with many users, the benefits of data warehousing far outweigh the challenges
What about small- to medium-sized organizations? Does the data warehouse strategy make sense? Such questions should be considered on a case-by-case basis.
For many small companies, data warehousing is probably unnecessary due to the volume of data and number of users. For medium-sized companies, the strategy is dependent upon two factors: the larger the data volume and number of users retrieving the data, the more appropriate the strategy becomes.
Using the right tools
Today, there are many tools available within the transaction software to retrieve and present data. Don’t let that, however, obscure the fact that evaluating and analyzing the data is still the most important activity.
Remember, too, that trends may be more important than snapshots. For example, losing a baseball game in the middle of a major league season is only a single event—losing eight consecutive games in the middle of the season is a trend that requires corrective action. Similarly, within the maintenance arena, schedule compliance being low for a week is not likely cause for concern. On the other hand, a trend downward for several weeks does require attention.
“Information overload” and “analysis paralysis” are terms to keep in mind when determining data and information requirements. Collecting, storing and retrieving data that does not provide value is a waste of valuable resources. To get the most out of information systems, improve the quality of that data which will facilitate value decisions, while using appropriate tools to improve and enhance the performance of the organization. Although effort is noble, in the end, it’s results that count!