OnLine Analytical Processing (OLAP)
When should we consider implementing OLAP technology?
Basically, any business process that requires us to analyze( roll up,drill down etc.)transactional data across a variety of categories is an excellent application of OLAP technology.
- Akey tenet of OLAP is that users should see consistent response times for each view, or slice, of the data they request. Because data is collected at the detail level only, the information summary usually computed in advance. These precomputed values,or aggregations, are the basis of the OLAP performance gains.
- OLAP services includes a middle_tier server that allows users to perform sophisticated analysis on large volumes of data with exceptional performance.
- Another feature of OLAP services is PivotTable service , which allows users to conduct analyses while disconnected from the corporate network.
- OLAP services organizes data from a data warehouse into multidimensional cubes with pre calculated summary information to provide a answers to complex analytical queries.
- OLAp services can access source data in any supported OLE DB data provider, which including not only SQL server but also a large number of desktop and server databases including Microsoft Access,MicroSoft FoxPro,Oracle,Sybase and Informix.
Performance of OLAP depends up on these things
- Aggregations
- Materializing aggregations usually leads to a faster query response since we probably need to do less work to answer a request for cell values.
- Partitions
- Partitions give you the ability to choose different storage strategies to optimize the tradeoff between processing and querying performance.
- Data slices on partitions
- Setting a data slice is an efficient way to avoid querying irrelevent partitions.
Among the key features of OLAP services are
- Ease of use provides by user interface wizards
- A flexible,robust data model for cube definition and storage
- Write enabled cubes for "what if " scenarios analysis
- Scalable architecture that provides a variety of storage scenarios and an automated solution to the "data-explosion syndrome" that plagues traditional OLAP technologies
- Integration of administration tools,security,data sources and client/server caching.
- Widely supported APIs and open architecture to support custom applications
OLAP and Data WarehouseOLAP and Data Warehouse

Data Warehouse
When should we consider a data warehousing solution ?
When users are requesting access to large amounts of historical information for reporting purposes, we should strongly consider a ware house or mart. The user will be benefit when the information is organized in an efficient manner for this type of access
A data warehouse is often used as the basis for a decision support system. It is designed to overcome problems encountered when an organization attempts to perform strategic analysis using the same database that is used for online transaction processing (OLTP).Data warehouses are read-only,integrated databases designed to answer comparative and what if questions.
- OLAP is a key component of data warehousing, and OLAP Services provides essential functionality for a wide array of applications ranging from reporting to advanced decision support.
- Unlike OLTP systems that store data in a highly normalized fashion, the data in the data warehouse is stored in a very de normalized manner to improve query performance.
- Data warehouses often use star and snowflake schemas to provide the fastest possible response times to complex queries, and the basis for aggregations managed by OLAP tools.
Difficulties often encountered when OLTP databases are used for online analysis include the following
- Analysts do not have the technical expertise required to create ad hoc queries against the complex data structure.
- Analytical queries that summarize large volumes of data adversely affect the ability of the system to respond to online transactions.
- System performance when responding to complex analysis queries can be slow or unpredictable, providing inadequate support to online analytical users.
- Constantly changing data interferes with the consistency of analytical information.
- Security becomes more complicated when online analysis is combined with online transaction processing.
Data warehousing provides one of the keys to solving these problems, organizing data for the purpose of analysis. Data warehouses
- Can combine data from heterogeneous data sources into a single homogenous structure.
- Organize data in simplified structures for efficiency of analytical queries rather than for transaction processing.
- Contain transformed data that is valid, consistent, consolidated, and formatted for analysis.
- Provide stable data that represents business history.
- Are updated periodically with additional data rather than frequent transactions.
- Simplify security requirements.
- Provide a database organized for OLAP rather than OLTP.
Data Warehousing Architecture
Two basic types of data warehouse architecture exist: enterprise data warehouses and data marts.
- The enterprise data warehouse contains enterprise-wide information integrated from multiple operational data sources for consolidated data analysis.
- The data mart contains a subset of enterprise-wide data that is built for use by an individual department or division in an organization.
Data Granularity
A data warehouse typically stores data in different levels of granularity or summarization, depending on the data requirements of the business. If an enterprise needs data to assist strategic planning, then only highly summarized data is required. The lower the level of granularity of data required by the enterprise, the higher the number of resources (specifically data storage) required to build the data warehouse. The different levels of summarization in order of increasing granularity are:
- Current operational data
- Historical operational data
- Aggregated data
- Metadata
The components of schema design are dimensions, keys, and fact and dimension tables.
Fact tables
- Contain data that describes a specific event within a business, such as a bank transaction or product sale. Alternatively, fact tables can contain data aggregations, such as sales per month per region. Except in cases such as product or territory realignments, existing data within a fact table is not updated; new data is simply added.
- Because fact tables contain the vast majority of the data stored in a data warehouse, it is important that the table structure be correct before data is loaded. Expensive table restructuring can be necessary if data required by decision support queries is missing or incorrect.
- The characteristics of fact tables are:
- Many rows; possibly billions
- Primarily numeric data; rarely character data.
- Multiple foreign keys (into dimension tables).
- Static data.
Dimension tables
- Contain data used to reference the data stored in the fact table, such as product descriptions, customer names and addresses, and suppliers. Separating this verbose (typically character) information from specific events, such as the value of a sale at one point in time, makes it possible to optimize queries against the database by reducing the amount of data to be scanned in the fact table.
- Dimension tables do not contain as many rows as fact tables, and dimensional data is subject to change, as when a customer's address or telephone number changes. Dimension tables are structured to permit change.
- The characteristics of dimension tables are:
- Fewer rows than fact tables; possibly hundreds to thousands.
- Primarily character data.
- Multiple columns that are used to manage dimension hierarchies.
- One primary key (dimensional key).
- Updatable data.
Dimensions
- Are categories of information that organize the warehouse data, such as time, geography, organization, and so on. Dimensions are usually hierarchical in that one member may be a child of another member
Dimensional keys
- Are unique identifiers used to query data stored in the central fact table
Changes in the Data Warehouse
- Data is usually added periodically to the data warehouse to include more recent information about the organization's business activities.
- Changes to data already in the data warehouse are less frequent and usually made only to incorporate corrections to errors discovered in the source from which the data was extracted, or to restructure data due to organizational changes.
- Structural changes to the data warehouse design typically are the least common.
- Referential integrity must be maintained when data warehouse data is added, changed, or deleted. Loss of referential integrity can cause errors during cube processing, fact table records to be bypassed, or result in inaccurate OLAP information.
Creating the informational data, that is, the data warehouse, from the operational systems is a key part of the overall data warehousing solution. Building the informational database is done with the use of transformation or propagation tools . These tools not only move the data from multiple operational systems, but often manipulate the data into a more appropriate format for the warehouse. This could mean:
- The creation of new fields that are derived from existing operational data
- Summarizing data to the most appropriate level needed for analysis
- Denormalizing the data for performance purposes
- Cleansing of the data to ensure that integrity is preserved
Example
Now we will consider an example.Consider a site which deals with sales of different products in different regions.The final thing which we want to analyze is, how much sales happened in specific region, one particular product movement in particular region etc. Figure (1) shows some of the tables in OLTP.
Figure(2) Shows the data warehouse design. For transferring data from OLTP to dataware house we can use Data transformation service(DTS) or procedures which will copy the needed information from OLTP to data warehouse. Here in the OLTP we have a date on which the transaction had happen. We have to transform it into period information as shown in the figure(2).

Figure(1): Some of the OLTP tables

Figure(2): Data warehouse Design
From this data warehouse we can create the OLAP cubes using OLAP cube Wizard.figure(3) shows the OLAP cube wizard.After creating a cube, we should process it. After that we can browse the cube for data. Figure(4) shows the browsing tool.

Figure(3): The Cube Wizard

Figure(4): Browsing the Cube
Querying the OLAP Cube Data
Microsoft SQL Server services provides an architecture for access to multidimensional data. Through OLE DB for OLAP, a Pivot Table Service provides client acces to this multidimensional OLAP data. For expressing queries to this data, OLE DB for OLAP employs a full edged, highly functional syntax : multidimensional expressions(MDX). Let's start by outlining one of the simplest forms an MDX expression :
SELECT axis specification ON COLUMNS axis specification ON ROWS FROM cube_name WHERE slicer_specification
Example
This visual Basic project demonstrates the basics of using ADO MD to access cube data. It displays member captions for columns and row headers,then displays formatted values of specific cells with in the cellset.
'----------------------------------
' PROJECT
'----------------------------------
Private Sub cmdCellSettoDebugWindow_Click()
Dim cat As New ADOMD.Catalog
Dim cst As New ADOMD.Cellset
Dim i As Integer
Dim j As Integer
Dim strServer As String
Dim strSource As String
Dim strColumnHeader As String
Dim strRowText As String
On Error GoTo Error_cmdCellSettoDebugWindow_Click
Screen.MousePointer = vbHourglass
'*-----------------------------------
'* Set Server to Local Host
'*-----------------------------------
strServer = "LOCALHOST"
'*-----------------------------------
'* Set MDX query string Source
'*-----------------------------------
strSource = strSource & "SELECT "
strSource = strSource & "{[Measures].members} ON COLUMNS,"
strSource = strSource & _
"NON EMPTY [Store].[Store City].members ON ROWS"
strSource = strSource & " FROM Sales"
'*-------------------------------------
'* Set Active Connection
'*-------------------------------------
cat.ActiveConnection = "Data Source=" & strServer & _
";Provider=msolap;"
'*--------------------------------------
'* Set Cell Set source to MDX query string
'*--------------------------------------
cst.Source = strSource
'*---------------------------------------
'* Set Cell Sets active connection to current connection
'*---------------------------------------
Set cst.ActiveConnection = cat.ActiveConnection
'*----------------------------------------
'* Open Cell Set
'*----------------------------------------
cst.Open
'*---------------------------------------
'* Allow space for Row Header Text
'*---------------------------------------
strColumnHeader = vbTab & vbTab & vbTab & vbTab & vbTab & vbTab
'*------------------------------------------
'* Loop through Column Headers
'*------------------------------------------
For i = 0 To cst.Axes(0).Positions.Count - 1
strColumnHeader = strColumnHeader & _
cst.Axes(0).Positions(i).Members(0).Caption & vbTab & _
vbTab & vbTab & vbTab
Next
Debug.Print vbTab & strColumnHeader & vbCrLf
'*---------------------------------------------
'* Loop through Row Headers and Provide data for each row
'*---------------------------------------------
strRowText = ""
For j = 0 To cst.Axes(1).Positions.Count - 1
strRowText = strRowText & _
cst.Axes(1).Positions(j).Members(0).Caption & vbTab & _
vbTab & vbTab & vbTab
For k = 0 To cst.Axes(0).Positions.Count - 1
strRowText = strRowText & cst(k, j).FormattedValue & _
vbTab & vbTab & vbTab & vbTab
Next
Debug.Print strRowText & vbCrLf
strRowText = ""
Next
Screen.MousePointer = vbDefault
Exit Sub
Error_cmdCellSettoDebugWindow_Click:
Beep
Screen.MousePointer = vbDefault
MsgBox "The Following Error has occurred:" & vbCrLf & _
Err.Description, vbCritical, " Error!"
Exit Sub
End Sub
'----------------------------------------
' END OF THE PROJECT
'----------------------------------------
References
Sites
Books
- Microsoft SQL Server 7.0 DBA and Survival Guide SAMS,TechMedia

