Download Microsoft® SQL Server® Express from Official Microsoft Download Center.SQL Server Management Studio (SSMS) – SQL Server Management Studio (SSMS) | Microsoft Docs
New arrivals. Written in an easy-to-follow, example-driven format, there are plenty of stepbystep instructions to help get you started! The book has a friendly approach, with the opportunity to learn by experimenting. This book is will give you a good upshot view of each component and scenarios featuring the use of that component in Data Warehousing and Business Intelligence systems. Reza Rad has more than 10 years of experience in databases and soft ware applications.
Most of his work experience is in data warehousing and business intelligence. Outrigger is a Kimball terminology for dimensions, such as date, which is conformed and might be used for a many-to-one relationship between dimensions for just one layer. In the previous example, you learned about transactional fact. Transactional fact is a fact table that has one record per transaction. This type of fact table usually has the most detailed Grain. There is also another type of fact, which is the snapshot Fact table.
In snapshot fact, each record will be an aggregation of some transactional records for a snapshot period of time. For example, consider financial periods; you can create a snapshot Fact table with one record for each financial period, and the details of the transactions will be aggregated into that record.
Transactional facts are a good source for detailed and atomic reports. They are also good for aggregations and dashboards. The Snapshot Fact tables provide a very fast response for dashboards and aggregated queries, but they don’t cover detailed transactional records. Based on your requirement analysis, you can create both kinds of facts or only one of them. There is also another type of Fact table called the accumulating Fact table. This Fact table is useful for storing processes and activities, such as order management.
You can read more about different types of Fact tables in The Data Warehouse Toolkit , Ralph Kimball , Wiley which was referenced earlier in this chapter. We’ve explained that Fact tables usually contain FKs of dimensions and some measures. However, there are times when you would require a Fact table without any measure. These types of Fact tables are usually used to show the non-existence of a fact. For example, assume that the sales business process does promotions as well, and you have a promotion dimension.
So, each entry in the Fact table shows that a customer X purchased a product Y at a date Z from a store S when the promotion P was on such as the new year’s sales. This Fact table covers every requirement that queries the information about the sales that happened, or in other words, for transactions that happened. However, there are times when the promotion is on but no transaction happens!
This is a valuable analytical report for the decision maker because they would understand the situation and investigate to find out what was wrong with that promotion that doesn’t cause sales. So, this is an example of a requirement that the existing Fact table with the sales amount and other measures doesn’t fulfill. This Fact table doesn’t have any fact or measure related to it; it just has FKs for dimensions.
However, it is very informative because it tells us on which dates there was a promotion at specific stores on specific products. We call this Fact table as a Factless Fact table or Bridge table. Using examples, we’ve explored the usual dimensions such as customer and date. When a dimension participates in more than one business process and deals with different data marts such as date , then it will be called a conformed dimension.
Sometimes, a dimension is required to be used in the Fact table more than once. For example, in the FactSales table, you may want to store the order date, shipping date, and transaction date. All these three columns will point to the date dimension. In this situation, we won’t create three separate dimensions; instead, we will reuse the existing DimDate three times as three different names.
So, the date dimension literally plays the role of more than one dimension. This is the reason we call such dimensions role-playing dimensions. There are other types of dimensions with some differences, such as junk dimension and degenerate dimension. The junk dimension will be used for dimensions with very narrow member values records that will be in use for almost one data mart not conformed.
For example, the status dimensions can be good candidates for junk dimension. If you create a status dimension for each situation in each data mart, then you will probably have more than ten status dimensions with only less than five records in each. The junk dimension is a solution to combine such narrow dimensions together and create a bigger dimension. You may or may not use a junk dimension in your data mart because using junk dimensions reduces readability, and not using it will increase the number of narrow dimensions.
So, the usage of this is based on the requirement analysis phase and the dimensional modeling of the star schema. A degenerate dimension is another type of dimension, which is not a separate dimension table. In other words, a degenerate dimension doesn’t have a table and it sits directly inside the Fact table.
Assume that you want to store the transaction number string value. Where do you think would be the best place to add that information? You may think that you would create another dimension and enter the transaction number there and assign a surrogate key and use that surrogate key in the Fact table. This is not an ideal solution because that dimension will have exactly the same Grain as your Fact table, and this indicates that the number of records for your sales transaction dimension will be equal to the Fact table, so you will have a very deep dimension table, which is not recommended.
On the other hand, you cannot think about another attribute for that dimension because all attributes related to the sales transaction already exist in other dimensions connected to the fact.
So, instead of creating a dimension with the same Grain as the fact and with only one column, we would leave that column even if it is a string inside the Fact table. This type of dimension will be called a degenerate dimension.
Now that you understand dimensions, it is a good time to go into more detail about the most challengeable concepts of data warehousing, which is slowly changing dimension SCD. The dimension’s attribute values may change depending on the requirement. You will do different actions to respond to that change. As the changes in the dimension’s attribute values happen occasionally, this called the slowly changing dimension. SCD depends on the action to be taken after the change is split in different types.
In this section, we only discuss type 0, 1, and 2. Type 0 doesn’t accept any changes. Let’s assume that the Employee Number is inside the Employee dimension. Employee Number is the business key and it is an important attribute for ETL because ETL distinguishes new employees or existing employees based on this field. So we don’t accept any changes in this attribute. This means that type 0 of SCD is applied on this attribute. Sometimes, a value may be typed wrongly in the source system, such as the first name, and it is likely that someone will come and fix that with a change.
In such cases, we will accept the change, and we won’t need to keep historical information the previous name. So we simply replace the existing value with a new value.
This type of SCD is called type 1. The following screenshot shows how type 1 works:. In this type, it is a common requirement to maintain historical changes. For example, consider this situation; a customer recently changes their city from Seattle to Charlotte. You cannot use type 0 because it is likely that someone will change their city of living. If you behave like type 1 and update the existing record, then you will miss the information of the customer’s purchase at the time that they were in Seattle, and all entries will show that they are customers from Charlotte.
So the requirement for keeping the historical version resulted in the third type of SCD, which is type 2. Type 2 is about maintaining historical changes. The way to keep historical changes is through a couple of metadata columns: FromDate and ToDate. Each new customer will be imported into DimCustomer with FromDate as a start date, and the ToDate will be left as null or a big default value such as 29,, If a change happens in the city, the existing records in ToDate will be marked as the date of change, and a new record will be created as an exact copy of the previous record with the new city and with a new FromDate , which will be the date of change, and the ToDate field will be left as null.
Using this solution to find the latest and most up-to-date member information, you just need to look for the member record with ToDate as null.
To fetch the historical information, you would need to search for it in the specified time span whether the historical record exists. The following screenshot shows an example of SCD type There are other types of SCD that are based on combinations of the first three types and cover other kinds of requirements. You can read more about the different types of SCD and methods of implementing them in The Data Warehouse Toolkit referenced earlier in this chapter.
In this chapter, you learned what Business Intelligence is and what its components are. You studied the requirement for BI systems, and you saw the solution architecture to solve the requirements. Then, you read about data warehousing and the terminologies in dimensional modeling.
If you come from a DBA or database developer background and are familiar with database normalization, then you will know that in dimensional modeling, you should avoid normalization in some parts and you would need to design a star schema. You’ve learned that the Fact table shows numeric and additive values, and descriptive information will be stored in dimensions. You’ve learned different types of facts such as transactional, snapshot, and accumulating, and also learned about different types of dimensions such as outriggers, role playing, and degenerate.
Data warehousing and dimensional modeling together constitute the most important part of the BI system, which is sometimes called the core of the system. Reza Rad has more than 10 years of experience in databases and soft ware applications. Most of his work experience is in data warehousing and business intelligence. He has a Bachelor’s degree in Computer Engineering. He has worked with large enterprises around the world and delivered high-quality data warehousing and BI solutions for them.
He has worked with industries in different sectors, such as Health, Finance, Logistics, Sales, Order Management, Manufacturing, Telecommunication, and so on. Reza has written books on SQL Server and databases.
His blog contains the latest information on his presentations and publications. It gives you the ability to download multiple files at one time and download large files quickly and reliably. It also allows you to suspend active downloads and resume downloads that have failed.
Microsoft Download Manager is free and available for download now. Warning: This site requires the use of scripts, which your browser does not currently allow. See how to enable scripts. Select Language:. Choose the download you want. Download Summary:. Total Size: 0. Back Next. Microsoft recommends you install a download manager. Do you enjoy reading or your need a lot of educational materials for your work?
These days it has become a lot easier to get books and manuals online as opposed to searching for them in the stores or libraries.
Microsoft SQL Server Business Intelligence Development Beginner’s Guide | Packt.
Account Options Sign in. Top charts. New arrivals. Written in an easy-to-follow, example-driven format, there are plenty of stepbystep instructions to help get you started! The book has a friendly approach, with the opportunity to learn by experimenting. This book is will give you a good upshot view of each component and scenarios featuring the use of that component in Data Warehousing and Business Intelligence systems. Reza Rad has more than 10 years of experience in databases and soft ware applications.
Most of his work experience is in data warehousing and business intelligence. He has a Bachelor’s degree in Computer Engineering. He has worked with large enterprises around the world and delivered highquality data warehousing and BI solutions for them.
He has worked with industries in different sectors, such as Health, Finance, Logistics, Sales, Order Management, Manufacturing, Telecommunication, and so on.
Reza has written books on SQL Server and databases. His blog contains the latest information on his presentations and publications.
Reza is a Mentor and a Microsoft Certified Trainer. He has been in the professional training business for many years. He conducts extensive handedlevel training for many enterprises around the world via both remote and inperson training.
Reviews Review policy and info. Published on. Flowing text, Google-generated PDF. Best for. Web, Tablet, Phone, eReader. Content protection. Read aloud. Flag as inappropriate. It syncs automatically with your account and allows you to read online or offline wherever you are.
Please follow the detailed Help center instructions to transfer the files to supported eReaders.
– About the author
However, they might require getting some data from outside, for example, getting some data from another vendor’s web service or many other protocols and channels to send and receive information. This indicates that there would be a requirement for consolidated analysis for such information, which brings the BI requirement back to the table. After understanding what the BI system is, it’s time to discover more about its components and understand how these components work with each other.
There are also some BI tools that help to implement one or more components. The following diagram shows an illustration of the architecture and main components of the Business Intelligence system:. The BI architecture and components differ based on the tools, environment, and so on.
The architecture shown in the preceding diagram contains components that are common in most of the BI systems. In the following sections, you will learn more about each component.
The data warehouse is the core of the BI system. A data warehouse is a database built for the purpose of data analysis and reporting. This purpose changes the design of this database as well. As you know, operational databases are built on normalization standards, which are efficient for transactional systems, for example, to reduce redundancy.
As you probably know, a 3NF-designed database for a sales system contains many tables related to each other. So, for example, a report on sales information may consume more than 10 joined conditions, which slows down the response time of the query and report.
A data warehouse comes with a new design that reduces the response time and increases the performance of queries for reports and analytics. You will learn more about the design of a data warehouse which is called dimensional modeling later in this chapter. It is very likely that more than one system acts as the source of data required for the BI system. So there is a requirement for data consolidation that extracts data from different sources and transforms it into the shape that fits into the data warehouse, and finally, loads it into the data warehouse; this process is called Extract Transform Load ETL.
There are many challenges in the ETL process, out of which some will be revealed conceptually later in this chapter. According to the definition of states, ETL is not just a data integration phase. Let’s discover more about it with an example; in an operational sales database, you may have dozen of tables that provide sale transactional data.
When you design that sales data into your data warehouse, you can denormalize it and build one or two tables for it. So, the ETL process should extract data from the sales database and transform it combine, match, and so on to fit it into the model of data warehouse tables. There are some ETL tools in the market that perform the extract, transform, and load operations. SSIS also has many built-in transformations to transform the data as required.
A data warehouse is designed to be the source of analysis and reports, so it works much faster than operational systems for producing reports. However, a DW is not that fast to cover all requirements because it is still a relational database, and databases have many constraints that reduce the response time of a query.
The requirement for faster processing and a lower response time on one hand, and aggregated information on another hand causes the creation of another layer in BI systems. This layer, which we call the data model, contains a file-based or memory-based model of the data for producing very quick responses to reports. Microsoft’s solution for the data model is split into two technologies: the OLAP cube and the In-memory tabular model. The OLAP cube is a file-based data storage that loads data from a data warehouse into a cube model.
The cube contains descriptive information as dimensions for example, customer and product and cells for example, facts and measures, such as sales and discount. The following diagram shows a sample OLAP cube:. In the preceding diagram, the illustrated cube has three dimensions: Product , Customer , and Time. Each cell in the cube shows a junction of these three dimensions. Aggregated data can be fetched easily as well within the cube structure.
For example, the orange set of cells shows how much Mark paid on June 1 for all products. As you can see, the cube structure makes it easier and faster to access the required information. Multidimensional modeling is based on the OLAP cube and is fitted with measures and dimensions, as you can see in the preceding diagram.
The tabular model is based on a new In-memory engine for tables. The In-memory engine loads all data rows from tables into the memory and responds to queries directly from the memory. This is very fast in terms of the response time. The frontend of a BI system is data visualization. In other words, data visualization is a part of the BI system that users can see. There are different methods for visualizing information, such as strategic and tactical dashboards, Key Performance Indicators KPIs , and detailed or consolidated reports.
As you probably know, there are many reporting and visualizing tools on the market. Microsoft has provided a set of visualization tools to cover dashboards, KPIs, scorecards, and reports required in a BI application.
Excel is also a great slicing and dicing tool especially for power users. There are also components in Excel such as Power View, which are designed to build performance dashboards.
Sometimes, you will need to embed reports and dashboards in your custom written application. Chapter 12 , Integrating Reports in Application , of this book explains that in detail. Every organization has a part of its business that is common between different systems.
That part of the data in the business can be managed and maintained as master data. For example, an organization may receive customer information from an online web application form or from a retail store’s spreadsheets, or based on a web service provided by other vendors. Master Data Management MDM is the process of maintaining the single version of truth for master data entities through multiple systems. Even if one or more systems are able to change the master data, they can write back their changes into MDS through the staging architecture.
The quality of data is different in each operational system, especially when we deal with legacy systems or systems that have a high dependence on user inputs. As the BI system is based on data, the better the quality of data, the better the output of the BI solution. Because of this fact, working on data quality is one of the components of the BI systems. As an example, Auckland might be written as “Auck land” in some Excel files or be typed as “Aukland” by the user in the input form.
As a solution to improve the quality of data, Microsoft provided users with DQS. DQS works based on Knowledge Base domains, which means a Knowledge Base can be created for different domains, and the Knowledge Base will be maintained and improved by a data steward as time passes.
There are also matching policies that can be used to apply standardization on the data. A data warehouse is a database built for analysis and reporting. In other words, a data warehouse is a database in which the only data entry point is through ETL, and its primary purpose is to cover reporting and data analysis requirements.
This definition clarifies that a data warehouse is not like other transactional databases that operational systems write data into. When there is no operational system that works directly with a data warehouse, and when the main purpose of this database is for reporting, then the design of the data warehouse will be different from that of transactional databases.
If you recall from the database normalization concepts, the main purpose of normalization is to reduce the redundancy and dependency. The following table shows customers’ data with their geographical information:. Let’s elaborate on this example. As you can see from the preceding list, the geographical information in the records is redundant.
This redundancy makes it difficult to apply changes. For example, in the structure, if Remuera , for any reason, is no longer part of the Auckland city, then the change should be applied on every record that has Remuera as part of its suburb.
The following screenshot shows the tables of geographical information:. So, a normalized approach is to retrieve the geographical information from the customer table and put it into another table. Then, only a key to that table would be pointed from the customer table. In this way, every time the value Remuera changes, only one record in the geographical region changes and the key number remains unchanged.
So, you can see that normalization is highly efficient in transactional systems. This normalization approach is not that effective on analytical databases. If you consider a sales database with many tables related to each other and normalized at least up to the third normalized form 3NF , then analytical queries on such databases may require more than 10 join conditions, which slows down the query response.
In other words, from the point of view of reporting, it would be better to denormalize data and flatten it in order to make it easier to query data as much as possible.
This means the first design in the preceding table might be better for reporting. However, the query and reporting requirements are not that simple, and the business domains in the database are not as small as two or three tables. So real-world problems can be solved with a special design method for the data warehouse called dimensional modeling. There are two well-known methods for designing the data warehouse: the Kimball and Inmon methodologies.
The Inmon and Kimball methods are named after the owners of these methodologies. Both of these methods are in use nowadays. The main difference between these methods is that Inmon is top-down and Kimball is bottom-up. In this chapter, we will explain the Kimball method. Both of these books are must-read books for BI and DW professionals and are reference books that are recommended to be on the bookshelf of all BI teams.
This chapter is referenced from The Data Warehouse Toolkit , so for a detailed discussion, read the referenced book. To gain an understanding of data warehouse design and dimensional modeling, it’s better to learn about the components and terminologies of a DW. A DW consists of Fact tables and dimensions.
The relationship between a Fact table and dimensions are based on the foreign key and primary key the primary key of the dimension table is addressed in the fact table as the foreign key. Reza Rad has more than 10 years of experience in databases and soft ware applications. Most of his work experience is in data warehousing and business intelligence. He has a Bachelor’s degree in Computer Engineering. He has worked with large enterprises around the world and delivered highquality data warehousing and BI solutions for them.
He has worked with industries in different sectors, such as Health, Finance, Logistics, Sales, Order Management, Manufacturing, Telecommunication, and so on. Reza has written books on SQL Server and databases.
At the same time, it should be mentioned that a lot of book sites are far from perfect and they offer only a very limited number of books, which means that you end up wasting your time while searching for them. Here, we are focused on bringing you a large selection of books for download so that you can save your time and effort.
Microsoft recommends you install a download manager. Microsoft Download Manager. Manage all your internet downloads with this easy-to-use manager. It features a simple interface with many customizable options:. Download multiple files at one time Download large files quickly and reliably Suspend active downloads and resume downloads that have failed. Yes, install Microsoft Download Manager recommended No, thanks. What happens if I don’t install a download manager? Why should I install the Microsoft Download Manager?
In this case, you will have to download the files individually. You would have the opportunity to download individual files on the “Thank you for downloading” page after completing your download. Files larger than 1 GB may take much longer to download and might not download correctly.