Friday, February 24, 2012

Can someone help explain?

Hi,

I am working with several tables, but for now I just mention 4 : one is fact table (named Usage), and 3 dimensional tables Periods, Products, and Regions. The fact table contains references to the dimensional tables. Table Periods contain two other columns month and year.

I created a cube containing columns from those 4 tables. Deployment was successful. Trouble comes when I want to create a mining structure using Time Series containing these columns :

- Period
- Amount (of money)
- Product name
- Region name

When I choose to use cube (instead of table) as source for mining structure, I'm forced to choose only one dimension (among the Periods, Products, and Regions). Whatever dimension I choose I end up being unable to use the column period as the Time-Key column. Effectively I cannot use Time Series method since I cannot use the column period.

(1) Why is this so [why Visual Studio forced us to use only one dimension from the cube] ?
(2) Why Visual Studio eliminates the column period, column that has relationship with the
time dimension?
(3) What is the use of Cube anyway to the mining? Is there still any use for it?
(4) What is the solution to that kind of problem I face?

Thank you,

Bernaridho
Dear all,

I have not received any answer why can I use only one dimension when using a cube to mine data. Can someone help explain please? The details are ini my previous posting.

Thank you,

Bernaridho
|||

#1. The current design on OLAP-based mining structures requires that case-level columns be member properties of the key - this is a strict way of making sure that the attributes refer to the same entity (the case) and the requirement is met if all the case columns come from the same dimension.

#2. The data mining engine does not have time intelligence in terms of recognizing the special status of the time dimension so unless there is explicit relationship through inclusion of the case dimension and time dimension in a measuregroup, the dimension cannot be included (as a nested table).

#3. Building mining models on top of cubes is useful when you already have the data in that form (and you don't want to or don't have access to the source data) and also for applying the results of data mining back to the cube to slice/dice the fact data with the hierarchies discovered by the model. For instance, you could build a cluster model and slice your facts by the clusters.

#4. In your case, it might be better to build your time series model directly on the source (relational) data since you will have much more flexibility in that case.

|||

You can also use DMX to create the models. DMX allows the form

INSERT INTO <mining model>(<columns>)

<MDX statement>

In this form it uses the flattening semantics of MDX to populate the mining model. This way you can create any arbitrary dataset as input into the mining model.

|||Are there any more details to this question?

No comments:

Post a Comment