Tuesday, March 20, 2012

Association Questions

I'm wondering if anyone can give me some help with an association model I'd like to setup. It's a typical market-basket analysis, but rather than grouping by individual customers, I'd like to group by customer grouping. (In our database, customers are grouped into categories like: large, small, medium) If this is possible, I'd like to generate the most popular items (so just querying the most probable itemsets), for each customer grouping (I'll refer to this as 'segments' from here on out), and then create a listing of customers in each segment which do not have the most popular items for their segment. I know for this last part I can use reporting services to tackle that problem, however, I'm not really sure how I can really do the rest of this with an association model in SSAS.

Our table structure looks like this:

Code Snippet

CustomerTable PurchasesTable

- --

CustomerName(key) CustomerName

CustomerGroup PurchasedProduct

And the data is arranged in this fashion:

Code Snippet

Customer Table:

CustomerName CustomerGroup

- -

A large

B large

C small

Purchases Table:

CustomerName PurchasedProduct

-

A ProductA

A ProductB

B ProductA

C ProductC

C ProductD

I know this is a lot of information but any help you guys may be able to offer would be great! Thanks!

Just a suggestion which might not do exactly what you requested, but might help:

You could create one association model looking like below:

(

Customer TEXT KEY,

CustomerGroup TEXT DISCRETE, // i.e. used as input, on the left hand side of the rules

Products TABLE PREDICT

(

Product TEXT KEY

)

)

Then, for each customer, get the recommended association predictions based on:

- the products currently owned by the customer

- the group that the customer belongs to

This will create rules that might include the customer group on the left hand side (if that it found to be interesting). It may also generate rules which ignore the group, containing items that sell together across groups of customers.

In the prediction query, if you ask solely for predictions that are NOT based on rules (but on popular itemsets), you will get a list of popular itemsets that should match each customer according to previous purchases and possibly group. Of course, excluding predictions that are based on rules might actually reduce the accuracy of the recommendations. You might want to sort the prediction results by support and use some sort of threshold

If you must really separate the groups (i.e. require to take the customer group into account for each individual customer), you could also build 3 models, one for each group. Then, you can use the same kind of predictions, but a different model for each customer (based on the customer group). This way, the recommendations will be based solely on popular items within the current customer group.

Hope this helps

|||

I was thinking that might be the way to go about doing that, but just have a couple quick questions to clarify.

If I do it the way you suggested, you said that I might get rules that ignore the group/contain the group. How may I filter(?) the results so I only obtain those with with the groups, as I'm not as interested in what individual customers purchased, so much as just purchases made within the groups?

If I built three models, how would I tell the model to only be concerned with the 'small' group when all the groups belong in the same database? Is there somewhere I can specify a filter on the mining model to only focus on certain fields...or something?

Thanks for your help so far!

|||

In SQL Server 2005 you would need to build three separate views (or named queries in the DSV) each showing only a single group of customers. Then for each group you build a separate mining structure and a separate model.

We are working on some advances that may make this easier in future versions, but for now, you should find that this approach works well.

No comments:

Post a Comment