Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in BI by (17.6k points)

I've been trying to solve an issue, and to date, I haven't been able to reach what I'd say is an optimal solution. I have a dimension (Features) that needs to be referenced in 2 other dimensions (Actions and Sessions), which in turn are referenced from the same Fact table (UserAction). This creates ambiguity and I can't complete the schema:

(note: snip of the model, not the whole thing) (included the bridge tables to show some of the added complexity in the model with many-to-many relationships)

I think the issue might be with Dim_Features technically having different meaning between both dimensions, but I'm still trying to use it as the same? It means both:

  • An Action belongs to this Feature / Feature Area
  • A Session had this Feature / Feature Area available (owned)

What I need to accomplish is being able to filter/slice Fact_UserActions by Sessions where certain features are available / unavailable, to then analyse things like:

  • Which Features are used when Feature 'A' is owned (as in, correlations between certain features being ownes, and others being used)?
  • How many users who own a Feature have not used it?
  • How often is a Feature used? (constrained by population of sessions that own it, ie. where it could actually be used)

Any ideas on what I might be doing wrong, or how I might improve the model?

EDIT: In case it helps, the sort of thing we'd want to get out of this is a table such as:

Where we can see the impact a feature has on the population as a whole, and within the population that owns it.

1 Answer

0 votes
by (47.2k points)

The standard Kimball advice for star schemas is to always find the absolute lowest grain because you'll always be able to aggregate up. They are all about the use of Features. You are using a fact table to analyze features that are not at Feature usage level. The Bridge tables exist to try to work around this. It's important to remember that the vast majority of the time, your dimensions should only be related indirectly, though Fact tables. Sometimes you need a Bridge table, but rarely we need it. It's hard to come up with a suggested schema here without knowing how it fits into the rest of the model, but consider the following:

  • Replace Fact_UserAction with something like Fact_FeatureUsage.

  • Have action_id, session_id, and feature_id in Fact_FeatureUsage.

  • Get rid of your Bridge tables.

Related questions

Browse Categories

...