Intellipaat Back

Explore Courses Blog Tutorials Interview Questions
0 votes
2 views
in BI by (17.6k points)

I have this DAX formula that gives me a count of id that appears on the fact table in a month, averaged over the year. I can put this measure is a table ad it's unpacked by row with no issues (by adding variables from dimensions)

Measure:= AVERAGEX(

    SUMMARIZE(

        CALCULATETABLE(fact_table;FILTER('Time_Dimension';'Time_Dimension'[Last_month] <> "LAST"));

         Time_Dimension[Month Name];

        "Count";DISTINCTCOUNT(fact_table[ID])

    );

    [Count]

)

But it's terrible slow (I have 3 measures like this on a single table) and the fact table is big (like 300Million rows big)

I was reading that SUMMARIZE perform really bad with aggregations and It should be replaced with SUMMARIZECOLUMNS. I wrote this formula

Measure_v2:= AVERAGEX(

    SUMMARIZECOLUMNS(

        Time_Dimension[Month Name];

        FILTER(Time_Dimension;

            Time_Dimension[Month Name]<>"LAST"

        );

        "Count";DISTINCTCOUNT(fact_table[ID])

    )

    [Count]

And it works when I visualize the measure as it is, but when I try to put it in a context (like the table above) it gives me the error "Can't use SUMMARIZECOLUMN and ADDMISSINGITEMS() in this context" How can I make a sustainable optimization from the original SUMMARIZE function? 

1 Answer

0 votes
by (47.2k points)
  • We need to re-visit the overall approach before optimizing SUMMARIZE. If your goal is to calculate average fact count per year-month, there is a simpler (and faster) way.

[ID Count]:=CALCULATE(COUNT('fact_table'[ID]),'Time_Dimension'[Last_month] <> "LAST") [Average ID Count]:=AVERAGEX( VALUES('Time_Dimension'[Year_Month]), [ID Count`])

  • assuming that:

    • you have a year-month attribute in your time dimension;

    • IDs in your fact table are unique (and therefore, the simple count is enough)

  • If this solution does not solve your problem, then please post your data model - it's hard to optimize without knowing the data structure.

  • On a side note, I would remove ID field from the fact table. It adds no value to the model, and consumes huge amounts of memory. Your objective can be achieved by simply counting rows:

[Fact Count]:=CALCULATE(COUNTROWS('fact_table'),'Time_Dimension'[Last_month] <> "LAST")

Related questions

...