0 votes
1 view
in BI by (17.6k points)

I have a requirement where I have to loop through each character in a text field

For example:

a#123456; 12341 becomes 123456;12341

a123456 12341bd becomes 123456;12341

a2017d  12341ds becomes 12341

a123456/12341bd becomes 123456;12341

n/a becomes (empty)

I have a requirement where I have to loop through each character in a text field

For example:

a#123456; 12341 becomes 123456;12341
a123456 12341bd becomes 123456;12341
a2017d  12341ds becomes 12341
a123456/12341bd becomes 123456;12341
n/a becomes (empty)

Basically I am cleaning my data here:

  1. I will remove any charachter that is not a number.
  2. A number is only valid if there are a minimum of 5 consecutive numbers.
  3. In the Input to differentiate the ID's a space or ";" or ',' or '/' could have been used. Multiple ID's will be present in some cases and not restricted to two.
  4. To keep it the same, I will replace all these connectors by ";"

Note:- My steps may not be right, but my expected output is what I am looking to get from the data.

How can we do this using DAX or Query Editor in Power BI? 

1 Answer

0 votes
by (47.2k points)
  • First, let's remove anything that isn't a number or a separator. We can create a custom column to do this with this formula (assuming your starting column is named Raw):

Text.Remove([Raw], List.Transform({33..43,60..255}, each Character.FromNumber(_)))

Custom 123456; 12341 123456 12341 2017 12341 123456/12341 /

  • From here, the plan is to split it using each separator and filter out strings that are too short.

  • The Text.SplitAny function allows us to specify multiple separators. For example, the formula Text.SplitAny("1;23 4.5-6/ 7:8,9", " ,-./:;") would return {"1","23","4","5","6","","7","8","9"}.

  • Once we have a list of substrings, we can filter the ones we don't want with List.Select and then concatenate them from a list of substrings into a single string using Text.Combine.

  • Putting all this together, we have this formula

Text.Combine(List.Select(Text.SplitAny([Custom], " ,-./:;"), each Text.Length(_) > 4), ";")

  • You can smash it all into one step if you prefer:=

Table.AddColumn(#"Previous Step", "Clean", each Text.Combine( List.Select( Text.SplitAny( Text.Remove( [Raw], List.Transform( {33..43,60..255}, each Character.FromNumber(_) ) ), " ,-./:;" ), each Text.Length(_) > 4), ";" ) )

Related questions

Welcome to Intellipaat Community. Get your technical queries answered by top developers !


Categories

...