Wednesday, March 21, 2012

Fuzzy Grouping Transform Corrupts Pass-through Data

We are working with a client and are using Fuzzy Group transform for de-duping, and hierarchy creation for a national account list.

I've found that if a large number of pass through columns are sent to the Fuzzy Grouping transforms it randomly corrupts the char columns.

Our work around was to only pass through ID columns and then build out the attributes needed from views against the Fuzzy group output however product team should take a look at this.
By corruption I mean random characters from other records would show up in character columns (we had address and name corruption in about 10% of a 1.5 million record dataset).

Thanks.

Michael Slater
Software Architects

Michael,

Thanks for your post. We have been unable to reproduce the problem you have reported in the new test cases that we have created for this issue. We would very much like to get to the bottom of what you are seeing. Can you please contact me directly so that we might work with you to find a better repro case that can be used to further investigate and fix this problem?

Please send an email to KrisGan@.microsoft.com

Thanks,
Kris Ganjamsql

No comments:

Post a Comment