Hi - we have been evaluating using Fuzzy Grouping and Lookup for maintaining our large list of customer records. Initial testing with Grouping on about 300K records went great but now with a larger sample of 7.3 million records we are running into problems. It doesn't appear to be system limitation - the index is built reasonably quickly and without errors but when it starts the matching we get these errors:
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: The ProcessInput method on component "Fuzzy Lookup" (86) failed with error code 0x8000FFFF. The identified component returned an error from the ProcessInput method. The error is specific to the component, but the error is fatal and will cause the Data Flow task to stop running.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread0" has exited with error code 0x8000FFFF.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread1" received a shutdown signal and is terminating. The user requested a shutdown, or an error in another thread is causing the pipeline to shutdown.
[Fuzzy Grouping Inner Data Flow : OLE DB Source [1]] Error: The attempt to add a row to the Data Flow task buffer failed with error code 0xC0047020.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "WorkThread1" has exited with error code 0xC0047039.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: The PrimeOutput method on component "OLE DB Source" (1) returned error code 0xC02020C4. The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.
[Fuzzy Grouping Inner Data Flow : DTS.Pipeline] Error: Thread "SourceThread0" has exited with error code 0xC0047038.
One thing we did find is that our test server didn't have SP1 installed and that seemed to help a lot (we were getting buffer errors prior to SP1). One other note - the desination table is populated with all the data but no scoring has been applied to it.
Does anyone have any ideas what could be causing this?
Thanks!
Keith Doyle
Had the same problem, and thanks to you, I managed to get this running by doing multiple passes on around 200k rows in each pass.
I also get lots of errors that say"Buffer Manager found Virtual Memory Low but unable to swap out any buffer". The package works inspite of these errors when rowcount is low enough.
DTSDebugHost.exe consumes a lot of Memory during processing. LSASS also seems unnaturally high. I added a lot of Paging memory and that seemed to improve things to the extent that DTSDebugHost usage increased even further :-). But from the PerfMon stats, it looks to me like DTSDebug isnt using paging memory. The other processes actually released memory instead.
I use SQL2005Ent SP1 on Win2003R2 - on Intel X-64.
I hope and pray that there is a fix on the way. Bcos I found Fuzzy quite useful.
|||I tried out a few things:
1. Added RAM
2. Used DTEXec instead of running in Debug mode in BIDS
3. Reduced columns
Have so far succeeded in 600k rows and am running 1.5 million now, and havent encountered any Buffer Manager errors so far (WIP, and fingers crossed)
My perfmon observations are however repeated. I have a suspicion Fuzzy does not use File Paging efficiently (or not at all). I had more than 8 GB of pagefile space allocated, but my Memory utilization never went beyond Physical Ram. If anything, all other running processes got squeezed out.
I have run some heavy mining stuff on the same infrastructure, without any crashes(even before I added RAM). So why is Fuzzy Grouping crashing? Or is it SSIS?
Another observation - Memory usage did not drop back after closing Devenv or DTExec. Memory gets freed only after a machine restart.
I hope someone at Microsoft tells us what the problem is. And gives us a fix please!!!
|||
Great info - I am seeing the exact same thing. See the other thread I opened for "Fuzzy Grouping: Any success with > 3 million records?". I'll post my latest test results there - please copy your info from this thread to that one.
Thanks!!
|||I'll have someone investigate at this end and we'll get back to these threads. Thanks for the grat info.
One thing would be to try executing using DTExec outside the debug environment. Or try "Start without Debugging." I doubt this will solve the problem entirely, but it will reduce any effect of the IDE on the problem.
Donald
|||Please see the follow up to this thread at:
http://forums.microsoft.com/MSDN/ShowPost.aspx?PostID=417880&SiteID=1&mode=1
sql
No comments:
Post a Comment