I think I might have come across a problem with using the BDC to index entity data in legacy systems. I’m sure I can’t be the first person to have hit this problem, but I can’t find a solution. Thus, I thought I’d blog it so other folks might be aware of it, and to open the problem up to suggestions.
I’m not sure I can describe the system I’m actually working on, so I’m going to use a slightly contrived example…
We’ve got data in a SQL server database, and we want to expose this via the BDC. In particular, we want to expose the data to SharePoint search.
In my contrived example we’ve got a couple of entities – Person and JobTitle. A Person entity has a JobTitle. In the database it might look something like:
Person:
PersonID | Name | JobTitleID | LastUpdate |
---|---|---|---|
4 | Jane Doe | 3 | 2009-05-21 12:01 |
5 | John Smith | 1 | 2009-05-21 12:05 |
6 | John Smith | 2 | 2009-05-21 12:10 |
Job Title:
ID | Name | LastUpdate |
---|---|---|
1 | Painter | 2009-03-21 12:00 |
2 | Plumber | 2009-03-19 12:32 |
3 | Tailor | 2009-02-21 10:21 |
Clearly, JobTitleID is the foreign key to the JobTitle table. We’re also using a Last Updated date/time to allow use to incrementally index the database with the BDC.
Now, you’ll note that we’ve got 2 people called John Smith. I do mean that we’ve got two people, not that John Smith has two jobs. What I’d like to be able to do, then, is search for “John Smith Painter” and get one record, or “John Smith Plumber” and get the other. In short, the Person entity is the important one; Job Title is just a look-up.
Unfortunately, while we can index these tables via the BDC, essentially the relationship is lost. A Search for “John Smith Painter” would return 2 matches for the “John Smith” part (our two people) and 1 match for the “Painter” part (our job title).
Ideally we could denormalised this database structure, and store the person’s job against their record directly. Sadly, for the system we’re working with this isn’t really an option.
So, what I can’t figure out is how to get the BDC to index our Person entity and related entity data as a reference to our Person record. I don’t know if anyone has any ideas, but I’d come up with:
- Denormalise the data (we can’t really do for certain relationships)
- Build a separate ASP.NET web site to display our data and index that (but would screw up incremental index)
- Write a custom protocol handler (complex, scary, and probably expensive) (Although this codeplex project might help)
If anyone has other suggestions or tips, let me know!
Hi Andy
It’s been a little while since I did any hands on BDC but the possibility that occurs me is writing a lightweight web services wrapper that handles the denormalisation. You could then create your BDC entities from the web services instead?
Could you use a database view to “denormalize” the data and index the view instead of the tables themselves?
Hmm. Both seem like possibilities. There are some tricky many-to-many relationships here, so I’ll have to have an investigate.