Sunday, February 26, 2012

Full-text SharePoint

Hi all,
I'm looking for in-depth information on the search functionality of
SharePoint 2003. As it is using the SQL full-text search, I post my question
to this newsgroup.
The main goal is to figure out how I can add the functionality of rating
documents to the SharePoint search, in order to make high-rated documents
show up higher in the search results.
Kind regards,
Jeroendb
Jeroen,
I think what you're looking for in SharePoint 2003 is called "best bets",
see "Planning Your Information Structure Using Microsoft Office SharePoint
Portal Server 2003" at:
http://www.microsoft.com/technet/pro...c0861881x.mspx and
under "Planning Keywords and Keyword Best Bets".
While you can do much the same thing using SQL Server, it's can be a bit
complex to get a scalable solution. How would you determine what is a
"highly-rated document" based upon a keyword or phrase search entered by the
searcher?
Regards,
John
"Jeroen De Brabander" <Jeroen.De.Brabander@.edan.be> wrote in message
news:OHe$OhCZEHA.2480@.tk2msftngp13.phx.gbl...
> Hi all,
> I'm looking for in-depth information on the search functionality of
> SharePoint 2003. As it is using the SQL full-text search, I post my
question
> to this newsgroup.
> The main goal is to figure out how I can add the functionality of rating
> documents to the SharePoint search, in order to make high-rated documents
> show up higher in the search results.
> Kind regards,
> Jeroendb
>
|||Thanks John, for your quick reply.
By rating I mean that I would like to let the end-user rate the documents he
finds. I.e. documents with an average rating of > 50% should show up higher
in the search hierarchy. Documents that are never rated or searched on
should be discarded after a certain period of time.
This probably is too much for SharePoint ?
Kind regards,
Jeroendb
"John Kane" <jt-kane@.comcast.net> wrote in message
news:OCB1l7JZEHA.3112@.tk2msftngp13.phx.gbl...
> Jeroen,
> I think what you're looking for in SharePoint 2003 is called "best bets",
> see "Planning Your Information Structure Using Microsoft Office SharePoint
> Portal Server 2003" at:
> http://www.microsoft.com/technet/pro...c0861881x.mspx
and
> under "Planning Keywords and Keyword Best Bets".
> While you can do much the same thing using SQL Server, it's can be a bit
> complex to get a scalable solution. How would you determine what is a
> "highly-rated document" based upon a keyword or phrase search entered by
the[vbcol=seagreen]
> searcher?
> Regards,
> John
>
> "Jeroen De Brabander" <Jeroen.De.Brabander@.edan.be> wrote in message
> news:OHe$OhCZEHA.2480@.tk2msftngp13.phx.gbl...
> question
documents
>
|||You're welcome, Jeroen,
Yes, this can be done using SQL FTS and adding a new column to your
FT-enabled table called "ExtendedRank" and then adding an value to the RANK
value in the CONTAINSTABLE or FREETEXTTABLE query and then ordering the
results by the new column "Extended Rank". Below is an example of the query
and table structure:
CREATE TABLE FTSTable (
KeyCol int IDENTITY (1,1) NOT NULL
CONSTRAINT FTSTable_IDX PRIMARY KEY CLUSTERED,
TextCol text NULL,
ImageCol image NULL,
ExtCol sysname NULL,
Keyword varchar(500),
KeywordWeight int,
TimeStampCol timestamp NULL
) ON [PRIMARY]
go
-- Note, the column KeywordWeight may need to be re-define as FLOAT
depending upon the type of calculations you are planning on doing.
-- Insert data...
INSERT FTSTable values('This is random data for row 1', 0xFFFFFFFF, 'zip',
'random,data,row', 1, NULL)
INSERT FTSTable values('The brown fox jumped over the fence. row 2',
0xFFFFFFFF, 'xls', 'brown,fox,fence', 10, NULL)
INSERT FTSTable values('Mary had a little lamb and its fleece was white',
0xFFFFFFFF, 'ppt', 'random,data,row', 0, NULL)
INSERT FTSTable values('Moby Dick is considered on of the greatest books
available.', 0xFFFFFFFF, 'txt', 'moby,dick,book', 50, NULL)
INSERT FTSTable values('This has more to do this short text that with rows',
0xFFFFFFFF, 'htm', 'short,text,row', 100, NULL)
INSERT FTSTable values('Taglines would be helpful here as I could get more
text added', 0xFFFFFFFF, 'pdf', 'tagline,helpful,text', 22, NULL)
INSERT FTSTable values('Winston Churchill was one of the greatest speakers
of all time', 0xFFFFFFFF, 'xml', 'winston,churchill,great,speakers', 1000,
NULL)
go
-- Note, that KeyCol rows 5 and 6 have very high values in the KeyWordWeight
column
-- Test FTS with containstable and freetexttable
SELECT FTS.KeyCol, FTS.TextCol, FTS.Keyword, FTS.KeywordWeight, CT.[RANK]
from FTSTable AS FTS
JOIN containstable(FTSTable,*,'text') as CT ON CT.[KEY] = FTS.KeyCol
ORDER by CT.[RANK] DESC
-- as expected two rows returned: KeyCol 6 and 5
-- caculated ExtendedRank = Rank + KeywordWeight
SELECT FTS.KeyCol, FTS.TextCol, CT.[RANK], FTS.KeywordWeight,
FTS.KeywordWeight + CT.[RANK] as ExtendedRank
from FTSTable AS FTS
JOIN containstable(FTSTable,*,'text') as CT ON CT.[KEY] = FTS.KeyCol
ORDER by ExtendedRank DESC -- CT.[RANK] DESC
/* -- Returns: as expected two rows returned: KeyCol 6 and 5 with HIGHEST
ExtendedRank first!!
KeyCol TextCol RANK
KeywordWeight ExtendedRank
-- ---- -- -
-- --
5 This has more to do this short text that with rows 48
100 148
6 Taglines would be helpful here as I could get more 48
22 70
(2 row(s) affected)
*/
Additionally, the above method is very scallable as you do not have to join
your FT-enabled table with another FT-enabled table in order to get the
"best bet" to the top of the resultset. This method could also be extended
and use a true "computed column" and support more complex calculations as
well. I don't know if this is "too much for SharePoint", but this is very
doable in SQL Server!
Regards,
John
"Jeroen De Brabander" <Jeroen.De.Brabander@.edan.be> wrote in message
news:#Ik6EwOZEHA.3988@.tk2msftngp13.phx.gbl...
> Thanks John, for your quick reply.
> By rating I mean that I would like to let the end-user rate the documents
he
> finds. I.e. documents with an average rating of > 50% should show up
higher[vbcol=seagreen]
> in the search hierarchy. Documents that are never rated or searched on
> should be discarded after a certain period of time.
> This probably is too much for SharePoint ?
> Kind regards,
> Jeroendb
> "John Kane" <jt-kane@.comcast.net> wrote in message
> news:OCB1l7JZEHA.3112@.tk2msftngp13.phx.gbl...
bets",[vbcol=seagreen]
SharePoint[vbcol=seagreen]
> and
> the
rating
> documents
>
|||Jeroen,
Did you find the below SQL FTS solution to provide the "functionality of
rating documents to the SharePoint search, in order to make high-rated
documents show up higher in the search results"? If so, how would you let
the end-user rate the documents he
finds? You programmatically set the column KeywordWeight to some value and
then use this a as multiplier (vs. just adding the value to the CT.[RANK]
value) and then ordering by the computed column: ExtendedRank.
If you're interested I've been able to extend this concept to use the
Keyword column to further extend it to a "keyword specific" solution
without joining to another table and without using cursor processing and
therefore is very scalable... Let me know if you would be interested in this
alternative approach to "best bet", i.e., ranking specific rows higher than
other rows in the CONTAINSTABLE or FREETEXTTABLE resultset.
Regards,
John
"John Kane" <jt-kane@.comcast.net> wrote in message
news:OGnTQvPZEHA.3132@.TK2MSFTNGP10.phx.gbl...
> You're welcome, Jeroen,
> Yes, this can be done using SQL FTS and adding a new column to your
> FT-enabled table called "ExtendedRank" and then adding an value to the
RANK
> value in the CONTAINSTABLE or FREETEXTTABLE query and then ordering the
> results by the new column "Extended Rank". Below is an example of the
query
> and table structure:
> CREATE TABLE FTSTable (
> KeyCol int IDENTITY (1,1) NOT NULL
> CONSTRAINT FTSTable_IDX PRIMARY KEY CLUSTERED,
> TextCol text NULL,
> ImageCol image NULL,
> ExtCol sysname NULL,
> Keyword varchar(500),
> KeywordWeight int,
> TimeStampCol timestamp NULL
> ) ON [PRIMARY]
> go
> -- Note, the column KeywordWeight may need to be re-define as FLOAT
> depending upon the type of calculations you are planning on doing.
> -- Insert data...
> INSERT FTSTable values('This is random data for row 1', 0xFFFFFFFF, 'zip',
> 'random,data,row', 1, NULL)
> INSERT FTSTable values('The brown fox jumped over the fence. row 2',
> 0xFFFFFFFF, 'xls', 'brown,fox,fence', 10, NULL)
> INSERT FTSTable values('Mary had a little lamb and its fleece was white',
> 0xFFFFFFFF, 'ppt', 'random,data,row', 0, NULL)
> INSERT FTSTable values('Moby Dick is considered on of the greatest books
> available.', 0xFFFFFFFF, 'txt', 'moby,dick,book', 50, NULL)
> INSERT FTSTable values('This has more to do this short text that with
rows',
> 0xFFFFFFFF, 'htm', 'short,text,row', 100, NULL)
> INSERT FTSTable values('Taglines would be helpful here as I could get more
> text added', 0xFFFFFFFF, 'pdf', 'tagline,helpful,text', 22, NULL)
> INSERT FTSTable values('Winston Churchill was one of the greatest speakers
> of all time', 0xFFFFFFFF, 'xml', 'winston,churchill,great,speakers', 1000,
> NULL)
> go
> -- Note, that KeyCol rows 5 and 6 have very high values in the
KeyWordWeight
> column
> -- Test FTS with containstable and freetexttable
> SELECT FTS.KeyCol, FTS.TextCol, FTS.Keyword, FTS.KeywordWeight, CT.[RANK]
> from FTSTable AS FTS
> JOIN containstable(FTSTable,*,'text') as CT ON CT.[KEY] = FTS.KeyCol
> ORDER by CT.[RANK] DESC
> -- as expected two rows returned: KeyCol 6 and 5
> -- caculated ExtendedRank = Rank + KeywordWeight
> SELECT FTS.KeyCol, FTS.TextCol, CT.[RANK], FTS.KeywordWeight,
> FTS.KeywordWeight + CT.[RANK] as ExtendedRank
> from FTSTable AS FTS
> JOIN containstable(FTSTable,*,'text') as CT ON CT.[KEY] = FTS.KeyCol
> ORDER by ExtendedRank DESC -- CT.[RANK] DESC
> /* -- Returns: as expected two rows returned: KeyCol 6 and 5 with HIGHEST
> ExtendedRank first!!
> KeyCol TextCol RANK
> KeywordWeight ExtendedRank
> -- ---- --
-
> -- --
> 5 This has more to do this short text that with rows 48
> 100 148
> 6 Taglines would be helpful here as I could get more 48
> 22 70
> (2 row(s) affected)
> */
> Additionally, the above method is very scallable as you do not have to
join[vbcol=seagreen]
> your FT-enabled table with another FT-enabled table in order to get the
> "best bet" to the top of the resultset. This method could also be extended
> and use a true "computed column" and support more complex calculations as
> well. I don't know if this is "too much for SharePoint", but this is very
> doable in SQL Server!
> Regards,
> John
>
> "Jeroen De Brabander" <Jeroen.De.Brabander@.edan.be> wrote in message
> news:#Ik6EwOZEHA.3988@.tk2msftngp13.phx.gbl...
documents[vbcol=seagreen]
> he
> higher
> bets",
> SharePoint
http://www.microsoft.com/technet/pro...c0861881x.mspx[vbcol=seagreen]
bit[vbcol=seagreen]
by
> rating
>

No comments:

Post a Comment