El Blanco's Office 2007 Blog

Tuesday, April 15, 2008

SharePoint Search Crawl Rules

When performing a search in SharePoint you often find you get noisy results where not only will it return the document you searched for, but it will also return the view and edit properties pages, the AllItems.aspx view form etc.

To prevent these from being returned you need to update the crawl rules. To do this follow the steps below:

  1. Open the Crawl Rules section of the Search Settings in the SSP admin site.
  2. Add crawl rules to exclude the following paths:

    *://*webfldr.aspx*

    *://*my-sub.aspx*

    *://*mod-view.aspx*

    *://*allitems.aspx*

    *://*all forms.aspx*

Anyone got any more that should be added to this list to bring back a better set of search results ?!

11 Comments:

  • Interesting post. But what if your trying to index a List and specifically the ID column. I found that I had to use AllItems.aspx in order to get the ID to be searchable.

    By Anonymous Anonymous, at 4:28 pm  

  • Hello, thanx for your info. I have also fund *://*DispForm.aspx* - to be excluded.

    By Anonymous Anonymous, at 4:22 pm  

  • Good one!
    Ading to this, is it possible to exclude the file from serach result if that file consists the data for which log on user does not have permissions?

    Appreciate your comments on this.

    By Anonymous Anonymous, at 2:11 pm  

  • Hi Karri,

    Search results are security trimmed anyway so users won't see search results for which they don't have permission - I think this was what your question referred to ?

    The only exception to this rule, as far as I know, are best bets. I don't believe that pre-defined best bets are security trimmed, although I could be wrong :)

    Cheers,
    Chris

    By Blogger Chris White, at 2:20 pm  

  • Hi,

    I made a crawl rule http://servername/*AllItems.aspx* (exclude all items).
    But still I am finding Allitems.aspx links in my search resluts.
    any idea why this is happening?

    By Anonymous Anonymous, at 3:32 am  

  • Hi Chetali,

    The rule you want is in the article:

    *://*allitems.aspx*

    Have you done a complete full crawl, rather than just an incremental crawl ?!

    Thanks,
    Chris

    By Blogger Chris White, at 7:55 am  

  • You should also check out the "ViewFormsPagesLockDown" feature for stopping google etc. from indexing your site's form pages.

    By Anonymous Anonymous, at 3:27 pm  

  • If I am entering *://*all forms.aspx* then it is showing error message. "Invalid URL" means I am unable to create rule, but if I am using http://servername/*AllItems.aspx then rule is creating but allitems.aspx is in search result.

    Any idea ?

    Thanks
    Furqan

    By Blogger Furqan, at 8:41 am  

  • Please go the Shared Services Administration: SSP default > Search Administration > Crawl rules section and NOT to the craw rules of a specific scope. Within the general crawrules you can use the wildcards as mentioned in this post.

    Regards
    Mark de Bruijne

    By Blogger Unknown, at 12:29 pm  

  • Thank you.

    Very useful Information.

    By Anonymous Monali, at 5:23 am  

  • *://*all forms.aspx* cannot be used because spaces are not allowed in the rule. I get the following error - "The site name is not specified or is invalid. Specify a site name that does not contain the following characters: [/\\@#|] or spaces.".

    By Anonymous Anonymous, at 5:47 pm  

Post a Comment

<< Home