SearchView

SearchView is a free search tool which allows you to search multiple search engines and save the results as XML or PDF. Increasingly search engines are returning search results as xml, which enables us to save these results and reaggreagate them. The SearchView functionality is found as a free tool and also incorporated into the web research tool BigBlogZoo Classic and MediaMiner. Below is a summary of the features of SearchView and some details on how to configure a new search engine to work with SearchView.

 

a) To open other previously saved search results.
b) To save the current search results.
c) To save as the current search results.
d) To Autodetect channels. Many web pages now have links to xml channels. Certain browsers for example Firefox detect these links. Firefox displays this symbol in the bottom right corner . The BigBlogZoo can detect and read any of these embedded links. Simply enter the url of the webpage in the browser and press the autodetect button, . Once you have detected the Channel drag it into MyZoo or into the BigBlogZoo . Alternatively you can use cut and paste.
e) To delete the currently selected entry.
f) To delete all of the entries.
g) To reaggregate the selected entries.
h) To create and view a PDF document from the selected search results.
i) A selected search engine.
j) Another selected search engine, noticed that more than one search engine can be used.

Context Menu

After pressing the right mouse button more options become available.

k) To open a search result in a new tab.
l) To email this search result
m) To save as the search results
n) To save the search results
o) To open previously saved search results.

To Configure New Search Engines

The search engine of your choice can be configured with xml files within the searchxml directory. If you are not interested in configuring your own search engine than you need read no further on this page.

Below you can see the definition of the Yahoo! Search.

<?xml version="1.0"?>

<PostGetDefinition>

<SearchName>Yahoo!</SearchName>
<StartReferer>http://search.yahoo.com/</StartReferer>
<Method>GET</Method>
<GeneratesXMLLink>TRUE</GeneratesXMLLink>
<DefaultSearch>FALSE</DefaultSearch>

<FormHandler URL="search">
 <FormField name="p" value="THEQUERYSTRING"/>
 <FormField name="fr" value="FP-tab-web-t"/>
 <FormField name="toggle" value="1"/>
 <FormField name="cop" value=""/>
 <FormField name="ei" value="UTF-8"/>
</FormHandler>

</PostGetDefinition>

The value within SearchName is what gets copied to the menu (h). The StartReferer is normally the home page of the search engine. Currently Method is always GET. GeneratesXMLLink refers to the XML which the search engine may or may not produce. Not all search engines produce xml, for example DMOZ. When no xml is produced by the search engine then this value should be FALSE. DefaultSearch should be true for the search that you wish to be used as a default. The URL attribute for the FormHandler is basically everything that comes after the StartReferer when you do a search. What comes next are the parameters of the search, at this point lets work with an example, here is a search on Richard Nixon using Yahoo!

http://search.yahoo.com/search?p=richard+nixon&fr=FP-tab-web-t&toggle=1&cop=&ei=UTF-8

This is the url that will come back after you do a search on Richard Nixon using Yahoo! Note that this is an example only, you can plugin the BigBlogZoo to any search engine that a) uses Get and b) doesn't block access to the BigBlogZoo, more about this later.

First of all note that where the value from StartReferer comes, it is basically everything up to the third forward slash, inclusive. Now the URL attribute for the FormHandler is now what remains until the question mark. Ok now we are up to the parameters. p Here is a special parameter because that contains our search. Everything to the left of an equals sign is a name and everything to the right is a value. Ensure that the parameter that has the value of your search is marked with the value of THEQUERYSTRING in the xml definition. This xml file needs to be copied into the xmlsearches directory.

That's about all there is to it. Almost. First of all before you configure a search engine, read their terms to make sure you are not violating their rules. No automated querying is provided as normally this violates a search engines terms of service. What does it mean it only supports Get? Well there are main two types of internet requests, Post and Get. Get uses parameters in the URL. So when you type something in a search engine, and you see what you looked for in the URL this is uses Get. Post is a more behind the scenes format. If you type something into a search engines and you get back a url without your search query then it is using Post.

After making sure all these parameters are correct, the search engine may block access. For example, Google normally blocks access to only a list of accepted browsers. Unfortunately the BigBlogZoo hasn't yet made the list, but we will keep everyone posted. If you have configured a search engine that you think the rest of us would like to use, please submit it the Safari Lodge and we will ensure it goes out with the release.

If you wish to know if a web page has an embedded link either look for this symbol in Firefox or the the equivalent in other browser's which support autodetection of channels. Internet Explorer does not currently support this feature. You can also read the HTML, Such an embedded link typically looks like this:

< link rel="alternate" 
  type="application/rss+xml" 
  href="http://syndicatescape.com/bigblogzoo/index.php?type=rss2;limit=20;action=.xml" />

 

 
 
Generatescape LogoSome Animal
     Copyright 2005. GenerateScape