Gentee Programming Language > Documentation > Tutorial Download documentation

Lesson 13

This lesson focuses on the internet. In addition, you will get to know how to find a substring in a string.

Example 1

Find out the position of a domain and obtain the total number of search results using Google. The program provides an opportunity to use several search queries and domain names. The domain names are viewed on the first five search pages.

We use the global array words in order to store the queries and the domains array - for the required domains. You can use any selection of words within the domain name. Do a search of the domains by entering these words.

arr  words of str = %{
   "q=free programming language",
   "q=gentee+programming+language"
}
arr  domains of str = %{
   "programming",
   "gentee.com",
   "php.net"
}

Let us have a look at the getresults function. This function searches through the string, that contains the source of the search result page, for the total number of results. First, we review the search page in order to define location of the substring where the total number of results is listed. Then we assign the required substring to the pattern variable. We need to get the result followed by the substring.

uint     offset end
spattern sp
str      pattern = "of about <b>"
   
out.clear()
sp.init( pattern, $QS_IGNCASE )
offset = sp.search( input, 0 )
offset += *pattern
if offset < *input 
{
   if ( end = input.findchfrom( '<', offset )) < *input
   {
      out.substr( input, offset, end - offset )
   }
}

The variable of spattern type is used for search purposes. At first, we initialize the search pattern with the help of the init method; second, applying the search method we do a search.

The displayed function, that is analogous to the previous function, is applied for searching the specified substring-domains in result pages. Let us the pattern pattern be served for the next search result and the dp pattern be used to find out the substring in the URL address.

spattern sp dp
str      pattern = "<p class=g><a class=l href=\""
   
sp.init( pattern, $QS_IGNCASE )
dp.init( domains[id], $QS_IGNCASE )
url.clear()

while ( offset = sp.search( input, offset )) < *input
{
   ret++ 
   offset += *pattern
   if ( end = input.findchfrom( '"', offset )) < *input
   {
      str stemp     
      stemp.substr( input, offset, end - offset )
      if dp.search( stemp, 0 ) < *stemp
      { 
         url = stemp
         return page * 10 + ret
      }
   }
}

id is an index of a domain name which is contained in the domains array. The URL address, that contains the required domain name, is specified in the urlargument. This function returns the position of the domain name in search results.

Let us have a look at the main function googlesearch. To begin with, use the inet_init function in order to initialize the internet library, then create a loop that searches through the words array for a query. You obtain the search result page with the help of the http_get function in the resultstring. getresults provides us with the total number of found records.

inet_init()
   
fornum iword, *words
{
   str  result stemp
   print("\nSearching \(words[ iword ])\nPage 1.")
   if !http_get( "http://www.google.com/search?\(words[ iword ])", 
                        result->buf, 0, $HTTPF_STR ) : break
   getresults( result, stemp )
   output += "   Search: \(words[ iword ])\lAll pages: \(stemp)\l"

Next step: we want to search through domains for the positions of the required domains on the first five search result pages. These positions will be stored in the pos array.

fornum id = 0, *domains : pos[ id ] = 0
page = 0
while 1
{                 
   fornum id = 0, *domains
   {            
      if !pos[ id ]
      {
         pos[ id ] = displayed( result, id, page, urls[ id ] )
      }                       
   }
   if ++page == 5 : break
   print("Page \( page + 1 ).") 
   if !http_get( "http://www.google.com/search?\(words[ iword ])&start=\(page * 10 )", 
   result->buf, 0, $HTTPF_STR ) : break
}

The next Google search result is downloaded by calling the http_get function.Finally, we output the obtained data into the file. So, the program has been completed.

      output += "\lResults for 1-5 pages\l"            
      fornum id = 0, *domains
      {
         output += "\( domains[ id ]): \(?( pos[ id ], str( pos[id] ) + 
                  "  \(urls[id])", 
                       "Not found"))\l"            
      }      
      output += "==========================================\l"
   }
   inet_close()
   output.write("search.txt")
   shell("search.txt")


 Copyright © 2004-2006 Gentee Inc. All rights reserved.