| Gentee Programming Language > Documentation > Tutorial | Download documentation |
This lesson focuses on the internet. In addition, you will get to know how to find a substring in a string.
Example 1Find out the position of a domain and obtain the total number of search results using Google. The program provides an opportunity to use several search queries and domain names. The domain names are viewed on the first five search pages.
We use the global array words in order to store the queries and the domains array - for the required domains. You can use any selection of words within the domain name. Do a search of the domains by entering these words.
arr words of str = %{
"q=free programming language",
"q=gentee+programming+language"
}
arr domains of str = %{
"programming",
"gentee.com",
"php.net"
}Let us have a look at the getresults function. This function searches through the string, that contains the source of the search result page, for the total number of results. First, we review the search page in order to define location of the substring where the total number of results is listed. Then we assign the required substring to the pattern variable. We need to get the result followed by the substring.
uint offset end
spattern sp
str pattern = "of about <b>"
out.clear()
sp.init( pattern, $QS_IGNCASE )
offset = sp.search( input, 0 )
offset += *pattern
if offset < *input
{
if ( end = input.findchfrom( '<', offset )) < *input
{
out.substr( input, offset, end - offset )
}
}The variable of spattern type is used for search purposes. At first, we initialize the search pattern with the help of the init method; second, applying the search method we do a search.
The displayed function, that is analogous to the previous function, is applied for searching the specified substring-domains in result pages. Let us the pattern pattern be served for the next search result and the dp pattern be used to find out the substring in the URL address.
spattern sp dp
str pattern = "<p class=g><a class=l href=\""
sp.init( pattern, $QS_IGNCASE )
dp.init( domains[id], $QS_IGNCASE )
url.clear()
while ( offset = sp.search( input, offset )) < *input
{
ret++
offset += *pattern
if ( end = input.findchfrom( '"', offset )) < *input
{
str stemp
stemp.substr( input, offset, end - offset )
if dp.search( stemp, 0 ) < *stemp
{
url = stemp
return page * 10 + ret
}
}
}id is an index of a domain name which is contained in the domains array. The URL address, that contains the required domain name, is specified in the urlargument. This function returns the position of the domain name in search results.
Let us have a look at the main function googlesearch. To begin with, use the inet_init function in order to initialize the internet library, then create a loop that searches through the words array for a query. You obtain the search result page with the help of the http_get function in the resultstring. getresults provides us with the total number of found records.
inet_init()
fornum iword, *words
{
str result stemp
print("\nSearching \(words[ iword ])\nPage 1.")
if !http_get( "http://www.google.com/search?\(words[ iword ])",
result->buf, 0, $HTTPF_STR ) : break
getresults( result, stemp )
output += " Search: \(words[ iword ])\lAll pages: \(stemp)\l"Next step: we want to search through domains for the positions of the required domains on the first five search result pages. These positions will be stored in the pos array.
fornum id = 0, *domains : pos[ id ] = 0
page = 0
while 1
{
fornum id = 0, *domains
{
if !pos[ id ]
{
pos[ id ] = displayed( result, id, page, urls[ id ] )
}
}
if ++page == 5 : break
print("Page \( page + 1 ).")
if !http_get( "http://www.google.com/search?\(words[ iword ])&start=\(page * 10 )",
result->buf, 0, $HTTPF_STR ) : break
}The next Google search result is downloaded by calling the http_get function.Finally, we output the obtained data into the file. So, the program has been completed.
output += "\lResults for 1-5 pages\l"
fornum id = 0, *domains
{
output += "\( domains[ id ]): \(?( pos[ id ], str( pos[id] ) +
" \(urls[id])",
"Not found"))\l"
}
output += "==========================================\l"
}
inet_close()
output.write("search.txt")
shell("search.txt")
Copyright © 2004-2006 Gentee Inc. All rights reserved. |