|
 |
|
 |
06-10-2008, 09:39 PM
|
#1 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
Search Engine Help in Java
hi to all,
I want to know how to store the value that is generated by the search Engine programmatically using Java. For example if we search for a key word in any search engine it gives some results and at top of the page it shows total number of results like "Results 1 - 10 of about 43,100", in this case I want to store the value 43,100 in a variable by using Java automatically. Any idea please.
any help will be appreciated,
thanks in advance.
|
|
|
06-11-2008, 02:19 AM
|
#2 (permalink)
|
|
Java fanboy
Join Date: Aug 2003
Posts: 1,166
|
What search engine? I don't understand what you're trying to do.
|
|
|
06-11-2008, 05:01 AM
|
#3 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
Actually i want to store the result by using Java program in order to avoid manual input, which is in above case is 43,100. For instance assume that i want it in Google in search engine then may i know the process. Basically i want this one to any search engine.
thanks!!!!!!!!!
|
|
|
06-11-2008, 04:34 PM
|
#4 (permalink)
|
|
Java fanboy
Join Date: Aug 2003
Posts: 1,166
|
So you're reading the response from an HTTP request, and wanting to screen-scrape the output? Can you post what you're working with? There are ten-million different ways to store the number "43,100" in Java, but I assume you're asking how to get it from the HTML spit back by Google into some data structure. I need to be familiar with what you've got setup for that before I can help you.
|
|
|
06-11-2008, 10:26 PM
|
#5 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
yes i want to read the response from HTTP. Suppose on daily basis want to check the results of my site and store the results in the data base to check its performance, avoiding manual input. Please can you specify some of them from those ten-million different ways.
chers!!! 
|
|
|
06-12-2008, 02:29 AM
|
#6 (permalink)
|
|
Java fanboy
Join Date: Aug 2003
Posts: 1,166
|
You need to be a little bit more precise - *are* you reading an HTTP response already, or do you *want* to read an HTTP response and need help figuring it out.
Assuming the former, for Google, you could simply put the whole response into a String, search for a table containing 'class="t bt"' - as that seems to hold the search results. The "total" is contained in the last <td> elements - I'd just parse it out. Every search engine will be different, and need a custom parseer. Then you could use JDBC to stick it into a database, or if you didn't want to mess around with that, just dump it into a flat file.
|
|
|
06-12-2008, 05:32 AM
|
#7 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
i want to read form the HTTP response. As i'm new to java please specify me a method to parse the search engine or whole process pleaseeeee.
thanks!!!!!!!
|
|
|
06-12-2008, 05:14 PM
|
#8 (permalink)
|
|
Java fanboy
Join Date: Aug 2003
Posts: 1,166
|
If you're new to Java, you might be trying a bridge too far with this project. You'd need to install J2EE in addition to J2SE, then work with the (relatively) clunky framework to just do a simple HTTP request. Java is more designed for enterprise systems, not checking on your page's ranking in Google. I'd suggest using PHP or Perl.
|
|
|
06-12-2008, 10:05 PM
|
#9 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
Already i have installed the J2EE along with J2SE. So can u give me the pseudo code so that i can program it accordingly. I understand that Java is not meant for page ranking in google of my site, but even if i am able to extract the total number of results is enough for now sir!
|
|
|
06-15-2008, 10:31 PM
|
#10 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
any ideas from any1 please!!!!!!!!!!!!!!!!!!!!!!!
|
|
|
06-16-2008, 04:20 PM
|
#11 (permalink)
|
|
Java fanboy
Join Date: Aug 2003
Posts: 1,166
|
What do you have so far for code?
|
|
|
07-02-2008, 12:03 AM
|
#12 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
hi 2 all,
after so many attempts i have the following code:
import java.net.*;
import java.io.*;
class Test {
public static void main(String[] args) throws Exception {
new Test().go();
}
void go() throws Exception {
URL url = new URL("http://search.yahoo.com/search?p=java");
URLConnection conn = url.openConnection();
conn.setRequestProperty("User-Agent","");
conn.connect();
BufferedReader in = new BufferedReader(new InputStreamReader(conn.getInputStream()));
String line;
String resultText="";
String urlResult="";
while ((line = in.readLine()) != null) {
resultText=resultText+line;
System.out.println(line);
}
if (resultText.indexOf("<p class=g>") != -1) {
urlResult = resultText.substring(resultText.indexOf("a href=", resultText.indexOf("<p class=g"))+7, resultText.indexOf("onmousedown") -1);
}
System.out.println(urlResult);
}
}
which produces the source code of search result page.
here i am stuck. can any body pleeeeeeeeeeaaassssseeeeeeee!!!!!!!!!! help me!
|
|
|
07-02-2008, 03:01 PM
|
#13 (permalink)
|
|
Java fanboy
Join Date: Aug 2003
Posts: 1,166
|
What's wrong with it?
Also, here are some things that will encourage people to help you:
- Use the "code" tag so your code is legible.
- Don't beg.
- Be judicious with exclamation points. The more exclamation points, the less likely anyone is to help you.
- Same can be said for extended spellings of "please".
- Don't replace two-letter words with a digit.
|
|
|
07-02-2008, 09:00 PM
|
#14 (permalink)
|
|
Code Monkey
Join Date: May 2008
Posts: 36
|
I wrote the code for extracting the source of a search web page. In that there will be the result number some where. Problem is that i want to extract that number store it in a variable. 
|
|
|
07-04-2008, 09:11 AM
|
#15 (permalink)
|
|
Java fanboy
Join Date: Aug 2003
Posts: 1,166
|
But you are storing it as a variable - "urlResult". What, specifically, isn't working with the code you wrote?
|
|
|
| Thread Tools |
|
|
| Display Modes |
Linear Mode
|
Posting Rules
|
You may not post new threads
You may not post replies
You may not post attachments
You may not edit your posts
HTML code is Off
|
|
|
All times are GMT -8. The time now is 03:23 PM.
|
Copyright © 2000-2008, Milano Interactive
Web Hosting provided by Portal 360 Web Hosting
|
 |
|