Am happy to join the forum and hope have lots of issues to be discussed.
I am doing on a local search engine as a masters thesis. I configured Nutch search engine and able to play with it.
I came to know that Nutch automatically indexs the documents it has crawled. But what I need is to do some
preprocessing on the downloaded pages before indexing. Can someone tell me how to go about? If this is the wrong place
to post the question, could you tell me which group to join?
Thank u in advance!
- Tesfaye
Scheduled Maintenance: We are aware of an issue with Google, AOL, and Yahoo services as email providers which are blocking new registrations. We are trying to fix the issue and we have several internal and external support tickets in process to resolve the issue. Please see: viewtopic.php?t=158230
Your Help on Nutch!
-
- Posts: 1
- Joined: 2010-05-31 06:00
Re: Your Help on Nutch!
1) What does this have to do with Debian?
2) Download the source, see what it's doing when and where. Modify the code to suit your needs.
2) Download the source, see what it's doing when and where. Modify the code to suit your needs.