Extraction of Most Relevant Data from Deep Web Mining
Ashok P1, V. Hariharan2, R. Lavanya3, R. R. Prianka4
1P. Ashok, Department of Information Technology, Veltech Hightech Dr. R. R. S. R. Engineering College, Avadi, Chennai, India.
2V. Hariharan, Department of Information Technology, Veltech Hightech Dr. R. R. S. R. Engineering College, Avadi, Chennai, India.
3R. Lavanya, Department of CSE, Veltech Multitech Dr. R. R. S. R. Engineering College, Avadi, Chennai, India.
4R. R. Prianka, Department of Information Technology, Veltech Hightech Dr. R. R. S. R. Engineering College, Avadi, Chennai, India.
Manuscript received on December 01, 2014. | Revised Manuscript received on December 10, 2014. | Manuscript published on December 15, 2014. | PP: 16-18 | Volume-3 Issue-1, December 2014. | Retrieval Number: A0762123114/2014©BEIESP
Open Access | Ethics and Policies | Cite | Mendeley
© The Authors. Published By: Blue Eyes Intelligence Engineering and Sciences Publication (BEIESP). This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/)
Abstract: Extraction of web content from the deep web page is the tough task to retrieve the relevant data because they are web page programming language dependent. The challenges of such web page extraction are increases every day due to expanding of huge web database, which makes the researchers to concentrate on deep web mining. Whenever user submits a query into search engine, it retrieves the list of best matching web page with short summary of notes such as title, some text from specific site. But retrieved information from web database is locked as deep web (Hidden Web or Invisible Web) on web page. In this paper, we proposed ontological technique with WordNet to extract the data records from the deep web pages. This technique discovers best matching words, eliminates unnecessary tags and able to extract large variety of data records with different structures.
Keywords: Ontology, Deep web, WordNet, Web Mining.