Tasks and Applications
Knowledge Layer Building
- Raw web pages (from RetroIndex)
Classification of knowledge extraction tasks
- By knowledge type: entities, types, semantic classes, peer similarity, synonymy, antonymy, acronyms, is-a, part-of, attribute names, attribute values, general relations, events...
- By data source: plain sentences, web page contents, page URLs, anchor text, query set, click through, dictionaries, databases...
- By source scope:
- for web pages: all pages, pages in a subset of domains (e.g. Wikipedia pages)...
- for query set: all queries, queries containing "list of"...
- By techinuqe element: lexical pattern matching, tag pattern matching, co-occurrence calculation, wrapper induction, template generation, context analysis, distributional similarity, graph analysis, bootstrapping, classification, clustering...
Knowledge extraction task items
- IsA extraction from web pages with lexical patterns and produced knowledge iteratively
- Extraction attributes from Wikipedia and other data resources
Target knowledge layers
- Type system
- Attribute system
- Relatedness system
Common Utility Modules
- Common APIs for basic feature requirements
- Query parsing and exploration (QPE)
- Short text conceptualization
- Short text similarity calculation