<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:71px; top:528px; width:226px; height:243px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">Every day, a large number of news articles are cre-
<br>ated and reported, many of which are unique. But
<br>certain types of events, such as hurricanes or mur-
<br>ders, are reported again and again throughout a year.
<br>The goal of Information Extraction, or IE, is to re-
<br>trieve a certain type of news event from past articles
<br>and present the events as a table whose columns are
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">filled with a name of a person or company, accord-
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">ing to its role in the event. However, existing IE
<br>techniques require a lot of human labor. First, you
<br>have to specify the type of information you want and
<br>collect articles that include this information. Then,
<br>you have to analyze the articles and manually craft
<br>a set of patterns to capture these events. Most exist-
<br>ing IE research focuses on reducing this burden by
<br>helping people create such patterns. But each time
<br>you want to extract a different kind of information,
<br>you need to repeat the whole process: specify arti-
<br></span></div><div style="position:absolute; border: textbox 1px solid; writing-mode:lr-tb; left:313px; top:284px; width:226px; height:488px;"><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">cles and adjust its patterns, either manually or semi-
<br>automatically. There is a bit of a dangerous pitfall
<br>here. First, it is hard to estimate how good the sys-
<br>tem can be after months of work. Furthermore, you
<br>might not know if the task is even doable in the first
<br>place. Knowing what kind of information is easily
<br>obtained in advance would help reduce this risk.
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">An IE task can be defined as finding a relation
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">among several entities involved in a certain type of
<br>event. For example, in the MUC-6 management
<br>succession scenario, one seeks a relation between
<br>COMPANY, PERSON and POST involved with hir-
<br>ing/firing events. For each row of an extracted ta-
<br>ble, you can always read it as “COMPANY hired
<br>(or fired) PERSON for POST.” The relation between
<br>these entities is retained throughout the table. There
<br>are many existing works on obtaining extraction pat-
<br>terns for pre-defined relations (Riloff, 1996; Yangar-
<br>ber et al., 2000; Agichtein and Gravano, 2000; Sudo
<br>et al., 2003).
<br>Unrestricted Relation Discovery is a technique to
<br>automatically discover such relations that repeatedly
<br>appear in a corpus and present them as a table, with
<br>absolutely no human intervention. Unlike most ex-
<br>isting IE research, a user does not specify the type
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">of articles or information wanted. Instead, a system
<br></span><span style="font-family: QTLIUY+NimbusRomNo9L-Regu; font-size:9px">tries to find all the kinds of relations that are reported
<br>multiple times and can be reported in tabular form.
<br>This technique will open up the possibility of try-
<br>ing new IE scenarios. Furthermore, the system itself
<br>can be used as an IE system, since an obtained re-
<br>lation is already presented as a table. If this system
<br>works to a certain extent, tuning an IE system be-
<br>comes a search problem: all the tables are already
<br>built “preemptively.” A user only needs to search