Task 1 - Sequential Update Summarization

Overview

Unexpected news events such as natural disasters represent a unique information access problem where the performance of traditional approaches deteriorates.  For example, immediately after an event, the corpus may be sparsely populated with relevant content.  Even when, after a few hours, relevant content is available, it is often inaccurate or highly redundant.  At the same time, crisis events demonstrate a scenario where users urgently need information, especially if they are directly affected by the event.

The goal of this track is to develop systems which allow users to efficiently monitor the information associated with an event over time.  Specifically, we are interested in developing systems which
  1. Can broadcast useful, new, and timely sentence-length updates about a developing event
  2. Can track the value of important event-related attributes (e.g. number of fatalities, financial impact).

Goals

  • to develop algorithms which detect sub-events with low latency.
  • to develop algorithms which minimize redundant information in unexpected news events.
  • to model information reliability in the presence of a dynamic corpus.
  • to understand and address the sensitivity of text summarization algorithms in an online, sequential setting.
  • to understand and address the sensitivity of information extraction algorithms in dynamic settings.

Problem Definition

For each event, a system will traverse the input stream of documents from the event onset time, t0, until some fixed period afterward, tT. Throughout this simulation, the system will emit short timestamped text summaries whenever appropriate.  At the end of the simulation, the system will have produced a list of tuples,


where  is the timestamp of the ith text update, ui.  The content of ui can be either extracted from documents in the stream or generated by the system.


Evaluation

In order to evaluate a system's simulation output, O, we need a set of all possible relevant sub-events annotated with the time at which it occurred.  This set can be derived by retrospective analysis of the event using some manual editorial process.  This is the approach taken in the  GALE distillation and NTCIR 1CLICK evaluations.  In addition to purely manual evaluation, we will consider a semi-automatic nugget-based evaluation.  Decidability of what nuggets (updates) match what system output or documents can be done mostly automatically.  System performance will be aggregated over a set of events, each with a separate system output and relevant set of sub-events.  We currently plan to use the Wikipedia edit history for Current Events since it clearly defines a target audience and provides a great deal of manual extraction and summarization work that is frequently updated as current events unfold and news becomes available.  Assessors will remove unnecessary components and aid in fact extraction based upon the edit history.
Comments