Entretiens / Interviews / Entrevistas - Marie Lebert

EDUARD HOVY [EN, FR]

[EN] Eduard Hovy (Marina del Rey, California)

#Head of the Natural Language Group at USC/ISI (University of Southern
California / Information Sciences Institute)

The Natural Language Group (NLG) at the Information Sciences Institute of the University of Southern California (USC/ISI) is currently involved in various aspects of computational/natural language processing. The group's projects are: machine translation; automated text summarization; multilingual verb access and text management; development of large concept taxonomies (ontologies); discourse and text generation; construction of large lexicons for various languages; and multimedia communication.

Eduard Hovy, his director, is a member of the Computer Science Departments of USC and of the University of Waterloo. He completed a Ph.D. in Computer Science (Artificial Intelligence) at Yale University in 1987. His research focuses on machine translation, automated text summarization, text planning and generation, and the semi-automated construction of large lexicons and terminology banks. The Natural Language Group at ISI currently has projects in most of these areas.

Dr. Hovy is the author or editor of four books and over 100 technical articles.
He currently serves as the President of the Association of Machine Translation
in the Americas (AMTA). He is Vice President of the Association for
Computational Linguistics (ACL), and has served on the editorial boards of
Computational Linguistics and the Journal of the Society of Natural Language
Processing of Japan.

[Interview 27/08/1998 // Interview 08/08/1999 // Interview 02/09/2000]

*Interview of August 27, 1998

= How do you see the growth of a multilingual Web?

In the context of information retrieval (IR) and automated text summarization (SUM), multilingualism on the Web is another complexifying factor. People will write their own language for several reasons — convenience, secrecy, and local applicability — but that does not mean that other people are not interested in reading what they have to say! This is especially true for companies involved in technology watch (say, a computer company that wants to know, daily, all the Japanese newspaper and other articles that pertain to what they make) or some government intelligence agencies (the people who provide the most up-to-date information for use by your government officials in making policy, etc.). One of the main problems faced by these kinds of people is the flood of information, so they tend to hire "weak" bilinguals who can rapidly scan incoming text and throw out what is not relevant, giving the relevant stuff to professional translators. Obviously, a combination of SUM and MT (machine translation) will help here; since MT is slow, it helps if you can do SUM in the foreign language, and then just do a quick and dirty MT on the result, allowing either a human or an automated IR-based text classifier to decide whether to keep or reject the article.