Menu Close

Annual Research Day

The group holds an annual research day at the start of Autumn term every year to bring everyone together and foster interaction between new and returning members and invited speakers. 

NLIP Research Day (incorporating the 19th Annual Language And Computation Day)

University of Essex
Friday 9 October 2020
Location: Zoom

Programme

0900 – Join us on Zoom for virtual tea and coffee

0930 – Welcome and Introduction to the Natural Language and Information Processing (NLIP) Research Group (Jon Chamberlain) 

0940 – Keynote talk: Mark Lattimer, Executive Director, CEASEFIRE Centre for Civilian Rights

Using NLIP to enhance human rights monitoring in conflict environments
This presentation will consider the experience of the CEASEFIRE Centre for Civilian Rights, in partnership with Essex University, seeking to develop civilian-led monitoring of violations in armed conflict. Difficulties in accessing war zones have raised interest in information circulating online and through social media. But how can NLIP be used to help gather data on rights violations? This talk will look at successes of the project to date and challenges for the future.

Session 1: Language and Computation (Chair: Jon Chamberlain)

1000 – Lightning updates (2 slides, 2 minutes)

  • Creating and evidencing impact (Matthew Wells)
  • Detecting hate speech in Evalita2020 (Udo Kruschwitz)
  • Update from DALI (Massimo Poesio)
  • Year 3 of the Royal Academy of Engineering Visiting Professorship (Tony Russell-Rose)
  • Update from BT KTP (Reyhaneh Hashempour)
  • Named Entity Detection on Mondaq Articles (Hui Yang)
  • Knowledge graphs and Perception: Update from Signal AI (James Brill)
1015 – Research updates (8 mins present, 2 mins Qs)
  • Legal chatbots (Mudita Sharma)
  • Update from Neotas KTP (Mozhgan Talebpour)
  • Automatic multi document text summarization from heterogeneous data sources (Mahsa Kia)
  • From academia to industry: Challenges and opportunities of NLP in the wild (Ayman Alhelbawy)
1130 – Break

1140 – Lightning updates (2 slides, 2 minutes)
  • Article prediction in the Chinese law system (Yunfei Long)
  • Life in China (Richard Sutcliffe)
  • Social Media Audience of Diplomatic Conflict in Korea and Japan (Akitaka Matsuo)
  • Fun with discourse representation (Chris Fox)
  • Learning to Rank Word Embeddings (Shoaib Jameel)
  • Does BERT Learn Latent Topics? (Mozhgan Talebpour)

1210 – Research updates (8 mins present, 2 mins Qs)

  • One word embedding to rule them all? Political Text Classification using CNNs (Kakia Chatsiou)
  • Textual Data Augmentation for Efficient Active Learning on Tiny Datasets (Husam Quteineh)
  • Drug recommendation system using sentiment analysis (Srinidhi Karthikeyan)

1240 – Lunch

Session 2: Computer Vision (Chair: Alba Garcia)

1330 – Lightning updates (2 slides, 2 minutes)   

  • Computer Vision and Machine Learning for Removing Defects from a Tube Manufacturing Line (Amit K Singh)
  • Biophysical modelling and deep learning in medical imaging (Giorgos Papanastasiou)
  • Multiscale computational modelling of the heart (Cunjin Luo)
  • Sparse representation via dictionary learning for image classification (Vahid Abolghasemi)
  • Multimodal Information Extraction through Evaluation (Alba García Seco de Herrera)
  • ImageCLEFcoral 2020: Transfer learning of coral images using deep learning (Jon Chamberlain)
  • Culture, Connection and Creativity through Eastern Arc partnerships (Faiyaz Doctor)

1400 – Research updates (8 mins present, 2 mins Qs) 

  • Investigating human impacts of Norfolk’s chalk reef using 3D photogrammetry (Jess Wright)
  • MediaEval 2020: Predicting media memorability (Rukiye Savran Kiziltepe and Janadhip Jacutprakart)
  • How does sound effect how you feel about art? (Inas Al-Taie)
  • Early prediction of Neurodegenerative Diseases using Deep Learning (Ekin Yagis)

1440 – Reading seminars for 2020/21 (agree time/date, suggest topics of interest to cover, agree conveners), AOB

1500 – Close 

Update

NLIP Research Day (incorporating the 18th Language And Computation Day)
University of Essex
Friday 4 October, 2019

Location: 1N1.4.1

Programme

  • 10:00 Jon Chamberlain Welcome to the day and Introduction to the Natural Language and Information Processing (NLIP) Group

Session 1: Image & Multimodal Information Processing (Chair: Alba Garcia Seco de Herrera)

  • 10:15 Alba Garcia Seco de Herrera Intro to the session
  • 10:20 Ekin Yagis Deep learning for neurodegenerative disease detection
  • 10:40 Roberto Leyva NLP Features for Video Memorability Prediction
  • 11:00 Hector Palop and Jon Chamberlain Aerial image retrieval and classification to detect construction materials
  • 11:20 [Break]
  • 11:40 Jessica Wright and Alba Garcia Seco de Herrera ImageCLEFcoral 2019 and beyond
  • 12:00 Chris Madge Text processing with mindless clicking? The curious case of Wordclicker

Plenary Talk

  • 12:20 Jerry Shen Intelligent multimodal recommendation
  • 13:00 [Lunch]

Research Group Coordination

  • 14:00 NLIP 2019/20 group meetings: What to cover in this year’s meetings and coordinating group activities

Session 2: Natural Language Processing and Text analytics (Chair: Kakia Chatsiou)

  • 14:15 Kakia Chatsiou Intro to the session
  • 14:20 Tony Russell-Rose The language of search
  • 14:40 Deirdre Lungley Challenges for the Social Science Search Interface – Discoverability and Linkability
  • 15:00 Chris Fox The Philosophy of Language
  • 15:20 [Break, featuring PhD cake bakeoff]
  • 15:40 Ansgar Scherp What If We Encoded Words as Matrices and Not as Vectors? The case of a Hybrid CBOW-CMOW-Model
  • 16:00 Ayman Alhelbawy Deep Learning for Humanitarian Sphere
  • 16:20 Kakia Chatsiou Deep Learning Methods for Political Science
  • 16:40 Ansgar Scherp Closing

As usual, we plan to have a quick drink on campus afterwards, dinner in the Black Buoy (table booked for 18.45, please let Jon know if you wish to join us) and then more drinks later in Wivenhoe.

Annual Research Day – 2018

Friday 21 September, 2018; Location: Tony Rich TC 2.12 – 2.13, Co-located with CEEC, Chair: Jon Chamberlain

Keynote talks (TC2.8-2.9)

  • 09:00 Sameer Antani Artificial Intelligence and Clinical Image Analytics for Global Health
  • 09:55 Ansgar Scherp About Extreme Analyses of Texts and Graphs

Programme (TC2.12-2.13)

  • 10:50 [Coffee break]
  • 11:10 Husam Quteineh and Richard Sutcliffe Customer engagement with a chatbot
  • 11:30 Rina Zviel-Girshin enetCollect: Linguistics Apps Development for Second Language Learning
  • 11:50 Udo Kruschwitz BT Challenge Projects
  • 12.10 Toby Crayston Tricks for Building Big Classifiers with Little Data
  • 12.30 Jon Chamberlain Introducing Tony Russell-Rose, Royal Academy of Engineering Visiting Professor
  • 12:40 LAC 2018/19 meetings: What to cover in this year’s LAC meetings and coordinating group activities
  • 13:00 [Lunch]
  • 14:00 Massimo Poesio Disagreements and Language Interpretation (DALI)
  • 14:20 Doruk Kicikoglu Wormingo – Applying language learning games into anaphoric annotation
  • 14:40 Chris Madge Crowdsourcing and Aggregating Nested Markable Annotations
  • 15:00 Alexandra Uma Learning coreference from multiply-annotated data
  • 15:20 Janosch Haber The PhotoBook Conversation Task and Dataset
  • 15:40 Alba Garcia Benchmarks for Confronting Image Annotation Challenges: ImageCLEF 2019
  • 16:00 Nida Sae Jong Thai speech recognition for rehabilitation
  • 16:20 END

Annual Research Day – 2017

Thursday October 5th, 2017; Location: 1N1.4.1

Programme

  • 09:30 Welcome; recent activities of the LAC group
  • 09:40 Aline Villavicencio Computational Models of languages and learners
  • 10:00 Renato Amorim Feature weighting in clustering
  • 10:20 Udo Kruschwitz A Primer on Enterprise Search
  • 10:40 Chris Madge Gamifying Acquisition of Language Resources
  • 11:00 [Coffee]
  • 11:20 Chris Fox What on Earth are we talking about?
  • 11:40 Massimo Poesio DALI: The First Year
  • 12:00 Jon Chamberlain Games, Language and Social Media: Experiences From 8 Years of Phrase Detectives Online
  • 12:20 Ayman Alhelbawy Following Human Rights Abuse on Social Media
  • 12:40 Richard Sutcliffe Capturing the Meaning of Complex Texts about Music
  • 13:00 LAC 2017/18 meetings: Discussion as to what subjects to cover in this year’s LAC meetings
  • 13:20 [Lunch & Close]

Abstracts/Slides/Bios/Notes

Here is a link to the Human Rights Abuse Corpus that Ayman referred to: HRADS corpus and the LREC 2016 paper that describes the corpus: Towards a corpus of violence acts in Arabic social media.

Annual Research Day – 2016

Monday September 26th, 2016; Location: NTC.3.07

Programme

  • 09:45 Welcome; Recent activities of the LAC group
  • 10:00 Gabriella Kazai Lumi News AI: A smart reader for a personalised feed of crowd-curated content
  • 11:00 [Coffee]
  • 11:20 Udo Kruschwitz Enterprise search
  • 11:40 Dino Ratcliffe Deep reinforcement learning doom
  • 12:00 Chris Fox Existence and Freedom
  • 12:20 LAC updates: Quick updates on what everyone in the research group is currently working on
  • 12:50 LAC 2016/17 meetings: Discussion as to what subjects to cover in this year’s LAC meetings
  • 13:00 [Lunch]
  • 14:00 Massimo Poesio The DALI project
  • 14:40 Chris Madge The markable game
  • 15:00 [Coffee]
  • 15:20 Silviu Paun Contextual topic models
  • 15:40 Annie Louis Conversation Trees
  • 16:00 Jon Chamberlain Visualising Discussions: Project update
  • 16:20 [Close]

Abstracts/Slides/Bios

Gabriella Kazai – Lumi News AI: A smart reader for a personalised feed of crowd-curated content

Lumi Social News is a content discovery platform and recommender system with iOS and Android app front-ends that builds on crowd curated content and social signals. Lumi automatically builds user profiles from the user’s Facebook and/or Twitter public feeds and continually learns from the user’s in-app actions. Recommendations of relevant or popular content through various channels are drawn from a large pool of crowd curated content, contributed by the community of Lumi users. Lumi builds on technologies, such as Elastic Search and DynamoDB and a range of machine learning methods including SVM, CF and clustering. In this talk I will detail some of the challenges we face in building a consumer product that can process millions of content posts a day and distribute these to the right users based on their user models and locations.

Gabriella Kazai is VP of Data Science at Lumi, the startup company behind the Lumi social newsreader app that provides personalised recommendations of crowd curated content from across the world’s media and social networks. Prior to that, Gabriella worked as a researcher at Microsoft Bing and at Microsoft Research. Her research interests include recommender systems, applied machine learning, information retrieval (IR), crowdsourcing, gamification,data mining, social networks and personal information management, with influences from HCI. She holds a PhD in IR from Queen Mary University of London. She published over 100 research papers and organised several workshops (e.g., GamifIR 2014-2016, News IR 2015) and IR conferences (ICTIR 2009, ECIR 2015-2016, HCOMP 2018). She is one of the founders and organisers of the INEX Book Track 2007-2014 and the TREC Crowdsourcing track 2011-2013.

Annual Research Day – 2015

Tuesday September 29th, 2015; Location: CSEE Seminar Room (Network Building, 1N1.4.1)

Programme

This is a provisional programme to be finalised nearer the day:

Annual Research Day – 2014

Monday October 6th, 2014; Location: Teaching Centre TC1.11

Programme

Here is a provisional programme to be finalised nearer the day:

  • 09:45 Welcome; Recent activities of the LAC group
  • 10:00 Fawaz Alarfaj Searching entities at web-scale
  • 10:20 Ans Alghamdi Active Expert Learning for Digital Humanities
  • 10:40 Maha Althobaiti Combining Minimal Supervision Methods for Arabic Named Entity Recognition
  • 11:00 [Coffee]
  • 11:30 Steven Zimmerman Shedding the light on entities in a massive text processing pipeline
  • 11:50 Jon Chamberlain Groupsourcing: Problem Solving, Social Learning and Knowledge Discovery on Social Networks
  • 12:10 Richard Sutcliffe and Chris Fox C@MERATA at MediaEval 2014 – Extracting Answer Passages from Classical Music Scores using Natural Language Descriptions
  • 12:30 [Lunch]
  • 14:00 Jochen L Leidner (Thomson Reuters) Research and Development in Information Access at Thomson Reuters Corporate R&D
  • 15:00 James Rynn A Framework for Named Entity Linking (Undergraduate Placement Programme)
  • 15:20 [Coffee]
  • 15:40 Ayman AlHelbawy Collective Approaches for Named Entity Disambiguation
  • 16:00 Kakia Chatsiou Crowdsourcing Collections of Cultural Heritage: the Song Catchers Project

Abstracts/Slides

Welcome and Recent Activities of the LAC group
This talk will give an introduction to the group and an overview of what the group has been up to since the last LAC day.
Jon Chamberlain – Groupsourcing: Problem Solving, Social Learning and Knowledge Discovery on Social Networks
Increasingly social networks are being used for citizen science, where members of the public contribute knowledge to scientific endeavours. Tasks can be presented and solved using human computation, termed groupsourcing, with users benefiting from community tuition and experts gaining knowledge from the crowd. This talk gives details of a prototype that utilises groupsourcing to solve image classification tasks, to support social learning and to facilitate knowledge discovery in the domain of marine biology.
Jochen L Leidner (Director, Research – Corporate Research & Development, London, Thomson Reuters) – Research and Development in Information Access at Thomson Reuters Corporate R&D
Thomson Reuters is a modern information company. In this talk, I characterise the nature of carrying out research, development and innovation activities as part of its Corporate R&D group that add value to end customers and translate into additional revenue. A couple of R&D projects in the are of natural language processing, information retrieval and applied machine learning will be described, covering the legal, scientific, financial and news areas. The talk will conclude with a cautious outlook of what the near future may hold. Additionally, I will attempt a comparison of doing research in a company with pursuing academic research at a university.

Annual Research Day – 2013

Monday September 30th, 2013; Location: 1N1.4.1

Programme

Here is a provisional programme to be finalised nearer the day:

  • 09:45 Welcome; Recent activities of the LAC group
  • 10:00 Fawaz Alarfaj Enhancing Entity-Finding Using Adaptive-Windows
  • 10:20 Azhar Alhindi Profile-Based Document Summarisation
  • 10:40 Naoto Nishio Predicting the Quality of a Translation from the Attibutes of a Translator using ML
  • 11:00 [Coffee]
  • 11:10 Maha Althobaiti A Semi-supervised Learning Approach to Arabic NER.
  • 11:30 Ans Alghamdi Active Learning for Archaeological Named Entities
  • 11:50 Roseline Antai TBC
  • 12:10 Deirdre Lungley Sentiment Analysis of Patient Feedback
  • 12:30 [Lunch]
  • 14:00 Andreas Vlachos (University of Cambridge) INVITED TALK: Imitation learning for structured prediction in NLP
  • 15:00 Cliff O’Reilly Modelling Mental Spaces
  • 15:20 [Coffee]
  • 15:40 Florence Myles Using oral learner corpora for second language acquisition research
  • 16:00 Doug Arnold TBC
  • 16:20 Sonja Eisenbeiss, Naledi Kgolo, Sarah Schmid and Janina Fickel Experimental Linguistics in the Field: A Morphological Processing Study on Setswana Noun Derivations and a New Resource Repository
  • 16:40 [Tea]

Abstracts/Slides

Welcome and Recent Activities of the LAC group
This talk will give an introduction to the group and an overview of what the group has been up to since the last LAC day.
Andreas Vlachos – Imitation learning for structured prediction in NLP
Imitation learning is a learning paradigm originally developed to learn robotic controllers from demonstrations by humans, e.g. autonomous helicopters from pilot’s demonstrations. Recently, algorithms for structured prediction in NLP were proposed under this paradigm and have been applied successfully to a number of tasks such as information extraction and summarization. In this talk I will describe in detail two imitation learning algorithms, SEARN (Daume III et al., 2009) and DAGGER (Ross et al., 2011) and describe their application to biomedical event extraction and knowledge base population.

Annual Research Day – 2012

Thursday Oct 4, 2012; Location: 1N1.4.1

Programme

Here is a provisional programme to be finalised nearer the day:

  • 09:45 Welcome; Recent activities of the LAC group
  • 10:00 Lucy Bell and Mahmoud El-Haj SKOS-HASSET Project at the UK Data Archive.
  • 11:00 [Coffee]
  • 11:15 Stephen Clark (University of Cambridge) INVITED TALK: A Mathematical Framework for a Distributional Compositional Model of Meaning Slides
  • 12:15 [Lunch]
  • 13:30 Naledi Kgolo, Sonja Eisenbeiss, Nancy Kula Corpus frequencies and subjective frequency ratings as predictors for lexical decision times in Setswana.
  • 14:00 Chris Fox Methodological Questions.
  • 14:30 Udo Kruschwitz Adaptation of the Concept Hierarchy Model with Search Logs for Query Recommendation on Intranets. Slides
  • 15:00 [Coffee]
  • 15:15 Massimo Poesio BrainNet: combining evidence from corpora and from the brain to study conceptual representations.
  • 15:45 Deirdre Lungley GALATEAS Topic Classification
  • 16.05
  • 16.35 [Tea]

Abstracts/Slides

Welcome and Recent Activities of the LAC group
This talk will give an introduction to the group and an overview of what the group has been up to since the last LAC day.
Udo Kruschwitz – Adaptive Search at Essex
Will present our SIGIR 2012 paper (Adeyanju,I., D. Song, M-D. Albakour, U. Kruschwitz, A. De Roeck and M. Fasli. “Adaptation of the Concept Hierarchy Model with Search Logs for Query Recommendation on Intranets”)
Lucy Bell – SKOS-HASSET Project at the UK Data Archive.
SKOS-HASSET is an 8-month, JISC-funded project, being undertaken within the UK Data Archive to apply SKOS to HASSET (the UK Data Archive’s thesaurus), improve its online presence and test its automated indexing applications. SKOS is a language designed to represent thesauri and other classification resources. It encodes these in a standardised way using RDF to make their structures comparable and to facilitate interaction. SKOS-HASSET is being taken as the terminology source for an automatic indexing tool and applied to items from the Archive’s collection. This presentation will outline in more detail the project’s aims, objectives, progress and potential uses, post-funding. (SKOS-HASSET Project’s Blog)
Mahmoud El-Haj – UKDA Keyword Indexing using a Controlled Vocabulary.
Will present the automation of the indexing and keyword extraction process as part of the SKOS-HASSET Project. (SKOS-HASSET Indexing and Evaluation Blog Post)

Annual Research Day – 2011

Friday Oct 7, 2011; Location: 3.405

Programme

  • 09:45 Welcome; Recent activities of the LAC group (slides)
  • 10:00 Udo Kruschwitz –Adaptive Search at Essex (slides)
  • 10:30 Dyaa Albakour – AutoAdapt at the Session Track in TREC 2011
  • 11:00 [Coffee]
  • 11:15 Massimo Poesio – The Trento / IITP / Essex submission to CONL11 Shared Task on Coreference
  • 11:45 David Hunter – New insights into telephone call dynamics: analysis of call record data from the BT Home OnLine study (joint work with Ben Anderson (Dept. of Sociology) and Alexei Vernitsky (Dept. of Maths) (slides)
  • 12:15 [Lunch]
  • 13:30 Azhar Alhindi – Personalised Text Summarisation
  • 14:00 Richard Sutcliffe – An Analysis of Successful Question Answering System Components at CLEF, 2003-2010
  • 14:30 Jon Chamberlain – Using social networks to annotate corpora (slides)
  • 15:00 [Coffee]
  • 15:15 Kakia Chatsiou – Using ESDS data in Linguistics and NLP (slides)
  • 15:45 Antai Roseline – Sentiment Analysis – SentiWordNet and polarity classification
  • 16.05 Adindla Suma – NLP Driven Intranet Search

Abstracts/Slides

Welcome and Recent Activities of the LAC group
This talk will give an introduction to the group and an overview of what the group has been up to since the last LAC day.
Udo Kruschwitz – Adaptive Search at Essex
I will give an overview of our recent activities which will then nicley link into some of the talks scheduled later on.
Dyaa Albakour – AutoAdapt at the Session Track in TREC 2011
Coming Soon!
Jon Chamberlain – Using social networks to annotate corpora
The large amounts of annotated text required for modern computational linguistics research cannot be created by small groups of expert hand-annotators. Recently however the AnaWiki project released a version of Phrase Detectives (an online game-with-a-purpose interface for creating collaborative annotation) on the social network Facebook. The talk will focus on how social networking elements were integrated into the game, provide an initial overview of findings and hopefully inspire some discussion about using social networks as a human computation platform.
Azhar Alhindi – Personalised Text Summarisation
The process of summarizing text is difficult and requires effort and time. In the past, this process was completed by hand, but there are now automated systems that perform the same task of almost equal quality. Many have researched this area and have explored many forms and approaches of summarization. Utilizing a user profile to produce a summary for Web pages is an improvement in text summarization which helps to generate a personalised summary. The main advantage to this method is that the produced summary reflects the user‘s interests. It consists of a cluster of personal data associated with a specific user. In this talk I am going to provide a short talk about the basic knowledge regarding text summarization, personalization, my previous work for the masters project and its relevant to the PhD one.
Richard Sutcliffe – An Analysis of Successful Question Answering System Components at CLEF, 2003-2010
There has now been a Question Answering track at CLEF since 2003. Each year there were some systems which performed well, and others which performed less well. We will present here a recently conducted analysis of successful systems, both monolingual and cross-lingual, across this eight year period. We have looked at the architecture of each high-scoring system and identified components and their underlying algorithms which were contributing to its success. This analysis can give us an insight into the way in which the fields of text processing, named entity recognition, information retrieval and information extraction have developed so far and what trends we might expect to see in future.

Annual Research Day – 2010

Friday Oct 8, 2010; Location: 3.320

Abstracts/Slides

Welcome and Recent Activities of the LAC group
Abstract: This talk will give an introduction to the group and an overview of what the group has been up to since the last LAC day.

Dyaa Albakour (Essex) – AutoAdapt @ TREC 2010
Abstract: This talk presents the contribution of the Autoadapt project in the session track of the Text REtrieval Conference (TREC) 2010. I will give a description of the task introduced in this year’s track and the experiments we run here at Essex to solve this task.

Massimo Poesio () – tbc
Abstract: Coming soon!

Deirdre Lungley & Jon Chamberlain (Essex) – Towards Adaptive Interactive Search
Abstract: The AutoAdapt project here at Essex is exploring various domain model learning frameworks. Deirdre opens this talk by detailing her lattice-based framework and recent lessons learnt. Jon continues with an outline of the new adaptive interactive search interface for the Essex intranet.

Chris Fox (Essex) – Judgemental Imperatives
Abstract: Coming soon!

Sonja Eisenbeiss (Essex) – Acquisition of English Possessive Constructions
Abstract: In this talk, I will present results from a study on English-speaking children’s acquisition of possessive constructions like the Mary’s cat or engine of the old red car. Based on an analysis of data from the Brown (1973) corpus of American child language, I will argue that children adapt to the constraints on the use of -s and of early on.

Mahmoud El-Haj (Essex) – Arabic Multidocument Text Summarisation
Abstract: Multi-document summarisation produces a single summary of a set of related documents that belong to the same category. The analysis in this area is mostly done on the sentence or document level. This talk will focus on the progress of my research towards Arabic single and multi-document text summarisation. I will present some results of experiments done to other languages including English and Hebrew in addition to that I will show the results of summarising automatically-translated articles (from English to Arabic) using Google Translate.

Kakia Chatsiou (Essex) – Essex Gateway to Arabic Resources Project
Abstract: The talk will report on the Essex Gateway to Arabic Linguistic Resources, a collection of resources, bibliography and tools on Arabic Language and Linguistics. The project has been funded by a Departmental Research Innovation Scheme Award to Sonja Eisenbeiss and Louisa Sadler and its aim is to build a network of open-access digital collections and resources on Arabic Language and Linguistics at first, then expanding to Semitic Languages and Linguistics. We hope that these resources will unlock access to a wide range of our departmental scholarly output and serve as a point of reference for researchers/visitors interested in Arabic and Semitic Languages.

Udo Kruschwitz (Essex) – Moving towards Adaptive Search in Digital Libraries
Abstract:Search applications have become very popular over the last decade, one of the main drivers being the advent of the Web. Nevertheless, searching on the Web is very different to searching on smaller, often more structured collections such as digital libraries, local Web sites and intranets. One way of helping the searcher locating the right information for a specific information need is by providing well-structured domain knowledge to assist query modification and navigation. There are two challenges: acquiring the domain knowledge and adapting it automatically to the specific interests of the user community. We will outline how a domain model can be automatically acquired using search engine query logs and continuously be updated using methods resembling ant colony behaviour.

Richard Sutcliffe (Limerick), Udo Kruschwitz (Essex) & Kieran White (Limerick)Named Entity Recognition in Intranet Documents and Query Logs
Abstract: The recognition of Named Entities – person names, company names, places, times, dates, and so on – has long been recognised as being of great importance in text processing. We have observed that NEs commonly occur in search engine logs such as that listing queries submitted to the University of Essex web search engine UKSearch. In this case, however, NEs are often specific to the Essex intranet domain, for example room numbers, committee names or research group titles. We call these Specific NEs or SNEs. Firstly, we have conducted a manual study of SNEs within a sample of queries to give us breakdown statistics of different SNE types, and to produce a provisional itinerary of them. Secondly, we have undertaken a pilot study in which SNEs within queries were recognised following training using the maximum entropy tagger within the NLPTools package. The key technique here was to obtain contexts for the candidate SNE (which is often he sole element within a query) by harvesting occurrences of them from web pages within the intranet. Thirdly, we have crawled the entire Essex intranet and hence have carried out further SNE analysis on this.


Annual Research Day – 2009

Time: 10:00– Thursday Oct 8, 2009
Location: 5A.332

Annual Research Day – 2008

Time: 10:00– Friday Oct 3, 2008
Location: Computer Science Seminar Room (on floor 4B)

Annual Research Day – 2007

5th October 2007, Computer Science seminar room (4B.531)

Annual Research Day – 2006

(5th Annual Language and Computation day), 4th October 2006, Computer Science seminar room (4B.531)

  • 09:00 Coffee and welcome
  • 09:30 Massimo Poesio (CS): MSDA: an unsupervised wordsense discrimination algorithm
  • 10:00 Olivia Sánchez-Graillet: tbc
  • 10:30 Louise Corti (Data Archive): SQUAD tools demo
  • 11:00 Coffee
  • 11:30 Doug Arnold (L&L): Epithets
  • 12:00 Chris Fox (CS): Imperatives
  • 12:30 Kakia Chatsiou (L&L): Resumptive pronouns and Modern Greek Relative Clauses
  • 13:00 Lunch
  • 14:00 Ron Artstein (CS): Identifying reference to abstract objects in dialogue
  • 14:30 Richard Sutcliffe (Limerick): tbc
  • 15:00 Udo Kruschwitz (CS): tbc
  • 15:30 Tea
  • 16:00 Invited speaker: Dawei Song (Open U): Concept-based document readability computation for domain-specific information retrieval