This PhD dissertation focuses on improving terminology extraction and alignment for applications in the translation industry. It explores three key use cases where these techniques benefit language professionals: creating client-specific terminology lists from large parallel corpora (i.e. translation memories), building domain-specific terminology resources from comparable corpora, and identifying important domain-specific …
Automatic terminology extraction, also known as automatic term extraction (ATE), is a natural language processing (NLP) task that identifies specialized terminology from domain-specific corpora. ATE is often used for terminographic tasks (e.g., the creation of specialized dictionaries) and contributes to several complex downstream tasks (e.g., machine translation and information retrieval). …
The thesis addresses a novel representation learning framework, combining neural and symbolic text representations, and demonstrates its utility for tackling diverse natural language processing problems. The proposed approach, avoiding the deficiencies of purely symbolic and purely neural methods, can be applied for the generation of efficient text representations. Its usefulness …