Document Summaries in Danish with OpenAI
On Denmark’s most visited news site Ekstrabladet.dk, we publish automatically generated local news articles in Danish as a part of the Platform Intelligence in News (PIN) project. In these automatically generated articles we cover topics including companies’ financial reports, inspection reports for food establishments, and real-estate sales. The articles are generated with rule-based NLG (Natural Language Generation) using data retrieved automatically via APIs. Yet this rule-based approach using structured data doesn’t offer the complete story that we want to share with our readers. For instance, companies’ financial reports often includes information from management that explains why revenue, results, and equity turned out the way they did. This is important context to include in our articles. At the same time, these documents are often too long, not always formatted with high grammatical quality, and are sometimes also written in English. To incorporate information from these documents into our automated articles we have been experimenting with the latest GPT models (text-davinci-003, gpt-3.5 turbo and gpt-4) from OpenAI to generate clean, well-written summaries in Danish to include in the articles. However, these AI-generated summaries are not always acceptable according to our editorial standards. And while we publish the rule-based auto-generated articles directly on Ekstrabladet.dk, we do not add the AI-generated summaries to the articles before they have been reviewed by a human.
0 Comments