Data Driven Publishing
A Framework To Better Serve Our News Audience
From eager news readers to podcast lovers to people getting their information on social platforms — our users are diverse and so are their news habits. Data Driven Publishing is a framework we developed to produce and deliver news journalism in a user-centered way and fulfill our mission as public service broadcaster.
By Uli Köppen and Verena Nierle
This method should help us to deliver data journalism and automated content to best fit our users’ news consumption habits. By creating modern data workflows and working on our infrastructure we’re committed to finding a public service way of personalizing news and future-proofing the way we reach our audiences. We’re happy to share our learnings along the way, for example how to use prototypes to improve your systems, what ways of personalization fit best to our public service mission or how to drive cultural change.
Is this article worth your time? Here’s what we’re covering:
- Who we are: Interdisciplinary Data Teams
- Addressing Challenges and Starting Small
- Goals of the Framework
- Personalizing News Experiences
- Principles of the Framework: Regionalizing the News, Versioning News Items, Making Our Content More Accessible, Using Prototypes to Improve Infrastructure
- Learnings
Who We Are: Interdisciplinary Data Teams
We’re three interdisciplinary teams at Bavarian Broadcasting, a German Public Service Broadcaster in the ARD Network, combining the work of journalists, software developers, machine learning experts and product designers. We’ve worked with data for quite some time, starting about eight years ago with the founding of our data journalism team BR Data. Over the years we grew with our investigative team, BR Recherche. Together we’re tackling investigative and news-driven data stories, specializing in algorithmic accountability and cybercrime reporting.
In 2020 we had the chance to build another team, allowing us to use our skills and mind set on the product side with the AI + Automation Lab. The team produces automated texts, graphics and audio news briefings and joins investigations with statistical knowledge and machine learning skills. All three teams collaborate on investigations and product development (here’s how we’re aiming at working together). We’re looking at both sides of AI and automation, asking: How can this technology be useful for journalism? How is it used in harmful ways that should be investigated and discussed by society?
Addressing Challenges and Starting Small
We wrote this framework to address some of the roadblocks we encountered when working with data in a media company. Data Driven Publishing should help us to
- Integrate our prototypes and products into legacy systems
- Use our CMS for automated and personalized content
- Create modern data pipelines and workflows
- Standardize content and metadata warehousing
These steps work toward building and publishing small prototypes and products to grow our infrastructure for existing and future user needs.
Goals of the Framework
Data Driven Publishing is aimed at translating our mission to the online news world. While there are many ideas around how to use personalization for video and audio platforms, concepts around journalistic news personalization are scarce and we hope to add to this discussion with our public service perspective. 
 
 The goals for Data Driven Publishing in a nutshell:
- Support our mission to do journalism for everyone
- Find a Public Service way of personalizing news
- Support growth of our core news brand
- Future-proof our infrastructure and workflows
It is crucial for us to hold ourselves accountable to our journalistic values which is why we published a set of AI Ethics Guidelines. Principles such as transparency, algorithmic fairness and a strong user focus are the groundwork for Data Driven Publishing.
Personalizing News Experiences
As public service broadcaster it’s our mission to deliver important information for everyone. Personalizing our quality content is at our core: “Personalization is a structural necessity for universal public access to our journalism”, says BBC’s Executive Product Manager David Caswell who is leading the British public broadcaster on this work.
David is using modular journalism approaches (also known as structured journalism) with different teams at BBC. News items are produced as content modules that can be repurposed and redistributed to best fit user interests. This approach to personalization is also a cornerstone of our Data Driven Publishing framework. There is some theoretical groundwork to look at, for example this article by Shirish Kulkarni on behalf of the JournalismAI Collaboration on how to translate user needs into modules for storytelling.
The public broadcaster Swedish Radio (SR) shows how a modular news approach works with personalization. SR decided to focus on audio for their digital distribution and they’re doing it with news clips. Those clips are arranged in Spotify-like lists put together based on topic of interest (e.g. sports, economy) or location including a national playlist with the most important general news and 26 regional lists. Olle Zachrison, Head of Digital News Strategy at SR, points to the strategic advantages of this approach: “The modular approach adds flexibility as we can automatically mix news clips in new ways, and has also changed our production process to a digital-first mindset.” SR also has built a Public Service Algorithm that helps editors curate those lists.
We’re using the modular approach for our prototype Remix Regional, an automated regional audio news briefing. It’s providing users with the news happening around them.
Principles of the Framework
We’re working toward our goals by following these principles:
- Support your Unique Selling Points (USPs) algorithmically
- Make your content more accessible
- Use your prototypes to future-proof your infrastructure
In our AI + Automation Lab we’re supporting two important aspects of our news DNA. We’re producing different versions of news pieces and regionalizing our news program — both with the help of automation and AI. These methods create the groundwork for any recommendation system built on diversified content.
Regionalizing the News
We have a dense net of reporters across Bavaria. It’s our goal to combine their quality journalism with algorithmically produced regional content that can be distributed in personalized ways. Our AI + Automation Lab and BR Data teams already have produced text automations with our sports department or our economics desk to get a better feeling on how to use technology to serve our goals. Now we’re focused on regionalizing text automation such as our COVID Tracker to offer one story per county. We’re looking at automation not only as material for future personalization products but also as support for our journalists by removing mundane tasks and freeing them to do actual reporting or in-depth analyses.
We’re also prototyping how regional audio personalization might work: Remix Regional (born at a hackathon, here’s more on that in German) is providing users with the news happening around them. We’re working with a cross-departmental team (Lab, Archives, technical department, regional newsrooms) on a way to automatically segment audio news programs into individual news items. Those items are then geotagged and served to listeners based on their location and preference. We’re aiming at an automated audio news feed that can be served on all our digital platforms and on smart speakers.
Versioning News Items
Whether our users prefer short- or longform content, listening to news podcasts or watching Insta stories — we want to offer the right version of news for their needs. We’re working on support systems for our reporters and editors to produce different versions and formats of a story: Our summarization prototype shortens news stories with the help of AI to help journalists produce stories of different lengths. Our online graphics editor is helping our colleagues at the news, graphics and social desk to make daily charts with the latest Covid statistics for online news articles, social media and TV.
Making Our Content More Accessible
Data Driven Publishing helps us to make our content more accessible. The more we segment and tag our content, the better it can be found and reused. This is true for our users as well as our own reporters and editors doing research in our archives. Our colleagues in the BR archives are tagging specialists who use facial recognition algorithms to tag video content or speech-to-text methods to make audio and video recordings searchable. Integrating those technologies into our workflows will provide a better service to our audience and enable better journalism.
Creating standardized data visualization workflows and showing data in compelling ways will better tell the stories of our investigations and explainers. We’re committed to work on our distribution systems such as our CMS and to build standardized data pipelines for our distribution channels.
Using Prototypes to Improve Infrastructure
We’re using our prototypes and products to get a better grasp on where our systems and infrastructure need to become more flexible for personalization. Prototypes such as Remix Regional help us to build hybrid workflows with newsrooms to improve our audio metadata stores. Storing audio snippets together with the relevant metadata will help us to meet existing and future user needs more quickly.
Distribution of our text automation products highlighted the needs of our content management system to deal with automated content. Getting a better understanding of those challenges is the first step to making them better. As a second step we’re building cross-departmental alliances with teams from our tech department to editorial.
Learnings
Conceptualizing, publishing and discussing the framework is part of a journey. We will certainly adapt some of our goals and challenges along the way — but here are some learnings we can already share from our work with Data Driven Publishing:
- Build prototypes to learn from them.
- Find ways to feed those learnings back into your larger strategy.
- Publish your prototypes and work with the user feedback. We’re regularly user testing and building in feedback.
- Become a team with other teams: Build interdisciplinary squads within your company. Automation can’t work when siloed.
- Find ways to orchestrate those collaborations — we’re installing representatives for those collaborations that are driving projects like Remix Regional over different departments.
Was this helpful for your own work? We would love to learn more about that. Hit us up on twitter: @BR_AILab, @BR_Data or @BR_Recherche

