
Newsletter
31.07.2025
# 3
#DSA40 Data Access Newsletter
Rising Temperature
Dear DSA40 Community,
welcome to the third #DSA40 Collaboratory newsletter. We know, it’s been a while since our last newsletter – and a lot has happened in the meantime.
Is this place where we catch up on everything that has happened since? No. We’re aware that the summer break is approaching – and for some of you surely has already started. So, we’ll keep it crisp and skip the risk in focus this time (it will be back in the next newsletter for sure!) So, instead of starting to sweat from the heat coming from the other side of the Atlantic, let’s focus on the benefits of the summer sun instead of actual or foreseeable negative effects to research, (European) democracy, or the planet.
General Updates
In case you have not yet received an invite, we would like to wholeheartedly invite you to the first DSA40 Data Access Days, which we’re hosting on 25-26 September in Berlin. With this two-day conference, we plan to facilitate exchange and collaboration on the topic of data access.
On Day 1 (25.09.25), a keynote and several panel discussions will focus on lessons learned from Article 40(12), and opportunities for Article 40(4) implementation.
On Day 2 (26.09.25), we will provide practical hands-on sessions focused on drafting applications for 40(4) data access for a limited number of participants.
Unfortunately, we’ve already reached the maximum amount of registrations for the hands-on sessions, but registration for day 1 and the waiting list for day 2 is still open! You can find the registration form and a preliminary programme here.
The Access Days will be right on time as the first week afterwards will not only mark the 3-year anniversary of the adoption of the DSA (on October 4) but also the end of the 3-month grace period of the Delegated Act on Data Access, which was adopted by the Commission on July 2. Alongside it, the DSA data access portal was also launched. Given that the Collaboratory had worked together with many different researchers to produce a response to its draft, published in October 2024, we directly went to work to see what was changed in the final version.
To this end, we aligned the relevant sections and produced a visual comparison, accessible on our website.
We’ve since published a short overview of key aspects [in German], a piece in Tech Policy Press, parsing the various expectations towards platforms, researchers, and regulators as assumed by the delegated act, and are working on more to be announced soon.
Also: we updated our FAQ with up-to-date information based on the delegated act. Originally launched two months ago, and built on top of initial work by Julian Jaursch and Philipp Lorenz-Spreen, it’s supposed to collect helpful resources in order to answer all kinds of questions researchers might have about data access. We will update the page continuously – so please let us know if we missed anything!
We are also happy to finally officially announce our standing group: Simon Munzert, Judith Möller, Jakob Jünger, Philipp Lorenz-Spreen, Katrin Weller, Hajo Boomgaarden, Juliane Mendelsohn, David Garcia, and Daniel Schnurr. Together with this interdisciplinary group of researchers we hope to raise awareness of the DSA’s data access options and will start to draft our own coordinated data access requests based on Art. 40(4) DSA soon! We’ve updated our website with links to their research profiles in case you want to take a look.
Speaking of our website, we added a whole bunch of new content, we would like to draw your attention to:
- Experiencing issues with data access? Does your API access not work or do you have issues finding an application form? You might not be the only one! Our issue tracker now allows you to let us know about any problems you have. We will review all submitted issues and – after verification – publish them on our websites so that other researchers can know about them. In addition, we will bring these issues forward to the relevant regulators.
- We have also reworked our main page to give you a better overview over our most recent activities and upcoming events. Here you’ll also find a reference to the hackathon on public data access we organised on 9. and 10. April 2025 together with AlgorithmWatch and Mozilla, where we looked at unexplored public data [UPD] and obviously public (but inaccessible) data [OPD], like highly disseminated content. Join the access days or write us an email to learn more if you’re interested!
- At last, we’re glad to announce that we will host a monthly data access hangout online for everyone who wants to connect and exchange on data access every second Tuesday of the month, starting 12 August 2025. Click here to join the email list!
We’re also planning for more updates to the website, that we’ll inform you about once the time is right.
Highlights
Apart from the Delegated Act being released, here’s a quick rundown of what we think have been the most interesting developments for research data access to take note of:
- You have a subjective right to data access
You likely did not miss this but in case you did: At the start of this year, in the run-up to the snap Federal Election in Germany, Democracy Reporting International (DRI) and Gesellschaft für Freiheitsrechte (GFF) had filed for interim measures against Musk’s X before the Berlin Regional Court because X had not decided on DRI’s request for data access for the purpose of election monitoring. In its final decision, the court declared that the temporary injunction to prevent serious or irreversible harm, initially issued against X, was not justified – however, it did- interpret Art. 40(12) DSA as granting vetted researchers a subjective right to data access in the positive sense. But also in the negative sense, platforms are obliged “not to hinder researchers entitled under Article 40(12) from data use and to enable this for them in real-time where possible.”
- assume that it was competent to hear the case. This means that litigation of the DSA can be initiated in any EU country where the researcher, incurring damage from the obstruction of access, is located – and does not be conducted in Ireland (where most platforms are headquartered and litigation is expensive).
If you want to know more, I highly recommend this summary of a panel discussion on the topic that DRI recently hosted.
- Commitments have been made
The second proceedings initiated by the European Commissions have come to a close (the first one being the investigations into TikTok light in August 2024) – being the first ones touching on data access. AliExpress settled the Commission’s proceedings by making a set of binding commitments relating, among other things, to research data access. The commitments can be understood as affirming a minimal standard for data access: well-documented APIs as well as the option to scrape. And while questions about meaning of public data and the extent to which scraping needs to be permitted remain, the commitments notably made separate mention of another access modality, not explicitly mentioned in the DSA: customised datasets, which are to be made available to eligible researchers upon request. This is not completely new, as Meta for example already offers some datasets to researchers–however, it is a reminder that researchers do not have to contend themselves to APIs but could request specific datasets directly, especially when they are impossible to compile through the modalities offered. - Commitments have been changed
Initially established in 2018 and expanded in 2022, the Code of Practice on Disinformation provided a self-regulatory framework to work on combating misinformation by defining various commitments signatories could subscribe to. One key area is “empowering the research community”, which acknowledges the importance of a robust data access framework, and includes commitments like offering, and documenting adequate access modalities and their update (commitment 26); cooperating with researchers more generally (commitment 28); transparent research (commitment 29); and developing, funding, and cooperating with an independent, third-party body to vet researchers and research proposals (commitment 27). While Twitter withdrew from the code after Musk’s takeover in 2024, the data published by the remaining signatories has allowed us to abstractly track the development of Art. 40(12)-based access requests over the last two years.Figure showing bar charts with the total amount applications (and their status) reported by the platforms as part of their commitments under the Code of Practice on Disinformation. All numbers, except for access options offered by Meta and LinkedIn, relate to the EEA region. The cumulative amount of reported access requests is shown as an area in the background. The dashed lines represent the acceptance rate for each reporting time frame by access modality. Since Meta does only provide information on the number of accepted requests, no acceptance rate can be calculated.
In July this year, however, the Code of Practice changed into a Code of Conduct under the DSA, which meant that commitments would be audited and that compliance is considered appropriate risk-mitigation. An analysis of the updated subscription documents subscribed VLOPSEs by DRI showed that platforms used this change to withdraw from many commitments, most notably regarding transparency of political advertising, and empowering the fact-checking community. In terms of research data access, most platforms completely withdrew from commitment 27 to support an intermediary body to help with researcher vetting. These withdrawals happened after the release of the draft delegated act, when its final shape was still up in the air. Art. 14(1) of the adopted delegated act now allows DSCs to consult experts “before formulating a reasoned request, or taking a decision on an amendment request – but it remains unclear to what extent we’ll see an institutionalisation of this support and if it will also extend to Art. 40(12) vetting also (as in the case of ICPSR).
- A sneak preview of data access to non-public data based on Art. 40(4) DSA
In April, the Coimisiún na Meán (CnaM), aka the Irish DSC, conducted a survey on DSA data access “to understand more about researchers’ needs, readiness and barriers to data access under Article 40. While the full results are yet to be published, some information was shared by Kirsty Park, Assistant Director of Research at CnaM, during a panel at CPDP as well as in CnaM’s first Vetted Researcher Newsletter: While 103 of the 116 respondents (89%) planned to submit a data access application under Art. 40(4), only 32 % had previously engaged with data access under Art. 40(12) – indicating significant researcher expectations for more comprehensive data availability. On average, researchers planned 5.8 applications (602 in total), which already foreshadows the complexities of cross-platform research as reasoned requests will have to be issued and negotiated individually with platforms. The fact that roughly 75 % (448) of the applications are planned for the first 6 months after the DA comes into force and that most of them relate to VLOPSEs established in Ireland additionally provides a first estimation of the initial work volume for CnaM. It also highlights that initial applications will need to be well founded to entail reasoned requests are exemplary results of the access process. CnaM recommends The Data Management Expert Guide (DMEG) as an effective way to communicate how researchers plan on managing the data they intend to access – so, if you want to also apply for data access, perhaps consider it as preparatory summer reading. - To VLOP or not to VLOP?
The systemic risks relevant to each VLOPSE are different, as are the necessary risk mitigation measures. While some Amazon and Zalando are still fighting their designation in court, for other platforms, like the navigation app Waze, the designation may come with some overhead but would likely not significantly change the business model, which is why it clearly reported the number of monthly active users as above the DSA’s threshold – an already has its own option in Google’s access application form.
For others, the designation can have significant consequences: in May, the Commission released its draft guidelines on protection of minors online, which link child protection to, among other mitigation measures, age verification technology. Age verification and age estimation have been explored by an EU task force since January 2024, the former of which is repeatedly linked to the EU Digital Wallet or an interim technology. Not touching on the potentially significant implications for internet anonymity, age verification also poses a threat to the porn platforms business model, since only the biggest platforms will have to implement these measures under the DSA, heavily incentivising using alternative platforms which don’t check the users’ age. It thus comes as little surprise that the reported average monthly user number reported by Pornhub, XNXX, StripChat, and XVideos dropped below the DSA’s threshold of 45 million in February. Did that keep the Commission from opening proceedings against them for failing to put in place appropriate risk assessment and mitigation measures, shortly after the release of the guidelines in May? No.
Another service for which the designation of as a VLOPSE would be very significant is ChatGPT, for which the numbers reported by OpenAI jumped from 11.2 in October to 41.3 million million monthly active EU users in March. It thus seems like a matter of time until the VLOPSE threshold it crossed. The Commission is reportedly considering designating ChatGPT as a VLOSE due to its web searching functionality. While it is unclear if this designation would relate to the entire service or only to its search functionality, these considerations point towards the potentials of data access to so-called “Artificial Intelligence”. As “AI” is integrated into more VLOPSEs, we may see researchers targeting it as a “feature of [VLOPSEs’] systems” based on Art. 35(1a). Thus, it seems like we won’t need to wait long to see a clash between the broad and mostly unexplored definition of systemic risk in the DSA and the rather explicit systemic risk definition in the General-Purpose AI Code of Practice, considering that OpenAI, Google and Microsoft signed while Meta refused. Also, the significant energy and water intensity of “AI” technologies is just one obvious way in which VLOPSEs can be linked to, arguably the mother of systemic risks: climate collapse – and it will be interesting to see how access applications arguing this way will be treated. Either way, we have come full circle to the rising heat – in a meteorological and political sense.
As you may have noticed, we’ve not touched on the continued politicisation and attempted misrepresentation of the DSA by the US administration and actors affiliated closely with it. Instead, we hope that (contrary to recent reporting) the KOM keeps on enforcing the DSA unaffected, while the US is poised to become a great control group to study the effect of risk mitigation measures.
Calls, Surveys and Workshops
Apart from the DSA40 Data Access Days (which you should join 😉 ), we would like to direct your attention to a set of calls, surveys, and workshops coming up in the second half of the year:
- The ERC is holding a survey on interest in ERC activities related to DSA Art. 40 and associated Delegated Act
- members of the Joint Research Centre’s Algorithmic Transparency team are delivering tutorials and interactive sessions on researcher data access under the Digital Services Act (DSA). Consider joining them at
- ECML PKDD (15-19 September) in Porto, Portugal for a data access tutorial on 19 September, 12:00-18:00 (CEST)
- RecSys (22-26 September) in Prague, Czechia for a Keynote by Emilia Gomez, JRC Algorithmic Transparency Unit: Recommender Systems: A European, Science for Policy Perspective
- the Algorithmic Transparency team is also hosting a research workshop in Seville on 12 November (submission deadline: 20 August)
- The Network Media Structures at Uni Mainz is hosting a Workshop on Epistemic Governance of Platforms in Mainz 23-24 October (submission deadline: 15 August, German only)
- The 4th Arcom Research Day will take place again in Paris on 13 November 2025 (submission deadline: 29 August)
- The Platform Governance Research Network will host PlatGovNet2025 online December 1-2 (submission deadline: 2 September)
- The second international conference on ‘The DSA and Platform Regulation’ is coming up at Amsterdam Law School on 16-17 February 2026 (submission deadline: 30 September)
We hope you have a restful summer break and that you keep asking for data access once you’re back.