Introduction: Achieving Global Biodiversity Infrastructures

Advances in information technology assisted in the formation of biodiversity informatics as a scientific and professional field (Canhos et al. 2004; Johnson 2007). For biodiversity informatics, utilizing information technology to store, record, track, transfer, and analyze biodiversity data enable a primary ambition to model and visualize “global biodiversity” (Bisby 2000; Guralnick and Hill 2009; Hardisty, Roberts, and the Biodiversity Informatics Community 2013: 1). In order to accomplish this aim, biodiversity informatics seeks to accrue as much biological data as possible, to standardize this data and make it transferable across differing scales and boundaries, to make data public or “open” for all audiences, and to streamline biodiversity informatics infrastructures so that they become “invisible” (Hardisty et al. 2013). Much of the literature on biodiversity informatics thus covers integrating its technology with conceptual frameworks (Peterson et al. 2010: 166), improving data linkages (Sarkar 2007; Page 2008), preventing data loss (Peterson et al. 2018), and addressing its potential impacts on select taxonomies and geographies (Arbuckle et al. 2001; Costello and Berghe 2006; Guenard et al. 2017). These efforts also coincide with the perception that biodiversity informatics fulfills a growing need and use for its data by society (Bingham et al. 2017: 2–3; Hobern et al. 2013: 4–5). In many respects, biodiversity informatics has developed into a broad interdisciplinary and multi-sector field that brings together taxonomic databases, species and specimen observation records, and distribution modelling (Soberón and Peterson 2004: 690–691; Leidenberger et al. 2016: 2). For instance, millions of individuals have registered to use mobile applications such as iNaturalist and eBird, while the Global Biodiversity Information Facility (GBIF), the world’s biggest repository of biodiversity data, has aggregated approximately 2.2 billion species occurrence records as of Fall 2022.

Since biodiversity informatics requires “cooperation among governments and nongovernmental organizations and between data providers and users of the information” (Scholes et al. 2008: 1044–1045), studying these interrelationships along with the tools and technologies of biodiversity informatics can assist in situating this field. Nevertheless, scholarship in biodiversity informatics pays little attention to the interests and values that shape its objectives as well as the local, social contexts of its infrastructures. Hence, we argue that a deeper understanding of biodiversity informatics infrastructures could be gained by contributions from science and technology studies (STS), which expose how “knowledge and human artifacts are human products and marked by the circumstances of their production” (Sismondo 2004: 10). Rather than challenging the aims of biodiversity informatics, we posit that contextualizing global biodiversity infrastructures—what drives them, who makes them (e.g., communities and institutions), and what practices and technologies do they employ—can form part of its agenda. If broader society does need biodiversity informatics data as suggested, understanding its infrastructures and their trajectories has the potential to assist biodiversity informatics to establish greater trust with its audience and users, critically engage with its social assumptions, and better deal with “messy features of real-world problems” (Sismondo 2004: 172). Attending to how biodiversity informatics infrastructures come about can pinpoint and highlight social and political challenges beyond the accrual and modelling of biodiversity data.

We present Sweden as a case story in respect to biodiversity informatics infrastructures, with the purpose of seeking answers to two questions. What contexts fueled the production of biodiversity informatics within this country? How have they contributed to the “legitimacy, appropriateness, and long-term efficacy” of its infrastructures (cf. Edwards et al. 2009: 372). Sweden’s biodiversity informatics community offers a fruitful case because of its early adoption of IT infrastructures and its substantial contribution to global biodiversity knowledge. Sweden, for instance, has the second largest occurrence record dataset after eBird and the highest rate of most occurrences per publishing country or area with GBIF. Moreover, in 2021, two of the largest and most highly publicized biodiversity informatics infrastructures in Sweden merged to form the Swedish Biodiversity Database Infrastructure (SBDI), a research infrastructure connecting data from eleven Swedish organizations. As a result, we story the development of biodiversity informatics infrastructures within Sweden in order to make clear general and unique visions that construct notions of national and international biodiversity data and to stimulate additional research on the many aspects of other infrastructural trajectories (e.g., technologies and practices) of biodiversity informatics. Understanding how this infrastructure has developed can help to explain why and how the production of local biodiversity data gets transposed into biodiversity records at national and eventually international scales (Peterson et al. 2022). Focusing on Swedish biodiversity informatics makes visible contextual aspects that allow biodiversity informatics infrastructures to become normative projects worldwide as well as bring into focus social, economic, and cultural obstacles for producing global biodiversity data and knowledge.

Theory and Methods

Scholarship in STS and its emphasis on studying infrastructure can help to contextualize species observations (Latour 1987; Callon and Latour 2010; Ruhleder and Star 1996; Edwards et al. 2007; Edwards et al. 2009; Callon and Law 1997). We take infrastructure to mean the material technologies, individuals associated with infrastructure, and the organizations—including scientific and governmental institutions—that “enable knowledge work” and “that are inherent to the functioning of science” (Bowker et al. 2009: 98). Due to space restrictions, however, we do not analyze the specific materialities of Sweden’s biodiversity informatics infrastructures nor the practices in close detail which operate and maintain them. Rather we chart the “growth” of these infrastructures’ organizational context, storying what perspectives, interests, and relations adjusted the formation of this infrastructure (Edwards et al. 2009).

To do so, we map out an infrastructural ecology that highlights different “visions” (e.g., interests/perspectives/relations) that help shape biodiversity informatics infrastructures in Sweden (Star and Griesemer 1989: 396–404). Doing so also allows us to comment on the “boundary work” bringing these infrastructures together but also keeping them apart, specifically what STS scholars Susan Leigh Star and James Griesemer call “standardizing methods” and "boundary objects” (1989: 404–413). However, because this boundary work operates at a different scale in our case—across organizations as opposed to within a single organization—we look at standardizing methods and boundary objects from a different angle, nuancing them while acknowledging their limited value at this scale (Star 2010: 612–613). In this way, we chart key organizing visions, their institutionalization, and those efforts to maintain them, which allow Swedish biodiversity records to operate in national and international communities.

This examination uses scientific and popular literature on biodiversity informatics as well as organizational documentation. Additionally, we also interview eleven specialists with varying experience and expertise in Swedish biodiversity informatics, including biodiversity researchers, policymakers, data scientists, and taxonomists through personal correspondence and semi-structured interviews informed by the dialogic and conversational models of active, collaborative, and analytic interview methods (Holstein and Gubrium 1995; Kvale 1996; Ellis and Berger 2001; Kreiner and Mouritsen 2005).Footnote 1 We follow ethical principles and anonymize the names of our interviewees (Fontana and Frey 2005). To assist in preventing their identification, we interweave interview data into the narrative of the text, presenting it as a vision of an institution or community (Clark 2006: 5; Moore 2012: 332).

Doing so forms part of our analytic method that synthesizes popular, scientific, and interviewee narratives (along with their differing perspectives) about the formation of Swedish biodiversity informatics infrastructures into a case story rather than a “factual” or “true” history. Hence, what follows stories the organizational context at stake in the development of Swedish biodiversity informatics infrastructure, building a description on the “mergers” and “fault lines…between the inner and outside realms of existence of all entities involved.” (Shklovsky 1991; Knorr-Cetina 2007: 69). Our analysis is further informed by discussions in STS related to the “biographies of artifacts and practices” to provide an analytic juxtaposition between “detailed analyses of the most interesting processes” and “broader scale descriptions” (Hyysalo et al. 2019: 9–10). That is, our telling of biodiversity informatics infrastructures in Sweden relies upon “the interwoven nature of narratives” about the dynamic relationships between differing interests, concepts, technologies, practitioners, and institutions (Smocovitis 1994: 410).

Results: Swedish Biodiversity Informatics, 1970s-Present

Two Visions: Scientific Progress and Conserving Species

Many digital biodiversity databases were built throughout Sweden by various people for different ends as early as the late 1970s and into the 1980s. However, two key visions emerged as the reasons for digitizing species observations, specifically: the perception that adopting this technology would assist i) scientific progress and ii) the protection of certain species.

In respect to scientific progress, individuals working in ecology and taxonomy, for instance, recognized that digital databases could provide avenues for furthering research of a species group they cared about. At the same time, after losing some relevance in the 20th century, taxonomists found they could “resurface in the interest of science” by creating digital phylogenies, which required a supporting database infrastructure (Beckman 2012: 396–397; Godfray and Knapp 2004). Some of those interested in scientific relevance worked in natural history museums throughout Sweden and realized that the creation of digital databases could assist their research and daily operations (Naturhistoriska Riksmuseet Staff 2021b). Yet, these database developers had little reason for building databases that extended beyond the confines of their own scientific interests. In particular, the Swedish Museum of Natural History had a history of divergence among its zoological and botanical experts in terms of what the museum and its collections were for (Beckman 2004: 101–104). Elsewhere, scientists working with long-term data collection projects, such as those involved with Svensk Fågeltaxering (Swedish Bird Survey) based at Lund University, also began using digital records for their own scientific purposes (Svensk Fågeltaxering, n.d.). Thus, budding infrastructures built around scientific relevance remained largely personal or restricted to tight knit interest groups with existing data. Their scientific specialization made sharing this data among scientists and naturalists who focused on different animal or plant groups uncommon. However, what drew these disparate efforts together was their access to long-term data sets and existing collections that could be digitized and made available to larger groups of people. These existing collections and records—whether personal or shared—represented stores of information that lent themselves to digital database creation as well as the promise of greater scientific progress and relevancy for those who managed them.

During this same period, many natural scientists and naturalists saw the direct consequences of intensive human activities upon the species they were interested in and sought solutions in compiling observation records on them. Efforts directed at compiling lists of species that were endangered or threatened became a second guiding vision for individuals and institutions in Sweden in the 1970s (Artdatabankens Verksamhetsberättelse 2014: 4). For instance, Skogsstyrelsen (Swedish Forest Agency), Sveriges lantbruksuniversitetet (SLU, Swedish University of Agricultural Sciences), and Naturhistoriska riksmuseet with funds also provided by the Swedish chapter of the World Wildlife Fund and Naturvårdsverket (Swedish Environmental Protection Agency) put together the projects “Projekt Linné” and “Floravård i Skogsbruket,” that made headway in compiling the first red-lists for Sweden. As project leader for Projekt Linné, botanist Örjan Nilsson notes, “extensive help was needed with both renewed inventories as well as archiving and analysis of received material, not to mention looking for new contacts and keeping them alive.” Existing individual lists and databases could not grapple with the problem of disappearing biodiversity on their own (Nilsson 2005: 154–156). With these projects brought together under the combined effort, “Databanken för hotade arter” (“Threatened Species Unit”), this vision to protect nonhuman species through biodiversity data would eventually become institutionalized at SLU in 1990.Footnote 2 Unlike those pursuing scientific relevance, amateur naturalists and scientists often concerned with biodiversity loss shared their digital records by making copies and passing these on to friends and colleagues. Members of recording groups, wildlife organizations, and researchers would share data or make atlases on specific groups of organisms by bringing their respective databases together (Artdatabanken Staff 2021d). Once a year, individuals and clubs were summoned to their Länsstyrelse (local county office) to voluntarily provide observational data to the public authority (Kasperowski & Hagen 2022). These efforts helped make the Swedish government consider biodiversity loss of national interest. In 1992, the United Nations Convention on Biological Diversity took place in Rio de Janeiro, where Sweden became a signatory party, officially ratifying the convention in 1993 and promising to develop a national biodiversity strategy in the ensuing years. With national interests focused on biodiversity, the Swedish parliament provided permanent funding for “Databanken för hotade arter” in 1993 (ArtDatabanken 2020) and which would eventually be renamed to Artdatabanken (SLU Swedish Species Information Centre) in 1995.Footnote 3 These concerns over biodiversity loss brought individuals and institutions together to build a biodiversity informatics infrastructure that could assist Sweden to make biological records that would serve its public institutions.

These two visions began to materialize in the required technologies needed to produce digital databases. “The gradual computerization of Swedish society” in the late 1970s, including the introduction of personal computers, provided means for individuals to develop their own digital collections (Kaiserfeld 1996: 251–252, 257). This led individuals—who later assisted in the development of biodiversity informatics in Sweden—to create and maintain their own databases on select species (e.g., gall wasps, grasshoppers) using available software such as FileMaker (1985), dBase (1981) and Lotus 123 (1983) as early as the mid 1980s (Artdatabanken Staff 2021b; Artdatabanken Staff 2021c; Artdatabanken Staff 2021e). Additionally, Swedish scholars who had traveled to other countries and assisted the development of databases abroad—such as Morphbank, a biodiversity image database funded by the USA’s National Science Foundation—would bring home additional values, such as using open-source software, which further diversified the number of technical approaches to recording and increased the complexity by which the data could be combined (Naturhistoriska Riksmuseet Staff 2021a). These early adopters set the groundwork for innovative large-scale databases; however, by using multiple nomenclatures as well as various database structures or formats, they simultaneously created one of the biggest obstacles for later sharing and connecting biodiversity data amongst themselves.

In sum, different visions assisted in a widespread, weak assemblage of digital biodiversity data infrastructures throughout Sweden during the adoption of database and digital recording technologies. In these formative years, Swedish biodiversity databases primarily began to coalesce around two visions, specifically centering on advancing scientific expertise and relevance by scientists and their existing species collections as well as addressing normative concerns over biodiversity loss by naturalists and state officials. However, sharing data did not always mean combining data or databases, regardless of interest. Additionally, competing visions, such as hunting and logging (not discussed here), would lead to further independent efforts at collecting biodiversity data in Sweden. Such disparate visions meant that biodiversity data, though digitized by many individuals and organizations, would remain separate and that some of this data would overlap across the multiple infrastructures being set up. Hence, the development of digital databases at this time remained somewhat fractured, isolated, and, to some extent, redundant. With many databases being created by many people in separate places and for different purposes, initial practitioners began to tie a knot that is still being untangled (Artdatabanken Staff 2021d).

Institutionalizing Visions: Science and Internationality

Those interested in the scientific relevance of species observations pursued infrastructures in connection to a broader international context. In 1999, the Organization for Economic Co-operation and Development’s Mega Science Forum endorsed biodiversity informatics, recommending the establishment of a global biodiversity database. The facility’s main aim was to harness the potential of collection databases, by making this data open and available to the international community (OECD 1999). As a distributed database system, the Global Biodiversity Information Facility (GBIF) sought to share contributor data virtually over the internet while allowing this data to remain under control by its respective institutions (Soberón and Peterson 2004: 690). GBIF sought to “bring the massive amount of biodiversity data located in natural history collections to the desktop of any user,” focusing on digitizing collection-based databases for an international audience (Edwards 2000: 2313). Sweden became an official supporting member of GBIF, signing GBIF’s memorandum of understanding and providing financial contributions to the network (GBIF January 23, 2001; Telenius 2011).

As a new member of GBIF, Sweden needed to set up its own national GBIF node, and the Swedish government gave this task to Vetenskapsrådet (VR, Swedish Research Council) in early 2001. VR set up the Swedish GBIF node, GBIF-Sweden, at the natural history museum in 2003—furthering the divide between the museum’s budding infrastructure and Artdatabanken’s Artportalen—where it existed as an independent, stand-alone collaboration for several years (Naturhistoriska Riksmuseet Staff 2021a; Naturhistoriska Riksmuseet Staff 2021b).Footnote 4 The intention for GBIF-Sweden was to play a strong and obvious role in the development of biodiversity informatics and biodiversity knowledge. After Vetenskapsrådet gave the museum the assignment to work with GBIF in 2001, they hired a node manager and a technical officer. With these hires, GBIF-Sweden became active, and its staff sought to bring together collection datasets throughout the country. These workers requested existing analog records to be digitized and assisted in restructuring the formats of digital records so that they could be shared. Though informatics had been implemented to structure collections at Swedish museums and herbaria, there was little to no coordination among the different databases. Collections could not be searched or accessed, because they typically existed as a spreadsheet on a personal computer (Naturhistoriska Riksmuseet Staff 2021c). GBIF-Sweden thus focused on finding such data repositories in Sweden, attempting to coordinate large collections from natural history museums and herbaria.Footnote 5 Observational databases, like Artportalen, were considered for inclusion by the larger GBIF network, but the inclusion of data from these databases in the early 2000s still felt “radical” (Naturhistoriska Riksmuseet Staff 2021c). That is, observation data, being tied to normative concerns about biodiversity loss and being generated by amateur naturalists, suffered from perceptions that this data was not of comparable quality to collection-based specimen data (Naturhistoriska Riksmuseet Staff 2021b). The different visions for pursuing digital biodiversity data contributed to incompatibilities between their ensuing infrastructures.

Thus, GBIF-Sweden tackled the sharing of collections and analog to digital conversions—seen as a key solution for addressing the “biodiversity crisis” internationally—by coordinating among different collection databases within Sweden (Krishtalka and Humphrey 2000: 611–614). Those holding collections seemed eager to work with GBIF at first, “lining up” in order to make their collections compatible with GBIF’s system. Their enthusiasm for the work meant that Sweden led the total number of observations on GBIF near its inception. However, GBIF, for all its good effort at international cooperation, moved slow and appeared to lag in the computing department. Recorders were hoping they would get information on their species of interest, and so they made a “leap of faith” to contribute their collections. Indeed, bringing together these collections was a work of trust and generosity on the part of the contributors. Yet, as GBIF failed to meet contributor expectations, it started to feel like GBIF-Sweden was trying “to sell a car with nothing under the hood” (Naturhistoriska Riksmuseet Staff 2021c). Though GBIF-Sweden brought many databases together through partnerships, digitizing records remained difficult and slow. Moreover, as technologies increased capacity for digitizing existing collections and best practices were developed, previously digitized data evidenced shortcomings (Naturhistoriska Riksmuseet Staff 2021b). As of 2012, only a little over one fifth of the Swedish museum collections were digitized, with “hundreds of years or more” to go if the pace were to remain the same (GBIF-Sweden Strategic Plan 2014: 5).

It was not until the mid-2000s that “the museum gradually adopted the ideas of GBIF for its own development with internal resources added,” and the node was incorporated within the museum’s bioinformatics and genetics research unit in 2013 (Personal Correspondence). GBIF-Sweden functioned also in parallel to the “Digital Information system for Natural history data” (DINA) consortium in 2006, with partners from Canada, Scotland, and Germany. This consortium sought to produce open-source web-based collection management systems to replace the locally-hosted and various applications upon which collections-based data were stored. In Sweden, the consortium eventually produced or took part in several applications, including Naturarv, the Swedish DNA Key, and Naturforskaren. During this time, GBIF pursued building tools and applications and found an open-source software solution in the “Atlas of Living Australia” project. In 2013, this atlas project’s open-source software became a target system for standardizing across GBIF’s international participants, with a Living Atlases community that came together, and an external review of this system in the following year (https://bioatlas.se/the-living-atlases-community/). The growing expertise and use of open-source software at the international level led GBIF-Sweden to adopt this interface and form the Biodiversity Atlas Sweden consortium to pursue open-source solutions for biodiversity informatics in Sweden. Their efforts would result in a BioAtlas online portal, which would go public in March 2017.

With interests rooted in scientific relevance, the practice of sharing museum collection data nationally and internationally among scholars, and the pursuit of open-source software, the informatics community and infrastructure built around GBIF-Sweden became institutionalized within an international-oriented context. This integral relationship with the international informatics community shaped GBIF-Sweden to approach biodiversity informatics in an alternative direction than in other parts of the country where species observations spurred by biodiversity loss remained tied to local and national managerial interests.

Institutionalizing Visions: Biodiversity Loss and the Nation

As GBIF-Sweden came into being and began its work of digitizing and compiling data from analog collections spread throughout the country, technological advances in compiling data on biodiversity observations captured the attention of naturalists involved in species and habitat protection. Concerns about the spread of salmonella in dairy products by barn swallows became an issue throughout Sweden in the 1990s (Haemig et al. 2008; Lindström 2008). To assist in monitoring this issue, staff at Naturvårdsverket worked with birding organizations—Sveriges ornitologiska förening (Swedish Ornithological Society) and Club 300—to develop observation reporting software that would keep records on where barn swallows were spotted. Though housed on a database server intended for tracking bathing water quality, the database was named “Svalan” (the Swallow) and was launched in 2000, becoming one of the first online reporting systems for birds in the world. Before the release of this observation database, Sweden’s migratory bird counters were using “Stracknet,” an email listserv that they eventually printed as a newsletter in 1997, to capture a wider distribution and audience. However, Svalan quickly garnered the attention of birders throughout the country who saw it as a new tool for making records of their observations. This interest outstripped Svalan’s original purpose, and the database was further developed to be able to record all bird species, with the first report of a whooper swan on January 1, 2000 and with the reporting system being released that June. Interests centered on public health and the pastimes of birders converged to produce this specific database. Its growth drew interest from those who set up Artdatabanken and saw potential in Svalan for assisting in developing a national registry of biodiversity data. As a result, Artdatabanken entered into an agreement with Naturvårdsverket to take control of the database in 2001,Footnote 6 rename it Artportalen (Swedish Species Observation Portal), and develop it to include more than just birds.Footnote 7

Not having to initially deal with the social coordination of collection-based databases spread throughout the country like GBIF-Sweden, Artdatabanken could focus on the technological challenges of making recording biodiversity data as easy as possible. Focused on the national rather than international scale, Artdatabanken operated as an independent unit and used preferred software and tools for building their databases and collecting biodiversity observation records. Staff used proprietary software solutions for managing the organization’s databases, in part to maintain in-house control over their project as well as believing that the state of open-source software in the early years of biodiversity informatics would have made all the things the organization wanted to do more difficult to accomplish (Artdatabanken Staff 2021c). They used existing tools and often developed their own way of handling data, which worked for them but made connecting their database to others using different standards more complex. For instance, taxonomic nomenclature is essential for biodiversity informatics (Soberón and Peterson 2004: 689, 694). To develop a strong taxonomic “backbone,” databases must be flexible and be able to cope with complex changes in how speciation is understood, including the use of pseudonyms, split species, and identification of new species (Artdatabanken Staff 2021b; Artdatabanken Staff 2021c; Naturhistoriska Riksmuseet Staff 2021b). Artdatabanken’s solution involved developing a taxonomic database called Dyntaxa that would provide the species names for multi-cellular taxa occurring in Sweden. This database initially required their taxonomists to input existing taxonomic information, and the platform was modified over time to keep up with taxonomic changes (Artdatabanken Staff 2021b). However, unlike many biodiversity databases that rely upon name identifiers, staff at Artdatabanken came up with a “taxon concept,” which allowed their concept identifier to remain unmodified even if the species name changed (Kindvall et al. 2015: 7–8; Artdatabanken Staff 2021b). Though the taxon concept serves Artdatabanken and its partners’ infrastructures very well, it remains incompatible with other systems modeled on using different identifiers, including the GBIF infrastructure that relies upon name identifiers from approximately 100 sources (GBIF Secretariat 2021). Such incompatibilities are not intrinsic failures but reflect how the different visions become materialized differently in specific organizations. Additionally, because Artdatabanken was collecting observational data, its staff also had to develop protocols for handling this data. During Artdatabanken’s early years, not much thought was given to data ownership, so data was gathered from their sources as much as possible and as quickly as possible. Also, because data on sensitive species was typically shared only for species’ protection, Artdatabanken developed ways for hiding and accessing sensitive data, which is still treated privately and needs to be transferred manually even if the technological capacity for sharing secure data online is possible (Artdatabanken Staff 2021c). Moreover, by relying upon proprietary software, Artdatabanken gained access to a larger hiring pool. Educators in Sweden often taught their students to use proprietary rather than open-source software because IT companies targeted universities to buy licenses from them. Hence, Artdatabanken’s needs matched the expertise of up-and-coming developers while also becoming subject to increased maintenance and operations costs as their system grew (Artdatabanken Staff 2021c). By pioneering their own system early on, Artdatabanken made innovative changes and created their own niche market while simultaneously making it harder for their system to be compatible with others. In sum, being an early forerunner of observation databases contributed to future fragmentation between their system and others that arrived later and focused on their own unique visions.

The growth of Artdabanken’s system and the number of observations-based records they produced assisted Artdatabanken to become a source of expertise at the national and regional scale as well as eclipse other Swedish biodiversity informatics infrastructures. They worked with institutions in Norway, Finland, Denmark, and Iceland, assisting data providers in these countries to integrate their data with Artdatabanken’s and/or develop their own national observation-based informatics systems (Artdatabankens Verksamhetsberättelse 2015: 5, Norwegian Institute for Nature Research 2014: 2–4). This, alongside increases in costs and attention to conservation policy, turned Artdatabanken’s ambitions abroad to informatics operations at the European level. In 2008, the European Strategy Forum on Research Infrastructures endorsed the European LifeWatch project. Sweden became the first country to receive grants to achieve a national LifeWatch structure after receiving 36 mSEK in 2009 from Vetenskapsrådet and Naturvårdsverket (Artdatabankens Verksamhetsberättelse 2015, p. 4), with the Swedish LifeWatch consortium beginning its operations at Artdatabanken in 2010–2011.Footnote 8 VR perhaps placed LifeWatch at ArtDatabanken instead of GBIF-Sweden because the LifeWatch initiative was handled within the council’s Department of Infrastructure while GBIF-Sweden was handled within the Natural Science Department (personal correspondence). However, it could also be that GBIF-Sweden, as part of an international rather than regional-oriented organization that had been advised to focus on building and storing data over developing analytical tools, was less-suited than Artdatabanken to host this project (GBIF 2013). For example, the LifeWatch project sought to develop a web-oriented service that provided access to terrestrial and aquatic biodiversity data (Gärdenfors et al. 2014). Not intending to create more databases, the Swedish LifeWatch endeavor aimed to connect existing databases in Sweden and then feed them into the European system by making their data “accessible to researchers, policy-makers and citizen scientists through a single entry point” (Leidenberger et al. 2016: 7). They forged national and international connections to national and EU-based biodiversity database projects, including the Nordic LifeWatch Initiative, BalticDiversity, Biodiversity Virtual e-Laboratory, and Group on Earth Observations Biodiversity Observation Network (Swedish LifeWatch 2013: 9–13). LifeWatch seemed to be the “perfect” project to “get resources to expand the current Artdatabanken system to a national [system], including connecting all major informatics databases in Sweden, such as “national databases on seals, butterflies, hedgehog, moose and other game animals, as well as databases on invasive species, mussels, freshwater jellyfish and crayfish or the tree portal” (Leidenberger et al. 2016: 7; Artdatabanken Staff 2021e). Not only did LifeWatch aim to collect data from existing databases but it also sought to provide analytical tools for processing data, culminating in the web application “Analysportalen.” It aimed to overcome the problems of taxonomic and data standardization as well as data gaps in order to create a “virtual laboratory” that could inform political and conservation agendas at a “global” scale (Gärdenfors 2012: 82–83). Thus, LifeWatch assisted Artdatabanken to further its goal of becoming the “one-stop shop” for biodiversity data in Sweden, by positioning itself as the primary provider of biodiversity data and data analysis in Sweden and promising open, available, standardized data in the near future (ArtDatabanken 2017: 5, 12–13). Though the Swedish LifeWatch project moved ahead, many of its European partners struggled. Unable to apply for funding from the Swedish research councils in 2014, the Swedish LifeWatch consortium members co-financed an additional 1.5mSEK to the project, and Vetenskapsrådet extended the Swedish LifeWatch contract from 2015 to 2016. Though the LifeWatch project would persist in Sweden, it would take on an observer role and no longer actively participate in the European consortium (https://www.slu.se/en/subweb/swedish-lifewatch/about/organisation/; Leidenberger et al. 2016: 1). As the non-Swedish initiatives around LifeWatch struggled, the international-oriented, science-driven GBIF network appeared more attractive to the research councils.

Two Infrastructures: Funding Visions and Autonomy

From its beginnings, biodiversity informatics infrastructures in Sweden developed ad hoc. The two visions along with their orientations and institutionalizations influenced these infrastructures to represent and embody different motives, features, and capabilities. As these systems grew larger, they began to occupy a scale they had never done before and, thus, needed to enroll assistance from organizations external to their vision and for whom these infrastructures were not built. As a result, the various interests, needs, and assistance of these external organizations drawn into the organizational side of this biodiversity informatics infrastructure assisted and, in some cases, strengthened the divide between its multi-faceted communities.

In the early 2000s, the approval of the Swedish national budget hinged upon the support of the Green Party, who, in return for getting the budget passed, bargained with the Social Democrat Party to earmark 440 mSEK (approx. 47 million euros) for biodiversity research between 2002–2004, with additional funds remaining in the budget until 2007 (Beckman 2012: 400; Vetenskapsrådet 2003: 2.; Vetenskapsrådet 2010: 3; Vetenskapsrådet Staff 2021b).Footnote 9 These funds were handled by Vetenskapsrådet, Formas, and The Swedish Taxonomy Initiative lead by Artdatabanken to revitalize museums, develop biodiversity infrastructure, and fund research (Personal Correspondence). Yet, many possible large-scale collaborative biodiversity research projects never materialized, perhaps because Swedish funding organizations prioritize funding a diverse range of grant projects (Vetenskapsrådet Staff 2021b; Sörlin 2007). What occurred was that several groups received funding—including GBIF-Sweden and Artdatabanken—which allowed them to develop somewhat independently from each other. Additionally, these funds were attractive to researchers outside the biodiversity informatics community, and research councils granted awards to many projects that were not expressly related to developing this infrastructure. As a result, some biodiversity informatics researchers and staff at funding agencies viewed this distribution as a misuse of funds because it directed money away from where it was intended (Beckman 2012: 400–401; Vetenskapsrådet Staff 2021b). However, from a research council perspective, those in the biological sciences were seen as approaching their most important challenges in a piecemeal and uncoordinated fashion (Vetenskapsrådet Staff 2021a; Vetenskapsrådet Staff 2021b).Footnote 10 Also, funds provided to the Swedish Taxonomy Initiative (2002) were subject to political maneuvering within the research councils. Given 30mSEK, the Swedish Taxonomic Initiative sought to “complete an inventory of Sweden’s fauna and flora of multicellular organisms within 20 years” (Ronquist and Gärdenfors 2003: 270; Beckman 2012: 400). However, because this initiative “straddle[ed] the boundary between academic science and environmental policy […]. The Swedish Research Council for Environment, Agricultural Sciences and Spatial Planning (Formas) did not regard this as ‘research’ at all, and eventually managed to get rid of the assignment” (Beckman 2012: 395, 401). As such, funding distribution between biodiversity informatics endeavors varied over time and among recipients, leading to unequal development across competing infrastructures.

As different organizations and actors pursued developing their own infrastructures with increased government funding, their ability to maintain relative autonomy made coordination and collaboration between them more difficult. Though different biodiversity informatics groups connected, worked together, and established ties, the Swedish government determined that biodiversity knowledge contained within the various databases was not sufficiently coordinated and made a commission of inquiry in 2005 (Convention on Biological Diversity 2007: 19). In 2010, an evaluation on the biodiversity research funded during 2002–2009 concluded that the biodiversity sciences in Sweden still needed a clearer strategy, definition, and greater collaboration among other concerns (Vetenskapsrådet 2010: 10). In 2013, bringing together data and analyses still figured as an integral part of the solution for achieving biodiversity goals (CBD 2014: 58). Collaborative partnerships, rather than building a comprehensive infrastructure, often aided the specific needs of one infrastructure belonging to one group over the other. Moreover, partnerships were not often viewed as beneficial or complementary. For instance, the museum’s DINA consortium received assistance and funding from Artdatabanken’s Swedish Taxonomy Initiative (Artdatabanken Staff 2021e). Yet, as GBIF-Sweden and others within the GBIF network collaborated with Artdatabanken, these collaborations—which appeared to merge the scientific agenda of GBIF with more policy and conservation-oriented efforts—were viewed as weakening GBIF’s global image as an “apolitical global science infrastructure” (Global Biodiversity Information Facility 2013: 1–3). Partnerships among biodiversity informatics groups, though evidencing signs of collaboration, did not strive for full integration among their biodiversity informatics systems. The distinctions between and priorities placed on collection data, observation data, and the ensuing infrastructures developed to achieve varying aims for multiple disciplines (e.g., ecology, taxonomy) put strain on the efforts to come together. Thus, with significant investment in biodiversity informatics in Sweden during the early 2000s, different biodiversity groups were able to develop their own databases in their own ways, increasing infrastructure complexity and community autonomy, making the underlying challenges to unify data more difficult. The various informatics groups could continue to build infrastructures that served the visions of their own interest groups.

Though increases in funds did not achieve a synthesis of projects, decreases in funding challenged the autonomy of the multiple infrastructures. As funding from the Swedish government began to be scaled back, with the “biodiversity earmark…removed” in 2007, observation and collection-based informatics projects needed to find funds elsewhere (Vetenskapsrådet 2010: 3). In 2013, the Ministry of Rural affairs reduced funding to both Artdatabanken, removing 10mSEK from the “Svenska artprojekt” (Swedish Taxonomy Initiative) budget, and the natural history museum, intending to halve the museum’s budget (Pihlstrand-Trulp 2013: 5). In 2012, GBIF “only received 64% of its anticipated budget,” causing “serious negative effects” to their organization (GBIF 2013: 3). Such moves complemented as well as contradicted the Swedish government’s biodiversity strategies that emphasized “integrative” protection, management, use, and restoration as well as involvement in global initiatives (Regeringskansliet 2013: 5). With available funds decreasing, maintenance costs became a real concern for Artdatabanken, especially due to their reliance on proprietary software as well as technological developments that outdated the Artportalen system (Peterson et al. 2022). Upgrading their infrastructure would mean redoing the whole thing; it would mean investing finances and labor in a product that would have no more additional functionality than its predecessor (SLU Staff 2021). Nevertheless, in 2015, it was already apparent that Artportalen needed “to be newly built and redesigned” (Artdatabankens Verksamhetsberrättelse 2015: 4).Footnote 11 As GBIF-Sweden and Artdatabanken mobilized to deal with the fallout from reduced funding, Vetenskapsrådet pursued combining the two infrastructures, requesting that a joint steering group be created by the start of 2015 to run both GBIF-Sweden and Swedish LifeWatch (GBIF-Sweden: Strategic plan 2012–2016; Artdatabanken Staff 2021e). GBIF-Sweden thus sought more “intense” collaborations with Swedish LifeWatch (GBIF-Sweden: Strategic plan 2012–2016). However, the two did not come together, as both GBIF-Sweden and Artdatabanken sent in separate proposals for funding during 2017. The two proposals were seen as valuable to pursue if they were combined (Vetenskapsrådet Staff 2021b); hence, Vetenskapsrådet mandated that the two infrastructures merge to form the Swedish Biodiversity Database Infrastructure (Vetenskapsrådet 2018). Ultimately, the research council made a financial decision which served their egalitarian values. They sought to reduce costs in biodiversity database infrastructures by combining them into one, prioritizing GBIF-Sweden’s international approach based on open-source software while attempting to protect their investments in the Swedish LifeWatch program (Naturhistoriska Riksmuseet Staff 2021a). In this way, they sought to ensure the longevity of their previous investments while also putting stress on the multiple databases that sought to retain independence over their approach to biodiversity informatics. Funding for the Swedish Biodiversity Database Infrastructure now requires 50% co-financing, which comes from the partner institutions (Naturhistoriska Riksmuseet Staff 2021a).Footnote 12

Though touted as a success, this merger has not come together without other ties being loosened or removed. As research councils prioritized funds for “innovation” over maintenance, Artdatabanken needed to figure out how to maintain its informatics systems and continue operations, leading to the attempt to charge county administrations (länstyrelser) for using their data. This attempt failed not only because the counties reacted by downloading as much data as possible from Artportalen to their own servers before they could be sent a bill but also because the expectation was that all observational data should be free and open (Artdatabanken Staff 2021a). Moreover, reduced funding and additional cuts in 2019 led Artdatabanken’s management to reorganize (Artdatabanken 2020, Skeri 2019a: 4). Approximately 10% of the employees were fired, which led to a conflict between Artdatabanken and the labor unions. Staff argued that funds were being given to increase administration and IT personnel and that biological expertise was less valued while management deflected blame to the research councils and governance, arguing that they were forced to make cuts, they did so according to Swedish law, and the research councils had stressed making budgets more efficient by focusing on digital work (Skeri 2019b: 6; Skeri 2019c: 6). The changing infrastructures and their systems, the reductions in funding, and the mandate to collaborate all assisted in reshaping the connections that existed before and those connections that exist now.

Discussion: Further Challenges to Biodiversity Informatics Infrastructures

Our case story reveals how the organizational aspect of Swedish biodiversity informatics infrastructures developed in the context of two main visions: scientific progress and species protection. In Sweden, these visions and their connection to collections and observations led to two main different biodiversity informatics infrastructures dedicated to making digital records of these types of data. As these infrastructures materialized around these visions in specific organizational contexts —including the level to which their systems depended upon open or proprietary software and were oriented towards international or national concerns—the reluctance to give up such infrastructures by their respective owners became more pronounced. Achieving a single national biodiversity informatics system proved difficult to accomplish in Sweden and lacked coordination. However, through our case, it becomes clear that the unification of these systems was not an original vision but one that developed over time with the addition of external organizations that funded these infrastructures. Swedish organizations that provided governmental funding to these infrastructures contributed to expanding biodiversity informatics infrastructures in both size and diversity. However, because the funding did not always match needs, it fueled competition and forced cooperation among infrastructures. Funding allowed different groups to retain more-or-less autonomous infrastructures but also drove their unification when funding became less available. This organizational context thus played a role in the formation of multiple groups and infrastructures that digitize and produce data on biodiversity in different ways, highlighting similarities and differences in developing similar infrastructures in other European countries, such as the UK and Finland (Lawrence 2010; Schulman et al. 2021: 7–8). Furthermore, it demonstrates that standardization of methods and the construction of boundary objects occurred primarily within institutions rather than among them (Star and Griesemer 1989). This context points to the difficulties in scaling up boundary work in an infrastructural ecology that encompasses multiple institutions with non-standardized and, in some circumstances, non-reconcilable visions.

In Sweden, many databases or “repositories” (Star and Griesemer 1989) were built to meet specific organizational contexts. Yet, these repositories rarely merge at the organizational level. Instead, data redundancies and overlap exist within the broader system as only certain data gets shared between some databases and not others (Hardisty et al 2013; Peterson et al. 2010). For instance, Sweden’s forestry and hunting data remain disconnected from the broader informatics system. Observation data on predators may be reported in different databases, such as Artportalen, Rovbase, and Skandobs, but no one will be able to know whether recorded observations in these databases are duplicated and thus also represented in one or more databases (Artdatabanken Staff 2021d). Many other biodiversity databases remain separate in Sweden, including the database run by the Swedish Fågeltaxering group (Artdatabanken Staff 2021e). Ways to reduce duplicates could be found from a technical standpoint, but this ignores why the data is gathered, for what purpose, and for whom. Digitization of biodiversity data theoretically allows for data sharing, combination, and recombination, but it does not guarantee it (Van der Wal et al. 2015). These digital infrastructures still need to become “FAIR,” so that digitized data may find use and reuse (Wilkinson et al. 2016). Additionally, it does not supplant or do away with concerns over issues such as sensitive data, data quality, scientific credibility, privacy, institutional prestige, and funding (Anhalt-Depies et al. 2019; Cooper et al. 2021; Ward-Fear et al. 2020). For instance, biodiversity data from competing databases provides means for politicizing data in other contexts, such as legal conflicts in Sweden that have questioned which observation data is most credible (Kasperowski and Hagen 2022). Essentially, many key aspects keeping biodiversity repositories separate, incompatible, or redundant are social and political challenges that may continue to exist even if technical solutions appear.

To maintain databases’ integrity and character as well as to bring data together and ensure institutional relevancy, the solution has been to mold data into “standardized forms” (Star and Griesemer 1989) in order to distribute the data (Soberón and Peterson 2004: 690). That is, it is near impossible to use and manage biodiversity data sets without distributing parts of the development process to other actors and to technology (Giere 2002; Giere and Moffat 2003). Consequently, in order to function and avoid a potential harmful fragmentation into various epistemic monopolies, such highly distributed settings need to develop standards as well as trust between actors and technologies to function (Knorr-Cetina 1999, 2007; Reyes-Galindo 2014). The standardized forms of data in one repository, however, must be altered to another when moving from one repository to the next, meaning that data gains and loses certain attributes (Peterson et al. 2022). This validates concerns regarding the trustworthiness of data appearing on aggregate databases like GBIF (Ferro and Flick 2015; Sikes et al. 2016: 149). It also implies that not all records contribute to the same purposes and agendas and, therefore, inheres challenges beyond technological expertise. For instance, our case illustrates that the value of data extends beyond its scientific credibility. The values and meanings of biodiversity data change based upon various indices, such as conformity, who it is shared with, and quantity. Biodiversity data depend upon how relational links are established between them, their features (including metadata), and the infrastructures in which they are stored, and how geographic, political, and technological boundaries affect these relationships. Bringing biodiversity data together shows how data formatting needs to be made compatible, thereby altering data as it populates differing databases.

This case story also demonstrates that the biodiversity being recorded by these infrastructures represents an “ideal type” that also operates as a “coincidental boundary” (Star and Griesemer 1989). That is, the different infrastructures contribute to developing not just biodiversity but Swedish biodiversity, in which each database compiles their own version. So, although the different infrastructures work towards a common end, they evidence differences in motivations and design within the broad characterizations made of biodiversity informatics infrastructures that involve non-specialist participation (Haklay 2013: 106–111; Arts et al. 2020). The efforts of those who contribute to these infrastructures suggests variety, diversity, specializations, and collaborations among contributors serve to set biodiversity groups apart as much as bring them together, functioning as additional drivers that contribute to the growth and diversity in citizen science biodiversity projects (and others) to those already identified, such as technological advances, increases in an educated general populace and leisure time, and degree and quality of participation (Haklay 2013: 111–112; Shirk et al. 2012). As institutions and people come together to do biodiversity informatics, they accrue prestige and recognition that give their efforts meaning and value amongst themselves, their networks, and broader society, which can establish hierarchies and power imbalances. As “data represent power” as well as “emotions and personal meaning” (Lawrence 2010: 262), professional and volunteer work must be cared for and acknowledged to retain participation and address tensions among coordinators and users (Verploegen et al. 2021). Most participant-oriented infrastructures recognize that they need to cater to the interests of those who assist their work, but when this data moves to more global contexts, data aggregators must also shoulder accountability for giving back to the institutions and individuals that have freely given over their data as well as account for what types of data compile the dataset. This makes achieving a unified means for collecting and representing biodiversity knowledge subject to idiosyncrasies that hinder a transparent negotiation regarding what biodiversity as an ideal type represents or means.

This challenge becomes even more evident as different visions and their institutionalizations come to represent different versions of biodiversity. Bias related to what data gets gathered can be philosophical as well as temporal, spatial, or personal (Boakes et al. 2010; Ellis and Waterton 2005). For example, in Artportalen, biodiversity observation data must be a non-human, undomesticated, exist or have existed within Sweden, be easily identifiable by a human unless deemed of pressing concern, be named, and have a registered observer. Without meeting these criteria, an observation cannot be logged. In Dyntaxa, the database includes only those species deemed of interest to the system taxonomists. It includes some unicellular organisms (e.g., cyanobacteria) but not the rest. It includes fossil records of living animals that once lived in Sweden (and no longer do); yet it does not include the species of specimens held by Swedish museums that do not occur in Sweden. It has been developed to potentially include species found in other Nordic countries but doing so has largely remained a possibility more than a reality. Iceland, for example, uses the Dyntaxa database. However, approximately 30 species of lichen, moss, and other plants—mostly introduced species from North America—that occur in Iceland but not Sweden are not present in the database because the system does not include a country identifier in order to filter out data based on individual countries. It also does not allow records to be made on domesticated animals, other species of little concern, or hypothetical species, evidencing similar kinds of taxonomic biases as observation databases (Petersen et al. 2021: 8–9). How biodiversity gets modelled thus depends upon assumptions made about what content and number of data perform biodiversity models sufficiently and how these assumptions get inscribed within their respective infrastructures. The biodiversity informatics system, in order to function, emplaces limitations upon what counts as biodiversity and that which does not, thus advocating for certain conceptions of biodiversity over others.

Conclusion

This case story started out with the purpose of seeking answers to two thematic questions. Firstly, what contexts fueled the production of biodiversity informatics within Sweden, and secondly, how have these contexts contributed to the “legitimacy, appropriateness, and long-term efficacy” of these infrastructures of biodiversity (cf. Edwards et al. 2009: 372).

What we show is that specific visions of scientific progress and species protection assisted the development of biodiversity informatics infrastructures in Sweden but that their institutionalization in different organizations assisted in keeping these infrastructures apart. By paying attention to the organizational contexts of this infrastructure, we synthesized multiple perspectives in order to make visible and appreciate how these different visions and their institutionalization permeate the work of producing biodiversity informatics infrastructures at national and international scales.

This work highlights how these infrastructures depend upon their organizational contexts and displays how they do not produce uncontested goods that can be achieved through technoscientific interventions that connect data. The emphasis on achieving global biodiversity knowledge through accelerations in data collection and improved curation directs attention away from the diversity and situated characteristics that biodiversity informatics infrastructures presently embody and will accrue in the future, such as through the addition of genomic and environmental data as well as even more intricate technologies.

More importantly, this article reveals some of the complexity behind creating biodiversity informatics infrastructures, highlighting that their amalgamations could have been rather different (Woolgar and Lezaun 2015), including our knowledge base of how biodiversity is distributed over space and time. We show that the systems designed to provide knowledge about the occurrence of species over time across the world capture this in ways limited by their organisational contexts, and thus lead to representations of biodiversity that could have been otherwise. The way the particulars come about in this development need to be understood in order to appraise resulting data. That is, disagreements and incompatibilities within biodiversity informatics infrastructures and data, differences in aims and purposes, and desires and needs for recognition and support all need to be recognized as forming part of biodiversity knowledge. These conditions delay and, in some cases, alter the kinds of boundary objects produced. For instance, compiling biodiversity data according to differing visions has led Swedish biodiversity informatics infrastructures to find ways to share and transfer data rather than standardize a single biodiversity databank. Hence, at this scale the multiple infrastructures contribute to merging the boundary objects of ideal type and coincidental boundary to construct Swedish biodiversity. Thus, the underlying visions, relationships and structures of Swedish biodiversity informatics infrastructure taking place in its organizational context both facilitate and complicate their efficacy and legitimacy in producing the biodiversity we think we know.