The role of crowdsourcing and social media in crisis mapping: a case study of a wildfire reaching Croatian City of Split

As climate change continues, wildfire outbreaks are becoming more frequent and more difficult to control. In mid-July 2017, a forest fire spread from the forests to the city of Split in Croatia. This unpredictable spread nearly caused emergency systems to collapse. Fortunately, a major tragedy was avoided due to the composure of the responsible services and the help of citizens. Citizens helped to extinguish the fire and provided a large amount of disaster-related information on various social media platforms in a timely manner. In this paper, we addressed the problem of identifying useful Volunteered Geographic Information (VGI) and georeferenced social media crowdsourcing data to improve situational awareness during the forest fire in the city of Split. In addition, social media data were combined with other external data sources (e.g., Sentinel-2 satellite imagery) and authoritative data to establish geographic relationships between wildfire phenomena and social media messages. This article highlights the importance of using georeferenced social media data and provides a different perspective for disaster management by filling gaps in authoritative data. Analyses from the presented reconstruction of events from multiple sources impact a better understanding of these types of events, knowledge sharing, and insights into crowdsourcing processes that can be incorporated into disaster management.


Introduction
Natural disasters are often unpredictable and can cause significant human and material damage. The development of society and technology contributes to a better response to the disaster (Tuladhar et al., 2015). Using a modern, robust, and complex decision system makes the response a more precise and effective actions are faster.
First, it is necessary to understand that the modern systems of disaster relief and rescue are already very efficient to ensure the disaster's successful resolution. Dissemination of information about these systems, especially concerning a spatial information's, the rules of search and rescue operations, and even human adaptation to such situations, are crucial. One way to achieve a better and more efficient system is to use crowdsourcing to achieve a better alignment of environmental disaster with human factors, as well as a better understanding of the social process. Crowdsourcing allows us to find additional information and scale different solutions to respond faster and more accurately to disasters.
VGI, as a crowdsourcing technique (Haworth et al., 2018), is defined by (Goodchild, 2007) as "the voluntary collection and dissemination of spatial information by individuals who often have little training or formal qualifications in the spatial sciences." On the other hand, crowdsourcing can also be used without location in responding to disasters. According to some authors, crowdsourcing is part of the necessary level of VGI and does not necessarily require conscious data collection (Haklay, 2013;Klonner et al., 2016).
It follows that crowdsourcing is a broader term and can be any action with any goal through crowd participation. VGI is limited by the definition of location or the compilation of geographic information and refers to organized activities and campaigns, often of limited duration. In this respect, VGI usually includes training or guidance for users, as many people are involved, including non-experts in spatial sciences.
From a plethora of general and specific emergency management theories, the specific field of crowdsourcing data and its application in wildfire response and rescue systems has emerged. For example, Oliveira et al. (2019) presented a fire warning service FDWithoutFire that improved the emergency response system for wildfires with crowdsourcing data. Villela et al. (2018) used crowdsourcing as the basis for a decision support system for emergency and crisis management called RESCUER. They used mobile crowdsourcing data to detect and respond to an incident in an industrial area. There are several emergency management systems that incorporate different data sources, and some of them are crowdsourced or social media (Castillo, 2016). SaferCity (Berlingerio et al., 2013;Castillo, 2016) integrates social media and news. STED (Castillo, 2016;Hua et al., 2013) uses traditional news media over social media news. Yang et al. (2014) developed Crowdsourcing Disaster Support Platform (CDSP), which provides a social platform with user collaboration capability to source credible crowdsourced information. LITMUS (Castillo, 2016;Musaev et al., 2014) generates landslide warnings using information collected from social networks and official data from the USGS (U.S. Geological Survey), as well as precipitation data from NASA's Tropical Rainfall Measuring Mission (TRMM).
All these systems have similar difficulties in collecting data, identifying relevant data sources, determining data reliability, and obtaining data in real-time situations. These difficulties could be solved by combining different data sources: Crowdsourcing techniques, social media, and authoritative data. Crowdsourcing is present in many perspectives of disaster management, for example, in a review of VGI for disaster management, Haworth and Bruce (2015) recognised challenges in several categories: Data Collection and Visualisation, Data Quality and Security, Data Management, and Empowerment through VGI. Their categorization serves to acknowledge and support existing theories on a four-phase theory of disaster management: prevention, preparation, response, and recovery (PPRR) (Abrahams, 2001;Cronstedt, 2002;Bajracharya et al., 2011;Rogers, 2011;Xiao et al. 2015). Crowdsourcing through VGI has opened up opportunities for citizens to participate in all phases of this theory of disaster management (Haworth and Bruce, 2015).
For this case study, we reconstruct spatiotemporal social media and other relevant data for 24 h from the start of a wildfire incident that happened in July of 2017 in the outskirts of the city Split. The wildfire, driven by heavy wind, reached several populated places and city of Split suburbs as well as residential districts in a short period. During and after the disaster, many citizens wanted to provide help and data as they wanted to be informed. In this research, sources of data from social media (Twitter, Facebook) were identified and merged with other external data sources to develop emergency response capabilities and raise awareness of the risk based on social media information. Therefore, the methodology workflow for aggregating data from different sources and data mining guidelines based on the existing knowledge was developed. In comparison to other studies, our approach integrates several sources of data, including the theoretical background. The results presented could help develop new emergency response capabilities based on combining crowdsourcing, social media, and authoritative data to improve efficiency and analysis for disaster management. Finally, an overview of the contributions and directions for further research are presented in the conclusion.

The challenges of crowdsourcing in crisis
There is a lot of literature on the topic of crowdsourcing support for disaster management. From the wealth of information found in previous studies, a few recent review studies have highlighted that look back at the importance of the practice due to the constant evolution of technologies.
Crowdsourcing through VGI opens up opportunities for citizens to participate in all phases of the PPRR theory, with a focus on the response phase (Haworth, 2016;Haworth and Bruce, 2015). Zhang et al. (2019) present the roadmap for future research based on a systematic review of previous studies. Their research includes five aspects related to social media disaster communication: The content Spatiotemporal patterns of social media usage distribution Dissemination patterns Rumour and trust issues The public's experience of social media usage.
One way to identify crowdsourcing in disaster components such as (control, verification, and usage) is to interview the emergency managers (Riccardi, 2016).
The following background is related to the context of crowdsourcing in disaster challenges that emerged during this research. Several aspects were selected to highlight crowdsourcing's challenges in disasters: data collection, data credibility, and quality assessment, privacy issues, participant engagement, and data interpretation.

Data collection
With the development of technology, crowdsourcing is becoming an important way to collect data. Technology and connections are becoming more accessible to potential participants. Various platforms for crowdsourcing data collection and analysis are being developed (Berlingerio et al., 2013;Castillo, 2016;Oliveira et al., 2019;Shi et al., 2016;Villela et al., 2018;Zhong et al., 2016;Zhu et al., 2019). Moreover, authoritative organizations are paying increasing attention to social media data (Mooney et al., 2011). According to Zhang et al. (2019), most emergency response organizations and other organizations search through text content (filtering posts based on disaster name) during the disaster. Only a few of them use a geographic search. Some of them ) used a detailed examination over the 10, 000 geocoded tweets during Hurricane Sandy to categorize posts to build an ontology base for a standard framework for social media content analysis during disasters.
In most cases, social media data is widely collected, especially data from Twitter, due to its partially open API (Eilander et al., 2016;Shelton et al., 2014). Twitter allows the use of specially developed software through its API, making it usable by researchers. Depending on the research topic, some researchers have combined some data collection methods, such as using search engines, RSS feeds and collecting data from various authoritative websites (Mejri et al., 2017). Collecting disaster data is especially important in the recovery phase (Riccardi, 2016) when citizens can provide valuable spatial information about losses.
Data credibility and quality assessment Castillo (2016) pointed out that immediacy is key to the relevance of information in social media. People on the ground gather and share information before mainstream media or disaster management systems can even respond. The importance of crowdsourcing as a source of data in disaster management is acknowledged, but so are the limitations associated with it, such as unreliability and questionable data quality.
The need to develop ways and tools for crowdsourcing information quality must be emphasized, as resources and sometimes even human lives are wasted during the crisis (Riccardi, 2016). Even with the best intentions, crowdsourcing participants can provide miserable information. Through social media, unreliable information and rumors can spread quickly, obscuring valuable information (Mejri et al., 2017).
A suggestion for quality assessment, double-checking, and triangulation of crowdsourcing data with official national data has been proposed (Mejri et al., 2017). A much more concrete proposal to ensure the quality of crowdsourcing data is the VGI protocol for improving VGI data quality (Mooney et al., 2016). Eilander et al. (2016) proposed a concept where Big Data could shape the patterns of flooding, which affects the reliability of information collected via social media.
Volunteers can offer assistance by assessing significant data quality after an event, such as aerial imagery, to assess Hurricane Sandy's damage (Munro et al., 2013). This type of volunteer involvement can be sensitive to the quality of the data. It was shown in this case that only 37% of volunteer damage assessments matched expert assessments (Munro et al., 2013;Riccardi, 2016). A VGI campaign set up on a good platform with the proposed classification and guidance for volunteers can significantly increase the data quality (accuracy: 89%; sensitivity: 73%; and precision: 89%) ].

Privacy issues
When using data from social networks, special care should be taken not to violate users' privacy. Although Twitter has fewer users than Facebook, the data on Twitter is public and available for processing through official APIs, and it has been more widely used by researchers (Fiesler and Proferes, 2018).
On the other hand, Facebook API is more restrictive, and there are more ethical issues about the use of data. However, publicly available data can be used for research purposes without the user's explicit consent (Franz et al., 2019). Of course, the privacy and security of individuals should be considered. The ethical and legal issues related to the VGI campaign are not yet entirely clear and resolvable. Nevertheless, it is necessary to ensure that both parties give and understand their consent to achieve the research goal (Mooney et al., 2017).

Engagement of participants
One of the components of crowdsourcing in disaster, as part of the participants' motivation, is control over the disaster (Riccardi, 2016). Some citizens were motivated to participate in social media sharing to share their information with other users. Citizens had a sense of control over the situation when they actively participated in sharing information, especially during the recovery phase. Another motivation for sharing information during a crisis is an emotional expression, crisis coping behaviors, and information seeking (Bird et al., 2012;Smith et al., 2018;Zook et al., 2010). In general, participation in VGI campaigns is associated with human altruism and a sense of loyalty. It can be reinforced by feedback from providers. It has also been found that as the VGI concept and technology evolve, volunteers become more interested in the impact and quality of the data (Baruch et al., 2016). Educated and motivated volunteers provide and collect more relevant data and even participate in the process of data quality assessment (Haworth, 2016;Riccardi, 2016;Rogers, 2011;Zhang et al., 2019).

Spatiotemporal data interpretation
The interpretation of the data proves to be one of the challenges due to the spatiotemporal context. The primary purpose of mapping during crisis mapping is to respond with accurate information. Later, during recovery, the data can be interpreted as static thematic maps (Mejri et al., 2017). One possibility is to use Big Data and appropriate statistical algorithms to derive the probability maps, e.g. for floods (Eilander et al., 2016). One of the more sophisticated analyses based on spatiotemporal theories is mapping user-generated data as a basis for understanding socio-spatial relationships. Analysing the data collected by crowdsourcing, the existence of more complex spatialities than longitude and latitude becomes evident (Shelton et al., 2014). Witanto et al. (2018) implemented a framework for predicting city events based on social media and thus created the basis for a smart government. This paper shows how social media data visualisation and combining crowdsourcing and authoritative data can better understand the nature of events.

Study area and motivation
The city of Split, the second largest city in Croatia, is located mainly on a peninsula surrounded by the Kozjak and Mosor mountains (Fig. 1). The city is the centre of the Dalmatia region and the surrounding settlements are inhabited by more than 200,000 people (Croatian Bureau of Statistics, 2018). Due to the tourist attractiveness of this area, the number of inhabitants increases significantly during the summer season.
A Mediterranean climate characterises the area, with hot, dry summers and mild, wet winters. The mean annual precipitation  is 782.8 mm, mostly in the period from October to April, and the mean annual air temperature is 16.1°C. Monthly extremes  of precipitation and temperature occur in July, the lowest mean precipitation of 25.5 mm and the highest monthly air temperature of 25.7°C (Croatian Meteorological and Hydrological Service, 2018). Undeveloped adjacent rural and mountainous areas are covered by scrubland and forests. Aleppo pine (Pinus halepensis) is the most common tree species found there, while the much smaller area is covered by black pine (Pinus nigra) and pubescent oak (Quercus pubscens). These areas are the main fire hazard zones as they provide highly flammable fuel, especially during the dry summer season.
The motivation for this research is, among other things, the frequent occurrence of forest fire outbreaks in Croatia during summer days. Due to high summer heat, strong winds and human factors, as in countries with similar climate (Spain, Portugal, Greece, Italy), Croatia is exposed to increased fire risk in highly populated coastal areas. Dalmatian fire units record at least 10 calls per day during the summer months (Copernicus EMS, 2018). The prolonged drought and the strong winds that blew from mid-July created ideal conditions for the rapid and terrifying spread of the forest fire. During the time the fire raged, the situation was on the verge of evacuation. The danger of a catastrophe arose when the fire spread to the city of Split landfill site called Karepovac (about 20 ha), located at the city limits, with the danger of releasing toxic gases. The fire was brought under control around noon the next day.

Input data
The input data is combined of three types of data: crowdsourcing, social media, and authoritative data. Every one of these three data types has a crowdsourcing practice. VGI is provided voluntarily by citizens with explicit instructions (related to this event) on a prepared platform. Because it is an organised campaign that part of collecting data refer to an activity of volunteered geographic information (VGI). VGI data was collected through the Crowdmap (Herbert, 2017) platform.
Crowdmap is a free and open-source tool based on Ushahidi. Ushahidi is a tool or a concept that is developed by Kenyan civil activists in 2008 to track and prevent ethnical clashes using the geographic data (Mäkinen and Kuira, 2008). A map of the affected area was created on the Crowdmap platform and shared with the public via social media and networks. On this map, users could share information by textually describing an event, uploading media, or drawing on a map (with the addition of time attributes and a textual description).
Other crowdsourcing data was collected using social media (Twitter and Facebook) where users approved the use of location and position information in the description of the post (by option of shared public posts in public groups and pages). Groups and pages related to this area and fires were found, and data on publicly published posts was obtained with the administrator's permission. Posts indicating a location with the timestamp were plotted to input data. Data from social media helped with filling the gap between other data sets.
The crucial source of data was Public Fire Department of Split (PFDS) call centre data, in which over 4000 calls of citizens were received and interpreted. After the event, PFDS staff listened to the recordings of all calls received by emergency services and fire departments within the specified period and created the transcript. The result was a text file containing the time and transcript of every call that night. From this record, those that denoted a location were extracted and georeferenced with the appropriate time attribute.
The Natural Protection and Rescue Directorate (NPRD) estimated burned area polygon and multispectral satellite data Copernicus Sentinel − 2 were used for general verification of data collected through crowdsourcing. Data harvesting resulted in a database containing various data types and properties, as shown in Table 1. Table 1 shows raw input data that is later processed and analysed.

Data processing
In the next step, all data is homogenised mostly in shapefile format (except images) and used to visualise the information. On the workflow of Split wildfire crowdsourcing (Fig. 2), a methodology for aggregating data from different sources is presented. The methodology was derived by conducting a small search in our own data and following the process of using a different approach. The basic data framework is formed from several data types: crowdsourcing (VGI and social media) data, and authoritative data. These types of data led into the design of the geodatabase.
The VGI data from the Crowdmap was exported in table format. Although users can enter the location and time of the event in the Crowdmap application, some of the posts did not include this information. Users described the locations and typed the timestamps in the description field. These types of posts were manually geocoded and placed in the correct timeline.
Social media data was largely analysed manually with a little help from Octoparse software. In this case, the Fig. 1 Geographical extent of the study area, the city of Split, center of the Dalmatia region in Croatia internet and social platforms also helped mobilise volunteers for the action and later for data collection. The Octoparse software was used to point out interesting posts and threads. The first step was to find the right keywords and hashtags (Murzintcev and Cheng, 2017), for which Google tools were used to show the popularity of different keywords in the Croatian language related to forest fire. The second group of keywords related to the location, the city of Split, nearby places, and Croatia. The third group of keywords referred to warnings and dangerous situations. Irrelevant data related to this event were manually removed from the data collection.
The official NPRD data were already georeferenced vector polygon data and did not need any special processing. Multispectral satellite images from Copernicus Sentinel-2 were used to calculate differenced Normalised Burn Ratio (dNBR) (García and Caselles, 1991). The NBR or dNBR is generally used to estimate the severity of the burn or fire or to highlight these areas. Difference NBR is the temporal NBR that uses the satellite images before the fire event and the satellite images after the fire event. It is based on the satellite bands from the near infrared and shortwave infrared regions of the electromagnetic spectrum (García and Caselles, 1991).
PFDS provided raw data in a tabular format containing three columns: Timestamp, Phone Number, and Call Description. Among the over 4000 call records from citizens to the fire call centre, there were some official communications between fire units and police due to lack of communication equipment availability. Those records that contained a location description, including toponyms, and those called from landlines associated with the address were manually selected. Some calls from landlines indicated a fire near a house in the description, so the house address was used for geocoding. The result was approximately 100 records that were geocoded. Many factors such as call location, call time, call type, time of contact and call duration are considered for later analysis. Figure 3 represents a data flow diagram that shows a summarised overview of the data processing. After studying previous research, the criteria to identify relevant data types for data collection are clearer. Spatial data were collected from the recognized sources and different contexts. The collected spatial data was subjected to processing and analysis to extract valuable information and later visualised to show examples of data use. Interpretation of the visualised data adds to the existing knowledge of the spatiotemporal characteristics of the Split wildfire. Based on the retrospective data analysis presented, an approach and methodology for crowdsourcing data integration were proposed. The procedure is based on activities undertaken to reconstruct an event by combining crowdsourcing data. Therefore, the shape of the inverted pyramid representing the invested effort was chosen to describe this approach effectively (Fig. 4).
The procedure consists of five phases: 1) Identify sources with relevant data. In this phase, the Crowdmap campaign is created and promoted. 2) Collect raw data from identified sources and send requests for official data. 3) Data mining and pre-processing the data from the combined sources. 4) Processing the data. 5) Processing and analysing the information and verifying it by designing test maps and other visualisations.
The first phase consists of two stages, the identification of various sources of existing data on the topic and the organisation of a VGI event trough Crowdmap campaign. Various relevant data sources were identified by searching and crawling the web on this topic. In the next phase, several interviews with experts on this topic and a discussion with the community for participation were conducted. The most used keywords were selected for the study. In the second phase, a VGI event was organised about a crowdmap campaign. Crowdmap is a platform where an organisation can build public awareness and engagement with their data, so we promoted our campaign through social media and networks. Each participant is asked to create their Crowdmap account by filling out the suggested form about the event. People responded and helped to reconstruct the event. They help remember with a list of calls, transcriptions of messages and media from their smartphones.
The second phase involved collecting raw data from identified sources on the internet and sending out requests for official data. Identified keywords and sources from the previous phase were used to collect all data relevant to this event by using the mentioned web crawling tools. Data collected through the Crowdmap campaign was also downloaded.
The third phase is to prepare the data for the next phase. It consists of rough data mining and data preprocessing from combined sources. The primary data collected for this phase was subjected to data refinement. Relevant data were selected based on location and time. For aggregation of data, especially spatial data, it is necessary to transform all data into compatible format and reproject into the same projection and datum.

Results
The search for relevant keywords and hashtags for this fire event resulted in 7 thematic terms, 14 location terms (toponyms) and 4 warning terms ( Table 2).
The map presented in Fig. 5 shows the spatial distribution of the 14 keywords and hashtags found during social media mining, which resulted in a total of 150 social media posts. The cumulative number of relevant social media posts per location published on July 17, 2017 is shown in dark blue (Facebook) and light blue (Twitter) above the location name. The sites with the highest number of posts were the city of Split, Žrnovnica (settlement with approximately 3000 inhabitants), Kila (district of the city of Split) and Podstrana (municipality with approximately 7000 inhabitants) (Fig. 5, Table 2). The Twitter platform was heavily used with the hashtags Split, Žrnovnica, Podstrana and Kila, while Facebook was used with the keywords Split, Srinjine (settlement   with approximately 1000 inhabitants), Žrnovnica and Tugare (settlement with approximately 700 inhabitants) (Fig. 5, Table 2). Inhabitants numbers were acquired from the Croatian Bureau of Statistics (2018) data. Table 3 provides an insight into the temporal as well as spatial distribution of social media posts on this topic. The highest number of posts published on the Twitter platform (45) took place in the period from 22 h to midnight on July 17, 2017, while 33 more posts were published in the period from 20 to 22 h (Table 3, Fig. 5). Twitter activity was significantly lower in other time periods, with 8 relevant posts published in the period from 18 to 20 h and even fewer in the other time periods. On the Facebook platform, most relevant posts (17) were published from 20 to 22 h and 12 more posts were published from 18 to 20 h. Interestingly, there were 8 Facebook posts from 10 to 12 h at the locations where the fire broke out, while Twitter activity was much lower at that time (2). Generally, both social media platforms were more active in the evening (18-24 h), which corresponds to the time when the fire reached the urban area. On the other hand, there were no relevant social posts in the early morning (0-6 h).
Crowdmap platform provided 30 data points (Fig. 6). Crowdmap covers only the period of 12 to 24 h, and it is mostly concentrated near the built-up areas. (Fig. 6) Although the least amount of data was collected by using the Crowdmap platform, the data were of high spatial and temporal quality. The descriptions provided in the Crowdmap contributions were very helpful in fire reconstruction and mapping.
In this study, most of the data were obtained from the PFDS, resulting in more than 4000 phone call records (Figs. 7 and 8). Although only a smaller subset could be accurately georeferenced (Fig. 8), the call descriptions of the other telephone calls proved useful in reconstructing the fire event. Figure 7 shows an overview of the temporal distribution of all data collected via crowdsourcing and emphasises the importance of this dataset. The orange line in Fig. 7, representing the PDFS phone calls, is scaled 10 times to fit the other data collected through crowdsourcing. It should be noted that the data on the start of the fire, from 0 to 8 h, could only be reconstructed from this dataset. Since there was no information from social media or Crowdmap platform from that time.
There is a spike in calls to PFDS at 6 pm (Fig. 7). The spike in calls has a good overlap with other geocoded spatiotemporal social media and Crowdmap data, as the fire was most turbulent and closer to the urban and suburban area at this time, as seen in the map below (Fig .   Table 3 Social media posts distribution by source (Facebook or Twitter platform), time of day on 17 July 2017 and location toponym extracted from the post's description *Split (city) is geographically related to the term Dalmatia (region) 8). It also suggests that people were calling because the risk and visible threat was higher during night-time (Fig.  7). The figures shown are extracted from collected data that appeared interesting and relevant to the reconstruction of the event.
For general verification of data collected through crowdsourcing, Copernicus Sentinel − 2 images and vector polygons of estimated fire extent provided by NPRD were used. Copernicus Sentinel − 2 images from May 18, 2017 and August 6, 2018 were selected and differential Normalized Burn Ratio (dNBR) was calculated to highlight fire impacts (Fig. 9).
The fire event's spatiotemporal reconstruction was done using the georeferenced and contextual data contained in the data descriptions (Fig. 10). The final product is the fire trajectory, which was created using all three types of crowdsourced data (Crowdmap, social media, and PFDS phone calls) in its visualization (Fig. 10). The trajectory consists of consecutive 24-line segments, each of them indicating the general fire movement during the 1 hour of July 17, 2017. The fire started near the settlement of Tugare and entered the city of Split (Fig. 10). Phone calls to PFDS were used to divide the width of the segments into 6 classes (0-60, 61-120, 121-180, 181-240, 241-300, 301-360). The number of social media posts per hour was used to colour the trajectory segments; this was done linearly from yellow to red. Where yellow colour indicated the low number of social media posts and red color indicated the high number of social media posts in an hour. The number of Crowdmap posts was used to adjust the transparency of the trajectory segment. This was also done in a linear fashion, with hours (trajectory segments) in which there were few or no Crowdmap posts having transparency of 70% and hours with a high number of Crowdmap posts being opaque (0% transparency).

Discussion
The results shown in this paper confirm the usefulness of the concept of combining different crowdsource data sources to support the disaster management system based on crowdsourcing data integration. However, several limitations emerged during and after the crowdsourcing data collection that needs to be pointed out.  The number of phone calls to PFDS is much higher than the number of data in the remaining sources, so the actual number is 10 times the number seen on the Number of data-axis From the disaster management perspective, the unpredictability of human behaviour and the prediction of hazards accompanying disasters are also key problems. The first problem can be solved by informing about current warning mechanisms and providing accurate and timely information to citizens (Durand et al., 2018;Tuladhar et al., 2015). In case of a natural disaster, citizens can be informed in three phases: before, during, and after the event (Hua et al., 2013). The second problem can be solved by developing better technical response systems based on a theoretical framework often developed to reconstruct past events. This type of analysis helps us better understand the cause and sequence of the event and use citizen science to reduce disaster risk (Parajuli, 2020). A valuable amount of data is extracted from social media. Although it is a much larger  Fig. 9 Map of differenced Normalized Burn Ratio (dNBR) derived from Copernicus Sentinel-2 images from 18 May 2017 and 6 August 2018, data collected by crowdsourcing and the estimated fire extent polygon provided by the Natural protection and rescue directorate (NPRD). Red colour indicates high fire impact and green colour no fire impact amount to consider regarding privacy issues with the release of data locations, perhaps these issues could be addressed with new privacy agreements on social networks and later technically with numerous tools available. In this research, only posts shared in public groups or pages were used, with great respect for the individuals' personal data and security. For data mining, social network keywords need to be selected, so different tools are used for identifying keywords and later for web scraping. There is a possibility that some posts with useful information did not have keywords from the selection, which opens space for new research proposals in this direction, such as semantic web crawling to support disaster management.
In most cases, citizens used their mobile phones to post on social media due to the power outage. In this way, they become kind of sensors . The authors are not aware of any recent research that evaluates the quality of positional accuracy of text social media data. Senerathe et al. (2017), in his review of voluntary methods for assessing the quality of geographic information, groups them into credibility-based methods and text content quality methods. The research reviewed does not address emergencies or disaster situations and is therefore not appropriate for this use case. This provides future research opportunities for developing models to assess the quality, credibility, and positional accuracy of social media text data collected during emergencies and disasters. The initial motivation for citizen participation in social network information sharing during disasters is communication and information seeking (Bird et al., 2012;Hjorth and Kim, 2011;Zook et al., 2010) and a sense of control over the situation (Riccardi, 2016). As many studies have shown, training VGI volunteers is as important as the need for increased motivation (Haworth, 2016;Riccardi, 2016;Rogers, 2011;Zhang et al., 2019). As mentioned earlier, citizens can become part of a disaster monitoring system. In this case, the Crowdmap campaign was not launched in time (it was 10 days after the fire) while people were still under the impact, so more efforts were needed to promote the campaign and motivate more volunteers. Although there were instructions for the volunteers, many did not manage to accurately enter data into the Crowdmap, which was later needed for correction in data pre-processing. The advantage of this type of data collection is that the exported data was georeferenced. In their research, Foody (2018) and others demonstrate the utility of the wisdom-of-the-crowd approach to increase the accuracy of data collected by volunteers. They propose a voting process that is weighted with information derived from the contributed data. This provides an excellent technique to increase the quality of VGI data and achieve greater accuracy from data collected from a group than from data collected from a single contributor. As mentioned earlier, CDSP (Castillo, 2016) provides a system for checking the credibility of users, and it would be useful as well as challenging to integrate social media into these systems to obtain more accurate crowdsourcing results.
Multispectral satellite data and official data from NPRD were used as reference data to represent the burned area's boundaries. In this verification phase, we also left open the possibility of adding different layers such as meteorological data, firefighter positions data, or even vegetation types to make future maps more informative. These datasets can inform us about wildfire processes in different locations. Changing fire patterns are still being studied through analyses of data at different scales (Mejri et al., 2017;Mooney et al., 2016), but there are several geographic relationships, as shown by the observations in this study.
Authoritative data with the active crowdsourcing component was the data from PFDS. Call centre agents took more than 4000 calls from citizens and noted a brief description. For this study, it was possible to geocode selected calls. In some cases, precise toponyms were listed in the description, but others were not as precise because they were geocoded from the fixed telephone address. Less accurate calls were included because they fit into the reconstruction of fire spread with more accurate data and within limits of reference data. The advantage of this type of data collection was seniors' inclusion in crowdsourcing, since some are not used to social media and prefer to communicate directly (Alexander, 2014). As Riccardi (2016) mentioned, there is a large data collection from the call centre, and this information may serve as the basis for determining the criteria for evaluating relevant information at the time of a disaster.
The merit of using multiple types of crowdsourcing data is that it covers all age groups of citizens and overcomes technical limitations, such as poor mobile network connectivity. After the disaster, in the recovery phase, it is possible to provide connectivity with the solution as Cell on Wheels (COW) (Riccardi, 2016). The lack of network operation during a disaster can be overcome by developing a mobile application that collects urban data to reduce hazards and enable communication during the disaster without using a network (Zhao et al., 2018).
Since the event took place in micro-locations where not many people were involved and everything happened relatively quickly (about 24 h), most of the data analysis is done manually. Another advantage in manual data analysis is the possibility to check the relevance of the information and, based on that, to design the methodological approach later.
Authoritative and non-authoritative organisations recognised social media's power and the possibilities of citizen science to better respond to various challenges (Becker and Bendett, 2015;Hossain, 2020;Mooney et al., 2011). Although there are differences between citizen postings and official announcements, in this case, citizens were used as an adjunct and provide valuable original information (David et al., 2016). The use of spatial data to prepare for and manage risks associated with civil emergencies is likely to be one of the 21st-century challenges. The main task currently being worked on is coordinating information systems and technologies to improve the quality of disaster risk information available to decision-makers.

Conclusions
This research presents an innovative approach to disaster data collection using the 2017 Split forest fire event as an example. By collecting data from multiple sources that occurred during the event, this case demonstrated an approach to better disaster management. Citizens actively share information during the response phase. They can serve as emergency alerts, highlighting how late traditional media outlets report from the field. This type of data collection in the recovery phase can be used for damage records, routing citizens to safe environments, and assessing information from the previous phase.
This paper focuses on integrating multiple data sources after a disaster-data from identified sources available at the time after the disaster were combined to reconstruct the event. Based on the retrospective data analysis presented, an approach and methodology for integration were proposed. The proposed methods for integrating data could be applicable in real-time for crisis mapping. Information is often needed immediately when a disaster occurs, and insight into the process can help develop an emergency response system with more reliable information. This research opens new horizons for organisations whose main activity is fire protection. The achievements shown in this article can be generally applied to other disaster management organisations.
The basis of a reliable disaster management system is accurate spatial data. This study suggests that citizen science can support the disaster management system by making citizens a part of the monitoring system. In this way, citizens would contribute to creating a warning system and the subsequent reconstruction of the events that occurred. This article highlights the importance of using geospatial information from social media, which provides a different perspective on disaster management by formulating data that is combined from multiple sources.
This type of combination of different data sources can only be used in densely populated urban and suburban areas. The data analysis presented shows that the number of contributions near urban areas is increasing rapidly. The main task that is currently being improved is the coordination of multiple sources to improve the quality of disaster risk information available to decisionmakers. In future research, there is an opportunity to find a way to integrate this collected data with the early warning system for forest fires, for example, in forests near populated areas. As mentioned earlier, developing, or improving models to assess the quality and location accuracy of social media text data collected during emergencies and disasters should also be considered in future actions.
The next phase of this research is to find and visualise spatial-geographic relationships between wildfire phenomena with combined data sources, which we have identified as problems. This will be addressed in future work, such as a better geographic representation of mapped elements from crowdsourced spatiotemporal data. The next phase of research could determine the criteria needed for the future decision support system's different functions, such as extracting relevant information for prioritising actions. This approach can also help develop disaster management and analysis systems that collect spatial data from numerous crowdsourced data sources and serve as a recommendation for progress towards the more extensive use of crowdsourced data in various disaster management systems.