Main content
Presentations
Link to aftermovie/promotional video
| Session | Title of the presentation | Author | Link |
|---|---|---|---|
| recording | |||
| OPENING SESSION | Opening speech | Dominika Rogalińska, Statistics Poland | download |
| Albrecht Wirthmann, Eurostat | download | ||
| KEYNOTE SPEECH | GenAI for official statistics - opportunities and dangers | Marko Grobelnik, Jožef Stefan Institute | download |
| SESSION I. WEB SCRAPING AND INFRASTRUCTURE | recording | ||
| Chair: Alexander Kowarik | |||
| I.1 | URL finding: looking back, progress and plans for the future | Heidi Kühnemann | download |
| I.2 | Identifying official firm websites: a comparison of machine learning-based URL retrieval methods and AI-powered search engines | Donato Summa | download |
| I.3 | State of play of the Data Acquisition Service (DAS) of the Web Intelligence Hub (WIH) | Mátyás Mészáros | download |
| I.4 | Providing online based enterprise characteristics with the Web Intelligence Hub | Jacek Maślankowski | download |
| I.5 | Firms innovation capabilities and corporate websites: evidence on Italian SMEs | Caterina Liberati | download |
| SESSION II. OJA USE CASE | recording | ||
| Chair: Fernando Reis | |||
| II.1 | Leveraging online job advertisements for green skills analysis in France | Emiline Roger | download |
| II.2 | Development of a labour shortage indicator by occupation from OJA data | Annalisa Lucarelli | download |
| II.3 | Combining online job advertisements with probability sample data for enhanced small area estimation of job vacancies | Donatas Šlevinskas | download |
| II.4 | Assessment of classifiers using pre-defined data source | Vladimir Kvetan | download |
| II.5 | Online job advertisements classification using encoder-like large language model | Mikołaj Tym, Jakub Żerebecki | download |
| II.6 | Using language models for extracting regions of employment from online job vacancies | Adam Tsakalidis, Antonio Ranieri | download |
| SESSION III. OBEC USE CASE | recording | ||
| Chair: Olav ten Bosch | |||
| III.1 | Evaluating the completeness of business databases: a comparison with official records using web scraping techniques | Josep Domenech | download |
| III.2 | Use of dedicated business website to enhance the statistical business register in the Netherlands | Arnout Van Delden | download |
| III.3 | Applying survey sampling theory to web-scraped data: an analysis of OBEC data using the IPW estimator | Vilma Nekrašaitė-Liegė | download |
| III.4 | Online based enterprise characteristics (OBEC) in Statistics Poland | Ewelina Niewiadomska | download |
| III.5 | Trade links: estimating interregional trade using weblinks | Juergen Amann | |
| SESSION IV. NEW USE CASES | recording | ||
| Chair: Klaudia Peszat | |||
| IV.1 | New use-cases of web data for official statistics | Olav ten Bosch | download |
| IV.2 | Measuring construction activities using advertisements from real estate portals. ESSnet WIN Work Package 3, Use Case 2 | Tobias Gramlich | download |
| IV.3 | Analysing housing market in Tricity Metropolitan Area in Poland | Olgun Aydin, Piotr Kłopotowski | download |
| IV.4 | Constructing a Hedonic House Price Index for Poland using listings data from 1996-2024 | Radosław Trojanek | download |
| IV.5 | Using web data for energy statistics: methodology and key lessons | Herbeth Sandrine | download |
| SESSION V. QUALITY OF WEB DATA | recording | ||
| Chair: Ciprian Alexandru | |||
| V.1 | Web content based statistics: the challenges ahead | Fernando Reis | download |
| V.2 | Exploiting the web presence of enterprises to improve NACE code classification | Johannes Gussenbauer | download |
| V.3 | Assessing the quality of enterprise characteristics and online job advertisement classifications derived from web data | Ville Auno, Johannes Gussenbauer | download |
| V.4 | Quality guidelines for acquiring and using web scraped data | Magdalena Six, Alexander Kowarik | download |
| V.5 | A specialised architectural framework for web data: the BREAL extension and enhancement | Giuseppina Ruocco | download |
| SESSION VI. METHODOLOGY ON USING WEB DATA | recording | ||
| Chair: Jacek Maślankowski | |||
| VI.1 | Selective scraping, sampling and other methods to minimize known causes of biases of web data | Alexander Kowarik | download |
| VI.2 | Online job advertisements deduplication using large language model | Jakub Żerebecki, Mikołaj Tym | download |
| VI.3 | Finding the Goldilocks data collection frequency for the Consumer Price Index | Luigi Palumbo | download |
| VI.4 | Integrating big data and administrative sources for estimating vehicle mileage and analyzing road traffic accidents | Marco Broccoli | download |
