Social media, online forums, darknet marketplaces, and various other digital platforms are increasingly used or targeted by cybercrime. Therefore, open source intelligence (OSINT) has become an important aspect in digital forensics and cybercrime investigations: leveraging publicly available data on the Internet provides new information and offers insights into criminal behavior, patterns, and relationships. Many different tools and services exist to collect and extract data from websites for digital forensic investigations. These are often expensive and prone to errors when target websites change their structure or content. In this paper we present MAMPF, a media acquisition and multi-processing framework for OSINT tasks. The framework is able to collect and extract data from various websites with easy extensibility and maintenance in mind. We show that our framework makes a self-hosted approach to efficient OSINT possible where a centralized core component is utilized in such a way that nodes performing crawling / scraping tasks no not require any maintenance at all. To describe our approach we use the analogy of a restaurant with chefs that prepare dishes following specific recipes.
York Yannikos, Marc Leon Agel, Julian Heeger, Simon Bugert, "Cooking Spiders: Efficient OSINT with Chefs and Recipes" in Electronic Imaging, 2025, pp 302-1 - 302-7, https://doi.org/10.2352/EI.2025.37.4.MWSF-302