Searching For- Jessica Bangkok In-all Categorie... Direct

Ground‑truth “Jessica” entities (n = 274) are planted with varying visibility across categories (e.g., 38 % appear only in images). 5.1 Evaluation Metrics | Metric | Definition | |--------|------------| | Recall@k | Fraction of true Jessicas appearing in top‑k results. | | Precision@k | Fraction of top‑k results that are true Jessicas. | | Mean Query Latency | Wall‑clock time per query (average over 10 000 runs). | | Privacy‑Risk Score | Expected number of leaked attributes per query (lower is better). | 5.2 Baselines | Baseline | Description | |----------|-------------| | B1 – Structured‑Only | Queries only official registers. | | B2 – Text‑Only | Full‑text search on social media. | | B3 – Image‑Only | Face‑matching on CCTV feeds. | | B4 – Simple Union | Union of all category results without weighting. | 5.3 Results | System | Recall@10 | Precision@10 | Latency (s) | Privacy‑Risk | |--------|-----------|--------------|------------|--------------| | B1 | 0.31 | 0.94 | 0.42 | 0.02 | | B2 | 0.44 | 0.68 | 0.78 | 0.04 | | B3 | 0.38 | 0.71 | 1.96 | 0.06 | | B4 | 0.61 | 0.73 | 2.31 | 0.12 | | CSRF (proposed) | 0.71 | 0.85 | 2.24 | 0.09 |

A. Researcher¹, B. Analyst², C. Data‑Scientist³ Searching for- jessica bangkok in-All Categorie...

[ S = \sum_i=1^N w_i \cdot c_i ,\quad \sum w_i = 1 ] Ground‑truth “Jessica” entities (n = 274) are planted

Searching for “Jessica” in Bangkok across All Categories: A Multi‑Modal Retrieval Study | | Mean Query Latency | Wall‑clock time

Our work fuses these strands, extending the literature (e.g., Li et al., 2022) into the cross‑category domain. 3. Category‑Spanning Retrieval Framework (CSRF) 3.1 Overview +-------------------+ +-------------------+ +-------------------+ | Structured DBs | ---> | Fusion Engine | ---> | Ranking Module | | (civil, tax, etc) | | (confidence | | (precision‑recall | +-------------------+ | weighting) | | trade‑off) | | +-------------------+ +-------------------+ v +-------------------+ +-------------------+ | Unstructured Text | ---> | Entity Extractor | | (tweets, blogs) | | (NER + fuzzy) | +-------------------+ +-------------------+ | v +-------------------+ +-------------------+ | Images / Video | ---> | Face/Person Re‑ID | | (Instagram, CCTV) | | (OSNet) | +-------------------+ +-------------------+ | v +-------------------+ +-------------------+ | Graph Signals | ---> | Linkage Analyzer | | (call‑graphs, | | (probabilistic | | co‑attendance) | | graph model) | +-------------------+ +-------------------+ 3.2 Data Sources | Category | Example Sources | Access Modality | |----------|----------------|-----------------| | Official Registers | Thai Civil Registration, Tax Office | Secure API (OAuth2, audit logs) | | Commercial Services | Grab, LINE Taxi, Foodpanda | Partner data‑share agreements | | Social Media | Instagram, Twitter, Facebook | Public endpoints + rate‑limited API | | Multimedia | CCTV (BMA), YouTube geotagged videos | Stream processing (Kafka) | | Relational Graphs | Mobile‑call records (anonymized), Event RSVP logs | Batch ETL pipelines |