CritiSense: Critical Digital Literacy and Resilience Against Misinformation
Misinformation on social media undermines informed decision-making and public trust. Prebunking offers a proactive complement by helping users recognize manipulation tactics before they encounter them in the wild. We present CritiSense, a mobile medi…
Authors: Firoj Alam, Fatema Ahmad, Ali Ezzat Shahroor
CritiSense: Critical Digital Literacy and Resilience Against Misinformation Firoj Alam 1 , F atema Ahmad 1 , Ali Ezzat Shahroor 1 , Mohamed Bayan Kmainasi 1 Elisa Sartori 2 , Giovanni Da San Martino 2 , Abul Hasnat 3 , Raian Ali 4 1 Qatar Computing Research Institute, Qatar , 2 Uni versity of P adov a, Italy 3 AP A VI.AI, France, 4 Hamad Bin Khalifa Uni v ersity , Qatar fialam@hbku.edu.qa https://critisense- web.digitqr.net/ Figure 1: Overvie w of C R I T I S E N S E app and its functionalities. Abstract Misinformation on social media undermines in- formed decision-making and public trust. Pre- bunking of fers a proacti v e complement by help- ing users recognize manipulation tactics befor e they encounter them in the wild. W e present C R I T I S E N S E , a mobile media-literacy app that builds these skills through short, interacti v e challenges with instant feedback. It is the first multilingual (supporting nine languages) and modular platform, designed for rapid updates across topics and domains. W e report a us- ability study with 93 users: 83.9% expressed ov erall satisfaction and 90.1% rated the app as easy to use. Qualitati v e feedback indicates that C R I T I S E N S E helps improv e digital literacy skills. Overall, it provides a multilingual pre- bunking platform and a testbed for measuring the impact of microlearning on misinformation resilience. Over 3+ months, we have reached 300+ activ e users. It is freely av ailable to all users on the Apple App Store and Google Play Store . Demo V ideo: https://shorturl.at/CDcdc 1 Introduction The rapid dif fusion of online misinformation threat- ens public health, democratic gov ernance, and so- cial cohesion. Since false claims often spread faster than corrections, post-hoc fact-checking can arri ve too late and may suf fer from fatigue and limited behavioral impact. Reflecting the scale of this challenge, the W orld Economic Forum’ s Global Risks Report 2026 ranks mis- and disinformation among the world’ s top near -term risks (second on the two-year outlook) ( W orld Economic Forum , 2026 ). T o address these challenges, social plat- forms rely on automatic detection, fact-checking pipelines, and warning interf aces to curb mislead- ing content. While these measures provide essen- tial first-line protection, the y are lar gely reactive and claim-specific, can degrade under temporal and cross-lingual/domain shift, and do not by them- selves b uild durable user competence ( Berger et al. , 2025 ; Stepanov a and Ross , 2023 ). 1 Figure 2: Examples of fictional social media posts, demonstrating different manipulation techniques. In practice, detection-centric mitigation is most ef fectiv e when complemented with user-f acing training, for three technical reasons: (i) T empo- ral drift: misinformation is non-stationary , nar - rati ves and multimodal presentation styles e volv e quickly , producing temporal distribution shift that can challenge deployed classifiers; temporally out- of-domain e valuation sho ws that performance can drop e ven for strong models ( Stepano v a and Ross , 2023 ). (ii) Data and coverage: robust detection benefits from large, high-quality annotations, yet such resources are unev en across languages and regions, and cross-lingual transfer remains less re- liable for lo wer-resource settings ( Ozcelik et al. , 2023 ). (iii) Interface design: detection outputs must be translated into interventions (labels, ban- ners, do wnranking), where careful UI choices are crucial to maximize critical ev aluation and av oid unintended ef fects, such as “implied truth” for un- tagged items ( Pennycook et al. , 2020 ). These limitations motiv ate complementary , user- centered approaches that impro ve critical e valua- tion rather than relying on perfect detection cov- erage. Digital media literacy interventions can improv e users’ ability to distinguish mainstream from false content, with effects that persist be yond immediate exposure ( Guess et al. , 2020 ). Pr e- bunking (psychological inoculation) e xtends me- dia literacy by targeting manipulation techniques (e.g., emotional (loaded) language, black white fal- lacy ( Da San Martino et al. , 2019 ), as sho wn in Figure 2 ). Scalable inoculation interv entions ha ve been shown to increase resistance to misinforma- tion tactics on social media ( Roozenbeek et al. , 2022a ). Meta-analyses further support improve- ments in credibility judgment across studies and settings ( Lu et al. , 2023 ; T raber g et al. , 2022 ). Y et most media literacy and inoculation e v alua- tions remain short, one-of f web experiments, with limited attention to (i) sustained engagement in ev- eryday contexts, (ii) multilinguality , and (iii) dura- bility and behavioral outcomes be yond immediate post-intervention assessments ( Lu et al. , 2023 ; T ra- berg et al. , 2022 ). C R I T I S E N S E is moti vated by these gaps. It treats automatic detection as a safety net, but prioritizes b uilding tr ansferable user com- petence through an iterati v e, mobile-first learning experience, along with an assessment method de- signed to measure not only immediate gains but also real-world impact. Our contributions are as follo ws: • W e introduce C R I T I S E N S E , a mobile-first micro-lessons app that trains users to recognize misinformation and manipulation tactics in real- istic, e veryday scenarios. • T o our knowledge, this is the first multilingual app of its kind, launched in Arabic and English and extended to Bangla, French, Hindi, Italian, Filipino, Nepali, and Urdu, offering full lesson content, quizzes, and feedback. • W e describe the complete user workflow , from learning and quizzes to simulated practice. • W e manually dev elop the learning content, co v- ering core digital literacy concepts, propag anda techniques, and fake-ne ws patterns. • W e integrate factuality , propaganda, and hateful- meme detection models to provide automatic sig- nals for textual and visual manipulation, support- ing in-app feedback and learning. • W e report a formativ e usability e v aluation mea- suring (i) ease of use, (ii) visual design, and (iii) percei ved content impact. Does CritiSense pro vide a usable and satisfying experience? A formati ve usability e v aluation with 93 first-time users demonstrates strong o verall us- ability (mean construct scores of 3.99–4.20 on a 5-point Likert scale), with 90.1% of users agreeing the app is easy to use and 83.9% reporting satisfac- tion with their experience. 2 C R I T I S E N S E App 2.1 A pp Design C R I T I S E N S E is a mobile-first media-literacy app designed to help users build and strengthen skills for ev aluating online information. The design of the app is grounded in active learning . Users read short lessons co vering a wide v ariety of topics re- lated to critical digital literacy , fake ne ws and pro- paganda, answer short quizzes, recei v e immediate feedback, and practice applying the same reason- ing to ne w examples. The app is also informed by cognitive inoculation , training recognition of common manipulation techniques (e.g., loaded lan- guage, name calling) to increase users’ resilience 2 before they encounter them in the wild ( Roozen- beek et al. , 2022b ). In Figure 1 , we pro vide an ov ervie w of C R I - T I S E N S E , org anized around four core app compo- nents, and illustrate the end-to-end user journey from onboarding to practice. The app features a streamlined onboarding process: new users reg- ister and receiv e introductory guidance, while re- turning users authenticate directly . All pathways then con v er ge on a dashboard/progress vie w , which branches into three functional modules: learning and kno wledge assessment, interactiv e content ver - ification, and simulated practice. Dashboard/Progr ess. This module displays the user profile, including the number of completed lessons, a score summary , and a leaderboard of top users with their ratings. Learning/Play . This module consists of lessons di vided into chapters, organized by topic or manip- ulation technique. F or example, the Critical Digital Literacy lesson is di vided into: (i) the digital space and us, (ii) de veloping critical digital literacy , and (iii) critical digital literacy skills. Each chapter presents ke y definitions and concepts, followed by quizzes for reinforcement. All contents of the app are de veloped manually . Learning/T est Knowledge. This module provides a flexible, non-linear self-assessment experience, allo wing users to attempt tar geted question banks and localized quizzes to gauge their understanding and earn rew ards. The quizzes use multiple-choice questions to assess both conceptual kno wledge and applied reasoning beyond indi vidual lessons. Detect. This module serves as a practical veri- fication tool where users can submit multimodal content, via text input, image uploads, or direct URLs, for real-time analysis. Rather than issuing automated truth judgments, it highlights linguis- tic cues, emotional framing, persuasiv e patterns, and other indicators associated with misinforma- tion, propaganda, and provides guided feedback to support users’ reasoning. Practice. This module operationalizes inoculation theory ( Banas and Rains , 2010 ) through an inter- acti ve simulation titled “The Fak e Ne ws Factory” (Figure 3 ). In this en vironment, users engage with a con v ersational system to practice generating and identifying disinformation tactics in a controlled setting, with immediate feedback. By exposing users to weakened examples of misleading ar gu- ments, which is paired with explanations and refu- tations. The module aims to build cogniti ve resis- Figure 3: A pictorial example illustrating inoculation theory: expose a weakened misleading message, e xplain the tactic, then test transfer to a nov el post. tance to future persuasion attempts and strengthen users’ ability to recognize manipulation in the wild. Design goals. C R I T I S E N S E is guided by se veral design goals. F irst , it adopts a technique-first ap- proach, training users to recognize propaganda and misinformation strategies rather than chasing in- di vidual claims, which promotes transfer across topics. Second , it emphasizes practice in realistic formats: examples mirror the kinds of posts and media users encounter in e veryday online settings. Thir d , it pro vides immediate feedback with e xpla- nations, going beyond binary correctness to high- light the reasoning behind each decision. F ourth , it is designed to be accessible and scalable through short microlearning lessons that require minimal setup and support learning beyond classroom con- texts. F inally , it is multilingual, e xpanding from its initial launch to nine supported languages: Arabic, English, Bangla, French, Hindi, Italian, Filipino, Nepali, and Urdu. 2.2 Functionalities C R I T I S E N S E provides: (i) interacti v e micro- lessons with short quizzes, (ii) technique-first pre- bunking modules that teach common manipula- tion tactics through examples, (iii) immediate, explanation-rich feedback, and (iv) prompts for repeatable verification habits (e.g., check sources, separate claims from evidence, and cross-check across outlets). The app also of fers progress track- ing and topic-level assessments for continued re- inforcement. Finally , an AI-assisted analysis tool helps users examine external text and images for potential manipulati ve cues. In T able 1 , we sum- marize the current functionality coverage across languages. 2.3 Models C R I T I S E N S E currently supports (i) text-based fact- checking and propaganda detection, and (ii) image- 3 Module EN AR BN FR HI IT TL NE UR Lessons ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Quizzes ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ Detect (text) ✓ ✓ Ý Ý Ý Ý Ý Ý Ý Detect (image) ✓ ✓ Ý Ý Ý Ý Ý Ý Ý Practice ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ ✓ T able 1: Language cov erage and status of the modules. EN=English, AR=Arabic, BN=Bangla, FR=French, HI=Hindi, IT=Italian, TL=Filipino (T aga- log), NE=Nepali, UR=Urdu. ✓ = av ailable; Ý = in progress. based propaganda detection for Arabic and hateful meme detection for English. W e make these mod- els av ailable through an in-house deployed API platform. 1 W e are extending this capability to addi- tional languages as part of our ongoing work. 2.3.1 Datasets F actuality (text). W e train the factuality model on a curated collection of publicly a v ailable datasets in Arabic and English. For Arabic, we include AraFacts2 ( Ali et al. , 2021 ), ANS-Claim ( Khouja , 2020 ), CT22Claim ( Nakov et al. , 2022 ), News- CredibilityDataset ( Samdani et al. , 2023 ), and CO VID19Factuality ( Alam et al. , 2021 ). F or English, we include PolitiFact ( Da San Mar- tino et al. , 2023 ), CT22T3_Factuality ( Alam et al. , 2021 ), CheckThat!-style misinformation datasets ( Köhler et al. , 2022 ; Shahi et al. , 2021 ), and A V eriT eC ( Schlichtkrull et al. , 2023 ). Propaganda (text). For text-based propaganda de- tection, we use PropXplain ( Hasanain et al. , 2025 ), which contains labeled instances in both Arabic and English. Propaganda (image). For image-based propa- ganda detection, we use ArMeme ( Alam et al. , 2024 ) (propagandistic vs. non-propag andistic), a meme-centric dataset. Hateful memes (image). For meme hatefulness detection, we use the Facebook Hateful Memes dataset ( Kiela et al. , 2020 ) (hateful vs. non-hateful). T able 2 reports the data used for each task, grouped by modality and language, along with the train/de- v/test split sizes. Across tasks, dev elopment splits are used for model selection and tuning, while test splits are held out for final e valuation. 2.3.2 Model T raining and Evaluation W e prioritize models that are low-cost to deploy in resource-constrained settings, including CPU- based servers. Accordingly , we use lightweight, 1 https://apihub.tanbih.org/docs T ask/Dataset Modality T rain Dev T est Factuality (En) T ext 467,830 67,389 134,594 Factuality (Ar) T ext 28,221 4,507 7,847 Propaganda (En) T ext 4,472 621 922 Propaganda (Ar) T e xt 18,453 1,318 1,326 ArMeme (Ar) Image 3,604 522 1,021 Hateful meme (En) Image 8,500 540 2,000 T able 2: Dataset statistics for each task and modality , showing the number of instances in the train/dev/test splits used in our experiments. T ask/Dataset Metric Model Perf ormance Factuality (En) Mi-F1 BER T -Base 0.726 Factuality (Ar) Mi-F1 AraBER T 0.868 Propaganda (En) Mi-F1 BER T -Base 0.772 Propaganda (Ar) Mi-F1 AraBER T 0.762 ArMeme (Ar) Ma-F1 V iT -B/16 0.554 Hateful meme (En) Ma-F1 V iT -B/16 0.507 T able 3: Model performance on each task and dataset, re- porting the e v aluation metric (Micro-F1 for text; Macro- F1 for image) and the backbone used. widely adopted backbones. BER T -base for En- glish ( De vlin et al. , 2019 ), AraBER T for Ara- bic ( Antoun et al. , 2020 ), and V iT -B/16 for image- based classification ( Dosovitskiy et al. , 2021 ). These choices provide a strong accurac y–ef ficiency trade-of f and allow seamless integration into our in-house API platform. Note that all trained models are binary classifiers. T able 3 reports performance across tasks. Over - all, the text models perform well on factuality and propaganda detection. Arabic factuality is highest (Mi-F1=0.868), suggesting that the training data is well represented. English factuality is lo wer (Mi-F1=0.726), which likely reflects broader topic di versity and more v aried claim formulations. Pro- paganda detection is similar across languages (Mi- F1=0.772 in English and 0.762 in Arabic), indi- cating that both BER T -base and AraBER T capture many lexical and stylistic cues, although subtle rhetoric and context dependence still mak e the task challenging. In contrast, image-based hateful/propagandistic meme detection remains harder . V iT -B/16 achie v es moderate Macro-F1 on ArMeme (0.554) and lo wer performance on Hateful Memes (0.507). The gap is expected because memes often rely on implicit cul- tural context, irony , and fine-grained visual–text in- teractions that lightweight vision-only models may miss. While stronger models exist (e.g., Shahroor et al. , 2026 ), we use lightweight backbones to meet deployment constraints and CPU-oriented serving requirements. 4 3 Evaluation of C R I T I S E N S E W e ev aluate C R I T I S E N S E through a structured us- ability study cov ering both interf ace usability and percei ved learning/beha vioral outcomes. The ques- tionnaire measures fi ve usability constructs, along with single-item ratings for overall satisfaction, ease of completing the lesson, quiz flo w , intention to continue using the app, and likelihood of recom- mending it. W e also collect brief contextual infor - mation and open-ended feedback to identify con- crete design priorities. The study was conducted for both Arabic and English languages. Participants. W e recruited 93 participants and ad- ministered the study using SurveyMonke y . Most participants were aged 18–24 (68.8%). Language preference was predominantly English (62.4%), follo wed by Arabic (17.2%) and bilingual En- glish/Arabic use (20.4%). Most participants held or were pursuing a bachelor’ s de gree (80.6%), with fields of study ske wing to ward STEM. Participation was voluntary; howe v er , we provided a voucher worth $14. Instrument. The instrument combines quantitativ e ratings with qualitativ e feedback (Section B ). It in- cludes 17 fi ve-point Likert items (strongly disagree– strongly agree), a fi ve-point o verall satisfaction item, a seven-point ease-of-completion rating for the lesson/quiz flow , and a Net Promoter Score (0– 10) 2 . W e also measure intention to continue, collect two multi-select responses on acti vities completed and field of study , and include three open-ended questions on confusions, highest-priority fixes, and most-liked features. Usability constructs and scoring. W e group the 17 Likert items into five theoretically motiv ated constructs (T able 5 ): Usability/UX (ease of use and percei ved usefulness), V isual Design (aesthet- ics and clarity of visual elements), Na vigation (ease of mo ving through the interface and finding content), Content Effectiv eness (quality of exam- ples/quizzes and cross-language consistency), and Beha vioral Impact (percei ved changes in critical e valuation behaviors). For each respondent, we compute each construct score as the mean of its constituent items, supporting both item-le vel diag- nostics and construct-le vel comparisons. Analysis. W e compute descriptiv e statistics (mean, standard deviation, and median) for all items and construct scores. W e assess internal consistency 2 How lik ely ar e you to r ecommend [product/app] to a friend or colleague? ( Reichheld , 2003 ) Construct Mean (SD) α % Positive Usability / UX 4.12 (0.85) 0.653 77.4% V isual Design 4.16 (0.70) 0.774 72.0% Navigation 4.20 (0.84) 0.837 73.1% Content Effecti veness 4.09 (0.62) 0.746 65.6% Behavioral Impact 3.99 (0.72) 0.840 63.4% Overall (17 items) — 0.921 — T able 4: Construct-lev el summary statistics (mean, stan- dard deviation), internal consistency ( α ), and percentage of positiv e responses. with Cronbach’ s alpha. Finally , we e xamine inter - construct relationships using Pearson correlations. 4 Findings Usability and satisfaction. Overall usability and satisfaction was strong (see Figure 4 and 5 ). All fi ve constructs exceeded the agreement threshold (M > 3.0) (reported in T able 4 ), with Navig ation rated highest (M=4.20, SD=0.84) follo wed by V i- sual Design (M=4.16, SD=0.70). At the item lev el, easy to use recei ved the highest rating (M=4.43, SD=0.88), with 90.1% of respondents selecting agree or strongly agree. Self-reported satisfaction was similarly high: 83.9% of users indicated the y were satisfied or very satisfied (M=4.14, SD=0.87). Learning design and engagement. P articipants responded positi vely to the pedagogical structure, particularly the quiz-based reinforcement. The item Quizzes r einfor ce learning scored M=4.30 (SD=0.79), and 90.3% of participants completed at least one quiz during their session. At the construct le vel, content effectiveness was high (M=4.09, SD=0.62), suggesting that lessons communicated the targeted critical-thinking concepts. Beha vioral impact. The behavioral impact con- struct scored M=3.99 (SD=0.72), with 63.4% of users providing positi ve ratings. Notably , Arabic- language users reported the highest beha vioral im- pact (M=4.36), which is consistent with the hy- pothesis that the app might be valuable in contexts where localized critical-thinking resources are less av ailable. Instrument reliability . It was high ov erall. Cron- bach’ s α ranged from 0.653 for Usability/UX (2 items) to 0.840 for behavioral impact (4 items), with an overall scale reliability of α =0.921. The lo wer value for the tw o-item construct is expected since α is sensitiv e to the number of items, while the full scale sho ws excellent reliability . Qualitative feedback and impro vement tar gets. Open-ended questions had high response rates (86– 5 100%), pro viding actionable feedback. Participants frequently highlighted ease of use and the quiz- based learning flow . Responses also identified some issues that can help us to improv e the app. Interpr etation and implications. Overall, re- sults indicate that C R I T I S E N S E provides a us- able and engaging first-release learning e xperi- ence, with clear priorities for iteration. Naviga- tion was rated highest (M=4.20), suggesting that the chapter → lesson → quiz flo w is easy to follo w . Quiz-based reinforcement was also well recei ved (M=4.30), and 90.3% of participants completed at least one quiz. Correlations are consistent with a “UX → Content → Impact” pattern. UX ratings correlate with content effecti veness ( r = 0 . 52 – 0.54), and content ef fecti veness correlates with behavioral impact ( r = 0 . 63 ). While not causal, these associations suggest that improving UX may yield downstream gains in percei v ed learning outcomes. Finally , NPS reflects an early-release profile (NPS = − 8 . 6 ): 31.2% of users are promot- ers (mode=10/10), while some others (12.9%) reported issues for improv ement. 5 Related W ork Detection-based and platf orm interventions. There has been substantial amount of work address- ing misinformation through automatic detection, fact-checking pipelines, and platform UI interv en- tions (e.g., labels and warnings) ( Hasanain et al. , 2024 ; Alam et al. , 2022 , 2021 ; Zhou and Zafarani , 2020 ). While these systems pro vide important first- line safeguards, their ef fecti veness can de grade un- der temporal shift ( Stepanov a and Ross , 2023 ), vary across languages in cross-lingual transfer ( Ozcelik et al. , 2023 ), and yield limited av erage changes in real-world beliefs or consumption ( Aslett et al. , 2022 ). Moreov er , selecti ve labeling can increase percei ved accuracy of untagged content (the “im- plied truth” ef fect) ( Pennycook et al. , 2020 ). Preb unking and inoculation. Psychological in- oculation builds resistance by teaching manipula- tion techniques rather than correcting individual claims. Games such as Bad News and Harmony Squar e impro ve recognition of propaganda tactics and reduce perceiv ed credibility of misleading con- tent ( Roozenbeek and v an der Linden , 2019 , 2020 ), with similar ef fects reported in topic-focused and multilingual variants ( Basol et al. , 2021 ). Scal- able preb unking videos can also impro ve technique recognition on social media ( Roozenbeek et al. , 2022b ). Howe v er , many interventions are deliv- ered as one-of f web experiences and are e v aluated primarily with immediate post-intervention assess- ments. Media literacy tools. Mobile-first media literacy tools remain less common. Cranky Uncle uses fallac y training for climate misinformation ( Cook et al. , 2023 ), while feed simulations such as F ake y provide ecologically realistic practice and report improv ed source discernment in longitudinal de- ployments ( Micallef et al. , 2021 ). Overall, prior tools typically emphasize either short-session tac- tic inoculation or practice-based ne ws literacy , b ut rarely combine both in a mobile-first e xperience designed for e veryday use be yond classrooms. C R I T I S E N S E . C R I T I S E N S E bridges these strands by combining tactic-le vel prebunking with practical verification skills (e.g., distinguishing opinion from e vidence, fallac y spotting, and verification habits) in short interactiv e ex ercises with explanation-rich feedback. Unlike primarily browser -based prebunk- ing games, it is mobile-first and supports multi- ple languages. Its modular design enables rapid topic updates while keeping learning objectiv es technique-centered. 6 Conclusions W e presented C R I T I S E N S E , a mobile-first media- literacy app that deli v ers technique-focused pre- bunking and practical verification skills through short, interacti ve ex ercises with explanation-rich feedback. C R I T I S E N S E is, to our knowledge, the first ef fort of this kind released in multiple lan- guages, starting with Arabic and English and ex- tending to additional se ven languages. A forma- ti ve usability study with 93 users shows strong user experience, including high usability (Na viga- tion M=4.20; Ease-of-use M=4.43), high satisf ac- tion (83.9% positiv e), and excellent internal con- sistency of the e v aluation instrument (Cronbach’ s α = 0 . 921 ). Qualitativ e feedback further vali- dates the chapter → quiz learning loop and high- lights functionality improv ements. Overall, these findings support C R I T I S E N S E as a practical, scal- able platform for multilingual prebunking in ev- eryday settings, and moti vate future work on long- term learning durability , behavioral outcomes, and stronger multimodal analysis. Limitations. C R I T I S E N S E is designed to strengthen aw areness and skills for recognizing 6 fake news, mis/disinformation, and propaganda. Our current ev aluation is a pilot mixed-method study focused on usability , engagement, and short-term learning signals; it does not yet measure long-term retention or real-world be- havioral change (e.g., sharing on users’ own platforms). The app presently covers a curated set of techniques and item types. The content requires ongoing updates and localization. While our design is intended to support across topics, ho wev er , we do not claim broad generalization across all domains or user groups, and larger longitudinal and cross-cultural studies are needed. Finally , C R I T I S E N S E complements platform detection and fact-checking rather than replacing them. Ethics and br oader impact. C R I T I S E N S E op- erates in a sensiti v e domain where design choices can inadvertently amplify harmful narrativ es. T o mitigate this, the app teaches manipulation pat- terns (e.g., emotional framing, scapegoating) using a technique-centered approach and av oids present- ing harmful misinformation. Feedback emphasizes actionable verification steps (e.g., source checks, e vidence tracing) rather than restating false claims. W e also acknowledge dual-use risks: explanations of propaganda strategies could be misused to craft persuasi ve misinformation. C R I T I S E N S E reduces this risk by prioritizing recognition and critical questioning, not step-by-step guidance for manip- ulation, and by curating e xamples for educational intent. Regarding priv ac y , C R I T I S E N S E does not collect sensiti ve user information. Overall, we e x- pect positi ve impact: improved media literac y can support informed participation and reduce suscepti- bility to manipulation, particularly in multilingual settings. W e plan longitudinal studies to assess durability and monitor potential unintended effects (e.g., ov erconfidence or blanket skepticism). References Firoj Alam, Stefano Cresci, T anmoy Chakraborty , Fab- rizio Silvestri, Dimiter Dimitrov , Giov anni Da San Martino, Shaden Shaar , Hamed Firooz, and Presla v Nakov . 2022. A surve y on multimodal disinforma- tion detection . In Proceedings of the 29th Inter- national Confer ence on Computational Linguistics , pages 6625–6643, Gyeongju, Republic of K orea. In- ternational Committee on Computational Linguistics. Firoj Alam, Abul Hasnat, Fatema Ahmed, Md Arid Hasan, and Maram Hasanain. 2024. ArMeme: Pro- pagandistic content in arabic memes. In Pr oceedings of the 2024 Confer ence on Empirical Methods in Nat- ural Languag e Pr ocessing (EMNLP) , Miami, Florida. Association for Computational Linguistics. Firoj Alam, Shaden Shaar, Fahim Dalvi, Hassan Saj- jad, Alex Nikolov , Hamdy Mubarak, Giov anni Da San Martino, Ahmed Abdelali, Nadir Durrani, Kareem Darwish, Abdulaziz Al-Homaid, W ajdi Za- ghouani, T ommaso Caselli, Gijs Danoe, Friso Stolk, Britt Bruntink, and Preslav Nakov . 2021. Fighting the CO VID-19 infodemic: Modeling the perspectiv e of journalists, fact-checkers, social media platforms, policy makers, and the society . In F indings of the Association for Computational Linguistics: EMNLP 2021 , pages 611–649, Punta Cana, Dominican Re- public. Association for Computational Linguistics. Zain Shammur Ali, W atheq Mansour , T amer Elsayed, and Amir Mohamed. 2021. AraFacts: The first large arabic dataset of naturally occurring claims . In Pr o- ceedings of the Sixth Arabic Natur al Languag e Pr o- cessing W orkshop (W ANLP) . Association for Compu- tational Linguistics. W issam Antoun, Fady Baly , and Hazem Hajj. 2020. AraBER T: Transformer -based model for arabic lan- guage understanding. In Pr oceedings of the 4th work- shop on open-sour ce arabic corpor a and pr ocessing tools, with a shar ed task on offensive languag e detec- tion , pages 9–15. Ke vin Aslett, Andrew M. Guess, Richard Bonneau, Jonathan Nagler , and Joshua A. T ucker . 2022. News credibility labels ha ve limited average ef fects on news diet quality and fail to reduce misperceptions . Sci- ence Advances , 8(18):eabl3844. John A. Banas and Stephen A. Rains. 2010. A meta- analysis of research on inoculation theory . Commu- nication Monographs , 77(3):281–311. Melisa Basol, Jon Roozenbeek, Manel Berriche, F atih Uenal, W illiam McClanahan, and Sander van der Lin- den. 2021. T ow ards psychological herd immunity: Cross-cultural evidence for two preb unking interven- tions against CO VID-19 misinformation . Big Data & Society , 8(1):1–22. Lara Marie Berger , Anna Kerkhof, Felix Mindl, and Johannes Münster . 2025. Deb unking “fake ne ws” on social media: Immediate and short-term effects of fact-checking and media literacy interv entions . Jour - nal of Public Economics , 245:105345. John Cook, Danielle Kinkead, and Sander v an der Lin- den. 2023. The cranky uncle game—combining humor and gamification to build student resilience against climate misinformation . En vir onmental Edu- cation Resear ch , 29(11):1594–1611. Giov anni Da San Martino, Firoj Alam, Maram Hasanain, Rabindra Nath Nandi, Dilshod Azizov , and Preslav Nakov . 2023. Ov ervie w of the CLEF-2023 Check- That! lab task 3 on political bias of news arti- cles and news media. In W orking Notes of CLEF 2023–Confer ence and Labs of the Evaluation F orum , CLEF ’2023, Thessaloniki, Greece. 7 Giov anni Da San Martino, Seunghak Y u, Alberto Barrón-Cedeño, Rostislav Petrov , and Preslav Nako v . 2019. Fine-grained analysis of propaganda in news articles . In Pr oceedings of the 2019 Confer ence on Empirical Methods in Natural Languag e Pr ocessing and the 9th International J oint Confer ence on Natu- ral Langua ge Pr ocessing (EMNLP-IJCNLP) , pages 5636–5646, Hong K ong, China. Association for Com- putational Linguistics. Jacob De vlin, Ming-W ei Chang, Kenton Lee, and Kristina T outanov a. 2019. BER T: Pre-training of deep bidirectional transformers for language under - standing . In Pr oceedings of the 2019 Confer ence of the North American Chapter of the Association for Computational Linguistics: Human Language T echnologies , N AA CL-HL T ’19, pages 4171–4186, Minneapolis, Minnesota, USA. Association for Com- putational Linguistics. Alex ey Doso vitskiy , Lucas Beyer , Alexander K olesniko v , Dirk W eissenborn, Xiaohua Zhai, Thomas Unterthiner , Mostafa Dehghani, Matthias Minderer , Georg Heigold, Sylvain Gelly , Jakob Uszkoreit, and Neil Houlsby . 2021. An image is worth 16x16 words: T ransformers for image recognition at scale . In International Confer ence on Learning Repr esentations (ICLR) . Andrew M. Guess, Michael Lerner , Benjamin L yons, Jacob M. Montgomery , Brendan Nyhan, Jason Rei- fler , and Neelanjan Sircar . 2020. A digital media literacy interv ention increases discernment between mainstream and false news in the united states and india . Pr oceedings of the National Academy of Sci- ences , 117(27):15536–15545. Maram Hasanain, F atema Ahmad, and Firoj Alam. 2024. Can GPT-4 identify propaganda? annotation and de- tection of propaganda spans in news articles . In Pr o- ceedings of the 2024 Joint International Confer ence on Computational Linguistics, Language Resour ces and Evaluation (LREC-COLING 2024) , pages 2724– 2744, T orino, Italia. ELRA and ICCL. Maram Hasanain, Md Arid Hasan, Mohamed Bayan Kmainasi, Elisa Sartori, Ali Ezzat Shahroor, Gio- vanni Da San Martino, and Firoj Alam. 2025. PropX- plain: Can LLMs enable explainable propaganda detection? In F indings of the Association for Compu- tational Linguistics: EMNLP 2025 , pages 23855– 23863, Suzhou, China. Association for Computa- tional Linguistics. Jude Khouja. 2020. Stance prediction and claim verifi- cation: An Arabic perspective . In Pr oceedings of the Thir d W orkshop on F act Extr action and VERification (FEVER) . Association for Computational Linguistics. Douwe Kiela, Hamed Firooz, Aravind Mohan, V edanuj Goswami, Amanpreet Singh, Pratik Ringshia, and Davide T estuggine. 2020. The hateful memes chal- lenge: Detecting hate speech in multimodal memes. Advances in neural information pr ocessing systems , 33:2611–2624. Juliane Köhler, Gautam Kishore Shahi, Julia Maria Struß, Michael W iegand, Melanie Siegel, and Thomas Mandl. 2022. Overvie w of the CLEF-2022 CheckThat! lab task 3 on fake news detection. In W orking Notes of CLEF 2022—Confer ence and Labs of the Evaluation F orum , CLEF ’2022, Bologna, Italy . Chang Lu, Bo Hu, Qiang Li, Chao Bi, and Xing-Da Ju. 2023. Psychological inoculation for credibility as- sessment, sharing intention, and discernment of mis- information: Systematic revie w and meta-analysis . Journal of Medical Internet Resear ch , 25:e49255. Nicholas Micallef, Mihai A vram, Filippo Menczer , and Sameer Patil. 2021. Fakey: A game intervention to improv e ne ws literacy on social media . Pr oceed- ings of the ACM on Human-Computer Interaction , 5(CSCW2):1–37. Preslav Nakov , Alberto Barrón-Cedeño, Firoj Alam, Giov anni Da San Martino, T amer Elsayed, Maram Hasanain, Reem Suwaileh, and W ajdi Zaghouani. 2022. Overvie w of the CLEF-2022 CheckThat! lab task 1 on identifying relev ant claims in tweets . In W orking Notes of CLEF 2022 (CEUR W orkshop Pr o- ceedings) . Oguzhan Ozcelik, Arda Sarp Y enicesu, Onur Y ildirim, Dilruba Sultan Haliloglu, Erdem Ege Eroglu, and Fazli Can. 2023. Cross-lingual transfer learning for misinformation detection: In vestigating performance across multiple languages . In Pr oceedings of the 4th Confer ence on Languag e, Data and Knowledg e , pages 549–558, V ienna, Austria. NO V A CLUNL, Portugal. Gordon Pennycook, Adam Bear , Evan T . Collins, and David G. Rand. 2020. The implied truth effect: At- taching warnings to a subset of f ake ne ws headlines increases percei v ed accurac y of headlines without warnings . Manag ement Science , 66(11):4944–4957. Frederick F . Reichheld. 2003. The one number you need to grow . Harvard Business Re view . Jon Roozenbeek and Sander van der Linden. 2019. Fak e news g ame confers psychological resistance against online misinformation . Humanities and Social Sci- ences Communications , 5(65):1–10. Jon Roozenbeek and Sander van der Linden. 2020. Breaking harmony square: A game that “inoculates” against political misinformation . Harvar d Kennedy School (HKS) Misinformation Re view , 1(8). Jon Roozenbeek, Sander van der Linden, Beth Goldber g, Stev e Rathje, and Stephan Lew ando wsky . 2022a. Psychological inoculation improves resilience against misinformation on social media . Science Advances , 8(34):eabo6254. Jon Roozenbeek, Sander van der Linden, Beth Goldber g, Stev e Rathje, and Stephan Lew andowsk y . 2022b. Psychological inoculation improves resilience against misinformation on social media . Science Advances , 8(34):eabo6254. 8 D. Samdani, M. T aileb, and N. Almani. 2023. Arabic news credibility on twitter using sentiment analysis and ensemble learning . Zenodo . Michael Sejr Schlichtkrull, Zhijiang Guo, and Andreas Vlachos. 2023. A VeriTeC: A dataset for real-world claim verification with evidence from the web . In Thirty-seventh Conference on Neural Information Pr ocessing Systems Datasets and Benchmarks T rac k . Gautam Kishore Shahi, Julia Maria Struß, and Thomas Mandl. 2021. Overvie w of the clef-2021 checkthat! lab task 3 on fak e news detection. W orking Notes of CLEF . Ali Ezzat Shahroor , Mohamed Bayan Kmainasi, Abul Hasnat, Dimitar Dimitrov , Giov anni Da San Mar- tino, Preslav Nakov , and Firoj Alam. 2026. Meme- Lens: Multilingual multitask vlms for memes. arXiv pr eprint arXiv:2601.12539 . Nataliya Stepanov a and Björn Ross. 2023. T emporal generalizability in multimodal misinformation detec- tion . In Pr oceedings of the 1st GenBenc h W orkshop on (Benchmarking) Gener alisation in NLP , pages 76–88, Singapore. Association for Computational Linguistics. Christian T raberg, Jon Roozenbeek, and Sander van der Linden. 2022. Psychological inoculation against mis- information: Current e vidence and future directions . The ANN ALS of the American Academy of P olitical and Social Science , 700(1):136–151. W orld Economic Forum. 2026. The global risks report 2026 . T echnical report, W orld Economic Forum. Published: 14 January 2026. Xinyi Zhou and Reza Zafarani. 2020. A surv ey of f ake news: Fundamental theories, detection methods, and opportunities. ACM Computing Surveys (CSUR) , 53(5):1–40. A Usability constructs T able 5 summarizes the fiv e usability constructs used in our e v aluation and the number of Likert items mapped to each construct. These constructs span usability/UX, visual design, na vigation, con- tent ef fecti veness, and behavioral impact, and are used to compute construct-le v el scores by a verag- ing their constituent items. B Usability Study Questionnaire The following questionnaire was administered to N = 93 participants via SurveyMonk ey after inter - acting with the CritiSense application. All Likert items used a 5-point scale: Str ongly Disa gr ee (1) – Str ongly Agr ee (5), unless stated otherwise. Construct # Description Usability / UX 2 General usability: capabilities meet needs; ease of use V isual De- sign 4 Aesthetics and clarity: design appeal; read- ability; icon clarity; animation clarity Navigation 3 Ease of navigation; finding lessons; speed and responsiv eness Content Ef- fectiveness 4 Misinformation identification; example rel- ev ance; quiz reinforcement; EN/AR consis- tency Behavioral Impact 4 Confidence evaluating information; ques- tioning reliability; noticing manipulation tac- tics; helping others T able 5: Usability constructs and item counts used in the ev aluation. Section 1: Consent & A pp Usage Q1. Which device did you use for CritiSense? [Andr oid phone / iPhone / T ablet / Computer] Q2. Which language did you use in the app? [English / Arabic / Both English & Ar abic] Q3. Is this your first time using CritiSense? [Y es / No] Q4. Which acti vities did you complete? (Select all that apply) [Br owsed the home scr een / Completed a quiz / Switched app languag e / Explor ed multiple sections / Other] Q5. What feature do you see in the app after read- ing a chapter? [Multiple choice: questions r elated to the chapter / another lesson / another c hapter] Section 2: Usability & UX Q6. CritiSense’ s technical capabilities meet my needs. Q7. CritiSense is easy to use. Q8. Overall, how easy or dif ficult was it to com- plete a lesson and quiz? [1 = Extr emely difficult 7 = Extremely easy] Section 3: V isual Design Q9. The visual design of the app is appealing. Q10. The text is easy to read (size, contrast, spac- ing). Q11. Icons and interface elements are clear and un- derstandable. Q12. Animations and transitions add clarity to the experience. 9 Section 4: Na vigation Q13. It is easy to na vigate through the app. Q14. I can find lessons and quizzes easily . Q15. The app feels fast and responsi ve. Section 5: Content Effecti veness Q16. The lessons helped me better identify misin- formation and manipulation techniques. Q17. The e xamples used in the lessons feel rele v ant to real situations. Q18. The quizzes reinforce what I learned. Q19. The English and Arabic content feel consistent in quality and cov erage. Section 6: Beha vioral Impact Q20. After using CritiSense, I feel more confident e valuating information I see online. Q21. I am more likely to question the reliability of social media posts after using CritiSense. Q22. CritiSense helped me notice common tricks used to manipulate people online. Q23. After using CritiSense, I am more likely to help friends or family spot misleading or ma- nipulati ve online content. Section 7: Ov erall Evaluation Q24. Overall, ho w satisfied are you with Cri- tiSense? [1 = V ery Dissatisfied 5 = V ery Satisfied] Q25. I would like to continue using CritiSense in the future. Q26. Ho w likely are you to recommend CritiSense to a friend or colleague? [0 = Not at all lik ely 10 = Extremely likely] Section 8: Open-Ended F eedback Q27. What part of the app (if any) felt confusing or dif ficult? Q28. If we fix one thing first, what should it be? Q29. What did you like most about using Cri- tiSense? Figure 4: User satisfaction surve y results (N=93). The distrib ution indicates high platform acceptance, with 83.9% of participants reporting positiv e sentiment (36.6% V ery Satisfied and 47.3% Satisfied). Neutral re- sponses accounted for 11.8%, while combined negati ve sentiment (Dissatisfied and V ery Dissatisfied) remained minimal at 4.4%. Figure 5: User perception of system usability ( N = 93 ). A significant majority (90.1%) e xpressed agreement, with 60.4% indicating they “Strongly Agree. ” Section 9: Demographics Q30. What is your highest level of education? [High school or equivalent / Some colle ge / Bachelor’ s (pursuing/completed) / Master’ s (pursuing/completed) / PhD or Doctorate / Other] Q31. What is your field of study or expertise? (Se- lect all that apply) [Education / Media & Communication / T ech- nology & IT / Business & Manag ement / Engi- neering / Health & Medicine / Social Sciences / Pr efer not to say / Other] Q32. What is your age group? [Under 18 / 18–24 / 25–34 / 35–44 / 45–54 / 55+] 10
Original Paper
Loading high-quality paper...
Comments & Academic Discussion
Loading comments...
Leave a Comment